Working With Data





REST APIs

When we visit a website our browser makes an HTTP request for that HTML as well as an subsequent HTTP requests for loading any assets (fonts, images, videos, etc). In the case of our projects (which we host on GitHub's servers), someone visiting our work in their browser send requests to the GitHub servers which send them back the code we wrote (and any other files/assets we might be storing on there), but our code can also make requests to other servers. In this way our work can incorporate data and other assets from various other parts of the web.

One of the most common approaches is to send requests from our JavaScript code to other servers which make data, assets and other services available through an interface known as a REST API. These 3rd party (meaning controlled by someone else) REST APIs usually provide data formatted in JSON (though sometimes you see other formats like XML) which we can access by sending HTTP requests to specific URLs.

There are loads of these sorts of APIs online, apilist.fun and programmableweb.com are just a couple of sites which attempt to aggregate as many of them as they can. The Chicago city also has a REST API which gives us access to all sorts of city data at data.cityofchicago.org

While it's possible to send requests to these REST APIs by entering their URLs in our browser's address bar (a great way to inspect what sort of data you'll be getting back and how it's structured), in order to create work using this data, we need a way to send HTTP requests in our JavaScript code. There have been different ways of accomplishing this over the years, which first began in the early days of Web 2.0 (mid 2000s) with the XMLHttpRequest object, which is a browser API for making HTTP requests. This was followed by an easier to use browser API called Fetch.

Below you'll find 3 netnet examples, which send a request to the same REST API, called dog.ceo, which returns a random image of a dog. The first uses the older XMLHttpRequest API, the second one uses the Fetch API and the third example uses the newer "async / await" syntax to use the Fetch API with cleaner and easier to read code. All three examples technically do the same thing (the difference is the syntax)

XMLHttpRequest (old way)
Fetch API: then(callback) (newer way)
Fetch API: async/await (newest way)



other data sources

Not all data-driven projects you come across online make use of these 3rd party REST APIs, sometimes data is made available for download so you can host it locally with your project (ie. upload it directly to your GitHub project like you would images or other assets). There are lots of places to find datasets online, one popular repository of data is kaggle.com, or checkout Jeremy Singer's Data Is Plural newsletter where he shares interesting datasets on a weekly basis (every single dataset he's shared in the newsletter previously can be found on this spreadsheet)

Sometimes the data we want is out there on some website, but there's no "download" button nor is there a REST API to conveniently request the data from. When this is the case you can create a "web scraper", a little bit of JavaScript code which acts as a bot that goes out onto the Internet and downloads the data off the website you want. These typically need to be custom written to ensure you get only the data you want, organized the way you want. This is beyond the scope of what we'll cover in class, but if this is something you're interested in learning more about, email me.

Lastly, you could create a project which generates its own data, this can be an automated process or it could be user generated data (like in Aaron Koblin's mechanical turk projects). In these cases you need somewhere to put the generated data, which requires "server side" or "back-end" JavaScript. In this class we're mostly focused on "client side" or "front-end" JavaScript, though again, while this is beyond the scope for this class, email me if you're interested in experimenting with something like this.