Web Application: selecting a dataset

You will work with your web application team for this assignment.

The web application

For the remainder of the term, you'll be working on various aspects of a database-driven web application. Think about a site like the World Population Clock or Baseball Reference or this super-cool baby name visualizer or Worldometer or IMDb or this ridiculous little toy for movie lovers. All of these concern themselves with various ways of searching and reporting on aspects of complex (or not-so-complex) datasets.

Our application will use a pretty typical setup:

There are performance, usability, and maintainability/extensibility tradeoffs in this structure that we'll discuss as we go along.

First step: pick a dataset

For the purposes of this project, you're going to start by selecting a dataset suitable for the pedagogical goals of the project. Normally, you would enter into a project knowing what data is involved, since there wouldn't be a project at all unless you or somebody else had an idea for what you wanted to build. But class projects are a little weird. Let's roll with it.

We want data that has the following properties:

Where can you find interesting data?

First, if you do some brainstorming about a website you would find interesting, you can certainly search for a relevant dataset on your own. But if you need some inspiration, the Carleton library has assembled a good list of sites with datasets suitable for this class. Check it out.

Here are some other places that might have something interesting:

In my experience, many students find their way to kaggle.com. That's not necessarily bad, but Kaggle datasets are extremely variable in quality and interest. Many of them are pretty bad or just kinda weird. So don't look only there.

Your tasks

Have fun!