Building and using a database

Due noon Friday, 5/16/03, on paper.

What to build

For this assignment, you will build a database of information about movies. Your database should start with three tables, named movies, directors, and performers. The movies table should include at least the title and year of each movie, and the directors and performers tables should contain the names and birth years of each person.

Once the database is built and filled with some data, you are going to want to ask questions about the data using SELECT statements of various sorts. To answer the question "list all the movies directed by Billy Wilder," for example, you will need some kind of connection between movies and directors. You might be tempted to introduce a director_id field into the movies table, but you'll get in trouble with any movie directed by more than one person (the Coen brothers, for example). Instead, you should create a table named something like directed_by or director_movie_links, each of whose rows contains a director ID and a movie ID. For example, your table might include one row with Joel Coen's director ID and the movie ID for "Fargo," and another row for Ethan Coen's director ID and the movie ID for "Fargo." This setup allows you to have movies with multiple directors, as well as directors with multiple movies. You will want to use a similar table to show the connections between performers and movies.

Your database should include at least ten movies, ten performers, and five directors.

There are many on-line resources for the movie-related data you will need for this assignment, the best known of which is probably the Internet Movie Database.

What to hand in

Once you have built your database and added data, hand in the following items, along with the SQL query that produced each item.

  1. A list of all your tables.

  2. The full contents of each table.

  3. For some director (pick one), a list of the movies he/she directed (according to your database, that is--no need to include all sixty-seven Hitchcock). This list should be ordered by year.

  4. For some performer, a list of the movies he/she acted in, ordered by movie title.

  5. For some movie, a list of the performers in the movie, alphabetized by last name.

  6. For some director, a list of the performers who acted in the director's movies, alphabetized by last name.

  7. A list of performers who acted in movies released before 1970. Include the movies and the years, and order the list by year. Note that a performer might appear in this list many times (for example, Ingrid Bergman in "Casablanca," "Notorious," and "Murder on the Orient Express"), and a movie might also appear many times (Ingrid Berman, Peter Lorre, and Humphrey Bogart in "Casablanca").

  8. A list of movies directed by directors born after 1950, alphabetized by movie title.

  9. Any other interesting list of your own choice.

To generate the many lists I have asked you for, first figure out the SQL query that creates each list. Then launch mysql and run a session in which you make each query in turn. Finally, copy the mysql session into a text editor, adjust margins or delete mistakes to make it readable, and print it out. If you have comments about any of the queries or lists, feel free to add them before printing or by hand after printing.

You might want some SQL tutorial information or reference material. Some possibilities include the W3 Schools SQL Tutorial, the tutorial at mysql.com, and the MySQL reference page.

Start early, have fun, and let me know if you have questions.