This assignment should be done individually. You may work with other people, get ideas, help each other debug, and so on; but the scripts that you submit should be your own.
PostgreSQL is installed on the department machine named lime. Make sure to connect to lime before doing anything else for the assignment. If you're connecting remotely via ssh from outside the building, you may have to connect to prism first.
This assignment will use a database containing college-type data. The data is located in seven flat files. You can find all the flat files in this zip file.
Your first goal is to import this data into relations in PostgreSQL. The schema of the database that you should create is as follows (keys are in bold):
student (sid, sname, sex, age, year, gpa)
dept (dname, numphds)
prof (pname, dname)
course (cno, cname, dname)
major (dname, sid)
section (dname, cno, sectno, pname)
enroll (sid, grade, dname, cno, sectno)
To import a flat file into PostgreSQL, here are the steps you should follow:
You should import the data from the other six relations similarly. The order that you import them in is relevant if you're using foreign keys.
Design SQL queries that answer the questions given below (one query per question) and run them using PostgreSQL. Your queries should be correct with respect to the design of this dataset. In other words, we should be able to use your SQL queries on another dataset with the same schema and get correct answers even if the actual data within is different.
The query answers should be duplicate free, but you should use distinct only when necessary in general. Determining whether or not you need distinct should not depend on the particular dataset you have before you, but on its design.
The following questions are roughly in the order of degree of difficulty.
All of your SQL queries should be contained in a single file called queries.sql.
Does the dataset that I've given you allow one to "cheat" at any of the above questions, i.e. issue a query that is wrong with respect to the schema but still gives the correct answers for this particular instance? If so, which questions are they, and what is it about this sample instance that allows "wrong" queries to work? Feel free to include any thoughts you have on this in a separate file that you can submit with your assignment.
An important application for database systems is supplying data for building web pages. Create a web page that will show the results from your first 3 queries above. Here's how do so:
Remember that these scripts are readable by anyone in the class! That's why I've asked you to put the SQL for your first 3 queries only on the web page. Grading the web page will not be a significant portion of the assignment, but is intended to be a fun way for you to see how your results could be used.
Part 1: Submit queries 1-4.
Part 2: Submit queries 5-12 and the web page.
Submit via Moodle your queries.sql as well as a copy of your index.php file (for part 2).