This is intended to give you a sense of what I think is important from the course so far, and what I will be thinking of when creating the exam. Here are some disclaimers. This is not a contract. I may have inadvertently left something off this list that ends up in an exam question. I make no guarantees that the exam will be 100% limited to items listed below. Moreover, I will not be able to test all of this material given the time limitations of the exam. I will have to pick and choose some subset of it. You are permitted one 8.5 x 11 sheet of paper with notes (both sides) for use as a reference during the exam. Here are the specifics: Students should be able to... SQL: Be able to produce, or interpret the results of, complex queries of the level of difficulty that have appeared in assignments. Indexing: Be able to describe alternative file organizations to the heap file, and quantitatively describe advantages and disadvantages. Be able to define and quantitatively assess merit of indexing strategies such as primary vs. secondary indexes, clustered indexes, dense vs. sparse indexes, built-in vs. external vs. external with list indexes (in book as Alternatives 1, 2, and 3). Be able to show detailed examples of how inserting and deleting works in B+ trees, extendible hashing, and linear hashing when sufficient assumptions on implementation are provided. Be able to explain advantages and disadvantages of each of the above techniques, and why each might be chosen. Be able to work out approximate I/O costs for retrieving data using a particular indexing technique. Query Evaluation: Be able to explain and/or demonstrate... - algorithms used for selection, projection, join, and aggregation - how approximate costs for such operations are calculated - various join algorithms such as tuple nested loops, block nested loops, index nested loops, sort merge join, and hash join, and I/O costs associated with these techniques - external sort, and "replacement sort" optimization (and associated costs) Query Optimization: Be able to evaluate alternative query evaluation strategies to determine which is more likely to be chosen. Be able to generate query evaluations strategies for a particular query, and indicate how to decide amongst them. Be able to describe how query optimizer approaches above problems, and show specific examples. Normalization strategies: Be able to interpret BCNF and 3rd normal form. Be able to indicate whether or not certain examples satisfy these criteria. Be able to define a functional dependency and be able to determine whether or not a given FD is derived from others. Be able to determine whether a given decomposition is dependency preserving and/or lossless join. Transactions: Be able to describe need for transactions. Be able to work through transaction schedules to show what occurs. Be able to describe various locking protocol (2PL, strict 2PL), show how they work, and address what problems they handle/don't handle. Show how deadlock and cascading aborts are affected by above locking techniques.