CS257 Software Design Friday, 7 October 2022 0. questions? 1. a bad database 1 table title pubyear author author_years Emma 1820 Jane Austen 1785-1828 Problems - Author name is not parsed into pieces - Same deal w/ author_years - Doesn't support multiple authors for a book - Duplicate data: "Jane Austen" appears in the table multiple times - 2. principles and observations - data is a mess - data is repetitive - data duplication is a problem - memory and disk space - errors in multiple locations - one-to-many and many-to-many relationships require separation of data - conceptual coherence - Abstract Data Type thinking - aesthetic judgment 3. looking at the Olympics data - downloading - file formats - Proposed Tables - olympic_games: year, events in games, who attended, city, season uh-oh: events in games (a zillion per year) uh-oh: who attended (a zillion per year) Try again olympic_games: id, year, season, city events: id, name nocs: id, abbreviation, name athletes: id, name, sex, weight? height? olympic_games id year season city 1 2012 Summer London 2 1980 Winter Lake Placid 3 1984 Winter Moscow events: id name 1 Men's Ice Hockey events_olympic_games (linking table) event_id olympic_games_id 1 2 1 3 athletes id - player_attributes: age, sex, weight, height, team, ID, medals, events 0 4. CSV processing "ID","Name","Sex","Age","Height","Weight","Team","NOC","Games","Year","Season","City","Sport","Event","Medal" "462","Ilze bola","F",19,169,60,"Latvia","LAT","1998 Winter",1998,"Winter","Nagano","Alpine Skiing","Alpine Skiing Women's Giant Slalom",NA