Assignment 6 - Parsing and Visualizing Data

Due: Monday, May 6, 2024, at 10pm

You may work alone or with a partner, but you must type up the code yourself. You may also discuss the assignment at a high level with other students. You should list any student with whom you discussed each part, and the manner of discussion (high-level, partner, etc.) in a comment at the top of each file. You should only have one partner for an entire assignment.

You should submit your assignment as an a6.zip file on Moodle.

Parts of this assignment:

You will work in two different files for this assignment. Additionally, you will need some data files.

  • Part 1: skeleton code and data file. You should save the code file as shapeParser.py and the data file as shapes.csv, both in a folder with graphics.py.
  • Part 2: skeleton code and data file. You should save the code file as weatherPlotter.py and the data file as weatherData.csv.

Comments and collaboration

As with all assignments in this course, for each file in this assignment, you are expected to provide top-level comments (lines that start with # at the top of the file) with your name and a collaboration statement. For this assignment, you have multiple programs; each needs a similar prelude.

You need a collaboration statement, even if just to say that you worked alone.

Note on style:

The following style guidelines are expected moving forward, and will typically constitute 5-10 points of each assignment (out of 100 points).

  • Variable names should be clear and easy to understand, should not start with a capital letter, and should only be a single letter when appropriate (usually for i, j, and k as indices, potentially for x and y as coordinates, and maybe p as a point, c for a circle, r for a rectangle, etc.).
  • It’s good to use empty lines to break code into logical chunks.
  • Comments should be used for anything complex, and typically for chunks of 3-5 lines of code, but not every line.
  • Don’t leave extra print statements in the code, even if you left them commented out.
  • Make sure not to have code that computes the right answer by doing extra work (e.g., leaving a computation in a for loop when it could have occurred after the for loop, only once).
  • Avoid having tons of lines of code immediately after another that could have been in a loop.

Note: The example triangle-drawing program on page 108 of the textbook demonstrates a great use of empty lines and comments, and has very clear variable names. It is a good model to follow for style.

Part 1: Parsing a CSV file into shapes

# You should already be fully equipped to complete this part (it’s primarily based on Lesson 14—Wednesday Apr. 24).

In this part, you will complete the implementation of a program that parses a .csv file into a window of shapes. An example is in shapes.csv:

window,800,600

circle,600,200,40,blue
triangle,700,200,700,300,600,100,red
rectangle,50,60,200,400,yellow
square,40,40,30,magenta

circle,60,400,25
triangle,400,200,400,300,300,400
rectangle,770,500,650,250
square,500,400,100

As you can see, most lines of the file contain either the window information or information for a single shape, and a shape may or may not have a color specified. Some lines are empty, and your code should simply ignore them (but not crash when they are encountered).

We talked in class about top-down design. An alternative is bottom-up design. In bottom-up design, you determine the functions you need to complete the smaller-scale tasks, and then determine how to connect them together.

Note: there are useful libraries that can help you parse CSV files, but I do not want you to use them for this assignment. I want to be sure that you can parse the file on your own first.

Part a: complete shape-parsing functions

For this first subpart, you should complete the implementations of the four shape-parsing functions:

  • parseCircle
  • parseSquare
  • parseRectangle
  • parseTriangle

Each function should take in a list of information about the shape, including a possible color. The specific list varies for each shape type. Your functions should create the relevant shape object from graphics.py, fill it in (if the provided list contains a color string), and return it. These functions have a return value (a shape object), but no side effect. They should not draw their shape.

Pay careful attention to the expected input types in the docstring for each function. For example, for a circle, note that it takes in a list of strings, and that list should have either three or four values in it:

def parseCircle(vals):
    """
    Creates a Circle object based on the values given in vals.

    Assumes vals is either a list [x,y,r,color] of strings,
    or [x,y,r].

    x: x-coordinate of center point (expect an int in the str)
    y: y-coordinate of center point (expect an int in the str)
    r: radius (expect a float in the str)
    color: optional color (str)

    Returns: the Circle object
    """
    # TODO
    return Circle(Point(0,0),0) # replace with your code

Part b: putting the pieces together

These functions all need to get used to parse a file into a list of shapes and a GraphWin object. An early part of software design is pseudocode. Pseudocode represents plans for code without having the actual syntax (so no colons, function calls, etc.). The function parseShapesFromFile needs to read in the lines of the CSV file and parse each line (at least, each one that isn’t blank) into a shape. The pseudocode for this function is given to you in comments:

def parseShapesFromFile(filename):
    """
    Opens the file specified by filename for reading, and parses
    the window dimensions and the shapes.

    Returns: the Window object and a list of shapes
    """
    # Specify variables for the window and the shape list
    # Open the file for reading
        # Go through each line in the file
            # Skip over any empty lines
            # Split the line into strings on the commas
            # Use the first string in the line to determine which object to create
    # Return the window and shapes list

Complete this function. Note that you should plan to keep the pseudocode comments around; you don’t have to do work to comment this function, and they will hopefully help you as you code!

Part c: drawing the shapes

One of the beautiful things about the shapes in graphics.py is that they all have a draw method. For a variable shape that represents any type of shape object, you can call shape.draw(win) to draw that shape in the GraphWin object win. This is called polymorphism.

To see this for yourself, you need to add a couple of lines of code to main to actually draw the shapes to the window.

def main():
    # Read in the provided file and parse it into a window and shapes
    filename = "test.csv"
    win, shapes = parseShapesFromFile(filename)

    # Draw each shape (in order) in the window
    pass # TODO: replace with your code

    # Wait to close until the user clicks
    win.getMouse()

Here is the output you should see for the example file:

<image: shapes>

Part 2: Visualizing weather data

# You should be fully equipped to complete this part after Lesson 16 (Wednesday May 1).

This part will require Matplotlib, so make sure to get it if you haven’t already. You can get it from the command prompt (search “cmd” in Windows or “Terminal” on a Mac), with the following command:

pip3 install matplotlib

Part a: parsing the data

I downloaded a bunch of weather data from NOAA’s website. It is provided to you in a CSV file called weatherData.csv.

Disclaimer: All of the data is as-provided by NOAA, except that I changed the station name to not include a comma.

Here are the first few lines of the file:

NAME,DATE,PRCP,SNOW,TMAX,TMIN
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/1/2023,0,0,35,22
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/2/2023,0.02,0.1,27,22
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/3/2023,0.65,6,31,24
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/4/2023,0.61,8.8,33,30
MINNEAPOLIS ST. PAUL INTERNATIONAL AIRPORT,1/5/2023,0.01,0.2,30,18

For the first task of this part, you should implement the parseData function. This function should grab the date strings, precipitation and snow amounts (as floats) and minimum and maximum temperatures (as ints) for each date from the file, and return five lists.

def parseData(filename):
    """
    Opens the CSV file with name filename for reading, and reads in the data.

    filename: string
    returns: five lists:
      - one of dates (strings)
      - one of precipitation (floats)
      - one of snow (floats)
      - one of minimum temps (ints)
      - one of maximum temps (ints)
    """
    dates = []
    precip = []
    snow = []
    minTemps = []
    maxTemps = []

    # TODO: Part 2a
    # your code here

    return dates, precip, snow, minTemps, maxTemps

Part b: plotting the data

Given the data you’ve parsed out of the file from Part a, you should plot the data using Matplotlib. Your plot should have:

  • both the minimum and maximum values in the same plot
  • a title
  • axes labels
  • a legend

Hint: Check out the function plot. Note that formal parameters listed with [] are optional (like the x-axis values, x, and the line format, fmt).

Here is what it should look like:

<image: plot of temperatures in 2023>

To do this, fill in the function plotTempData:

def plotTempData(minTemps, maxTemps):
    """
    Plots both the minimum and maximum temperature for each day
    in the same plot.
    """
    # TODO: Part 2b
    pass # replace with your code

For this function (and no others except Part c), you are encouraged to explore online for help using Matplotlib. This can include the Matplotlib documentation and stack overflow, but should not include tools to generate code for you, like ChatGPT or Bard. Just make sure to cite any website (with the full URL) you visit in your readme.txt. Note that you should still only be collaborating closely (e.g., viewing code) with your partner, if you choose to have one.

Part c: another plot

Come up with another interesting plot that uses:

  • at least one of precipitation or snow data
  • the date

For full credit, you must do something interesting with the data, not just make a line plot for the entire year like in Part b.

For example, you could calculate the total snowfall for each month and make a bar graph with one bar per month. Or, you could make separate preciptation lists for each month, and make a line graph with all twelve months on the same plot as different lines.

Put your new plotting code in another function, and make sure to modify main to call it. Your function should have a docstring, and your plot should have a title and a legend.

def main():
    filename = "weatherData.csv"

    dates, precip, snow, mins, maxes = parseData(filename)

    plotTempData(mins, maxes)

    # TODO: Part 2c
    pass # replace with a call to another plotting function

The same rules apply for Part c as they do for Part b: you are encouraged to explore online for help using Matplotlib. This can include the Matplotlib documentation and stack overflow, but should not include tools to generate code for you, like ChatGPT or Bard. Just make sure to cite any website (with the full URL) you visit in your readme.txt.

Reflection

# You should be equipped to complete this part after finishing your assignment.

Were there any particular issues or challenges you dealt with in completing this assignment? How long did you spend on this assignment? Write a brief discussion (a sentence or two is fine) in your readme.txt file.

Grading

This assignment will be graded out of 100 points, as follows:

  • 5 points - submit a valid a6.zip file with all files correctly named

  • 5 points - all code files contain top-level comments with file name, purpose, and author names

  • 5 points - each code files’ top-level comments contain collaboration statement

  • 10 points - code style enables readable programs

  • 16 points - parseCircle, parseSquare, parseRectangle, and parseTriangle functions correctly create and return shape objects (Part 1a)

  • 10 points - parseShapesFromFile function correctly parses a .csv file and returns a window and list of shape objects (Part 1b; should call functions from Part 1a)

  • 4 points - main function in shapeParser.py correctly draws the shapes returned from parseShapesFromFile (Part 1c)

  • 15 points - parseData function correctly parses a valid .csv weather data file and returns the lists of dates, precipitation, snow, and min/max temperatures (Part 2a)

  • 10 points - plotTempData function correctly plots the data returned from parseData (6 pts), and includes a title, legend, and x- and y-axis labels (4 pts) (Part 2b)

  • 15 points - a new graph is created by a function (2 pts), uses dates and at least one of precip/snow data (2 pts) in an interesting way (9 pts), and the plot has a title and legend (2 pts) (Part 2c)

  • 5 points - readme.txt file contains reflection and citations

What you should submit

You should submit a single .zip file on Moodle. It should contain the following files:

  • readme.txt (reflection, and any websites visited for Parts 2b and 2c)
  • shapeParser.py (Part 1)
  • weatherPlotter.py (Part 2)