CS 111: Introduction to Computer Science

Winter 2017

HW07: Weather statistics

Due: Wednesday, 02/01 at 22:00

This assignment contains some extremely common tasks (e.g., calculating mean and standard deviation). As such, code for implementing these functions abound, both on the web and even in the textbook. I ask you to put in a good-faith effort to write the code yourself without looking up code samples.

Introduction

When you write a function, you are giving a name to a collection of operations. We do this in non-programming life all the time. "Make breakfast" is a name for a complex sequence of operations, most of them conditional ("if the milk hasn't gone bad, get the bowl out for cereal, otherwise if there is bread, get out the toaster"). In a computer program, writing a function enables you to name a complex operation and reuse it anywhere else in your program—even in another function. By naming operations in this way, we can decompose very complex problems into simple parts, which enables us to think more effectively about the problems.

In this assignment, you will write some functions and then use those functions to solve a problem involving weather data.

The program

Your program, called weatherstats.py, will read data from a file with temperature recording and produce a report summarizing some of the interesting features. Here is a file of daily high temperatures recorded at the Minneapolis-St. Paul airport between 1939 and 2014. Your program should accept the name of the data file as a command-line argument, and then produce output looking something like this:

$ python3 weatherstats.py msp-temperatures.txt

High temperature data (Fahrenheit)

Year      Mean     Std Dev
1939     52.74       24.47
1940     53.89       23.03
...
2014     53.28       25.95

Highest mean temperature: 53.22 (1954)
Lowest mean temperature: 55.87 (1989)
Highest standard deviation: 24.34 (1972)
Lowest standard deviation: 21.03 (2003)
Highest temperature: 107 (1963)
Lowest temperature: -43 (1945)

Note: My numbers in the example above are fictional. I made them up to demonstrate the kinds of display I want to see. Your results will be different from these.

Though you are free to implement much of your program however you wish, I want you to start by writing the following three functions:

def loadWeatherData(fileName):
    '''Loads temperature data from the specified file into a
       list of lists of integers, and returns the list.  This
       function assumes that each line of the file consists of
       a year followed by some number of temperatures, all
       separated by commas.  For example, if the file looks
       like this: 

           1939,-10,5,3,14,...
           1940,2,8,-1,4,...
           ...
           2014,7,12,3,5,-2,...

       (which means that on Jan 1, 1939, the high temperature
       was 10, etc.), then loadWeatherData will return a list
       of lists like this:

           [[1939,-10,5,3,14,...], [1940,2,...], ..., [2014,7,12,...]]
    '''
    pass

def mean(listOfNumbers):
    '''Returns the mean of the specified list of numbers
       if the list contains at least one element.  Otherwise,
       returns 0.  This function assumes that its parameter refers
       to a (possibly empty) list of numbers.'''
    pass

def standardDeviation(listOfNumbers):
    '''Returns the standard deviation of the specified list of
       numbers if the list contains 2 or more elements.  Otherwise,
       returns 0.  This function assumes that its parameter refers
       to a (possibly empty) list of numbers.'''
    pass

You may download a skeleton file as a starting point.

Note that the Python instruction pass is just a place-holder for the function body that does nothing at all. You will replace the pass in each case with the actual code required by these three functions.

The idea here is that if you write these three fairly simple functions, your main program will be able to exploit these functions (in some cases repeatedly) to perform the overall task of the program. Of course, you are free to define additional functions if you find it useful to do so. But you should definitely implement the three functions described above, and use them in your main program where appropriate.

The raw data for this exercise came from the United States Historical Climatology Network.

Some suggestions

Optional challenge

The typical code students turn in tend to have a lot of code duplication. Try to get rid of as much code duplication as you can.

Grading

Please name your file weatherstats.py.

This is a somewhat long assignment. Start early, have fun, and discuss questions on Moodle.