CS 111: Introduction to Computer Science
Winter 2017
HW07: Weather statistics
Due: Wednesday, 02/01 at 22:00
Introduction
When you write a function, you are giving a name to a collection of operations. We do this in non-programming life all the time. "Make breakfast" is a name for a complex sequence of operations, most of them conditional ("if the milk hasn't gone bad, get the bowl out for cereal, otherwise if there is bread, get out the toaster"). In a computer program, writing a function enables you to name a complex operation and reuse it anywhere else in your program—even in another function. By naming operations in this way, we can decompose very complex problems into simple parts, which enables us to think more effectively about the problems.
In this assignment, you will write some functions and then use those functions to solve a problem involving weather data.
The program
Your program, called weatherstats.py
, will read data from a
file with temperature recording and produce a report summarizing some of the
interesting features.
Here is a file of daily high temperatures
recorded at the Minneapolis-St. Paul airport between 1939 and 2014.
Your program should accept the name of the data file as a
command-line argument, and then produce output looking something
like this:
$ python3 weatherstats.py msp-temperatures.txt High temperature data (Fahrenheit) Year Mean Std Dev 1939 52.74 24.47 1940 53.89 23.03 ... 2014 53.28 25.95 Highest mean temperature: 53.22 (1954) Lowest mean temperature: 55.87 (1989) Highest standard deviation: 24.34 (1972) Lowest standard deviation: 21.03 (2003) Highest temperature: 107 (1963) Lowest temperature: -43 (1945)
Note: My numbers in the example above are fictional. I made them up to demonstrate the kinds of display I want to see. Your results will be different from these.
Though you are free to implement much of your program however you wish, I want you to start by writing the following three functions:
def loadWeatherData(fileName): '''Loads temperature data from the specified file into a list of lists of integers, and returns the list. This function assumes that each line of the file consists of a year followed by some number of temperatures, all separated by commas. For example, if the file looks like this: 1939,-10,5,3,14,... 1940,2,8,-1,4,... ... 2014,7,12,3,5,-2,... (which means that on Jan 1, 1939, the high temperature was 10, etc.), then loadWeatherData will return a list of lists like this: [[1939,-10,5,3,14,...], [1940,2,...], ..., [2014,7,12,...]] ''' pass def mean(listOfNumbers): '''Returns the mean of the specified list of numbers if the list contains at least one element. Otherwise, returns 0. This function assumes that its parameter refers to a (possibly empty) list of numbers.''' pass def standardDeviation(listOfNumbers): '''Returns the standard deviation of the specified list of numbers if the list contains 2 or more elements. Otherwise, returns 0. This function assumes that its parameter refers to a (possibly empty) list of numbers.''' pass
You may download a skeleton file as a starting point.
Note that the Python instruction pass
is just a place-holder
for the function body that does nothing at all. You will replace the
pass
in each case with the actual code required by these three
functions.
The idea here is that if you write these three fairly simple functions, your main program will be able to exploit these functions (in some cases repeatedly) to perform the overall task of the program. Of course, you are free to define additional functions if you find it useful to do so. But you should definitely implement the three functions described above, and use them in your main program where appropriate.
The raw data for this exercise came from the United States Historical Climatology Network.
Some suggestions
Before you start, develop an incremental development plan, where you approach the project with small, testable steps. For example, Stage 1 might be to implement the
mean
function and test it like so:print(mean([23])) print(mean([4, 4, 4, 4])) print(mean([5, -2, 3, 4])) print(mean([]))Stage 2 could be to implement
standardDeviation
, and test it using example data from the Wikipedia page, etc.- Don't know the formula for standard deviation? See the Wikipedia article, notably the examples in the "Basic examples" section and the definition in the "Discrete random variable" section.
- Don't forget that functions can call functions. For example,
standardDeviation
will probably want to callmean
. - If you have a list where the first element is special (e.g.,
myList = [1986, -4, 9, 6,...]
), you can get a list containing everything other than the first element using slicing, like so:myList[1:]
. - Command-line arguments are accessible via
sys.argv
. See the skeleton file for more details. - You already know how to iterate through lines of a file via a
for
loop. This is sufficient to do this assignment. However, you may wish to peruse Zelle §5.9 titled "File Processing." - Write some data files with different test cases (using the same format, of course) to make sure your program behaves correctly. We will be running your program on other test files.
Optional challenge
The typical code students turn in tend to have a lot of code duplication. Try to get rid of as much code duplication as you can.Grading
Please name your fileweatherstats.py
.
- Style/comments [4 points]
- All the style guidelines from previous assignments, as always.
- In particular, do not use global variables!
- Close file as soon as you are done reading.
- Command-line processing [2 points]
- Required functions [9 points]
- Absolute conformance to the function specifications above
- Takes the arguments as specified
- Returns the correct result
- Output display [8 points]
- Presents all the data in a similar way to the sample output above
- Need not be formatted exactly the same
This is a somewhat long assignment. Start early, have fun, and discuss questions on Moodle.