HW11 Part 2: A Histogram Class

This is the second part of HW11.

A histogram is, abstractly, a collection of bins that store counts of values encountered in some sample. For example, a three-bin histogram for test scores might have one bin for scores from 0 to 60, another for those greater than 60 but less than 80, and another for those from 80 to 100. If you add the score 75 to this histogram, the count in the middle bin increases by 1.

In this portion of the assignment you will write a full-featured Java class representing a histogram with equal-sized bins.

Creating your class

  1. Create a new file Histogram.java and define the class Histogram inside it. Your histogram will store counts of doubles. Create empty methods with the following signatures:

    public Histogram(double min, double max, int num_bins) {
        // ...
    }
    public void addValue(double val) {
        // ...
    }
    public void print() {
        // ...
    }
    

    As above, use the mystery List implementation to create your class; later on you'll replace it with the implementation that you write. Your class should need only one data member that's a List, representing the bins; remember that you store counts, not values in the bins, so think about the consequence this has on the type that your List should store.

  2. Your bins should be equally-sized, covering the range from min to max. You should consider the range to be inclusive on its high end and exclusive on its low end, and the same for the ranges covered by each bin. So, for example, the call

    Histogram hist = new Histogram(0.0, 1.0, 2);
    

    should create a histogram with two bins, covering the ranges $0.0 < x \le 0.5$ and $0.5 < x \le 1.0$, respectively. Note that the value $0$ does not fall in either bin.

  3. When your user calls addValue(), you should figure out which bin the value belongs in, and increment the count in that bin. Counts are all initially zero.

    What happens when you add a value outside of the range covered by the histogram? It should increment the count in one of two special “out-of-range” bins, representing the ranges $-\infty < x \le$ min and max $< x < \infty$.

  4. Your print() method should print each bin on its own line (including each out-of-range bin, if it's non-empty), with the range of the bin indicated, followed by a vertical separator, followed a number of asterisks equal to the count in that bin. (See below for an example).

Testing your histogram

Write a main method that takes four command-line arguments: the min value, max value, and number of bins for a histogram, and a filename. The given file is expected to be a plain-text file with a single floating-point value on each line. Your main method should construct a histogram according to the command-line arguments, add all the values in the file to it, and then print it out.

For example, if the file test_scores.txt consists of:

25.0
36.0
12.0
37.0
0.0
4.0
22.0
50.0

then your program should behave something like this:

> java Histogram 0.0 50.0 10 test_scores.txt
<= 0  | *
0-5   | *
5-10  |
10-15 | *
15-20 |
20-25 | **
25-30 |
30-35 |
35-40 | **
40-45 |
45-50 | *

Note that the high out-of-range bin was not printed, since it was empty, but all other empty bins were printed.

To get the output to line up nicely, you may find it useful to use the Formatter and StringBuilder classes. To keep the output clean, don't print more than two decimal places in the bin bounds.