Getting started with C

Starter code: starting-c-package.tar
Upload solutions via Moodle as: starting-c.tar

Goals

Learn some of the fundamentals of C programming
Start thinking of your data in terms of bytes
Get used to using some simple program testing automation

Rubric

1 - author name(s) in comment at top of each C source file 3 - "depunctuate" program correctness 6 - "sorter" program correctness 2 - code quality

How I learn new languages

Once you have learned a couple programming languages, getting started in a new language is mostly a matter of finding good reference materials, getting a source of sample programs, and writing a bunch of small programs to get the syntax and core libraries under control.

When I want to learn a new programming language, I start by writing a few small programs to make sure I can handle the basics. Sometimes, I'll just do the first 10 problems at Project Euler. But usually, I prefer to ramp up faster by writing programs specifically aimed at teaching myself key elements of the new language. For example:

Output. Write a "hello, world" program to print a simple text message to standard output.
Input. Write a program that asks users for their names, age in years, and some non-integer number (e.g. their hourly wage), and prints the information back.
A function. Write a recursive factorial function as an example of a function that takes an integer parameter and returns an integer.
Arithmetic and conditionals. Write a change-making program. Here's a description of such a program as assigned to a CS 111 class many years ago.
Input/output parameters. Write a function that takes two parameters and swaps them. Whether this is possible depends on whether your programming language supports pass-by-reference or pointers.
File input, loops, and command-line arguments. Read the contents of one text file and write the same contents, in all uppercase, to a second text file. Both the input and the output file names should be specified as command-line arguments.
Lists/arrays. Given a text file consisting of one word on each line, read in the list of words, sort them into alphabetical order, and print the sorted list. Let the user specify the text file as a command-line argument.
String manipulation and searching. Count the number of times each word in a text file appears. Print the results sorted in decreasing order by word count. Let the user specify the text file as a command-line argument.
Dictionaries/hash tables. Do the word-counting exercise mentioned in the string manipulation item above, but keep track of the counts using a hash table (also known as a dictionary in some languages, including python). This should run much faster than the one using lists.
Classes. If your language has some form of object-orientation (which C does not!), create a class called Circle with instance variables to keep track of the center and radius of a circle. The class should have a suitable constructor (or constructors), plus methods getArea, getCircumference, and a collection of appropriate accessors. Your program should read a list of circles from a text file whose lines consist of three numbers separated by spaces (e.g. "3.2 4 2.7" represents the circle of radius 2.7 centered at (3.2, 4)), instantiating Circle objects for each one. Once you have a list of Circle objects, run through the list reporting the center, radius, area, and circumference of each circle. Let the user specify the file of circle data as a command-line argument.
Pointers, references, and memory allocation. Read a list of integers from a file into a linked list of your own construction. Sort the linked list (insertion sort and merge sort both work well with linked lists) and print the sorted list. Let the user specify the file as a command-line argument.

After getting these basics under control, I start exploring the language's standard libraries. I will always want to know more about string manipulation, the file system (create/delete/move files, traverse a directory tree, etc.), simple GUIs and line graphics, invoking other programs, networking, etc. But this is a longer-term project. If I have a good personal project to work on, most of these libraries come up naturally.

Your assignment

Write C versions of the files/loops/command-line and lists/arrays programs described above.

More specifically:

depunctuate.c. This program's command-line syntax will be:

./depunctuate inputfile outputfile

The program will copy the contents of inputfile to outputfile unchanged except that only digits (ASCII 48-57), tabs and spaces (9, 32), newlines (10, 13) and letters (65-90 uppercase, 97-122 lowercase) will be written to the output file.
sorter.c. This program's command-line syntax will be:

./sorter textfile

The program will read the lines of the text file into an array, sort them lexicographically (i.e. using the return value of strcmp or strncmp as the comparison function), and print the sorted list to standard output. You may assume that:
- the input file is an ASCII file; that is, every byte in the file has a value between 0 and 127
- lines are delimited by the newline char '\n' (ASCII 10), except possibly for the last line in the file, which may or may not end in '\n'
- no line contains more than 200 bytes, including the '\n'
- there are no more than 500 lines in the input file
- it's OK to use a O(N^2) sorting algorithm

Getting the starter package

For many assignments this term, you'll receive some starter code, some testing tools, or miscellaneous other materials to help you get started. These will generally be delivered to you via downloadable tar files. As noted in this handy tutorial from Indiana University, you can extract the files and folders contained in a tar file by using the command:

tar xvf whatever.tar

To get started on this first assignment:

Login to mantis.mathcs.carleton.edu using VS Code and open your cs208 folder
In your VS Code terminal, run:

wget https://cs.carleton.edu/faculty/jondich/courses/cs208_s23/assignments/packages/starting-c-package.tar
Still in your VS Code terminal, extract the starting-c-package folder:

tar xvf starting-c-package.tar

This will create a folder named "starting-c-package" with some stuff in it.
Read the readme.txt file and get started.

Automated testing

In the starting-c-package.tar file linked at the top of this page, you will find:

Makefile: a file that you'll use to perform some very simple automated tests for your depunctuate.c and sorter.c programs
readme.txt: an explanation of how to run the tests
some test data files

Note that for most assignments, I will only give you very simple tests as part of an assignment's starter package. The grader and I will certainly add some more sophisticated tests to explore the boundaries of a given assignment. You are, of course, free to use the testing infrastructure from the starter package to add your own tests. Getting used to automated testing and to writing detailed tests of your own will serve you well in the long-run.

Submitting your work

put your source files (depunctuate.c and sorter.c) in a folder named starting-c/
cd to the parent directory of starting-c/
create a tar file:

tar cvf starting-c.tar starting-c
download the tar file to your local machine (in Visual Studio Code while connected to mantis.mathcs.carleton.edu, you can right-click on starting-c.tar and select Download)
use the Moodle web interface to submit your tar file

A little advice

To help simplify the code for depunctuate.c, try looking at the manual pages for isspace, isalpha, and isdigit. (For example, in a terminal, type man isalpha.)
Similarly, take a look at character escape sequences in C. You might not need these for the current assignment, but if you do, you would probably care especially about: '\n' (newline, ASCII 10), '\r' (carriage return, ASCII 13), and '\t' (tab, ASCII 9).

This will make your code much easier to read and understand. This code, for example:

char ch; ...[ch gets a value]... if (isdigit(ch)) { [blah blah]; } else if (ch == '\n') { [blah blah]; } ...

is much better than this code:

char ch; ...[ch gets a value]... if (ch >= 48 && ch <= 57) { [blah blah]; } else if (ch == 10) { [blah blah]; } ...
Add one test case to each program consisting of an empty input file. You always want your programs to be able to deal with this very common boundary case.

Was it all over too soon?

Want an extra C challenge to fill your quiet hours? Try writing the word-counting program for fun. I will not grade this program, nor will you get any course credit for it. But the practice won't hurt.

Here's a more detailed description of this program.

wordcounter. This program's command-line syntax will be:

./wordcounter textfile

The program will read the words from the input file, count the number of times each word occurs in the file, and print to standard output a list of words and their counts, in the following format:

the,37 and,12 for,6 in,4 of,4 were,4 ...

That is, each line of output consists a word, then a comma, then the base-10 count of the number of occurrences of that word, with no spaces. The lines of output are sorted in reverse order of their counts, with ties broken by putting words in alphabetical order (see in/of/were above). You should assume that:

a "word" consists of any contiguous block of Latin letters (a-z, A-Z), delimited by non-Latin-letter characters (e.g. punctuation, spaces) or the beginning or end of the file. (Note that with this definition, contractions like "don't" will generate weird "words" like "don" and "t"—don't try to fix that!)
no word will be longer than 30 bytes (including the null-termination)
the input file will contain no more than 1000 distinct words