Now that you have chosen data to work with and written a first draft of your feature list, let's put together a skeleton prototype of the code infrastructure your application will use. The goals of this phase are:
When this phase is complete, a user will be able to go to your web app with a browser, enter a small amount of data, and get an answer back that includes data extracted from the database. For example, suppose my web application concerned campaign finance data. Then in this phase of the project, I might provide a button to click to generate the complete list of election candidates in my database, alphabetized by name. The point is to put together the simplest possible application that includes access to the database. Adding complexity to get a finished app will be phase 3.
When you're working on a networked application, it's extremely handy to be comfortable with three basic tools:
An SSH client. SSH ("secure shell") gives you command-line access to remote machines on which you have login access. When I want to login to one of my web host accounts, I open the Mac Terminal program, type "ssh myaccountname@name.of.server.com", and enter my password when asked. If the remote computer is running some variant of Unix (including Linux, FreeBSD, Solaris, Mac OS X, etc.), I can do all the usual Unix command-line things. If the remote computer is running Windows, then I'll need to know suitable Windows command-line commands, but the principle is the same.
On Windows, you can run SSH using PuTTY, or by making sure to install OpenSSH when installing cygwin. On Mac, it's easiest to run SSH from within Terminal.
An SFTP client. Secure File Transfer Protocol helps you move files from computer to computer. As with SSH, you can use something like "sftp myaccountname@name.of.server.edu" from within Mac's Terminal or cygwin. Many people prefer to use a GUI file transfer system like Transmit for Mac or WinSCP for Windows or Cyberduck for either. SCP ("secure copy") is similar.
A terminal-based editor, so you can edit files directly on a computer to which you're connected via SSH. The titans in this area are vi (often just an alias to vim, i.e. "VI iMproved") and emacs. Many programmers will talk to you all day about their relative merits. Both vi and emacs have extensive documentation and tutorials on-line, and both require an investment of time to learn how to use them effectively. Pick one and start learning it; it's worth the effort. That said, most Unix systems also support an editor named nano, which is less powerful than emacs or vi, but also easier to learn for simple editing tasks.
If you take time now to learn to use SSH, SFTP, and either vi or emacs, it will make your work on networked applications and other remote tasks smoother. There's no time like the present!
Since the user, the browser, and the web server (Apache running on cs-research1.mathcs.carleton.edu, for our purposes) already exist, setting up the code infrastructure requires only that you place your rudimentary web application and data in a place where Apache can get at them.
Follow these steps to get started.
Login to your account in the cs.carleton.edu network, either by sitting in front of one of the machines in CMC 304 or 306, or by connecting remotely via ssh to thacker.mathcs.carleton.edu.
Make a subdirectory of your web space to store Phase 2:
Download copies of tinywebapp.html and tinywebapp.py, and copy them to the /Accounts/courses/cs257/jondich/web/yourusername/phase2 directory.
Make sure your tinywebapp.py file is executable by typing the commands:
(That sequence has you run "ls -l" before and after the chmod command. Look at the permissions information for tinywebapp.py on the far left of the ls listing. Did anything change when you ran chmod?)
Test your setup by using a browser to go to http://thacker.mathcs.carleton.edu/cs257/yourusername/phase2/tinywebapp.html. Try the form and see if it works.
Change the names of tinywebapp.html and tinywebapp.py to webapp.html and webapp.py. (These are more generic than one would usually use, but if we all name our apps with the same name, it will be easier to look at one another's code. Try the new app. Does it work? Why not? Fix it.
Now that the basic idea is working, we will want the front page (currently named webapp.html) to be generated by a Python script rather than just a flat HTML file. With that in mind, redesign your setup so that when I go to http://thacker.mathcs.carleton.edu/cs257/jondich/yourusername/phase2/webapp.py, I'll get the front page, and when I click on Submit or Go or whatever your button is labelled, I will get appropriate results.
Finally (for now), add whatever HTML input elements you need to allow people to request the simplest service your application is intended to provide. Maybe you need radio buttons or checkboxes or dropdown lists or whatever seems suitable. This tutorial has some good examples, and here is some tutorial information on HTML 5 forms.
See webapp.py for a couple techniques that weren't in the tinywebapp.
Your web application, whose entry point is webapp.py, will need to access your data to provide the desired services for people. Next week, we will get your PostgreSQL databases set up with your data. Regardless of the PostgreSQL details, in the meantime you're going to create a class to act as an interface between your main application code and the data. This is an important idea, so pay attention.
Suppose, for example, your data is federal election campaign finance data. Then you would create a file called datasource.py, containing a class more or less like this:
The idea is that your application needs to ask certain idiosyncratic questions of your data.
This class's methods will be designed to provide answers to those questions, regardless
of where and how the data are stored.
You could, for example, write one version of DataSource
that assumes the
data are stored in a CSV file, and another version that assumes the data are stored
in a PostgreSQL database. By isolating the main application from the details of the data source,
you get all sorts of great benefits, which you can undoubtedly imagine (and which we will
discuss in detail in class during the next couple days).
Please think carefully about what methods this class should have, and what their signatures should be (i.e. names, parameter lists, and return values). Document these methods in detail in a comment below the "def" line before trying to implement them. Your documentation should succinctly explain the meaning and type of each parameter, the operation of the function, and the meaning and type of the return value. If the method might raise an exception, it's good to document the conditions under which that might happen as well.
As you can see in my example, I have prepared my DataSource class's methods as stubs. That is, they return empty lists or zero or whatever type of "nothing" is appropriate for the method in question. Alternatively, I could return dummy data by just hard-coding it in each stub's return statement.
You will hand in your code by making sure http://thacker.mathcs.carleton.edu/cs257/yourusername/webapp.py presents a page with:
webapp.py contains one approach to linking to source files.
If I were working on this project, I would create a new git repository on Bitbucket, and put webapp.py and its supporting files at the root level of the repository. Then, I would login to thacker, cd to /Accounts/courses/cs257/jondich/web/yourusername, and then run "git clone [repository address] phase2". That would cause the phase2 directory to be a clone of my webapp repository, which would then give me all the benefits of version control.
If you do this, be careful about what files you put into the repository. You don't want private information to be visible on the web.