CS 257 Assignment

You have now selected your data and identified and prioritized a collection of features in the form of user stories. For this next phase of the project, you'll think about what services these features will need from your database.

Your API will be an HTTP-based API like the ones you worked with last week. The goal of your API will be to provide your web application code (i.e. the user interface of your envisioned application) with convenient access to your data. For my books and authors web application, for example, I need API access to the list of all authors, the list of all books, author(s) given a specific book, book(s) given a specific author, etc.

For this phase of the web application, you will create structured documentation for all the services your API will provide. This documentation will focus on endpoints—structured URLs that act as a request for specific JSON responses. For example, as you will see below, I will use an endpoint "http://host:port/author/<author_id>" (typically written as just "/author/<author_id>") to request all the available information for the author with ID author_id.

What to hand in for Phase 3

Create a text file webapp/phase3.txt. In this file, include:

All partners' names
A list of your API endpoints. As illustrated above, each endpoint should include the Endpoint format, a brief Summary, a description of the Response Format, and an example query and response.
For each user story in phase2.txt, a brief explanation of which API endpoints will be used to satisfy the data needs of the user story.

The rest of this document describes the design and documentation of a couple endpoints for the books-and-authors project. This should give you a template for thinking about and documenting your API's design.

Books and authors: user stories

Consider two simple user stories for my books-and-authors project.

Alice can click the "Authors" tab to see list of the names of all the authors in the database, alphabetized by last name. Each author is formatted as a link.
Bob can click on an author's name and see the author's name, birth and death years, and a list of the author's books, sorted in increasing order by publication date.

Books and authors: API, first draft

Suppose my API provided the following services.

Return the complete list of author IDs.
Given an author's ID, return the author's name, birth year, and death year.
Given an author's ID, return the author's list of book IDs.
Give a book's ID, return the book's title and publication date.

Then story #1 from above can be implemented by combining API services (a) and (b), while story #2 can be implemented by combining API services (b), (c), and (d).

Books and authors: API, second draft

As an API designer, though, the first draft of my API worries me on performance grounds. Running story #1 would require one query of type (a), and an additional query of type (b) for each author in the database. Similarly, story #2 would have to invoke service (b) once, (c) once, and (d) for every book written by the author. That's a lot of distinct API calls (with the attendant network overhead) for two very simple stories.

So let's give this API design another try, with attention on reducing the number of queries.

Return the complete list of authors, where each author is represented by a JSON dictionary containing the author's ID, first name, last name, birth year, and death year.
Given an author's ID, return the author as a JSON dictionary including the author's first name, last name, birth year, death year, and list of books. Each book, in turn, would be represented by a dicttionary containing the book's title and publication date.

That's cleaner. Story #1's data needs can be completely served by one query of type (A), and story #2 can be handled by one query of type (B). There's potential redundancy here—if we've already done story #1 and then Bob clicks on a link from story #1, then we'll retrieve some of the data we already had a second time (the author's name). But balancing that redundancy against the dramatically reduced number of separate API calls, we might choose this second version of the API.

Books and authors: API documentation

Since we're now settled on the A, B version of the API rather than the a, b, c, d version, let's articulate our API more formally. That might looks something like this:

Endpoint: /authors

Summary: Returns the complete list of authors contained in the database.

Response format: a JSON list of author dictionaries. Each author dictionary will have keys "id", "first_name", "last_name", "birth_year", and "death_year". A typical response to a query of this type will look like this.

[ {"id": 27, "first_name": "Jane", "last_name": "Austen", "birth_year": 1775, "death_year": 1817}, {"id": 15, "first_name": "Lois McMaster", "last_name": "Bujold", "birth_year": 1949, "death_year": NULL}, ... ]
Endpoint: /author/<author_id>

Summary: Returns all information in the database related to the author with the specified ID.

Response format: a JSON dictionary with keys will have keys "id", "first_name", "last_name", "birth_year", "death_year", and "books". The "books" value will be a list of book dictionaries. Each book dictionary will have keys "title" and "publication_year".

A typical response to a query like /author/27 will look like this.

{"id": 27, "first_name": "Jane", "last_name": "Austen", "birth_year": 1775, "death_year": 1817, "books": [ {"id": 18, "title": "Pride and Prejudice", "publication_year": 1813}, {"id": 5, "title": "Emma", "publication_year": 1815} ] }

Books and authors: GET parameters

Suppose I want to be able to filter my requests for books and authors by miscellaneous attributes. For example, I might want to ask for books published during the 1920s. One API-design approach might looks like this:

/books-by-date-range/1920/1929

So far so good. So if I want to be able to filter by, say, genre:

/books-by-genre/mystery

But what if I want to filter by both date and genre? Do I do this?

/books-by-genre-and-date/mystery/1920/1929

And what if I add yet another constraint type--maybe publisher?

/books-by-genre-and-date-and-publisher/mystery/1920/1929/doubleday /books-by-date-and-publisher/1920/1929/doubleday /books-by-genre-and-publisher/mystery/doubleday ... UGH!

You start to get a combinatorial explosion of endpoints, and it's a mess that's hard to think about, hard to remember, hard to parse reliably, etc.

So is there a better way? When you're confronted with a collection of optional constraints like this, the best approach is usually to use GET parameters. For example:

/books?genre=mystery&start_year=1920&end_year=1929

Note that with GET parameters, you can put them in any order, and you can leave some or all of them out. The core endpoint sytax remains simple and clear ("/books") and the filters can be made clear through well-named parameters ("genre", "start_year", "end_year").

Some last design advice

Your API's URL components should generally be nouns—singular if you want to retrieve a single thing, or plural if you want a list. For example:
/book/<book_id>
is asking for one book (represented by a JSON dictionary), while
/books
is asking for a list of books (represented by a JSON list of dictionaries).
Optional constraints on an API request should generally be represented by GET parameters, while required constraints should be included in the main URL.

For example, my linguistic API includes an endpoint for verb conjugations. Performing a verb conjugations naturally requires that the caller specify both the language and the verb, so my API includes those parameters as distinct URL components:

/conjugations/<language>/<verb> /conjugations/french/parler
On the other hand, the question of how I want the response to be formatted (e.g. JSON, XML, plain text, or HTML) can be treated as optional. You can specify it if you want:
/conjugations/french/parler?output_format=XML
or you can just accept the default (JSON in this case):
/conjugations/french/parler

CS 257: Software Design

Web Application, Phase 3: API design