Carleton Comps Project: Automated Software Engineering

Automated Software Engineering

Advisor: Dave Musicant

Final Results - LosCat - A Tool for Source Code Monitoring

Final Results - Automated Code Retrieval and Summarization

Background

The ACM Automated Software Engineering (ASE) Conference is an annual research conference dedicated to applying computer science techniques to its own development tools. Each year, a variety of papers are presented and published that demonstrate all sorts of approaches for automating software design. Here are links to lists of papers from recent years, take a look (and click the "Table of Contents" tab when you get there):

2018 2017 2016 2014 2013 2012 2011 2010 2009 2008

This project will initially begin with a larger group of students (perhaps 12). Each student will find a paper of interest, and present it to the rest of the group. Discussion will follow regarding the interest among other students and the feasibility of doing the project. After the presentations, the students will coalesce into subgroups of approximately 4 students each, depending on the scope of the projects. Each subgroup will pick one of the papers to work with, and implement the project.

It should be noted that many of these papers have a significant algorithmic component to them, and so understanding and possibly implementing AI or optimization-based algorithms will likely form an important part of the project that you'll do.

Sample projects

What might some of these projects be? I don't want to supply examples that you'll do, because I want the team members to find projects. Therefore, I've picked a couple of old ones (from ASE 2007) to illustrate the concept; you'll pick something more recent. These projects may or may not be sized appropriately for the team at hand, and thus perhaps might need tweaking; I've merely included them here to give you ideas of the sorts of things you might find.

Automated code styling. It can be jarring when different programmers collaborate on a shared set of code but use different styles. This paper describes a tool to help keep the code style consistent. It automatically detects which formatting style a program uses, and then applies that style to new code that is intended to be added to it. (This paper extracted from each program a number of features relevant to the formatting style, and both applied and compared a variety of supervised machine learning approaches to identify the style for each.)

Nighthawk: a two-level genetic-random unit test data generator. Writing unit tests is critical for having reliable code, but they can be cumbersome to write -- especially if a large code base exists and unit test writing has been historically neglected. Wouldn't it be awesome if a system could automatically generate unit tests? This paper describes a tool for doing exactly that, using genetic-algorithm approach to evolve tests.

Combined Static and Dynamic Mutability Analysis. When a method in an object-oriented language takes objects as parameters, the method might or might not change the contents of those objects. In CS 251 lingo, we call that a side-effect. It's important for a user of this method to know whether the objects supplied as parameters might be changed when calling the method. However, it is not always obvious from a quick scan of the method code whether or not such changes, also known as mutations, might be made. This paper describes a tool that can automatically determine whether or not a method's parameters might be mutated during the execution of the method, which might be used ultimately for documentation generation, debugging purposes, test generation, or other purposes. This is done by an approach where parameter references are modeled as a graph, and a variety of graph algorithms are used to propagate mutability status for variables through the graph.

Deliverables

You'll be expected to produce the following two deliverables associated with your project:

A working implementation. You should be able to demonstrate a working implementation of your project that works well, and that you are proud of.
An assessment of how well it works. Specifically what this is will depend on the project, and the paper you choose generally will have strategies for doing so. For example, if you are developing a tool for users, this would be a user study. If you are implementing an algorithm to automate some aspect of software development, you might be testing it on large open source code repositories. This portion of the project should not be overlooked; it is one of the two key expectations.

Prerequisites

Students will have some flexibility to pick projects that fit their backgrounds, so there is no particular expertise required.