Carleton Comps Project:

There are already apps that will take speech as input and turn them into songs (songify, etc.) to create such gems as Autotune the News. There are also ways to alter speed of recordings so that you can get faster playback yet retain the same pitch, i.e. for watching lectures at double speed. These applications are related in that both involve decoupling pitch and speed in an audio signal.

The Project

I want an application that will allow me to take recorded speech and songify it. But in addition, I want it to have a tool that will allow me to alter the speech speed to match parts of the song. This will involve some sort of visualization of the audio signals, and a way to match timepoints. I envision being able to set timepoints t_1, t_2, t_3, etc., in my speech signal, and corresponding timepoints t_1, t_2, t_3, in my song signal. Between each set of timepoints, the speech will be sped up or slowed down to match the song, and the pitch of the speech signal will be altered to match the pitch of the song, producing an output audio signal that contains the song plus the songified speech.

Deliverables

The baseline should be an application as described above that includes:

Automatic pitch detection, which should entail
1. Research of automatic pitch detection methods
2. Implementation of one or more algorithms for pitch detection
An implementation of a phase vocoder to allow for pitch and time scaling of an audio signal
Visualization of the two audio signals
A way to set at least two timepoints on each audio signal and produce an output that is the “songified” mix of the two.

Stretch Goals

An automatic “best fit” between a speech signal and song signal
Multiple speech signals to integrate
A way to handle multiple types of sound files

Recommended Experience

The following are some courses that will be useful to have taken. If you haven’t taken them, (or don’t plan to) don’t despair, but I want to try and make sure that a few students have previous experience with signal processing.

Natural Language Processing (CS 322)
Computer Music and Sound (MUSC 208)
Fourier Series and Boundary Value Problems (MATH 341)