2023–24 Projects:
Advisor: Jeff Ondich
Meeting time: TTh 10:10-11:55
Current versions of both Windows XP and MacOS X come standard with spoken language interface tools. If you activate the Windows "Speech Tools," for example, you can speak menu commands and dictate to your word processor. The speech recognition is so-so, but getting better. With practice, you can dictate nearly as quickly as you can type. For people who have repetitive motion disorders or who work in contexts where they need to use the computer but don't have their hands free, these speech tools can be handy.
For years, people have been predicting that everybody will throw away the mouse and start talking to their computers any day now, but it hasn't happened yet. Maybe that's because the speech recognition rates are still too low. But maybe it's because the speech-driven systems don't really add enough new value for most people. Can I do what I want to do more easily, more naturally, more quickly, or more effectively? No? Then I'll keep my keyboard, thank you.
For this project, you will explore ways to make speech-driven command-and-control systems more sophisticated, and, if all goes well, more useful than the currently available systems. Right now, it's relatively easy to have simple conversations with the computer that go like this: "File...Save As...OK." But what if you want to have a conversation like this?
You: Shrink the browser a bit and move it to the left. You: A little bit more...that's good. You: Now bring the other window forward. Computer: This one? You: No, the big one...good.
This conversation hints at the kind of ease of use that speech-driven command and control might bring. Making such a conversation a reality, however, will require a sophisticated system, in which the computer resolves pronouns, remembers previous requests, and so on. This project will focus on the attempt to improve the usefulness of speech-driven interfaces by making more natural, multi-sentence conversations possible.
For this project, you will develop a command and control dialogue system for Linux, with an emphasis on natural multi-sentence interactions, and with the goal of developing tools that people would actually like to use. Your tasks will include:
Michael F. McTear, "Spoken Dialogue Technology: Enabling the Conversational User Interface," in ACM Computing Surveys, Vol. 34, No. 1, March 2002, pp.90-169.
The Galaxy Communicator project.
The home page of SIGdial, the Special Interest Group on Discourse and Dialogue of the Association for Computational Linguistics.
Barbara Grosz, Discourse and Dialogue, Chapter 6 of Survey of the State of the Art in Human Language Technology, edited by Ronald A. Cole, et. al., 1996.
Hossein Motallebipour and August Bering, A Spoken Dialogue System to Control Robots. This one is a typical example of the sort of project people can do with the tools that are being made available to programmers through academic and industrial research programs.