Peter Gorniak and Deb Roy. (2003). A Visually Grounded Natural Language Interface for Reference to Spatial Scenes. In Proceedings of the International Conference for Multimodal Interfaces.