Deb Roy and Alex Pentland. (1998). Learning words from natural audio-visual input. Fifth International Conference on Spoken Language Processing (ICSLP), 5 pages.