Speechbot

SpeechBot was a web search engine for streaming media content{{cite book|last=Gibbon|first=David C.|title=Introduction to video search engines|year=2008|publisher=Springer|location=Berlin|isbn=978-3540793366|pages=226–227|url=https://books.google.com/books?id=6yyj2GIhWH8C|author2=Zhu Liu }} developed at Compaq's (later HP) research laboratories in Cambridge, MA and Australia.{{cite news|url=http://www.pcworld.idg.com.au/article/93182/australian_research_gives_compaq_voice/|title=Australian research gives Compaq a voice|first=Byron|last=Kaye|date=10 January 2000|agency=PC World}} Compaq launched the website at Streaming Media West 1999 in San Jose, CA.{{cite news|title=Compaq Unveils First Website for Indexing Spoken Streamed Media; SpeechBot Research and Development Site Furthers Innovation Leadership.|agency=PR Newswire|date=7 December 1999|url=http://www.thefreelibrary.com/Compaq+Unveils+First+Website+for+Indexing+Spoken+Streamed+Media%3B...-a058056491}}{{cite news|first=Linda|last=Leung|title=Compaq's Speechbot site is an Internet first|url=http://www.v3.co.uk/v3-uk/news/1951962/compaqs-speechbot-site-internet|accessdate=18 June 2012|date=8 December 1999|agency=V3}}{{cite news|url=http://www.infotoday.com/online/OL2000/engine3.html|agency=ONLINE|date=March 2000|last=Notess|first=Greg|title=Internet Search Engine Update}} The internet radio shows indexed by SpeechBot included The Motley Fool, Fresh Air, Talk of the Nation, The Dr. Laura Program, and Dreamland with Art Bell. By June 2003, the service had indexed over 17,000 hours of multimedia content. The website was taken offline in 2005, after HP closed their Cambridge research lab.{{cite news|last=Price|first=Gary|title=Multimedia Searching: Speechbot is No Longer Available|url=http://blog.searchenginewatch.com/blog/051104-150521|date=4 November 2005|agency=Search Engine Watch}}

The SpeechBot indexing workflow involved a farm of Windows workstations that retrieved the streaming content; and a Linux cluster running speech recognition to transcribe the spoken audio. The web server, search index and metadata library were hosted on AlphaServers running Tru64 UNIX.

If transcripts were already available, then these were aligned to the audio stream; otherwise, an approximate transcript was produced using speech recognition. The Calista recognizer that was used was derived from Sphinx-3. Due to the low quality of streaming audio at the time, the word error rate was quite high, but most searches were still able to retrieve relevant hits.{{cite journal|url=http://eprints.whiterose.ac.uk/4710/|title=The relationship of word error rate to document ranking|year=2004|last1=Mang Shou|first1=X.|last2=Sanderson|first2=M.|last3=Tuffs|first3=N.|journal=Proceedings of the AAAI Spring Symposium Intelligent Multimedia Knowledge Management Workshop|pages=28–33|isbn=1577351908}} The search results linked to the offset in the stream that corresponded to the search phrase, so that users did not need to listen to the entire program to find the section of interest.

References

{{Reflist}}

Further reading

  • {{cite journal|last=Swain |first=Michael J. |title=Searching for Multimedia on the World Wide Web |journal=Compaq Technical Report |date=March 1999 |volume=CRL 99/1 |url=http://hpl.hp.com/techreports/Compaq-DEC/CRL-99-1.pdf |url-status=unfit |archiveurl=https://web.archive.org/web/20051031113112/http://hpl.hp.com/techreports/Compaq-DEC/CRL-99-1.pdf |archivedate=October 31, 2005 }}
  • {{cite journal|last1=Eberman |first1=B. |last2=Fidler |first2=B. |last3=Iannucci |first3=R.A. |last4=Joerg |first4=C. |last5=Kontothanassis |first5=L. |last6=Kovalcin |first6=D.E. |last7=Moreno |first7=P. |last8=Swain |first8=M.J. |last9=Van Thong |first9=J-M |title=Indexing Multimedia for the Internet |journal=Compaq Technical Report |date=March 1999 |volume=CRL 99/2 |url=http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-2.html |url-status=unfit |archiveurl=https://web.archive.org/web/20060320223954/http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-2.html |archivedate=March 20, 2006 }}
  • {{cite journal|first1=F.|last1=Dufaux|first2=B.|last2=Eberman|first3=L.|last3=Kontothanassis|first4=P.|last4=Moreno|first5=M.|last5=Swain|first6=C.|last6=Weikart|title=A system for indexing web multimedia|journal=Compaq Technical Report|date=March 1999|volume=CRL 99/3}}
  • {{cite journal|last1=Kontothanassis |first1=Leonidas |last2=Joerg |first2=Chris |last3=Swain |first3=Michael J. |last4=Eberman |first4=Brian |last5=Iannucci |first5=Robert A. |title=Design Implementation and Analysis of a Multimedia Indexing and Delivery Server |journal=Compaq Technical Report |date=August 1999 |volume=CRL 99/5 |url=http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-5.html |url-status=unfit |archiveurl=https://web.archive.org/web/20060320225546/http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-5.html |archivedate=March 20, 2006 }}
  • {{cite journal|last1=Moreno|first1=P.J.|last2=Van Thong|first2=J.-M.|last3=Logan|first3=B.|last4=Jones|first4=G.J.F.|title=From multimedia retrieval to knowledge management|journal=Computer|date=1 January 2002|volume=35|issue=4|pages=58–66|doi=10.1109/MC.2002.993772}}
  • {{cite journal|last1=Van Thong|first1=J.-M.|last2=Moreno|first2=P.J.|last3=Logan|first3=B.|last4=Fidler|first4=B.|last5=Maffey|first5=K.|last6=Moores|first6=M.|title=Speechbot: an experimental speech-based search engine for multimedia content on the web|journal=IEEE Transactions on Multimedia|date=March 2002|volume=4|issue=1|pages=88–96|doi=10.1109/6046.985557|url=http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-2001-6.pdf}}
  • {{cite book|doi=10.1109/ICASSP.2005.1416475|date=March 2005|last1=Logan|first1=Beth|last2=Goddeau|first2=Dave|last3=Van Thong|first3=Jean-Manuel|title=Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005 |chapter=Real-World Audio Indexing Systems |journal=Proc. ICASSP'05|volume=5|pages=1001–1004|isbn=0-7803-8874-7|s2cid=30576691 }}
  • {{cite news|last=Olsen|first=Stefanie|title=Search engines try to find their sound|url=http://news.cnet.com/Search-engines-try-to-find-their-sound/2100-1032_3-5221267.html|accessdate=18 June 2012|date=27 May 2004|agency=CNET News}}

Category:Hewlett-Packard

Category:Defunct internet search engines

Category:1999 software

Category:Internet properties established in 1999

Category:Internet properties disestablished in 2005

{{web-software-stub}}