Diffbot

{{short description|American machine learning and knowledge management company}}

{{Infobox company

| name = Diffbot

| logo = Diffbot Logo

| type = Private company

| traded_as =

| predecessor =

| successor =

| founder = Mike Tung

| defunct =

| fate =

| area_served = Worldwide

| key_people =

{{plain list|

  • Mike Tung (CEO)

}}

| industry = Internet

| genre =

| products =

| production =

| services = Web APIs, Enterprise Search, Web Scraping, Web Crawling

| revenue =

| operating_income =

| net_income =

| aum =

| assets =

| equity =

| owner =

| num_employees =

| parent =

| divisions =

| subsid =

| footnotes =

| intl =

| caption =

| foundation =

| hq_location_city = Menlo Park, California

| hq_location_country = U.S.

| locations =

| homepage = [https://www.diffbot.com/ www.diffbot.com]

}}

Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base.

The company has gained interest from its application of computer vision technology to web pages, wherein it visually parses a web page for important elements and returns them in a structured format.{{cite web|url=https://thenextweb.com/apps/2011/08/25/diffbot-lets-developers-navigate-code-the-way-our-eyes-see-the-world/ |title=Diffbot Lets Developers Navigate Code the Way Our Eyes See the World |publisher=TheNextWeb |date=August 25, 2011 |accessdate=April 21, 2013}} In 2015 Diffbot announced it was working on its version of an automated "Knowledge Graph" by crawling the web and using its automatic web page extraction to build a large database of structured web data.{{cite web|url=https://www.wired.com/2015/06/startup-shares-google-knowledge-graph-clone-everyone/ |title=Startup Unleashes Its Clone of Google's 'Knowledge Graph' |publisher=Wired |date=June 4, 2015 |accessdate=June 15, 2015}} In 2019 Diffbot released their Knowledge Graph which has since grown to include over two billion entities (corporations, people, articles, products, discussions, and more), and ten trillion "facts."

The company's products allow software developers to analyze web home pages and article pages,{{cite web|url=http://gigaom.com/2011/08/25/diffbot-helps-apps-read-the-web-like-humans/ |title=Diffbot Helps Apps Read the Web Like Humans |work=Gigaom |date=August 25, 2011 |accessdate=March 14, 2013 |last1=Kim |first1=Ryan }} and extract the "important information" while ignoring elements deemed not core to the primary content.{{cite web|url=https://blogs.wsj.com/venturecapital/2012/05/31/investors-back-diffbots-visual-learning-robot-for-web-content |title=Investors Back Diffbot's Visual Learning Robot for Web Content |publisher=The Wall Street Journal |date=May 31, 2012 |accessdate=March 14, 2013}}

In August 2012 the company released its Page Classifier API, which automatically categorizes web pages into specific "page types".{{cite web|url=https://venturebeat.com/2012/08/16/diffbot-api-links |title=DiffBot's new API brilliantly reveals what's hiding behind any link |date=August 16, 2012 |accessdate=March 14, 2013}} As part of this, Diffbot analyzed 750,000 web pages shared on the social media service Twitter and revealed that photos, followed by articles and videos, are the predominant web media shared on the social network.{{cite web|url=http://mashable.com/2012/08/16/twitter-day-in-the-life-infographic/ |title=Twitter: A Day in the Life |website=Mashable |date=August 16, 2012 |accessdate=March 14, 2013}}

In September 2020 the company released a Natural Language Processing API for automatically building Knowledge Graphs from text.{{Cite web |date=2020-09-17 |title=New AI Tool Maps the Families of the Bible, A Song of Ice and Fire |url=https://www.datanami.com/2020/09/17/new-ai-tool-maps-the-families-of-the-bible-a-song-of-ice-and-fire/ |access-date=2022-06-08 |website=Datanami}}

{{cite news |last1=Peter |first1=Alex |title=Web Scraping |url=https://it-s.com/our-services/data-tranformation-services/web-scraping-services/ |access-date=28 March 2021}}

The company raised $2 million in funding in May 2012 from investors including Andy Bechtolsheim and Sky Dayton.{{cite web|url=https://www.theverge.com/2012/5/31/3054444/diffbot-raises-2-million-apps-open-web |title=Diffbot raises $2 million to help apps understand the open, unstructured web |publisher=TheVerge |date=May 31, 2012 |accessdate=March 14, 2013}}

Diffbot's customers include Adobe, AOL, Cisco, DuckDuckGo, eBay, Instapaper, Microsoft, Onswipe and Springpad.{{cite web|url=https://www.forbes.com/sites/anthonykosner/2015/06/04/diffbot-bests-googles-knowledge-graph-to-feed-the-need-for-structured-data/ |title=Diffbot Bests Google's Knowledge Graph To Feed The Need For Structured Data |date=June 4, 2015 |work=Forbes |accessdate=June 15, 2015}}

See also

References

{{Reflist|2}}