Hugging Face

{{Short description|French-American software company}}

{{about|the company|the emoji|Emoji}}

{{primary|date=February 2024}}

{{Infobox company

| name = Hugging Face, Inc.

| logo = Hugging Face logo.svg

| logo_upright = 1.2

| logo_alt =

| type = Private

| industry = Artificial intelligence
machine learning
software development

| founded = {{Start date and age|2016}}

| founder =

| hq_location_city = Manhattan, New York City

| hq_location_country =

| area_served = Worldwide

| key_people = {{Unbulleted list|Clément Delangue (CEO)| Julien Chaumond (CTO)| Thomas Wolf (CSO)}}

| products = Models, datasets
spaces

| owner =

| revenue = {{increase}} {{US$|15}}{{nbsp}}million

| revenue_year = 2022

| num_employees = 170

| num_employees_year = 2023

| parent =

| website = {{URL|https://huggingface.co/}}

}}

Hugging Face, Inc. is a French-American company based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

History

The company was founded in 2016 by French entrepreneurs Clément Delangue, Julien Chaumond, and Thomas Wolf in New York City, originally as a company that developed a chatbot app targeted at teenagers.{{Cite web |title=Hugging Face wants to become your artificial BFF |url=https://techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |access-date=2023-09-17 |website=TechCrunch |date=9 March 2017 |language=en-US |archive-date=2022-09-25 |archive-url=https://web.archive.org/web/20220925012620/https://techcrunch.com/2017/03/09/hugging-face-wants-to-become-your-artificial-bff/ |url-status=live }} The company was named after the {{unichar|1F917}} emoji. After open sourcing the model behind the chatbot, the company pivoted to focus on being a platform for machine learning.

In March 2021, Hugging Face raised US$40 million in a Series B funding round.{{cite web |title=Hugging Face raises $40 million for its natural language processing library |date=11 March 2021 |url=https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library |access-date=5 August 2022 |archive-date=28 July 2023 |archive-url=https://web.archive.org/web/20230728113102/https://techcrunch.com/2021/03/11/hugging-face-raises-40-million-for-its-natural-language-processing-library/ |url-status=live }}

On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |access-date=5 August 2022 |archive-date=1 July 2022 |archive-url=https://web.archive.org/web/20220701073233/https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |url-status=live }} In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large language model with 176 billion parameters.{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co |archive-date=2022-11-14 |archive-url=https://web.archive.org/web/20221114122342/https://bigscience.huggingface.co/blog/bloom |url-status=live }}{{Cite web |title=Inside a radical new project to democratize AI |url=https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |access-date=2023-08-25 |website=MIT Technology Review |language=en |archive-date=2022-12-04 |archive-url=https://web.archive.org/web/20221204184214/https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |url-status=live }}

In December 2022, the company acquired Gradio, an open source library built for developing machine learning applications in Python.{{Cite web |last=Nataraj |first=Poornima |date=2021-12-23 |title=Hugging Face Acquires Gradio, A Customizable UI Components Library For Python |url=https://analyticsindiamag.com/hugging-face-acquires-gradio-a-customizable-ui-components-library-for-python/ |access-date=2024-01-26 |website=Analytics India Magazine |language=en-US |archive-date=2021-12-23 |archive-url=https://web.archive.org/web/20211223120242/https://analyticsindiamag.com/hugging-face-acquires-gradio-a-customizable-ui-components-library-for-python/ |url-status=live }}

On May 5, 2022, the company announced its Series C funding round led by Coatue and Sequoia.{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en |archive-date=2022-11-03 |archive-url=https://web.archive.org/web/20221103121236/https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |url-status=live }} The company received a $2 billion valuation.

On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports SaaS or on-premises deployment.{{Cite web |title=Introducing the Private Hub: A New Way to Build With Machine Learning |url=https://huggingface.co/blog/introducing-private-hub |access-date=2022-08-20 |website=huggingface.co |archive-date=2022-11-14 |archive-url=https://web.archive.org/web/20221114122333/https://huggingface.co/blog/introducing-private-hub |url-status=live }}

In February 2023, the company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face's products available to AWS customers to use them as the building blocks for their custom applications. The company also said the next generation of BLOOM will be run on Trainium, a proprietary machine learning chip created by AWS.{{cite news |last=Bass |first=Dina |date=2023-02-21 |title=Amazon's Cloud Unit Partners With Startup Hugging Face as AI Deals Heat Up |url=https://www.bloomberg.com/news/articles/2023-02-21/amazon-s-aws-joins-with-ai-startup-hugging-face-as-chatgpt-competition-heats-up |work=Bloomberg News |access-date=2023-02-22 |archive-date=2023-05-22 |archive-url=https://web.archive.org/web/20230522030130/https://www.bloomberg.com/news/articles/2023-02-21/amazon-s-aws-joins-with-ai-startup-hugging-face-as-chatgpt-competition-heats-up |url-status=live }}{{cite news |last=Nellis |first=Stephen |date=2023-02-21 |title=Amazon Web Services pairs with Hugging Face to target AI developers |url=https://www.reuters.com/technology/amazon-web-services-pairs-with-hugging-face-target-ai-developers-2023-02-21/ |work=Reuters |access-date=2023-02-22 |archive-date=2023-05-30 |archive-url=https://web.archive.org/web/20230530091325/https://www.reuters.com/technology/amazon-web-services-pairs-with-hugging-face-target-ai-developers-2023-02-21/ |url-status=live }}{{Cite web |date=2023-02-21 |title=AWS and Hugging Face collaborate to make generative AI more accessible and cost efficient {{!}} AWS Machine Learning Blog |url=https://aws.amazon.com/blogs/machine-learning/aws-and-hugging-face-collaborate-to-make-generative-ai-more-accessible-and-cost-efficient/ |access-date=2023-08-25 |website=aws.amazon.com |language=en-US |archive-date=2023-08-25 |archive-url=https://web.archive.org/web/20230825202343/https://aws.amazon.com/blogs/machine-learning/aws-and-hugging-face-collaborate-to-make-generative-ai-more-accessible-and-cost-efficient/ |url-status=live }}

In August 2023, the company announced that it raised $235 million in a Series D funding, at a $4.5 billion valuation. The funding was led by Salesforce, and notable participation came from Google, Amazon, Nvidia, AMD, Intel, IBM, and Qualcomm.{{Cite web |last=Leswing |first=Kif |date=2023-08-24 |title=Google, Amazon, Nvidia and other tech giants invest in AI startup Hugging Face, sending its valuation to $4.5 billion |url=https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |access-date=2023-08-24 |website=CNBC |language=en |archive-date=2023-08-24 |archive-url=https://web.archive.org/web/20230824141538/https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |url-status=live }}

In June 2024, the company announced, along with Meta and Scaleway, their launch of a new AI accelerator program for European startups. This initiative aims to help startups integrate open foundation models into their products, accelerating the EU AI ecosystem. The program, based at STATION F in Paris, will run from September 2024 to February 2025. Selected startups will receive mentoring, access to AI models and tools, and Scaleway’s computing power.{{Cite web |date=2024-06-25 |title=META Collaboration Launches AI Accelerator for European Startups |url=https://finance.yahoo.com/news/meta-collaboration-launches-ai-accelerator-151500146.html |access-date=2024-07-11 |website=Yahoo Finance |language=en-US |archive-date=2024-07-11 |archive-url=https://web.archive.org/web/20240711201409/https://finance.yahoo.com/news/meta-collaboration-launches-ai-accelerator-151500146.html |url-status=live }}

On September 23, 2024, to further the International Decade of Indigenous Languages, Hugging Face teamed up with Meta and UNESCO to launch a new online language translator {{Cite web |date=2024-09-23 |title=Hugging Face Spaces Translator |url=https://huggingface.co/spaces/UNESCO/nllb.html}} built on Meta's No Language Left Behind open-source AI model, enabling free text translation across 200 languages, including many low-resource languages.{{Cite web |date=2024-09-23 |title=UNESCO Translator Event |url=https://www.unesco.org/en/event/unesco-language-translator-powered-meta-and-hugging-face-launching-event?hub=68184.html}}

On April 2025, Hugging Face announced that they acquired a humanoid robotics startup, Pollen Robotics. Pollen Robotics is a France based Robotics Startup founded by Matthieu Lapeyre and Pierre Rouanet in 2016.{{Cite web |last=Wiggers |first=Kyle |date=2025-04-14 |title=Hugging Face buys a humanoid robotics startup |url=https://techcrunch.com/2025/04/14/hugging-face-buys-a-humanoid-robotics-startup/ |access-date=2025-04-15 |website=TechCrunch |language=en-US}}{{Cite web |last=Koetsier |first=John |title=Open Source Humanoid Robots That You Can 3D Print Yourself: Hugging Face Buys Pollen Robotics |url=https://www.forbes.com/sites/johnkoetsier/2025/04/14/open-source-humanoid-robots-hugging-face-buys-pollen-robotics/ |access-date=2025-04-15 |website=Forbes |language=en}} In an X tweet, Clement Delangue - CEO of Hugging Face, share his vision to make Artificial Intelligence robotics Open Source.{{Cite magazine |last=Knight |first=Will |title=An Open Source Pioneer Wants to Unleash Open Source AI Robots |url=https://www.wired.com/story/hugging-face-acquires-open-source-robot-startup/ |access-date=2025-04-15 |magazine=Wired |language=en-US |issn=1059-1028}}

Services and technologies

= Transformers Library =

The Transformers library is a Python package that contains open-source implementations of transformer models for text, image, and audio tasks. It is compatible with the PyTorch, TensorFlow and JAX deep learning libraries and includes implementations of notable models like BERT and GPT-2.{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co |archive-date=2023-09-27 |archive-url=https://web.archive.org/web/20230927023923/https://huggingface.co/docs/transformers/index |url-status=live }} The library was originally called "pytorch-pretrained-bert"{{cite web |date=Nov 17, 2018 |title=First release |url=https://github.com/huggingface/transformers/releases/tag/v0.1.2 |access-date=28 March 2023 |website=GitHub |archive-date=30 April 2023 |archive-url=https://web.archive.org/web/20230430011038/https://github.com/huggingface/transformers/releases/tag/v0.1.2 |url-status=live }} which was then renamed to "pytorch-transformers" and finally "transformers."

A javascript version (transformers.js{{cite web |title=xenova/transformers.js |url=https://github.com/xenova/transformers.js |website=GitHub |access-date=2024-05-26 |archive-date=2023-03-07 |archive-url=https://web.archive.org/web/20230307035125/https://github.com/xenova/transformers.js |url-status=live }}) has also been developed, allowing models to run directly in the browser through ONNX runtime.

= Hugging Face Hub =

The Hugging Face Hub is a platform (centralized web service) for hosting:{{Cite web |title=Hugging Face Hub documentation |url=https://huggingface.co/docs/hub/index |access-date=2022-08-20 |website=huggingface.co |archive-date=2023-09-20 |archive-url=https://web.archive.org/web/20230920185949/https://huggingface.co/docs/hub/index |url-status=live }}

  • Git-based code repositories, including discussions and pull requests for projects.
  • models, also with Git-based version control;
  • datasets, mainly in text, images, and audio;
  • web applications ("spaces" and "widgets"), intended for small-scale demos of machine learning applications.

There are numerous pre-trained models that support common tasks in different modalities, such as:

  • Natural Language Processing: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
  • Computer Vision: image classification, object detection, and segmentation.
  • Audio: automatic speech recognition and audio classification.

= Other libraries =

File:Gradio example.png

In addition to Transformers and the Hugging Face Hub, the Hugging Face ecosystem contains libraries for other tasks, such as dataset processing ("Datasets"), model evaluation ("Evaluate"), and machine learning demos ("Gradio").{{Cite web |title=Hugging Face - Documentation |url=https://huggingface.co/docs |access-date=2023-02-18 |website=huggingface.co |archive-date=2023-09-30 |archive-url=https://web.archive.org/web/20230930074626/https://huggingface.co/docs |url-status=live }}

= Safetensors =

The safetensors format was developed around 2021 to solve problems with the pickle format in python. It was designed for saving and loading tensors. Compared to pickle format, it allows lazy loading, and avoids security problems.{{Citation |title=huggingface/safetensors |date=2024-09-21 |url=https://github.com/huggingface/safetensors#yet-another-format- |access-date=2024-09-22 |publisher=Hugging Face}} After a security audit, it became the default format in 2023.{{Cite web |title=🐶Safetensors audited as really safe and becoming the default |url=https://huggingface.co/blog/safetensors-security-audit |access-date=2024-09-22 |website=huggingface.co}}

The file format:

  • size of the header: 8 bytes, an unsigned little-endian 64-bit integer.
  • header: JSON UTF-8 string, formatted as {"TENSOR_NAME": {“dtype”: “F16”, “shape”: [1, 16, 256], “data_offsets”: [BEGIN, END]}, "NEXT_TENSOR_NAME": {…}, …}.
  • file: a byte buffer containing the tensors.

See also

References

{{Reflist}}