DBRX

{{Short description|Open-sourced large language model}}

{{Infobox software

| developer = Mosaic ML and Databricks team

| screenshot = DBRX chatbot example screenshot.webp

| screenshot_alt = Screenshot of a DBRX chatbot answer, describing Wikipedia in a thoughtful way

| caption = Screenshot of DBRX describing Wikipedia

| released = March 27, 2024

| repo = https://github.com/databricks/dbrx

| license = Databricks Open License

| website = https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

}}

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024.{{Cite web |date=2024-03-27 |title=Introducing DBRX: A New State-of-the-Art Open LLM |url=https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm |access-date=2024-03-28 |website=Databricks |language=en-US}}{{Cite web |title=New Databricks open source LLM targets custom development {{!}} TechTarget |url=https://www.techtarget.com/searchbusinessanalytics/news/366575678/New-Databricks-open-source-LLM-targets-custom-development |access-date=2024-03-28 |website=Business Analytics |language=en}}{{Cite web |last=Ghoshal |first=Anirban |date=2024-03-27 |title=Databricks' open-source DBRX LLM beats Llama 2, Mixtral, and Grok |url=https://www.infoworld.com/article/3714625/databricks-open-source-dbrx-llm-beats-llama-2-mixtral-and-grok.html |access-date=2024-03-28 |website=InfoWorld |language=en}} It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token.{{Cite web |first= |date=Mar 28, 2024 |title=A New Open Source LLM, DBRX Claims to be the Most Powerful – Here are the Scores |url=https://www.gizmochina.com/2024/03/28/open-source-llm-dbrx-powerful/ |website=GIZMOCHINA}} The released model comes in either a base foundation model version or an instruction-tuned variant.{{Cite web |last=Wiggers |first=Kyle |date=2024-03-27 |title=Databricks spent $10M on new DBRX generative AI model |url=https://techcrunch.com/2024/03/27/databricks-spent-10m-on-a-generative-ai-model-that-still-cant-beat-gpt-4/ |access-date=2024-03-29 |website=TechCrunch |language=en-US}}

At the time of its release, DBRX outperformed other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok, in several benchmarks ranging from language understanding, programming ability and mathematics.{{Cite web |date=2024-03-28 |title=Data and AI company DataBrix has launched a general-purpose large language model (LLM) DBRX that out.. |url=https://www.mk.co.kr/en/world/10976197 |access-date=2024-03-28 |website=Maeil Business Newspaper |language=en}}{{Cite magazine |last=Knight |first=Will |title=Inside the Creation of the World's Most Powerful Open Source AI Model |url=https://www.wired.com/story/dbrx-inside-the-creation-of-the-worlds-most-powerful-open-source-ai-model/ |access-date=2024-03-28 |magazine=Wired |language=en-US |issn=1059-1028}}

It was trained for 2.5 months on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of $10m USD.

References

{{reflist}}

{{Generative AI}}

Category:Large language models