data build tool
{{Short description|Data analytics transformation tool}}
{{Infobox software
| name = dbt
| logo = data build tool (dbt) logo.svg
| developer = dbt-Labs
| released = {{Start date and age|2021|12|03}}
| latest release version = 1.8.9
| latest release date = {{Start date and age|2024|11|21}} {{cite web |title=Release dbt-core v1.8.9 · dbt-labs/dbt-core |url=https://github.com/dbt-labs/dbt-core/releases/tag/v1.8.9 |website=GitHub |access-date=6 Dec 2024}}
| latest preview version =
| latest preview date =
| operating system = Microsoft Windows, macOS, Linux
| size =
| programming language = Python
| genre = Data analytics, data management
| license = Apache License 2.0
| website = {{URL|https://docs.getdbt.com/}}
| language = Python
}}
Data build tool (dbt) is an open-source command line tool that helps analysts and engineers transform data in their warehouse more effectively.{{cite book |last1=Atwal |first1=Harvinder |title=Practical DataOps: Delivering Agile Data Science at Scale |date=9 December 2019 |publisher=Apress |isbn=978-1-4842-5104-1 |page=223 |url=https://books.google.com/books?id=ADLDDwAAQBAJ&pg=PA223 |language=en}}
History
It started at RJMetrics in 2016 as a solution to add basic transformation capabilities to Stitch (acquired by Talend in 2018).{{cite web |url=https://www.stitchdata.com/blog/stitch-is-joining-talend |title=Stitch is joining Talend |publisher=Stitch Data. |date=2018-11-07 |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174227/https://www.stitchdata.com/blog/stitch-is-joining-talend/ |url-status=live }} The earliest versions of dbt allowed analysts to contribute to the data transformation process following the best practices of software engineering.{{cite web |url=https://blog.getdbt.com/goodbye-rjmetrics-hello-fishtown-analytics/ |title=Goodbye RJMetrics, Hello Fishtown Analytics |publisher=dbt Blog |date=2016-08-01 |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174141/https://blog.getdbt.com/goodbye-rjmetrics-hello-fishtown-analytics/ |url-status=live }}
From the beginning, dbt was open source.{{Cite web |last=Cai |first=Kenrick |title=Dbt Labs In Talks To Raise At $6 Billion Valuation, Six Months After Becoming A Unicorn |url=https://www.forbes.com/sites/kenrickcai/2021/12/15/dbt-labs-in-talks-to-raise-at-6-billion-valuation-six-months-after-becoming-a-unicorn/ |access-date=2023-04-01 |website=Forbes |language=en}} In 2018, the dbt Labs team (then called Fishtown Analytics) released a commercial product on top of dbt Core.{{cite web |url=https://blog.getdbt.com/sinter-release-notes-august-2018-pull-request-builder-fine-grained-github-permissions-and-more/ |title=Sinter Release Notes, August 2018: pull request builder, fine-grained GitHub permissions, and more |date=2018-07-31 |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174141/https://blog.getdbt.com/sinter-release-notes-august-2018-pull-request-builder-fine-grained-github-permissions-and-more/ |url-status=live }}
Funding
In April 2020, dbt Labs announced its Series A led by Andreessen Horowitz.{{cite web |date=2020-04-22 |title=Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool |url=https://techcrunch.com/2020/04/22/fishtown-analytics-raises-12-9m-series-a-for-its-open-source-analytics-engineering-tool/ |publisher=TechCrunch |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174139/https://techcrunch.com/2020/04/22/fishtown-analytics-raises-12-9m-series-a-for-its-open-source-analytics-engineering-tool/ |url-status=live }} In November, dbt Labs announced its Series B led by Andreessen Horowitz and Sequoia.{{cite web |date=2020-11-11 |title=Fishtown Analytics raises $29.5M Series B for its data engineering platform |url=https://techcrunch.com/2020/11/11/fishtown-analytics-raises-29-5m-series-b-for-its-data-engineering-platform/ |publisher=TechCrunch |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174140/https://techcrunch.com/2020/11/11/fishtown-analytics-raises-29-5m-series-b-for-its-data-engineering-platform/ |url-status=live }} And in June 2021, dbt Labs raised its Series C led by Altimeter, Sequoia, and Andreessen Horowitz.{{cite web |date=2021-06-30 |title=Of the Community, By the Community, For the Community |url=https://blog.getdbt.com/of-the-community-by-the-community-for-the-community/ |publisher=dbt Blog |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107215943/https://blog.getdbt.com/of-the-community-by-the-community-for-the-community/ |url-status=live }} In February 2022, the company raised $222 million for its Series D, at a $4.2 billion valuation.{{cite web |last1=Cai |first1=Kenrick |date=24 Feb 2022 |title=VENTURE CAPITAL Dbt Labs Raises At $4.2 Billion Valuation, $2 Billion Less Than First Planned |url=https://www.forbes.com/sites/kenrickcai/2022/02/24/dbt-labs-series-d-4-billion-less-than-planned/?sh=76c4dd1d67c3 |archive-url=https://archive.today/20220511115331/https://www.forbes.com/sites/kenrickcai/2022/02/24/dbt-labs-series-d-4-billion-less-than-planned/?sh=26c7c1cc67c3 |archive-date=11 May 2022 |access-date=11 May 2022 |website=Forbes |publisher=Forbes |language=English |quote=The Philadelphia-based data analytics startup revealed Thursday that it had settled on a $4.2 billion valuation as part of a $222 million Series D funding round}}
Overview
Dbt enables analytics engineers to transform data in their warehouses by writing select statements, and turns these select statements into tables and views. Dbt does the transformation (T) in extract, load, transform (ELT) processes – it does not extract or load data, but is designed to be performant at transforming data already inside of a warehouse. Dbt has the goal of allowing analysts to work more like software engineers, in line with the dbt viewpoint.{{cite web |url=https://docs.getdbt.com/docs/about/viewpoint |title=dbt viewpoint |access-date=2021-11-07 |archive-date=2021-11-07 |archive-url=https://web.archive.org/web/20211107174742/https://docs.getdbt.com/docs/about/viewpoint |url-status=live }}
Dbt uses YAML files to declare properties. seed
is a type of reference table used in dbt for static or infrequently changed data, like for example country codes or lookup tables), which are CSV based and typically stored in a seeds folder.
References
{{Reflist|30em}}
{{DEFAULTSORT:dbt}}