extract, load, transform
{{Single source|date=November 2023}}
Extract, load, transform (ELT) is an alternative to extract, transform, load (ETL) used with data lake implementations. In contrast to ETL, in ELT models the data is not transformed on entry to the data lake, but stored in its original raw format. This enables faster loading times. However, ELT requires sufficient processing power within the data processing engine to carry out the transformation on demand, to return the results in a timely manner.{{Cite web |title=What is ELT (Extract, Load, Transform)? {{!}} IBM |url=https://www.ibm.com/topics/elt |access-date=2024-01-30 |website=www.ibm.com |date=October 2021 |language=en-us}}{{Cite web |last=Abdullahi |first=Aminu |date=2023-06-30 |title=ETL vs ELT: What Are the Main Differences and Which Is Better? |url=https://www.techrepublic.com/article/etl-vs-elt/ |access-date=2024-01-30 |website=TechRepublic |language=en-US}} Since the data is not processed on entry to the data lake, the query and schema do not need to be defined a priori (although often the schema will be available during load since many data sources are extracts from databases or similar structured data systems and hence have an associated schema). ELT is a data pipeline model.{{usurped|1=[https://web.archive.org/web/20210118172247/https://deductive.com/blogs/using-amazon-redshift-spectrum-data-pipelines/ Using Redshift Spectrum to load data pipelines]}} Published by deductive.com on January 17, 2018, retrieved on April 3, 2019.{{Cite web |date=2024-01-30 |title=What is ELT (Extract, Load, Transform)? {{!}} dbt Developer Hub |url=https://docs.getdbt.com/terms/elt |access-date=2024-01-30 |website=docs.getdbt.com |language=en}}
Benefits
Some of the benefits of an ELT process include speed and the ability to handle both structured and unstructured data.{{Cite web |last=Mishra |first=Tanya |date=2023-09-02 |title=ETL vs ELT: Meaning, Major Differences & Examples |url=https://www.analyticsinsight.net/etl-vs-elt-meaning-major-differences-examples/ |access-date=2024-01-30 |website=Analytics Insight |language=en-US}}
Cloud data lake components
=Common storage options=
- AWS
- Simple Storage Service (S3)
- Amazon RDS
- Azure
- Azure Blob Storage
- GCP
- Google Storage (GCS)
=Querying=
- AWS
- Redshift Spectrum
- Athena
- EMR (Presto)
- Azure
- Azure Data Lake
- GCP
- BigQuery
References
{{Reflist}}
External links
- Dull, Tamara, [https://www.smartdatacollective.com/data-lake-debate-pro-s-first/ "The Data Lake Debate: Pro is Up First"], smartdatacollective.com, March 20, 2015.
- [https://www.astera.com/type/blog/elt-extract-load-and-transform/ ELT: Extract, Load, and Transform A Complete Guide] | Astera Software
{{Computing-stub}}