YugabyteDB
{{primary sources|date=January 2022}}
{{Short description|Transactional distributed SQL database}}
{{Infobox software
| title =
| name = YugabyteDB
| logo = YugabyteLogo.png
| author = Kannan Muthukkaruppan, Karthik Ranganathan, Mikhail Bautin
| developer = Yugabyte, Inc.
| released = {{Start date and age|2016}}
| latest release version = 2024.2 (Stable)
2.23 (Preview)
| latest release date = {{Start date and age|2024|12|09}}
{{Start date and age|2024|09|13}}
| programming language = C++
| operating system = Alma Linux 8.x and derivatives, MacOS
| platform = Bare Metal, Virtual Machine, Docker, Kubernetes and various container management platforms
| size =
| language = English
| genre = RDBMS
| license = Apache 2.0
| standard =
}}
{{Infobox company
| name = Yugabyte, Inc.
| logo =
| type = Private
| industry = Database
| founded = {{start date and age|2016||02}}
| founder = Kannan Muthukkaruppan, Karthik Ranganathan, Mikhail Bautin
| hq_location = Sunnyvale, California, US
| key_people = Kannan Muthukkaruppan
(co-founder & president,
product development)
Karthik Ranganathan
(co-founder & CTO)
Mikhail Bautin
(co-founder &
software architect)
Bill Cook
(CEO)
| services = Commercial database management systems
| website = {{URL| yugabyte.com}}
}}
YugabyteDB is a high-performance transactional distributed SQL database for cloud-native applications, developed by Yugabyte.{{cite web |url=https://db-engines.com/en/system/YugabyteDB |website=DB-Engines |title=YugabyteDB System Properties |access-date=30 December 2021}}
History
Yugabyte was founded by ex-Facebook engineers Kannan Muthukkaruppan, Karthik Ranganathan, and Mikhail Bautin. At Facebook, they were part of the team that built and operated Cassandra and HBase{{cite web |title=Karthik Ranganathan |url=https://www.dataversity.net/contributors/karthik-ranganathan/ |website=Dataversity |access-date=30 December 2021}}{{cite book |url=https://dl.acm.org/doi/abs/10.1145/1989323.1989438 |website=Association For Computer Machinery |year=2011 |doi=10.1145/1989323.1989438 |access-date=15 January 2022|last1=Borthakur |first1=Dhruba |last2=Rash |first2=Samuel |last3=Schmidt |first3=Rodrigo |last4=Aiyer |first4=Amitanand |last5=Gray |first5=Jonathan |last6=Sarma |first6=Joydeep Sen |last7=Muthukkaruppan |first7=Kannan |last8=Spiegelberg |first8=Nicolas |last9=Kuang |first9=Hairong |last10=Ranganathan |first10=Karthik |last11=Molkov |first11=Dmytro |last12=Menon |first12=Aravind |title=Proceedings of the 2011 ACM SIGMOD International Conference on Management of data |chapter=Apache hadoop goes realtime at Facebook |page=1071 |isbn=9781450306614 |s2cid=207188340 }} for workloads such as Facebook Messenger and Facebook's Operational Data Store.{{cite web |title=YugaByte Raises $8M in Series A Funding |access-date=30 December 2021|url=https://www.finsmes.com/2017/11/yugabyte-raises-8m-in-series-a-funding.html |website=FINSMES|date=2 November 2017 }}
The founders came together in February 2016 to build YugabyteDB.{{cite web |title=Yugabyte CTO outlines a PostgreSQL path to distributed cloud |url=https://venturebeat.com/2021/07/26/yugabyte-cto-outlines-a-postgresql-path-to-distributed-cloud/ |website=VentureBeat |date=26 July 2021 |access-date=31 December 2021}}{{cite web |title=Yugabyte expands its fully managed enterprise cloud service with $188M |access-date=30 December 2021|url=https://venturebeat.com/2021/10/28/yugabyte-expands-its-fully-managed-enterprise-cloud-service-with-188m/ |website=VentureBeat|date=28 October 2021 }}
YugabyteDB was initially available in two editions: community and enterprise. In July 2019, Yugabyte open-sourced previously commercial features and launched YugabyteDB as open-source under the Apache 2.0 license.{{cite web |title=Yugabyte Expands Multi-Region Database Capabilities and Enterprise-Grade Security with YugabyteDB 2.5 |url=https://www.businesswire.com/news/home/20201112005160/en/Yugabyte-Expands-Multi-Region-Database-Capabilities-and-Enterprise-Grade-Security-with-YugabyteDB-2.5 |website=businesswire.com |date=12 November 2020 |access-date=30 November 2024}}
= Funding =
In October 2021, five years after the company's inception, Yugabyte closed a $188 Million Series C funding round to become a Unicorn start-up with a valuation of $1.3Bn{{cite web |title=Another cloud native SQL database unicorn: Yugabyte raises $188M Series C funding at $1.3B valuation |url=https://www.zdnet.com/article/another-globally-distributed-cloud-native-sql-database-unicorn-yugabyte-raises-188m-series-c-funding-at-1-3b-valuation/ |website=ZDNet |access-date=12 January 2022}}
Architecture
YugabyteDB is a distributed SQL database that aims to be strongly transactionally consistent across failure zones (i.e. ACID compliance].{{cite web |title=ACID Transactions |url=https://devopedia.org/acid-transactions |website=Devopedia |date=18 August 2019 |access-date=12 January 2022}}{{cite web |title= ICT Solutions for local flexibility markets |url=https://www.conferenceie.ase.ro/wp-content/uploads/2020/06/ProceedingsIE2020/ict_solutions_for_local_flexibility_markets.pdf |website=Academia de Studii Economice din Bucuresti |publisher=Proceedings of the IE 2020 International Conference |access-date=15 January 2022}} Jepsen testing, the de facto industry standard for verifying correctness, has never fully passed, mainly due to race conditions during schema changes.{{cite web |title=YugaByte DB 1.3.1 |url=https://jepsen.io/analyses/yugabyte-db-1.3.1 |access-date=30 December 2021|website=Jepsen.io}} In CAP Theorem terms YugabyteDB is a Consistent/Partition Tolerant (CP) database.{{cite web |title=YugaByteDB: A Distributed Cloud Native Database for a Highly Scalable Data Store |url=https://www.opensourceforu.com/2020/09/yugabytedb-a-distributed-cloud-native-database-for-a-highly-scalable-data-store/ |website=Open Source Foru |date=14 September 2020 |access-date=15 January 2022}}{{cite web |title=Yugabyte Design Goals |url=https://docs.yugabyte.com/latest/architecture/design-goals/ |website=Yugabyte.com |access-date=15 January 2022}}{{cite journal |title=A Generic and Extensible Core and Prototype of Consistent, Distributed, and Resilient LIS |year=2020 |doi=10.3390/ijgi9070437 |doi-access=free |last1=Galić |first1=Zdravko |last2=Vuzem |first2=Mario |journal=ISPRS International Journal of Geo-Information |volume=9 |issue=7 |page=437 |bibcode=2020IJGI....9..437G }}
YugabyteDB has two layers,{{cite web |title=Yugabyte Layered Architecture |url=https://docs.yugabyte.com/latest/architecture/layered-architecture/ |website=Yugabyte |access-date=15 January 2022}} a storage engine known as DocDB and the Yugabyte Query Layer.{{cite web |last1=Hirsch |first1=Orhan Henrik |title=Scalability of NewSQL Databases in a Cloud Environment |url=https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/2777732/no.ntnu%3ainspera%3a57320302%3a31535683.pdf?sequence=1&isAllowed=y |website=Norwegian University of Science and Technology |publisher=NYNU Open |access-date=15 January 2022}}
= DocDB =
The storage engine consists of a customized RocksDB combined with sharding and load balancing algorithms for the data. In addition, the Raft consensus algorithm controls the replication of data between the nodes. There is also a Distributed transaction manager and Multiversion concurrency control (MVCC) to support distributed transactions.{{cite web |last1=Budholia |first1=Akash |title=NewSQL Monitoring System |url=https://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1996&context=etd_projects |website=San Jose State University Scholar Works |access-date=15 January 2022}}
The engine also exploits a Hybrid Logical Clock{{cite web |title=Hybrid Clock |url=https://martinfowler.com/articles/patterns-of-distributed-systems/hybrid-clock.html |website=Martin Fowler |access-date=30 December 2021}} that combines coarsely-synchronized physical clocks with Lamport clocks to track causal relationships.{{cite web |title=Distributed Transactions without Atomic Clocks |url=https://www.yugabyte.com/wp-content/uploads/2021/05/Distributed-Transactions-Without-Atomic-Clocks.pdf |website=Yugabyte |access-date=15 January 2022}}
= YugabyteDB Query Layer =
Yugabyte has a pluggable query layer that abstracts the query layer from the storage layer below.{{cite web |title=Yugabyte DB 2.0 Ships Production-Ready Distributed SQL Database for Going Cloud Native |url=https://www.idevnews.com/stories/7298/Yugabyte-DB-20-Ships-Production-Ready-Distributed-SQL-Database-for-Going-Cloud-Native |website=Integration Developer News |access-date=15 January 2022}} There are currently two APIs that can access the database:
YSQL{{cite web |title=Yugabyte Structured Query Language (YSQL) |url=https://docs.yugabyte.com/latest/api/ysql/ |website=Yugabyte |access-date=15 January 2022}} is a PostgreSQL code-compatible API{{cite web |title=Yugabyte Meets Developer Demand for Comprehensive PostgreSQL Compatibility with YugabyteDB 2.11 |url=https://www.businesswire.com/news/home/20211123005572/en/Yugabyte-Meets-Developer-Demand-for-Comprehensive-PostgreSQL-Compatibility-with-YugabyteDB-2.11 |website=BusinessWire |date=23 November 2021 |access-date=15 January 2022}}{{cite web |title=PostgreSQL Compatibility in YugabyteDB 2.0 |url=https://blog.yugabyte.com/postgresql-compatibility-in-yugabyte-db-2-0/ |website=Yugabyte|date=17 September 2019 }} based around v11.2. YSQL is accessed via standard PostgreSQL drivers using native protocols.{{cite web |title=Client Drivers for YSQL |url=https://docs.yugabyte.com/latest/reference/drivers/ysql-client-drivers/ |website=Yugabyte}} It exploits the native PostgreSQL code for the query layer{{cite web |title=Why We Built YugabyteDB by Reusing the PostgreSQL Query Layer |url=https://blog.yugabyte.com/why-we-built-yugabytedb-by-reusing-the-postgresql-query-layer/ |website=Yugabyte |date=24 April 2020 |access-date=15 January 2022}} and replaces the storage engine with calls to the pluggable query layer. This re-use means that Yugabyte supports many features, including:
- Triggers & Stored Procedures
- PostgreSQL extensions that operate in the query layer
- Native JSONB support
YCQL{{cite web |title=Yugabyte Cloud Query Language (YCQL) |url=https://docs.yugabyte.com/latest/api/ycql/ |website=Yugabyte |access-date=15 January 2022}} is a Cassandra-like API based around v3.10 and re-written in C++. YCQL is accessed via standard Cassandra drivers{{cite web |title=Client drivers for YCQL |url=https://docs.yugabyte.com/latest/reference/drivers/ycql-client-drivers/ |website=Yugabyte}} using the native protocol port of 9042. In addition to the 'vanilla' Cassandra components, YCQL is augmented with the following features:
- Transactional consistency - unlike Cassandra, Yugabyte YCQL is transactional.{{cite web |title=ACID Transactions |url=https://docs.yugabyte.com/latest/develop/learn/acid-transactions-ycql/ |website=Yugabyte}}
- JSON data types supported natively{{cite web |title=YCQL JSONB Data Type |url=https://docs.yugabyte.com/latest/api/ycql/type_jsonb/ |website=Yugabyte |access-date=15 January 2022}}
- Tables can have secondary indexes{{cite web |title=YCQL Secondary Indexes |url=https://docs.yugabyte.com/latest/develop/learn/data-modeling-ycql/#secondary-indexes |website=Yugabyte |access-date=15 January 2022}}
Currently, data written to either API is not accessible via the other API, however YSQL can access YCQL using the PostgreSQL foreign data wrapper feature.{{cite web |title=YugabyteDB: Postgres foreign data wrapper |url=https://gruchalski.com/posts/2021-11-08-yugabytedb-postgres-foreign-data-wrapper/ |website=Gruchalski |date=8 November 2021 |access-date=15 January 2022}}
The security model for accessing the system is inherited from the API, so access controls for YSQL look like PostgreSQL,{{cite web |title=YSQL Access Control |url=https://docs.yugabyte.com/latest/secure/authorization/rbac-model/ |website=Yugabyte |access-date=15 January 2022}} and YCQL looks like Cassandra access controls.{{cite web |title=YCWL access Controls |url=https://docs.yugabyte.com/latest/secure/authorization/rbac-model-ycql/ |website=Yugabyte |access-date=15 January 2022}}
Cluster-to-cluster replication
In addition to its core functionality of distributing a single database, YugabyteDB has the ability to replicate between database instances.{{cite web |title=Yugabyte Expands Multi-Region Database Capabilities and Enterprise-Grade Security with YugabyteDB 2.5 |url=https://www.businesswire.com/news/home/20201112005160/en/Yugabyte-Expands-Multi-Region-Database-Capabilities-and-Enterprise-Grade-Security-with-YugabyteDB-2.5 |website=Business Wire |date=12 November 2020 |access-date=15 January 2022}}{{cite web |title=xCLuster Replication |url=https://docs.yugabyte.com/latest/architecture/docdb-replication/async-replication/ |website=Yugabyte |access-date=15 January 2022}} The replication can be one-way or bi-directional and is asynchronous.
One-way replication is used either to create a read-only copy for workload off-loading or in a read-write mode to create an active-passive standby.
Bi-directional replication is generally used in read-write configurations and is used for active-active configurations, geo-distributed applications, etc.
Migration tooling
Yugabyte also provides YugabyteDB Voyager, tooling to facilitate the migration of Oracle and other similar databases to YugabyteDB.{{cite web |title=Yugabyte simplifies SQL database migration with YugabyteDB Voyager |url=https://siliconangle.com/2023/01/24/yugabyte-simplifies-sql-database-migration-yugabytedb-voyager/?hss_channel=lcp-10643910 |website=siliconANGLE |date=24 January 2023 |access-date=15 March 2023}}{{cite web |title=Yugabyte chomps into cloud migration |url=https://www.techzine.eu/blogs/data-management/101380/yugabyte-chomps-into-cloud-migration/ |website=Techzine|date=2 February 2023 |access-date=15 March 2023}} This tool supports the migration of schemas, procedural code and data from the source platform to YugabyteDB.
See also
{{Portal|Free and open-source software}}
References
{{reflist|30em}}
External links
- {{Official website|https://www.yugabyte.com/}}
- {{GitHub|yugabyte/yugabyte-db}}
- [https://yugabytedb.tips/ YugabyteDB Tips]
Category:Bigtable implementations
Category:Database-related software for Linux
Category:Distributed computing