tidyverse
{{short description|Collection of R packages}}
{{Infobox software
| title = Tidyverse
| logo = Tidyverse hex logo.svg
| logo caption = The tidyverse hex logo
| logo alt = A black hexagon logo with the word "tidyverse" in white letter in the middle, while having smaller colorful hexagons throughout the larger black hexagon logo
| logo size =
| collapsible =
| screenshot =
| screenshot size =
| screenshot alt =
| caption =
| other_names =
| author =
| developer =
| released = {{Start date and age|2016|09|15|df=no}}{{cite web |last1=Wickham |first1=Hadley |title=tidyverse 1.0.0 |url=https://posit.co/blog/tidyverse-1-0-0/ |website=Posit Software, PBC}}{{cite web |last1=Wickham |first1=Hadley |date=April 15, 2025 |title=A personal history of the tidyverse |url=https://hadley.github.io/25-tidyverse-history/index.pdf}}
| ver layout =
| discontinued =
| latest release version =
| latest release date =
| latest preview version =
| latest preview date =
| repo = {{URL|https://github.com/tidyverse/tidyverse}}
| qid =
| programming language = R
| middleware =
| engine =
| operating system =
| platform =
| included with =
| replaces =
| replaced_by =
| service_name =
| size =
| standard =
| language =
| language count =
| language footnote =
| genre = Package collection
| license = MIT
| website = {{official URL|www.tidyverse.org}}
| AsOf =
}}
{{Portal|Free software}}
The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham{{Cite web|url=https://blog.revolutionanalytics.com/2016/09/tidyverse.html|title=Welcome to the Tidyverse|website=Revolutions|access-date=2018-11-26}} and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data.{{Cite web|url=https://www.tidyverse.org/|title=Tidyverse|website=www.tidyverse.org|language=en-us|access-date=2018-11-26}} Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.{{Citation|author1=Stefan Milton Bache |author2=Hadley Wickham|title=magrittr: A Forward-Pipe Operator for R|date=2014-11-22|url=https://cran.r-project.org/package=magrittr|access-date=2020-04-20}}{{Cite book|last=Wickham|first=Hadley|url=https://style.tidyverse.org/pipes.html|title=4 Pipes {{!}} The tidyverse style guide}}{{cite book |last1=Wickham |first1=Hadley |title=Advanced R |date=May 30, 2019 |publisher=Chapman & Hall |isbn=978-0815384571 |edition=2nd |location=New York}}
As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages.{{Cite web|url=https://www.rdocumentation.org/trends|title=RDocumentation|website=www.rdocumentation.org|access-date=2018-11-26}} The tidyverse is the subject of multiple books and papers.{{Cite journal|last=Duggan|first=Jim|date=2018-09-07|title=Input and output data analysis for system dynamics modelling using the tidyverse libraries of R|journal=System Dynamics Review|volume=34|issue=3|pages=438–461|language=en|doi=10.1002/sdr.1600|issn=0883-7066|hdl=10379/15029|s2cid=70005357|hdl-access=free}}{{Cite book|url=https://books.google.com/books?id=_iVFgKTRYrQC&q=ggplot2|title=R Graphics Cookbook|last=Chang|first=Winston|date=2013|publisher="O'Reilly Media, Inc."|isbn=9781449316952|language=en}}{{Cite book|title=Data wrangling with R|last=Boehmke|first=Bradley C.|isbn=9783319455990|publisher=Springer |location=Cham|oclc=964404346|date = 2016-11-17}}{{Cite book|title=R for data science : import, tidy, transform, visualize, and model data|last=Hadley|first=Wickham|others=Grolemund, Garrett|isbn=9781491910399|edition= First|publisher=O'Reilly Media|location=Sebastopol, CA|oclc=968213225|year = 2017}} In 2019, the ecosystem has been published in the Journal of Open Source Software.{{cite journal |last1=Wickham |first1=Hadley |last2=Averick |first2=Mara |last3=Bryan |first3=Jennifer |last4=Chang |first4=Winston |last5=McGowan |first5=Lucy D'Agostino |last6=François |first6=Romain |last7=Grolemund |first7=Garrett |last8=Hayes |first8=Alex |last9=Henry |first9=Lionel |last10=Hester |first10=Jim |last11=Kuhn |first11=Max |last12=Pedersen |first12=Thomas Lin |last13=Miller |first13=Evan |last14=Bache |first14=Stephan Milton |last15=Müller |first15=Kirill |last16=Ooms |first16=Jeroen |last17=Robinson |first17=David |last18=Seidel |first18=Dana Paige |last19=Spinu |first19=Vitalie |last20=Takahashi |first20=Kohske |last21=Vaughan |first21=Davis |last22=Wilke |first22=Claus |last23=Woo |first23=Kara |last24=Yutani |first24=Hiroaki |title=Welcome to the Tidyverse |journal=Journal of Open Source Software |date=21 November 2019 |volume=4 |issue=43 |pages=1686 |doi=10.21105/joss.01686 |bibcode=2019JOSS....4.1686W |s2cid=214002773 |doi-access=free }}
Its syntax has been referred to as "supremely readable",{{Cite web |last=Steinmetz |first=Art |date=2024-04-10 |title=Outsider Data Science - The Truth About Tidy Wrappers |url=https://outsiderdata.netlify.app/posts/2024-04-10-the-truth-about-tidy-wrappers/benchmark_wrappers.html |access-date=2024-04-11 |website=outsiderdata.netlify.app |language=en}} and some{{Cite web |last=Heppler |first=Jason |date=2018-02-27 |title=Teaching the tidyverse to R novices |url=https://medium.com/@jaheppler/teaching-the-tidyverse-to-r-novices-7747e8ce14e |access-date=2023-08-24 |website=Medium |language=en}} have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks.{{Cite web |last=on |first=Teach the tidyverse to beginners was published |title=Teach the tidyverse to beginners |url=http://varianceexplained.org/r/teach-tidyverse/ |access-date=2022-07-15 |website=Variance Explained |date=5 July 2017 |language=en}} Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas.{{Cite web |title=Why pandas feels clunky when coming from R |url=https://sumsar.net/blog/pandas-feels-clunky-when-coming-from-r/ |access-date=2024-03-30 |website=Rasmus Bååth's Blog |language=en-us}} There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC),{{Cite web |title=dslc.io |url=https://dslc.io/ |access-date=2024-08-11 |website=dslc.io |language=en}} where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier.{{Citation |title=rfordatascience/tidytuesday |date=2024-08-11 |url=https://github.com/rfordatascience/tidytuesday |access-date=2024-08-11 |publisher=Data Science Learning Community}} Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages.{{cite web |last1=Matloff |first1=Norm |date=30 September 2019 |title=An opinionated view of the Tidyverse "dialect" of the R language |url=https://github.com/matloff/TidyverseSkeptic |accessdate=28 October 2019 |website=GitHub}}{{cite web |last1=Muenchen |first1=Bob |date=23 March 2017 |title=The Tidyverse Curse |url=http://r4stats.com/2017/03/23/the-tidyverse-curse/ |website=r4stats.com |language=en}}
The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features.{{Cite web |title=The Power of Transitioning to a '-verse' Approach in R Package Development |url=https://www.appsilon.com/post/the-power-of-transitioning-to-a-verse |access-date=2024-08-11 |website=www.appsilon.com |language=en}} An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.{{Cite web |title=pharmaverse |url=https://pharmaverse.org/ |access-date=2024-08-11 |website=pharmaverse.org}}
Packages
The core tidyverse packages, which provide functionality to model, transform, and visualize data, include:{{Cite news|url=https://www.tidyverse.org/packages/|title=Tidyverse packages - Tidyverse|access-date=2018-11-26|language=en-us}}
- ggplot2 – for data visualization
- dplyr – for wrangling and transforming data
- [https://tidyr.tidyverse.org/ tidyr] – help transform data specifically into tidy data, where each variable is a column, each observation is a row; each row is an observation, and each value is a cell.
- [https://readr.tidyverse.org/ readr] – help read in common delimited, text files with data
- [https://purrr.tidyverse.org/ purrr] – a functional programming toolkit
- [https://tibble.tidyverse.org/ tibble] – a modern implementation of the built-in data frame data structure
- [https://stringr.tidyverse.org/ stringr] – helps to manipulate string data types
- [https://forcats.tidyverse.org/ forcats] – helps to manipulate category data types
Additional packages assist the core collection.{{Cite web|title=Tidyverse packages|url=https://www.tidyverse.org/packages/|access-date=2020-12-22|website=www.tidyverse.org|language=en-us}} Other packages based on the tidy data principles are regularly developed, such as tidytext{{Citation |last=Silge |first=Julia |title=tidytext: Text mining using tidy tools |date=2023-02-01 |url=https://github.com/juliasilge/tidytext |access-date=2023-02-03}} for text analysis, tidymodels{{Cite web |title=Tidymodels |url=https://www.tidymodels.org/ |access-date=2023-02-03 |website=www.tidymodels.org |language=en-us}} for machine learning, or tidyquant{{Cite web |title=Tidy Quantitative Financial Analysis |url=https://business-science.github.io/tidyquant/ |access-date=2023-02-03 |website=business-science.github.io |language=en}} for financial operations.
References
{{Reflist}}
{{R (programming language)}}
Category:Data analysis software