arXiv
{{Short description|Online archive of e-preprints}}
{{lowercase title}}
{{Distinguish|archive.org{{!}}Internet Archive (archive.org)}}
{{Use mdy dates|date=May 2016}}
{{Infobox website
| name = arXiv
| logo = ArXiv logo 2022.svg
| logo_size =
| screenshot = ArXiv-org screenshot 20140706.png
| url = {{official URL}}
| commercial = No
| type = Science
| language = English
| owner = Cornell University
| author = Paul Ginsparg
| oclc = 228652809
| issn = 2331-8422
| current_status = Online
| launch_date = {{start date and age|1991|8|14}}
}}
arXiv (pronounced as "archive"—the X represents the Greek letter chi ⟨χ⟩){{cite web |url=http://ezramagazine.cornell.edu/FALL12/CoverStorySidebar2.html |title=Library-managed 'arXiv' spreads scientific advances rapidly and worldwide |work=Ezra |date=Fall 2012 |page=9 |archive-url=https://web.archive.org/web/20150111003819/http://ezramagazine.cornell.edu/FALL12/CoverStorySidebar2.html |archive-date=January 11, 2015 |url-status=live |first=Bill |last=Steele |publisher=Cornell University |location=Ithaca, New York |volume=V |issue=1 |oclc=263846378 |quote=Pronounce it 'archive'. The X represents the Greek letter chi {{bracket|{{nnbsp}}χ{{nnbsp}}}}.}} is an open-access repository of electronic preprints and postprints (known as e-prints) approved for posting after moderation, but not peer reviewed. It consists of scientific papers in the fields of mathematics, physics, astronomy, electrical engineering, computer science, quantitative biology, statistics, mathematical finance, and economics, which can be accessed online. In many fields of mathematics and physics, almost all scientific papers are self-archived on the arXiv repository before publication in a peer-reviewed journal. Some publishers also grant permission for authors to archive the peer-reviewed postprint. Begun on August 14, 1991, arXiv.org passed the half-million-article milestone on October 3, 2008,{{cite arXiv|last=Ginsparg|first=Paul |eprint=1108.2700 |title=It was twenty years ago today ...|class= cs.DL|date=2011 }}{{cite web|url=http://news.library.cornell.edu/content/online-scientific-repository-hits-milestone |title=Online Scientific Repository Hits Milestone: With 500,000 Articles, arXiv Established as Vital Library Resource |publisher=News.library.cornell.edu |date=October 3, 2008 |access-date=July 21, 2013}} had hit a million by the end of 2014{{citation|title=One Million Preprints and Counting: A conversation with arXiv founder Paul Ginsparg |journal=The Scientist |first=Tracy|last=Vence |date=December 29, 2014 |url=http://www.the-scientist.com/?articles.view/articleNo/41677/title/Q-A--One-Million-Preprints-and-Counting}} and two million by the end of 2021.{{Cite web |title=Monthly Submissions |url=https://arxiv.org/stats/monthly_submissions |access-date=2023-05-16 |website=arxiv.org}}{{Cite web |title=Reports – arXiv info |url=https://info.arxiv.org/about/reports/index.html |access-date=2023-05-16 |website=info.arxiv.org}} As of November 2024, the submission rate is about 24,000 articles per month.{{cite web|url=https://arxiv.org/show_monthly_submissions |title=arXiv monthly submission rate statistics |publisher=Arxiv.org |access-date=November 19, 2024}}
History
File:ArXiv 1994.png. At the time, HTML forms were a new technology.]]
File:ArXiv's yearly submission rate plot.svg
arXiv was made possible by the compact TeX file format, which allowed scientific papers to be easily transmitted over the Internet and rendered client-side.{{cite journal |last1=O'Connell|first1=Heath|title=Physicists Thriving with Paperless Publishing |url=https://core.ac.uk/download/pdf/11876491.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://core.ac.uk/download/pdf/11876491.pdf |archive-date=2022-10-09 |url-status=live |journal=High Energy Physics Libraries Webzine|volume=6|issue=6|pages=3|year=2002 |arxiv=physics/0007040 |bibcode=2000physics...7040O}} Around 1990, Joanne Cohn began emailing physics preprints to colleagues as TeX files, but the number of papers being sent soon filled mailboxes to capacity.{{Cite journal|last=Feder|first=Toni|title=Joanne Cohn and the email list that led to arXiv |date=2021-11-08|url=https://physicstoday.scitation.org/do/10.1063/PT.6.4.20211108a/abs/|journal=Physics Today|volume=2021|issue=4|pages=1108a|language=EN|doi=10.1063/PT.6.4.20211108a|bibcode=2021PhT..2021d1108.|s2cid=244015728}} Paul Ginsparg recognized the need for central storage, and in August 1991 he created a central repository mailbox stored at the Los Alamos National Laboratory (LANL) that could be accessed from any computer.{{cite journal |last1=Feder |first1=Toni |title=Joanne Cohn and the email list that led to arXiv |url=https://physicstoday.scitation.org/do/10.1063/PT.6.4.20211108a/full/ |journal=Physics Today |date=8 November 2021 |volume=2021 |issue=4 |pages=1108a |doi=10.1063/PT.6.4.20211108a|bibcode=2021PhT..2021d1108. |s2cid=244015728 }} Additional modes of access were soon added: FTP in 1991, Gopher in 1992, and the World Wide Web in 1993.{{cite web|last=Ginsparg|first=Paul|date=October 1, 2008|title=The global-village pioneers|url=https://physicsworld.com/a/the-global-village-pioneers/|access-date=October 10, 2020|work=Physics World}} The term e-print was quickly adopted to describe the articles.
It began as a physics archive, called the LANL preprint archive, but soon expanded to include astronomy, mathematics, computer science, quantitative biology and, most recently, statistics. Its original domain name was xxx.lanl.gov. Due to LANL's lack of interest in the rapidly expanding technology, in 2001 Ginsparg changed institutions to Cornell University and changed the name of the repository to arXiv.org.{{cite journal
| last = Butler | first = Declan
| s2cid = 1527860
| title = Los Alamos Loses Physics Archive as Preprint Pioneer Heads East
| journal = Nature
| date = July 5, 2001
| pages = 3–4
| volume = 412 | issue = 6842
| doi = 10.1038/35083708
| pmid = 11452262
| bibcode = 2001Natur.412....3B
| doi-access = free
}}
Ginsparg brainstormed the new name with his wife; the domain "archive" was already claimed, so "chi" was replaced with "X" standing in as the Greek letter chi and the "e" dropped for symmetry around the "X".{{Cite magazine |last=Han |first=Sheon |title=Inside arXiv—the Most Transformative Platform in All of Science |url=https://www.wired.com/story/inside-arxiv-most-transformative-code-science/ |access-date=2025-03-28 |magazine=Wired |language=en-US |issn=1059-1028}}
arXiv was an early adopter and promoter of preprints.{{cite web |title=Celebrating 30 Years of arXiv and Its Lasting Legacy on Scientific Advancement |url=https://sparcopen.org/news/2021/celebrating-30-years-of-arxiv-and-its-lasting-legacy-on-scientific-advancement/ |website=SPARC |date=25 October 2021}} Its success in sharing preprints was one of the precipitating factors that led to the later movement in scientific publishing known as open access. Mathematicians and scientists regularly upload their papers to arXiv.org for worldwide access{{cite news |first=James |last=Glanz |title=The World of Science Becomes a Global Village; Archive Opens a New Realm of Research |url=https://www.nytimes.com/2001/05/01/science/world-science-becomes-global-village-archive-opens-new-realm-research.html |newspaper=The New York Times |date=May 1, 2001 |url-access=limited}} and sometimes for reviews before they are published in peer-reviewed journals. Ginsparg was awarded a MacArthur Fellowship in 2002 for his establishment of arXiv.{{cite web|author=Bill Steele|date=23 September 2002|title=Cornell professor Paul Ginsparg, science communication rebel, named a MacArthur Foundation fellow; three other alumni also receive 'genius award' fellowships|url=http://www.news.cornell.edu/releases/sept02/ginsparg-MacArthur.ws.html |url-status=dead |archive-url=https://web.archive.org/web/20211027011146/https://news.cornell.edu/stories/2002/09/paul-ginsparg-named-macarthur-genius-fellow |archive-date=Oct 27, 2021 |website=Cornell Chronicle }} The annual budget for arXiv was approximately $826,000 for 2013 to 2017, funded jointly by Cornell University Library, the Simons Foundation (in both gift and challenge grant forms) and annual fee income from member institutions.{{cite web|url=https://confluence.cornell.edu/download/attachments/127116484/arXiv+Business+Model.pdf |title=Cornell University Library arXiv Financial Projections for 2013-2017 |date=March 28, 2012 |website=Confluence.cornell.edu |access-date=2017-02-26}} This model arose in 2010, when Cornell sought to broaden the financial funding of the project by asking institutions to make annual voluntary contributions based on the amount of download usage by each institution. Each member institution pledges a five-year funding commitment to support arXiv. Based on institutional usage ranking, the annual fees are set in four tiers from $1,000 to $4,400. Cornell's goal is to raise at least $504,000 per year through membership fees generated by approximately 220 institutions.{{Cite web |title=arXiv Member Institutions (2021) – arXiv about – Our Members |url=https://arxiv.org/about/ourmembers |access-date=2021-12-27 |website=arXiv.org}}
In September 2011, Cornell University Library took overall administrative and financial responsibility for arXiv's operation and development. Ginsparg was quoted in the Chronicle of Higher Education as joking that it "was supposed to be a three-hour tour, not a life sentence".{{cite news | url=http://chronicle.com/blogs/wiredcampus/the-first-free-research-sharing-site-arxiv-turns-20/32778 | title=The First Free Research-Sharing Site, arXiv, Turns 20 With an Uncertain Future | work=Chronicle of Higher Education | date=August 10, 2011 | access-date=August 12, 2011 | author=Fischman, Joah}} However, Ginsparg remains on the arXiv's Scientific Advisory Board and its Physics Advisory Committee.{{Cite web|title=arXiv Scientific Advisory Board {{!}} arXiv e-print repository|url=https://arxiv.org/about/people/scientific_ad_board|access-date=2020-10-10|website=arxiv.org}}{{Cite web|title=About the Physics Archive {{!}} arXiv e-print repository|url=https://arxiv.org/help/physics|access-date=2020-10-10|website=arxiv.org}}
In January 2022, arXiv began assigning DOIs to articles, in collaboration with DataCite.{{cite web |title=New arXiv articles are now automatically assigned DOIs |url=https://blog.arxiv.org/2022/02/17/new-arxiv-articles-are-now-automatically-assigned-dois/ |access-date=4 April 2023}}
Data format
Each arXiv paper has a unique identifier:
YYMM.NNNNN
, e.g.1507.00123
,YYMM.NNNN
, e.g.0704.0001
,arch-ive/YYMMNNN
for older papers, e.g.hep-th/9901001
.
Different versions of the same paper are specified by a version number at the end. For example, 1709.08980v1
. If no version number is specified, the default is the latest version.
arXiv uses a category system. Each paper is tagged with one or more categories. Some categories have two layers. For example, q-fin.TR
is the "Trading and Market Microstructure" category within "quantitative finance". Other categories have one layer. For example, hep-ex
is "high energy physics experiments".
Moderation process and endorsement
Although arXiv is not peer reviewed, a collection of moderators for each area review the submissions; they may recategorize any that are deemed off-topic,{{citation|last=McKinney|first=Michelle|title=ArXiv.org |journal=Reference Reviews|volume=25|issue=7|year=2011|pages=35–36|doi=10.1108/09504121111168622}} or reject submissions that are not scientific papers, or sometimes for undisclosed reasons. The lists of moderators for many sections of arXiv are publicly available,{{cite web|url=https://arxiv.org/moderators/ |title= Current arXiv moderators |publisher=Arxiv.org |access-date=October 3, 2024}} but moderators for most of the physics sections remain unlisted.
Additionally, an "endorsement" system was introduced in 2004 as part of an effort to ensure content is relevant and of interest to current research in the specified disciplines.{{citation|title=As we may read|last=Ginsparg|first=Paul|year=2006|journal=Journal of Neuroscience|volume=26|issue=38|pages=9606–9608|doi=10.1523/JNEUROSCI.3161-06.2006|pmid=16988030|pmc=6674456}} Under the system, for categories that use it, an author must be endorsed by an established arXiv author before being allowed to submit papers to those categories. Endorsers are not asked to review the paper for errors but to check whether the paper is appropriate for the intended subject area. New authors from recognized academic institutions generally receive automatic endorsement, which in practice means that they do not need to deal with the endorsement system at all. However, the endorsement system has attracted criticism for allegedly restricting scientific inquiry.{{citation|journal=International Journal of Theoretical Physics|date=July 2005|volume=44|issue=7|pages=691–692|title=Preface to the Proceedings of Quantum Structures 2002|first1=Richard|last1=Greechie|first2=Sylvia|last2=Pulmannova|first3=Karl|last3=Svozil|s2cid=121442106|doi=10.1007/s10773-005-7053-z|quote=The new endorsement system may contribute to an effective barrier, a digital divide|bibcode = 2005IJTP...44..691G }}{{cite journal |last1=Josephson |first1=Brian |title=Vital resource should be open to all physicists |journal=Nature |date=23 February 2005 |volume=433 |issue=7028 |page=800 |doi=10.1038/433800a |pmid=15729314 |bibcode=2005Natur.433..800J |doi-access=free }}
A majority of the e-prints are also submitted to journals for publication, but some work, including some very influential papers, remain purely as e-prints and are never published in a peer-reviewed journal. A well-known example of the latter is an outline of a proof of Thurston's geometrization conjecture, including the Poincaré conjecture as a particular case, uploaded by Grigori Perelman in November 2002.{{cite arXiv|author=Perelman, Grisha|title=The entropy formula for the Ricci flow and its geometric applications|eprint = math.DG/0211159|date=November 11, 2002}} Perelman appears content to forgo the traditional peer-reviewed journal process, stating: "If anybody is interested in my way of solving the problem, it's all there {{bracket|on the arXiv}}{{spnd}}let them go and read about it".{{cite news |first1=Nadejda |last1=Lobastova |first2=Michael |last2=Hirst |url=http://www.smh.com.au/news/world/maths-genius-living-in-poverty/2006/08/20/1156012411120.html |title=Maths genius living in poverty |newspaper=Sydney Morning Herald |date=August 21, 2006}} Despite this non-traditional method of publication, other mathematicians recognized this work by offering the Fields Medal and Clay Mathematics Millennium Prizes to Perelman, both of which he refused.{{citation|url=https://www.washingtonpost.com/wp-dyn/content/article/2010/07/01/AR2010070106247.html|title=Russian mathematician wins $1 million prize, but he appears to be happy with $0|newspaper=Washington Post|date=July 2, 2010|first=Marc|last=Kaufman}}
While arXiv does contain some dubious e-prints, such as those claiming to refute famous theorems or proving famous conjectures such as Fermat's Last Theorem using only high-school mathematics, a 2002 article which appeared in Notices of the American Mathematical Society described those as "surprisingly rare".{{cite journal
| last = Jackson | first = Allyn
| title = From Preprints to E-prints: The Rise of Electronic Preprint Servers in Mathematics
| journal = Notices of the American Mathematical Society
| volume = 49
| issue = 1
| date = 2002
| pages = 23–32
| url = https://www.ams.org/notices/200201/fea-preprints.pdf
}} arXiv generally re-classifies these works, e.g. in "General mathematics", rather than deleting them;{{Cite journal |last=Ginsparg |first=Paul |title=ArXiv at 20 |date=August 2011 |journal=Nature |volume=476 |issue=7359 |pages=145–147 |doi=10.1038/476145a |pmid=21833066 |bibcode=2011Natur.476..145G |s2cid=4421407 |issn=0028-0836|doi-access=free }} however, some authors have voiced concern over the lack of transparency in the arXiv screening process.{{cite journal |url=http://www.nature.com/news/arxiv-rejections-lead-to-spat-over-screening-process-1.19267 |first1=Zeeya |last1=Merali |title=ArXiv rejections lead to spat over screening process |s2cid=189061969 |date=29 January 2016 |journal=Nature |access-date=December 14, 2017 |doi=10.1038/nature.2016.19267 }}
= Withdrawn preprints =
It has been reported that 14,000 preprints have been withdrawn at arXiv, most commonly due to "crucial errors".{{cite arXiv |eprint=2412.03775 |last1=Rao |first1=Delip |last2=Young |first2=Jonathan |last3=Dietterich |first3=Thomas |last4=Callison-Burch |first4=Chris |title=WithdrarXiv: A Large-Scale Dataset for Retraction Study |date=2024 |class=cs.CL }} A lesser number of the withdrawals were due to the preprint being subsumed by another publication. The report itself was posted at arXiv December, 2024.
Submission formats
Papers can be submitted in any of several formats, including LaTeX, and PDF printed from a word processor other than TeX or LaTeX. The submission is rejected by the arXiv software if generating the final PDF file fails, if any image file is too large, or if the total size of the submission is too large. arXiv now allows one to store and modify an incomplete submission, and only finalize the submission when ready. The time stamp on the article is set when the submission is finalized.
Access
File:Arxiv.org abstract view.png
The standard access route is through the arXiv.org website. Other interfaces and access routes have also been created by other un-associated organisations.
Metadata for arXiv is made available through OAI-PMH, the standard for open access repositories.{{cite web|access-date=2020-04-25|title=Open Archives Initiative (OAI) |url=https://arxiv.org/help/oa/index|website=arxiv.org}} Content is therefore indexed in all major consumers of such data, such as BASE, CORE and Unpaywall. As of 2020, the Unpaywall dump links over 500,000 arxiv URLs as the open access version of a work found in CrossRef data from the publishers, making arXiv a top 10 global host of green open access.
Finally, researchers can select sub-fields and receive daily e-mailings or RSS feeds of all submissions in them.
Copyright status of files
Files on arXiv can have a number of different copyright statuses:{{cite web|url=https://arxiv.org/help/license |title=arXiv License Information |publisher=Arxiv.org |access-date=July 21, 2013}}
- Some are public domain, in which case they will have a statement saying so.
- Some are available under either the Creative Commons 4.0 Attribution-ShareAlike license or the Creative Commons 4.0 Attribution-Noncommercial-ShareAlike license.
- Some are copyright to the publisher, but the author has the right to distribute them and has given arXiv a non-exclusive irrevocable license to distribute them.
- Most are copyright to the author, and arXiv has only a non-exclusive irrevocable license to distribute them.
See also
Citations
General and cited sources
{{Refbegin}}
- {{cite journal
| last = Butler | first = Declan
| s2cid = 4374168
| date = 2003
| title = Biologists Join Physics Preprint Club
| journal = Nature
| volume = 425 | issue = 6958 | page = 548
| doi = 10.1038/425548b
| pmid = 14534551
|bibcode = 2003Natur.425..548B | doi-access = free
}}
- {{cite journal
| last = Choi
| first = Charles Q.
| date = 2003
| title = Biology's New Online Archive
| journal = The Scientist
| url = http://www.biomedcentral.com/news/20030930/03/
| access-date = June 21, 2005
| archive-date = March 13, 2005
| archive-url = https://web.archive.org/web/20050313151648/http://www.biomedcentral.com/news/20030930/03
| url-status = dead
}}
- {{cite journal
| last = Giles | first = Jim
| s2cid = 29003994
| date = 2003
| title = Preprint Server Seeks Way to Halt Plagiarists
| journal = Nature
| volume = 426 | issue = 6962 | page = 7
| pmid = 14603280
| doi = 10.1038/426007a
| bibcode = 2003Natur.426Q...7G
| doi-access = free
}}
- {{cite journal
| last = Ginsparg
| first = Paul
| date = 1997
| title = Winners and Losers in the Global Research Village
| url = https://www.cs.cornell.edu/~ginsparg/physics/blurb/pg96unesco.html
| journal = The Serials Librarian
| volume = 30
| issue = 3–4
| pages = 83–95
| doi = 10.1300/J123v30n03_13
| doi-access = free
}}
- {{cite journal
| last = Halpern | first = Joseph Y.
| date = 1998
| title = A Computing Research Repository
| journal = D-Lib Magazine
| volume = 4 | issue = 11 | doi = 10.1045/november98-halpern
| doi-access = free}}
- {{cite journal
| last = Halpern | first = Joseph Y.
| s2cid = 5453868
| date = 2000
| title = CoRR: A computing research repository
| journal = ACM Journal of Computer Documentation
| volume = 24 | issue = 2 | pages = 41–48
| arxiv = cs.DL/0005003
| doi=10.1145/337271.337274
| bibcode = 2000cs........5003H
}}
- {{cite journal
| last = Luce
| first = Richard E.
| date = 2001
| title = e-Prints Intersect the Digital Library: Inside the los Alamos arXiv
| url = http://www.istl.org/01-winter/article3.html
| journal = Issues in Science and Technology Librarianship
| issue = 29
| doi = 10.5062/F44B2Z95
}}
- {{cite journal
|last=McKiernan
|first=Gerry
|title=ArXiv.org: The los Alamos National Laboratory e-print server
|date=2000
|url=http://www.public.iastate.edu/~gerrymck/arXiv.org.pdf
|journal=International Journal on Grey Literature
|volume=1
|issue=3
|pages=127–138
|doi=10.1108/14666180010345564
|url-status=dead
|archive-url=https://web.archive.org/web/20050505044157/http://www.public.iastate.edu/~gerrymck/arXiv.org.pdf
|archive-date=May 5, 2005
}}
- {{cite journal
| last = Pinfield | first = Stephen
| date = 2001
| title = How Do Physicists Use an E-Print Archive? Implications for Institutional E-Print Services
| journal = D-Lib Magazine
| volume = 7 | issue = 12 | doi = 10.1045/december2001-pinfield
| doi-access = free
}}
- {{cite journal
| last = Quigley
| first = Brian
| date = 2000
| title = Physics Databases and the Los Alamos e-Print Archive
| url = http://www.allbusiness.com/technology/internet-search-engines/1149636-1.html
| journal = EContent
| volume = 23
| issue = 5
| pages = 22–26
}}
- {{cite journal
| last = Taubes | first = Gary
| date = 1993
| title = Publication by Electronic Mail Takes Physics by Storm
| journal = Science
| volume = 259 | issue = 5099 | pages = 1246–1248
| bibcode = 1993Sci...259.1246T
| doi = 10.1126/science.259.5099.1246
| pmid = 17732237
}}
- {{cite arXiv
| last = Warner | first = Simeon
| title = Open Archives Initiative protocol development and implementation at arXiv
| date = 2001
| eprint = cs/0101027
}}
- {{cite journal
| date = 2004
| title = What Is q-bio?
| journal = Open Access Now
}}
{{Refend}}
External links
{{Commons category|ArXiv.org}}
- {{official website|https://arxiv.org}}
{{Cornell}}
Category:1991 establishments in New Mexico
Category:American digital libraries