Yaniv Erlich
{{Short description|Israeli-American scientist}}
{{Infobox scientist
| name = Yaniv Erlich
| native_name = יניב ארליך
| native_name_lang = lang-he
| alma_mater = Watson School of Biological Sciences
| doctoral_advisor = Greg Hannon
| website = https://teamerlich.org/
| field = Genomics, Bioinformatics, Genetic Privacy, Crowdsourcing,
| work_institution = Columbia University
}}
Yaniv Erlich (Hebrew: יניב ארליך) is an Israeli-American scientist. He formerly served as an Associate Professor of Computer Science at Columbia University and was the Chief Science Officer of MyHeritage.{{Cite web|url=http://teamerlich.org | title = Erlich lab's website}} Erlich's work combines computer science and genomics.
Biography
Dr. Erlich was born in Israel. He earned BSc in Brain Sciences in 2006 from Tel Aviv University and a PhD in bioinformatics in 2010 from Watson School of Biological Sciences at Cold Spring Harbor Laboratory. From 2010 to 2015, Erlich was a Fellow at the Whitehead Institute, MIT. From 2015 to 2019, he led a lab at Columbia University in computational genomics.{{Cite web|url=http://www.tedxdanubia.com/speaker/yaniv-erlich | title = TEDxDanubia speakers}} From 2020 to present, he has served as CEO of Eleven Therapeutics.{{Cite web |title=About Us |url=https://www.eleventx.com/about-us/ |access-date=2023-04-16 |website=Eleven Therapeutics |language=en-US}}
Scientific work
= Crowd sourcing genomic information =
Erlich's team published a study in the journal Science that reported crowd-sourcing of tens of millions of genealogical records from the website Geni.com.{{Cite journal| title = Quantitative analysis of population-scale family trees with millions of relatives| year = 2018| doi = 10.1126/science.aam9309| last1 = Kaplanis| first1 = Joanna| last2 = Gordon| first2 = Assaf| last3 = Shor| first3 = Tal| last4 = Weissbrod| first4 = Omer| last5 = Geiger| first5 = Dan| last6 = Wahl| first6 = Mary| last7 = Gershovits| first7 = Michael| last8 = Markus| first8 = Barak| last9 = Sheikh| first9 = Mona| last10 = Gymrek| first10 = Melissa| last11 = Bhatia| first11 = Gaurav| last12 = MacArthur| first12 = Daniel G.| last13 = Price| first13 = Alkes L.| last14 = Erlich| first14 = Yaniv| journal = Science| volume = 360| issue = 6385| pages = 171–175| pmid = 29496957| pmc = 6593158| bibcode = 2018Sci...360..171K}} The team was able to create a single family tree of 13 million people that are all connected and spans tens of generations and over 600 years of history.{{Cite web|url=https://directorsblog.nih.gov/2018/03/13/crowdsourcing-600-years-of-human-history/|archive-url=https://web.archive.org/web/20180819182106/https://directorsblog.nih.gov/2018/03/13/crowdsourcing-600-years-of-human-history/|url-status=dead|archive-date=August 19, 2018| title = Crowdsourcing 600 Years of Human History| date = 13 March 2018}} The study used the data to analyze the genetics of longevity and familial dispersion{{Cite news|url=https://www.wsj.com/articles/the-13-million-people-in-your-family-tree-1519930813| title = WSJ| newspaper = Wall Street Journal| date = March 2018| last1 = Hotz| first1 = Robert Lee}}
In a different line of studies, Erlich and Joseph Pickrell put together a website called DNA.Land to crowd source genomic datasets of participants of consumer genomics.{{Cite web|url=https://www.nature.com/articles/s41588-017-0021-89| title = DNA.Land is a framework to collect genomes and phenomes in the era of abundant genetic information
}} The website collected over 130,000 datasets by November 2018.
= Genetic Privacy =
The Erlich group published several studies on the subject of genetic privacy. In 2013, they reported the possibility of recovering the surname of a male from his allegedly anonymous genomic dataset, which can lead to tracing his full identity.{{Cite journal|url=https://www.science.org/doi/10.1126/science.1229566| title = Identifying Personal Genomes by Surname Inference| year = 2013| doi = 10.1126/science.1229566| last1 = Gymrek| first1 = Melissa| last2 = McGuire| first2 = Amy L.| last3 = Golan| first3 = David| last4 = Halperin| first4 = Eran| last5 = Erlich| first5 = Yaniv| journal = Science| volume = 339| issue = 6117| pages = 321–324| pmid = 23329047| bibcode = 2013Sci...339..321G| s2cid = 3473659| url-access = subscription}} The technique exploits the co-inheritance of surnames and Y-chromosomes in most societies. Thus, by comparing the Y-chromosome of the person of interest to genetic genealogy databases of Y-chromosomes, it is possible in some cases to infer the surname. The team estimated that 12% of males in the US are subject to successful surname recovery. The team also demonstrated that after recovering the surname, basic demographic identifiers such as age and state of residency can permit tracing back the identity of the individual. To demonstrate the power of technique, they recover the identity of multiple 1000 Genomes by surname inference.
In 2014, Erlich and Arvind Narayanan published a survey of hacking techniques to genomic datasets.{{Cite journal| title = Routes for breaching and protecting genetic privacy
| year = 2014
| doi = 10.1038/nrg3723
| last1 = Erlich
| first1 = Yaniv
| last2 = Narayanan
| first2 = Arvind
| journal = Nature Reviews Genetics
| volume = 15
| issue = 6
| pages = 409–421
| pmid = 24805122
| pmc = 4151119
}} They predicted that autosomal searches in GEDmatch can be used to trace back the identity of anonymous people once the GEDmatch user base will reach a certain size, which indeed happened in 2018, where the website used to capture the Golden State Killer.
In 2018, the Erlich team published a study in Science that reported that about 60% of US individuals of European descent have at least a 3rd cousin match in GEDmatch, which can theoretically permit their identification.{{Cite journal| title = Identity inference of genomic data using long-range familial searches| year = 2018| doi = 10.1126/science.aau4832| last1 = Erlich| first1 = Yaniv| last2 = Shor| first2 = Tal| last3 = Pe'Er| first3 = Itsik| last4 = Carmi| first4 = Shai| journal = Science| volume = 362| issue = 6415| pages = 690–694| pmid = 30309907| pmc = 7549546| bibcode = 2018Sci...362..690E}} In two to three years, virtually any person in this ethnic group can be theoretically traced using this technique, if the current rate of growth in GEDmatch will continue.{{Cite news |url= https://www.nytimes.com/2018/10/11/science/science-genetic-genealogy-study.html |title= Most White Americans' DNA Can Be Identified Through Genealogy Databases|newspaper= The New York Times|date= 11 October 2018|last1= Murphy|first1= Heather}} The team suggested a cryptographic signature technique to reduce the chance of misusing direct to consumer websites by police searches.
References
{{reflist}}
{{authority control}}
{{DEFAULTSORT:Erlich, Yaniv}}
Category:Columbia University faculty
Category:American people of Israeli descent
Category:Tel Aviv University alumni