Hierarchical editing language for macromolecules

The hierarchical editing language for macromolecules (HELM) is a method of describing complex biological molecules. It is a notation that is machine readable to render the composition and structure of peptides, proteins, oligonucleotides, and related small molecule linkers.{{cite journal| title= HELM: A Hierarchical Notation Language for Complex Biomolecule Structure Representation| last1=Zhang|first1=Tianhong | last2=Li|first2= Hongli | last3= Xi|first3= Hualin |last4= Stanton|first4= Robert V. |last5=Rotstein| first5= Sergio H. | journal= J. Chem. Inf. Model.| year= 2012 |volume= 52 |issue= 10| pages= 2796–2806 | doi= 10.1021/ci3001925| pmid=22947017| doi-access= free}}

HELM was developed by a consortium of pharmaceutical companies in what is known as the Pistoia Alliance. Development began in 2008. In 2012 the notation was published openly and for free.{{cite web|url=http://www.openhelm.org/about-us| website= OpenHELM.org| title=About| accessdate= 14 Nov 2014}}

The HELM open source project can be found on GitHub.

Background

The need for HELM became obvious as researchers began working on modeling and computational projects involving molecules and engineered biomolecules of this type. There was not a language to describe the entities in an accurate manner which described both the composition and the complex branching and structure common in these entity types. Protein sequences can describe larger proteins and chemical language files such as mol files can describe simple peptides. But the complexity of new research biomolecules makes describing large complex molecules difficult with chemical formats, and peptide formats are not sufficiently flexible to describe non-natural amino acids and other chemistries.{{cite web|url=http://chembl.blogspot.com/2014/02/helm-in-chembl.html|title=HELM in ChEMBL|date= Feb 2014| accessdate= 17 Nov 2014}}

Design

In HELM, molecules are represented at four levels in a hierarchy:{{cite AV media|people= Tianhong Zhang|date=2017-07-09 |title=2014 07 09 Pistoia Alliance HELM Showcase Webinar |trans-title= |medium= |language= |url= https://www.youtube.com/watch?v=QeOkbnBCgh4|access-date= 2016-01-28|format= |time= |location= |publisher= youtube.com|id= |isbn= |oclc= |quote= }}

  • Complex polymer
  • Simple polymer
  • Monomer
  • Atom

Monomers are assigned short unique identifiers in internal HELM databases and can be represented by the identifier in strings. The approach is similar to that used in Simplified molecular-input line-entry system (SMILES). An exchangeable file format allows sharing of data between companies who have assigned different identifiers to monomers.{{cite web|url=http://www.bio-itworld.com/2014/7/18/universal-language-pistoia-alliance-takes-indescribable-biology.html|title= Universal Language: The Pistoia Alliance Takes on Indescribable Biology| last=Krol| first= Aaron|date= 18 Jul 2014| accessdate = 17 Nov 2014| website= bio-itworld.com}}

Examples

{{empty section|date=December 2023}}

(For now, see the following external links: [https://pistoiaalliance.atlassian.net/wiki/spaces/HELM/pages/2535522305/HELM+Notation "HELM notation" on HELM wiki], and [https://pistoiaalliance.atlassian.net/wiki/download/attachments/13795362/Test%20Set%20V1_0.xlsx?version=1&modificationDate=1551284920016&cacheVersion=1&api=v2&download=true test data file].)

Adoption

In 2014 ChEMBL announced plans to adopt HELM by 2014.{{cite web|url=http://www.ebi.ac.uk/about/news/service-news/HELM-collaboration|title=The Pistoia Alliance and EMBL-EBI announce HELM collaboration for cheminformatics|accessdate= 17 Nov 2014|website=ebi.ac.uk|date=4 February 2014 }} The informatics company BIOVIA developed a modified Molfile format called the Self-Contained Sequence Representation (SCSR) A standard which can incorporate individual attempts to solve the problem and be used universally and avoid proliferating standards is a goal of HELM.

Tools

An editor tool is needed to visualize and work with biomolecules at the correct level of detail. The editor is needed to "zoom out" to see a large molecule at the amino-acid sequence level, then "zoom in" to the atomic level at a particular site of conjugation or derivatization.{{cite web|url=https://www.chemaxon.com/app/uploads/2013/04/Sergio-Rotstein-ChemAxon-2013-UGM-Talk.pdf| title= What about the "big guys"? The emerging HELM standard for macromolecular representation and the Pistoia Alliance| last= Rotstein| first = Sergio H.| date = May 2013| accessdate = 28 Jan 2016 | website = www.chemaxon.com/library}}

The HELM Editor and HAbE (HELM Antibody Editor) are two client tools which may in the future be released as web-based applications.{{cite web|url=http://www.pistoiaalliance.org/rfi-published-helm-web-based-editor/| title = RFI published: HELM Web-based Editor | date = 7 Jan 2016| accessdate= 15 Jan 2016}}

Pistoia Alliance

At a conference in Pistoia, Italy, a group of researchers from Pfizer, AstraZeneca, GlaxoSmithKline, and Novartis formed what came to be known as the Pistoia Alliance. All parties were interested in solving problems for data aggregation, data sharing and analytics for pharmaceutical research. The alliance was incorporated in 2008. The alliance is now composed of informatics experts and researchers from industry, academia and life science service organizations. {{cite web|url=http://www.pistoiaalliance.org/about/mission_history.html |archive-url=https://web.archive.org/web/20110429073214/http://www.pistoiaalliance.org/about/mission_history.html |url-status=dead |archive-date=29 April 2011 |title=Mission&History |website=www.pistoiaalliance.org/ |accessdate=15 Nov 2014 }}

See also

References

{{Reflist}}

{{DEFAULTSORT:Hierarchical Editing Language for Macromolecules}}

Category:Chemical nomenclature

Category:Encodings

Category:Chemical file formats

Category:Bioinformatics