Glycoinformatics
Glycoinformatics is a field of bioinformatics that pertains to the study of carbohydrates involved in protein post-translational modification. It broadly includes (but is not restricted to) database, software, and algorithm development for the study of carbohydrate structures, glycoconjugates, enzymatic carbohydrate synthesis and degradation, as well as carbohydrate interactions. Conventional usage of the term does not currently include the treatment of carbohydrates from the better-known nutritive aspect.
Issues to consider
File:ArabinoXylanBranchingSequence.PNG molecule.Dervilly-Pinel G, et al. (2004). Carbohydrate Polymers 55:171–177. The carbohydrate structure is expressed as a sequence of numbers representing the branches in the main chain. As the complexity of the chain increases, the numerical representation of the carbohydrate becomes more complex.]]
Even though glycosylation is the most common form of protein modification, with highly complex carbohydrate structures, the bioinformatics on glycome is still very poor.Helenius A, Aebi M (2001) Intracellular
functions of N-linked glycans. Science 291:2364–2369Kikuchi N, et al. (2005). Bioinformatics 21:1717–1718. http://bioinformatics.oxfordjournals.org/cgi/content/full/21/8/1717
Unlike proteins and nucleic acids which are linear, carbohydrates are often branched and extremely complex.Seeberger PH (2005). Nature 437:1239. For instance, just four sugars can be strung together to form more than 5 million different types of carbohydratesService RF (2001). Science 291:805-806. http://www.sciencemag.org/cgi/content/full/291/5505/805a or nine different sugars may be assembled into 15 million possible four-sugar-chains.Dove A (2001). Nature Biotechnology 19:913-917. http://www.columbia.edu/cu/biology/courses/w3034/LACpapers/bittersweetNatBiot01.pdf {{Webarchive|url=https://web.archive.org/web/20100629060647/http://www.columbia.edu/cu/biology/courses/w3034/LACpapers/bittersweetNatBiot01.pdf |date=2010-06-29 }}
Also, the number of simple sugars that make up glycans is more than the number of nucleotides that make up DNA or RNA. Therefore, it is more computationally expensive to evaluate their structures.von der Lieth CW, et al. (2011). EUROCarbDB: An open-access platform for glycoinformatics. Glycobiology 21:4:493–502
One of the main constrains in the glycoinformatics is the difficulty of representing sugars in the sequence form especially due to their branching nature. Owing to the lack of a genetic blue print, carbohydrates do not have a "fixed" sequence. Instead, the sequence is largely determined by the presence of a variety of enzymes, their kinetic differences and variations in the biosynthetic micro-environment of the cells. This increases the complexity of analysis and experimental reproducibility of the carbohydrate structure of interest.Lutteke T. (2012). The use of glycoinformatics in glycochemistry. Beilstein J. Org. Chem. 8:915–929. doi:10.3762/bjoc.8.104 It is for this reason that carbohydrates are often considered as the "information poor" molecules.
Databases
Table of major glyco-databases.Aoki-Kinoshita KF. (2011). [http://www.beilstein-institut.de/download/383/aoki_kinoshita.pdf Introduction to Glycoinformatics And Computational Applications]. Beilstein-Institut. (PDF 1.57 MB)Egorova K.S., Toukach Ph.V. (2018). Glycoinformatics: bridging isolated islands in the sea of data. Angewandte Chemie International Edition 57:14986-14990 | doi:10.1002/anie.201803576