Chemistry Development Kit
{{short description|Computer software}}
{{Infobox software
| name = Chemistry Development Kit
| logo = Cdklogo.svg
| logo alt = Burgundy-colored ball and stick pseudo-molecule diagram spelling the three letters C, D, and K.
| screenshot =
| caption =
| author = Christoph Steinbeck, Egon Willighagen, Dan Gezelter
| developer = The CDK Project
| released = {{Start date and age|2001|05|11|df=yes}}{{Cite web|url=https://sourceforge.net/projects/cdk/files/OldFiles/|title = The Chemistry Development Kit - Browse /OldFiles at SourceForge.net}}
| programming language= Java
| operating system = Windows, Linux, Unix, macOS
| language = English
| genre = Chemoinformatics, molecular modelling, bioinformatics
| license = LGPL 2.0
| website = {{URL|cdk.github.io}}
| repo = {{URL|github.com/cdk/cdk}}
}}
The Chemistry Development Kit (CDK) is computer software, a library in the programming language Java, for chemoinformatics and bioinformatics.{{cite journal |last1=Steinbeck |first1=C. |last2=Han |first2=Y. Q. |last3=Kuhn |first3=S. |last4=Horlacher |first4=O. |last5=Luttmann |first5=E. |last6=Willighagen |first6=E. L. |title=The Chemistry Development Kit (CDK): An open-source Java library for chemo- and bioinformatics |journal=Journal of Chemical Information and Computer Sciences |volume=43 |issue=2 |pages=493–500 |year=2003 |pmid=12653513 |doi=10.1021/ci025584y |pmc=4901983 }}{{Cite journal|last1=Willighagen|first1=Egon L.|last2=Mayfield|first2=John W.|last3=Alvarsson|first3=Jonathan|last4=Berg|first4=Arvid|last5=Carlsson|first5=Lars|last6=Jeliazkova|first6=Nina|last7=Kuhn|first7=Stefan|last8=Pluskal|first8=Tomáš|last9=Rojas-Chertó|first9=Miquel|date=2017-06-06|title=The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching|journal=Journal of Cheminformatics|language=En|volume=9|issue=1|pages=33|doi=10.1186/s13321-017-0220-4|pmid=29086040|pmc=5461230|issn=1758-2946 |doi-access=free }} It is available for Windows, Linux, Unix, and macOS. It is free and open-source software distributed under the GNU Lesser General Public License (LGPL) 2.0.
History
The CDK was created by Christoph Steinbeck, Egon Willighagen and Dan Gezelter, then developers of Jmol and JChemPaint, to provide a common code base, on 27–29 September 2000 at the University of Notre Dame. The first source code release was made on 11 May 2011.{{Cite web|url=http://sourceforge.net/projects/cdk/files/OldFiles/|title=The Chemistry Development Kit - Browse /OldFiles at SourceForge.net}} Since then more than 100 people have contributed to the project,{{Cite web|url=https://github.com/cdk/cdk/blob/master/AUTHORS.txt|title = The Chemistry Development Kit (CDK)|website = GitHub|date = 12 October 2021}} leading to a rich set of functions, as given below. Between 2004 and 2007, CDK News was the project's newsletter of which all articles are available from a public archive.{{Cite web|url=https://sourceforge.net/projects/cdk/files/CDK%20News/|title=The Chemistry Development Kit - Browse /CDK News at SourceForge.net}} Due to an unsteady rate of contributions, the newsletter was put on hold.
{{Infobox journal
|title = CDK News
|abbreviation=CDK News
|language = English
|editors = Egon Willighagen, Christoph Steinbeck
|history = 2004-2007
|ISSN = 1614-7553
|italic title=no
}}
Later, unit testing, code quality checking, and Javadoc validation was introduced. Rajarshi Guha developed a nightly build system, named Nightly, which is still operating at Uppsala University.{{cite web |url=http://pele.farmbio.uu.se/nightly/ |title=CDK 1.5.x Nightly Build - 2013-05-10 (21:21) [Commit 2abcb5d61304e58d55ea26a23ebd0d375deea36d] |accessdate=2013-08-05 |url-status=dead |archive-url=https://web.archive.org/web/20130524042151/http://pele.farmbio.uu.se/nightly/ |archive-date=2013-05-24 }} In 2012, the project became a support of the InChI Trust, to encourage continued development. The library uses JNI-InChI{{cite web |url=http://jni-inchi.sourceforge.net/ |title=Home |website=jni-inchi.sourceforge.net}} to generate International Chemical Identifiers (InChIs).{{Cite journal |last1= Spjuth |first1= O. |last2= Berg |first2= A. |last3= Adams |first3= S. |last4= Willighagen |first4= E. L. |title= Applications of the InChI in cheminformatics with the CDK and Bioclipse |doi= 10.1186/1758-2946-5-14 |journal= Journal of Cheminformatics |volume= 5 |issue= 1 |pages= 14 |year= 2013 |pmid= 23497723 |pmc= 3674901 |doi-access= free }}
In April 2013, John Mayfield (né May) joined the ranks of release managers of the CDK, to handle the development branch.{{Cite web|url=http://chem-bla-ics.blogspot.nl/2013/04/john-may-is-now-release-manager-of-cdk.html|title = John May is now release manager of CDK 1.5.x}}
Library
The CDK is a library, instead of a user program. However, it has been integrated into various environments to make its functions available. CDK is currently used in several applications, including the programming language R,{{cite journal |last=Guha |first=R. |title=Chemical informatics functionality in R |journal=Journal of Statistical Software |volume=18 |issue=5 |pages=1–16 |year=2007 |doi=10.18637/jss.v018.i05|doi-access=free }} CDK-Taverna (a Taverna workbench plugin),{{cite journal |last1=Kuhn |first1=T. |last2=Willighagen |first2=E. L. |last3=Zielesny |first3=A. |last4=Steinbeck |first4=C. |title=CDK-Taverna: an open workflow environment for cheminformatics |journal=BMC Bioinformatics |volume=11 |pages=159 |year=2010 |pmid=20346188 |pmc=2862046 |doi=10.1186/1471-2105-11-159 |doi-access=free }} Bioclipse, PaDEL,{{Cite journal |doi= 10.1002/jcc.21707 |pmid= 21425294 |title= PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints |journal= Journal of Computational Chemistry |volume= 32 |issue= 7 |pages= 1466–74 |year= 2011 |last1= Yap |first1= C. W. |s2cid= 206032727 |doi-access= free }} and Cinfony.{{cite journal |doi= 10.1186/1752-153X-2-24 |pmid=19055766 |pmc=2646723 |title=Cinfony – combining Open Source cheminformatics toolkits behind a common interface |journal=Chemistry Central Journal |date=2008 |volume=2 |issue=1 |pages=24 |first=Noel M |last=O'Boyle |doi-access=free }} Also, CDK extensions exist for Konstanz Information Miner (KNIME){{Cite journal |last1= Beisken |first1= S. |last2= Meinl |first2= T. |last3= Wiswedel |first3= B. |last4= De Figueiredo |first4= L. F. |last5= Berthold |first5= M. |last6= Steinbeck |first6= C. |doi= 10.1186/1471-2105-14-257 |title= KNIME-CDK: Workflow-driven Cheminformatics |journal= BMC Bioinformatics |volume= 14 |pages= 257 |year= 2013 |pmid= 24103053 |pmc= 3765822 |doi-access= free }} and for Excel, called LICSS ([https://github.com/KevinLawson/excel-cdk]).{{Cite journal |last1= Lawson |first1= K. R. |last2= Lawson |first2= J. |doi= 10.1186/1758-2946-4-3 |title= LICSS - a chemical spreadsheet in microsoft excel |journal= Journal of Cheminformatics |volume= 4 |issue= 1 |pages= 3 |year= 2012 |pmid= 22301088 |pmc =3310842 |doi-access= free }}
In 2008, bits of GPL-licensed code were removed from the library. While those code bits were independent from the main CDK library, and no copylefting was involved, to reduce confusions among users, the ChemoJava project was instantiated.[https://github.com/egonw/chemojava ChemoJava]
Major features
=Chemoinformatics=
- 2D molecule editor and generator
- 3D geometry generation
- ring finding{{cite journal|last1=Berger|first1=Franziska|last2=Flamm|first2=Christoph|last3=Gleiss|first3=Petra M.|last4=Leydold|first4=Josef|last5=Stadler|first5=Peter F.|title=Counterexamples in Chemical Ring Perception|journal=Journal of Chemical Information and Computer Sciences|date=March 2004|volume=44|issue=2|pages=323–331|doi=10.1021/ci030405d|pmid=15032507|url=http://ul.qucosa.de/api/qucosa%3A33096/attachment/ATT-0/}}{{cite journal|last1=May|first1=John W|last2=Steinbeck|first2=Christoph|title=Efficient ring perception for the Chemistry Development Kit|journal=Journal of Cheminformatics|date=2014|volume=6|issue=1|pages=3|doi=10.1186/1758-2946-6-3|pmid=24479757|pmc=3922685 |doi-access=free }}
- substructure search using exact structures and Smiles arbitrary target specification (SMARTS) like query language
- QSAR descriptor calculation{{cite journal |last1=Steinbeck |first1=C. |last2=Hoppe |first2=C. |last3=Kuhn |first3=S. |last4=Floris |first4=M. |last5=Guha |first5=R. |last6=Willighagen |first6=E. L. |title=Recent developments of the chemistry development kit (CDK) — an open-source java library for chemo- and bioinformatics |journal=Curr. Pharm. Des. |volume=12 |issue=17 |pages=2111–20 |year=2006 |pmid=16796559 |doi=10.2174/138161206777585274 |url=http://www.benthamdirect.org/pages/content.php?CPD/2006/00000012/00000017/0005B.SGM |url-status=dead |archive-url=https://web.archive.org/web/20110725062137/http://www.benthamdirect.org/pages/content.php?CPD%2F2006%2F00000012%2F00000017%2F0005B.SGM |archive-date=2011-07-25 |hdl=2066/35445 |hdl-access=free }}
{{cite journal |last1=Guangli |first1=M. |last2=Yiyu |first2=C. |title=Predicting Caco-2 permeability using support vector machine and chemistry development kit |journal=J Pharm Pharm Sci |volume=9 |issue=2 |pages=210–21 |year=2006 |pmid=16959190 |url=https://www.ualberta.ca/~csps/JPPS9_2/Dr_Guangli/MS_538.htm }}
- fingerprint calculation, including the ECFP and FCFP fingerprints{{cite journal |last1=Clark |first1=Alex M |last2=Sarker |first2=Malabika |author-link2=Malabika Sarker |last3=Ekins |first3=Sean |year=2014 |title=New target prediction and visualization tools incorporating open source molecular fingerprints for TB Mobile 2.0 |journal=Journal of Cheminformatics |volume=6 |pages=38 |doi=10.1186/s13321-014-0038-2 |pmc=4190048 |pmid=25302078 |doi-access=free }}
- force field calculations
- many input-output chemical file formats, including simplified molecular-input line-entry system (SMILES), Chemical Markup Language (CML), and chemical table file (MDL)
- structure generators{{Cite journal |last1= Peironcely |first1= J. E. |last2= Rojas-Chertó |first2= M. |last3= Fichera |first3= D. |last4= Reijmers |first4= T. |last5= Coulier |first5= L. |last6= Faulon |first6= J. L. |last7= Hankemeier |first7= T. |doi= 10.1186/1758-2946-4-21 |title= OMG: Open molecule generator |journal= Journal of Cheminformatics |volume= 4 |issue= 1 |pages= 21 |year= 2012 |pmid= 22985496 |pmc= 3558358 |doi-access= free }}
- International Chemical Identifier support, via JNI-InChI
=Bioinformatics=
- protein active site detection
- cognate ligand detection{{Cite journal |doi= 10.1016/j.jmb.2006.09.041 |last1= Bashton |first1= M. |last2= Nobeli |first2= I. |last3= Thornton |first3= J. M. |title= Cognate Ligand Domain Mapping for Enzymes |journal= Journal of Molecular Biology |volume= 364 |issue= 4 |pages= 836–52 |year= 2006 |pmid= 17034815|doi-access= free }}
- metabolite identification{{Cite journal |last1= Rojas-Cherto |first1= M. |last2= Kasper |first2= P. T. |last3= Willighagen |first3= E. L. |last4= Vreeken |first4= R. J. |last5= Hankemeier |first5= T. |last6= Reijmers |first6= T. H. |doi= 10.1093/bioinformatics/btr409 |title= Elemental composition determination based on MSn |journal= Bioinformatics |volume= 27 |issue= 17 |pages= 2376–2383 |year= 2011 |pmid= 21757467 |doi-access= free }}
- pathway databases
- 2D and 3D protein descriptors{{cite journal |doi= 10.1186/s12859-015-0586-0 |title= ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins |journal= BMC Bioinformatics |volume= 16 |pages= 162 |year= 2015 |last1= Ruiz-Blanco |first1= Yasser B |last2= Paz |first2= Waldo |last3= Green |first3= James |last4= Marrero-Ponce |first4= Yovani |pmid=25982853 |pmc=4432771 |doi-access= free }}
=General=
- Python wrapper; see Cinfony
- Ruby wrapper
- active user community
See also
{{Portal|Free and open-source software}}
{{Scholia|topic}}
- Bioclipse – an Eclipse–RCP based chemo-bioinformatics workbench
- Blue Obelisk
- JChemPaint – Java 2D molecule editor, applet and application
- Jmol – Java 3D renderer, applet and application
- JOELib – Java version of Open Babel, OELib
- List of free and open-source software packages
- List of software for molecular mechanics modeling
References
{{Reflist|30em}}
External links
- {{Official website|github.com/cdk}}
- [https://archive.today/20121225081348/http://apps.sourceforge.net/mediawiki/cdk CDK Wiki] – the community wiki
- [https://web.archive.org/web/20101107071751/http://pele.farmbio.uu.se/planetcdk/ Planet CDK] - a blog planet
- [https://www.simolecule.com/cdkdepict/depict.html CDK Depict]
- [http://www.openscience.org/ OpenScience.org]
{{Chemistry software}}
Category:Bioinformatics software
Category:Chemistry software for Linux
Category:Computational chemistry software
Category:Free chemistry software
Category:Free software programmed in Java (programming language)