KNIME
{{Short description|Data science software}}
{{Use dmy dates|date=May 2025}}
{{Infobox software
| name = KNIME
| logo = KNIMELogoTM.svg
| logo upright = 1
| screenshot = Knime 5.2 GUI.png
| caption =
| released = {{Start date and age|2006}}
| latest release version = 5.4
| latest release date = {{Start date and age|2024|12|06|df=yes}}{{cite web |url=https://www.knime.com/blog/whats-new-knime-analytics-platform-54 |title=What's New in KNIME Analytics Platform 5.4 |website=KNIME.com |access-date=2024-12-07}}
| developer = KNIME
| programming language = Java
| operating system = Linux, macOS, Windows
| language = English
| genre = guided analytics, enterprise reporting, business intelligence, data mining, deep learning, data analysis, text mining, big data
| license = GNU General Public License
| website = {{URL|www.knime.com}}
}}
KNIME ({{IPAc-en|n|aɪ|m|audio=en-us-KNIME.oga}}), the Konstanz Information Miner,{{cite journal |last1=Berthold |first1=Michael R. |last2=Cebron |first2=Nicolas |last3=Dill |first3=Fabian |last4=Gabriel |first4=Thomas R. |last5=Kötter |first5=Tobias |last6=Meinl |first6=Thorsten |last7=Ohl |first7=Peter |last8=Thiel |first8=Kilian |last9=Wiswedel |first9=Bernd |title=KNIME - the Konstanz information miner |journal=ACM SIGKDD Explorations Newsletter |date=16 November 2009 |volume=11 |issue=1 |pages=26 |doi=10.1145/1656274.1656280 |s2cid=408188 |url=http://centaur.reading.ac.uk/6139/1/2006_DiFatta06-MASS-ISC.pdf}} is a data analytics, reporting and integrating platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of Analytics" concept. A graphical user interface and use of Java Database Connectivity (JDBC) allows assembly of nodes blending different data sources, including preprocessing (extract, transform, load (ETL)), for modeling, data analysis and visualization with minimal, or no, programming.{{Citation needed|date=May 2024}} It is free and open-source software released under a GNU General Public License.
Since 2006, KNIME has been used in pharmaceutical research,{{cite journal |first1=Abhishek |last1=Tiwari |first2=Arvind K.T. |last2=Sekhar |title=Workflow based framework for life science informatics |journal=Computational Biology and Chemistry |volume=31 |issue=5–6 |pages=305–319 |date=October 2007 |doi=10.1016/j.compbiolchem.2007.08.009 |pmid=17931570}} and in other areas including customer relationship management (CRM) and data analysis, business intelligence, text mining and financial data analysis.{{cite journal |first1=Coro |last1=Chasco |first2=Fernando H. |last2=Taques |first3=Flávio H. |last3=Taques |title=Integrating big data with KNIME as an alternative without programming code: an application to the PATSTAT patent database |journal=Journal of Geographical Systems |volume=27 |pages=31–61 |date=January 2025 |issue=1 |doi=10.1007/s10109-024-00445-0|doi-access=free|bibcode=2025JGS....27...31T }} Recently, attempts were made to use KNIME as robotic process automation (RPA) tool.{{cite web |url=https://www.airslate.com/bot/explore/knime-analytics-platform-bot |title=KNIME Analytics Platform Bot |access-date=2021-04-12 |archive-date=2021-06-03 |archive-url=https://web.archive.org/web/20210603175928/https://www.airslate.com/bot/explore/knime-analytics-platform-bot |url-status=live}},{{When|date=May 2024}}
KNIME's headquarters are based in Zurich, with other offices in Konstanz, Berlin, and Austin (USA).{{Citation needed|date=May 2024}}
History
Development of KNIME began in January 2004, with a team of software engineers at the University of Konstanz, as an open-source platform. The original team, headed by Michael Berthold, came from a Silicon Valley pharmaceutical industry software company. The initial goal was to create a modular, highly scalable and open data processing platform that allows easy integration of different data loading, processing, transforming, analyzing, and visual exploring modules, without focus on any one application area. The platform was intended for collaborating, research, and for integrating various other data analysis projects.{{cite web |url=https://www.knime.com/open-for-innovation |title=Open for Innovation |website=KNIME.com |url-status=live |archive-url=https://web.archive.org/web/20200310023208/https://www.knime.com/open-for-innovation |archive-date=2020-03-10 |access-date=2020-02-24}}
In 2006, the first version of KNIME was released. Several pharmaceutical companies began using KNIME, and several life science software vendors began integrating their tools into the platform.{{cite web |author1= |date= |url=http://www.tripos.com/knime |title=Tripos |website=Tripos, Inc. |url-status=dead |archive-url=https://web.archive.org/web/20110717104252/http://www.tripos.com/knime/ |archive-date=2011-07-17}}{{cite web |author1= |date=2005–2025 |url=https://www.schrodinger.com/platform/products/knime-extensions/ |title=KNIME Extensions: Modular, highly configurable framework for easy workflow automation and data analysis |website=Schrödinger, Inc. |url-status=live |archive-url=https://web.archive.org/web/20090925095936/http://www.schrodinger.com/ProductDescription.php?mID=6&sID=33&cID=0 |archive-date=2009-09-25 |access-date=22 May 2025}}[http://www.infocom.co.jp/bio/develop/jchemextension_en.html ChemAxon] {{webarchive|url=https://web.archive.org/web/20110717125630/http://www.infocom.co.jp/bio/develop/jchemextension_en.html |date=2011-07-17}}{{Cite web |url=http://enalosplus.novamechanics.com/ |title=NovaMechanics Ltd. |access-date=2017-11-14 |url-status=live |archive-url=https://web.archive.org/web/20230418200803/http://enalosplus.novamechanics.com/ |archive-date=2023-04-18}}{{Cite web |url=http://www.treweren.com/ |title=Treweren Consultants |access-date=2010-12-07 |url-status=live |archive-date=2017-04-24 |archive-url=https://web.archive.org/web/20170424123140/http://treweren.com/}} Later that year, after an article in the German magazine c't,Datenbank-Mosaik Data Mining oder die Kunst, sich aus Millionen Datensätzen ein Bild zu machen, c't 20/2006, S. 164ff, Heise Verlag. users from a number of other areas{{Cite web |url=http://tech.knime.org/forum |title=Forum auf der KNIME Webseite |access-date=2010-12-07 |archive-date=2017-04-26 |archive-url=https://web.archive.org/web/20170426063901/http://tech.knime.org/forum |url-status=live}}{{Cite web |url=http://www.pervasivedatarush.com/products/Pages/DataRushforKnime.aspx |title=Pervasive |access-date=2010-12-07 |url-status=dead |archive-url=https://web.archive.org/web/20100829035639/http://www.pervasivedatarush.com/products/Pages/DataRushforKnime.aspx |archive-date=2010-08-29}} joined ship. As of 2012, KNIME is in use by over 15,000 actual users (i.e. not counting downloads, but users regularly retrieving updates) in the life sciences and at banks, publishers, car manufacturer, telcos, consulting firms, and various other industries, and a large number of research groups, worldwide.{{Citation needed|date=May 2024}}{{Update span|date=May 2024}} Latest updates to KNIME Server and KNIME Big Data Extensions, provide support for Apache Spark 2.3, Parquet and HDFS-type storage.{{Citation needed|date=May 2024}}
For the sixth year in a row, KNIME has been placed as a leader for data science and machine learning platforms in Gartner's Magic Quadrant.{{Citation needed|date=May 2024}}{{Which year?|date=May 2024}}
Design philosophy, features
These are the design principles and features that KNIME software follows:{{cite journal |last1=Berthold |first1=Michael R. |last2=Cebron |first2=Nicolas |last3=Dill |first3=Fabian T. |last4=Gabriel |first4=homas R. |last5=Kötter |first5=Tobias |last6=Meinl |first6=Thorsten |last7=Ohl |first7=Peter |last8=Thiel |first8=Kilian |last9=Wiswedel |first9=Bernd |date=16 November 2009 |title=KNIME-the Konstanz information miner: version 2.0 and beyond |journal=ACM SIGKDD Explorations Newsletter |volume=11 |issue=1 |pages=26–31 |doi=10.1145/1656274.1656280}}
- Visual, Interactive Framework: KNIME Software prioritizes a user-friendly and intuitive approach to data analysis. This is achieved through a visual and interactive framework where data flows can be combined using a drag-and-drop interface. Users can develop customized and interactive applications by creating simple to advanced and highly-automated data pipelines. These may include, for example, access to databases, machine learning libraries, logic for workflow control (e.g., loops, switches, etc.), abstraction (e.g., interactive widgets), invocation, dynamic data apps, integrated deployment, or error handling.
- Modularity: processing units and data containers should remain independent of each other. This design choice enables easy distribution of computation and allows for the independent development of different algorithms. Data types within KNIME are encapsulated, meaning no types are predefined. This design choice facilitates adding new data types, and integrating them with extant types, while including type-specific renderers and comparators. This principle also enables inspecting results at the end of each single data operation.
- Extensibility: KNIME Software is designed to be extensible. Adding new processing nodes or views is made simple through a plug-in mechanism. This mechanism ensures that users can distribute their custom functionalities without the need for complicated install or uninstall procedures.
- Interleaving No-Code with Code: the platform supports integrating both visual programming (no-code) and script-based programming (e.g., Python, R, JavaScript) approaches to data analysis. This design principle is termed low-code.
- Automation and Scalability: for example, the use of parameterization via flow variables, or the encapsulation of workflow segments in components contribute to reduce manual work and errors in analyses. Further, the scheduling of workflow execution (available in KNIME Business Hub and KNIME Community Hub for Teams) reduces dependency on human resources. In terms of scalability, a few examples include the ability to handle large datasets (millions of rows), execute multiple processes simultaneously out of the box and reuse workflow segments.
- Full Usability: due to the open source nature, KNIME Analytics Platform provides free full usability with no limited trial periods.
Internals
KNIME allows users to visually create data flows (or pipelines), selectively execute some or all analysis steps, and later inspect the results, models, using interactive widgets and views. KNIME is written in Java and based on Eclipse. It makes use of an extension mechanism to add plug-ins providing added functions. The core version includes hundreds of modules for data integration (file input/output (I/O), database nodes supporting all common database management systems through JDBC or native connectors: SQLite, MS-Access, SQL Server, MySQL, Oracle, PostgreSQL, Vertica and H2), data transformation (filter, converter, splitter, combiner, joiner), and the commonly used methods of statistics, data mining, analysis and text analytics. Visualization is supported with the Report Designer extension. KNIME workflows can be used as data sets to create report templates that can be exported to document formats such as doc, ppt, xls, pdf and others. Other KNIME abilities are:
- KNIMEs core-architecture allows processing of large data volumes that are only limited by the available hard disk space (not limited to the available RAM). E.g., KNIME allows analyzing 300 million customer addresses, 20 million cell images, and 10 million molecular structures.
- Added plug-ins allow integrating methods for text mining, image mining, time series analysis, and networking.
- KNIME integrates various other open-source projects, e.g., machine learning algorithms from Weka, H2O.ai, Keras, Spark, the R project and LIBSVM; plotly, JFreeChart, ImageJ, and the Chemistry Development Kit.{{Cite journal |last1=Beisken |first1=S. |last2=Meinl |first2=T. |last3=Wiswedel |first3=B. |last4=De Figueiredo |first4=L. F. |last5=Berthold |first5=M. |last6=Steinbeck |first6=C. |doi=10.1186/1471-2105-14-257 |title=KNIME-CDK: Workflow-driven Cheminformatics |journal=BMC Bioinformatics |volume=14 |pages=257 |year=2013 |pmid=24103053| pmc=3765822 |doi-access=free}}
KNIME is implemented in Java, allows for wrappers calling other code, in addition to providing nodes that allow it to run Java, Python, R, Ruby and other code fragments.{{Citation needed|date=May 2024}}
License
In 2024, KNIME version 5.3 is released under the same GPLv3 license as previous versions.{{cite web |author1= |date=2024-08-03 |url=https://www.knime.com/downloads/full-license |title=KNIME 5.3 License Terms and Conditions |url-status=live |archive-url=https://web.archive.org/web/20240803064319/https://www.knime.com/downloads/full-license |archive-date=2024-08-03 |access-date=2024-08-03}}
As of version 2.1, KNIME is released under the GPLv3 license, with an exception that allows others to use the well-defined node application programming interface (API) to add proprietary extensions.[http://knime.org/blog/knime-210-released KNIME 2.1.0 released] {{webarchive|url=https://web.archive.org/web/20100417223831/http://knime.org/blog/knime-210-released |date=2010-04-17}}{{Update span|date=May 2024}} This allows commercial software vendors to add wrappers calling their tools from KNIME.
Courses
KNIME allows the performance of data analysis without programming skills. Several free, online courses are provided.{{cite web |url=https://www.knime.com/learning |title=KNIME Learning Center |publisher=KNIME |access-date=December 4, 2024 |archive-date=November 23, 2024 |archive-url=https://web.archive.org/web/20241123135047/https://www.knime.com/learning |url-status=live}}
See also
- Weka – machine-learning algorithms that can be integrated in KNIME
- ELKI – data mining framework with many clustering algorithms
- Keras – neural network library
- Orange – an open-source data visualization, machine learning and data mining toolkit with a similar visual programming front-end
- List of free and open-source software packages
References
{{Reflist}}
External links
- {{Official website|www.knime.org}}
- {{GitHub|knime|KNIME}}
- [http://hub.knime.com/ KNIME Hub] - Official community platform to search and find nodes, components, workflows and collaborate on new solutions
- [https://nodepit.com/ Nodepit] - KNIME node collection supporting versioning and node installation
{{DEFAULTSORT:Knime}}
Category:Data mining and machine learning software
Category:Extract, transform, load tools
Category:Free bioinformatics software
Category:Free software programmed in Java (programming language)