Collective Knowledge (software)

{{Short description|Open-source framework for researchers}}

{{Infobox software

| name = Collective Knowledge (CK)

| logo = Collective Knowledge and cTuning logo.png

| developer = Grigori Fursin and the cTuning foundation

| released = {{start date and age|2015|df=yes}}

| latest release version = 2.6.3 (discontinued for the new Collective Mind framework{{citation |url=https://pypi.org/project/ck |title=CK package at PYPI}})

| latest release date = {{release date|2022|11|30}}

| operating system = Linux, Mac OS X, Microsoft Windows, Android

| genre = Knowledge management, FAIR data, MLOps, Data management, [https://cTuning.org/ae Artifact Evaluation], Package management system, Scientific workflow system, DevOps, Continuous integration, Reproducibility

| programming language = Python

| license = Apache License for version 2.0 and BSD License 3-clause for version 1.0

| website = {{URL|https://github.com/ctuning/ck}}, {{URL|https://cknow.io}}

}}

The Collective Knowledge (CK) project is an open-source framework and repository to enable collaborative, reproducible and sustainable research and development of complex computational systems.

{{cite conference

| first=Grigori

| last=Fursin

| author-link=Grigori Fursin

| title=Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIs

| conference=Philosophical Transactions of the Royal Society

| date=29 March 2021

| doi=10.1098/rsta.2020.0211

| arxiv=2011.01149

}} CK is a small, portable, customizable and decentralized infrastructure helping researchers and practitioners:

  • share their code, data and models as reusable Python components and automation actions[https://cKnowledge.io/actions Reusable CK components and actions to automate common research tasks] with unified JSON API, JSON meta information, and a UID based on FAIR principles
  • assemble portable workflows from shared components (such as multi-objective autotuning and Design space exploration)
  • automate, crowdsource and reproduce benchmarking of complex computational systems{{citation |url=https://cKnowledge.io |title=Online repository with reproduced results}}
  • unify predictive analytics (scikit-learn, R, DNN)
  • enable reproducible and interactive papers{{citation |url=https://cKnowledge.io/reproduced-papers |title=Index of reproduced papers}}

Notable usages

  • ARM uses CK to accelerate computer engineering{{citation |url=https://github.com/ctuning/ck/wiki/Demo-ARM-TechCon'16 |title=ARM TechCon'16 presentation "Know Your Workloads: Design more efficient systems!"|author=Ed Plowman |author2=Grigori Fursin}}
  • Several ACM-sponsored conferences use CK to automate the Artifact Evaluation process{{citation |url=http://cTuning.org/ae |title=Artifact Evaluation for systems and machine learning conferences}}{{citation |url= https://learning.acm.org/techtalks/reproducibility | title=ACM TechTalk about reproducing 150 research papers and testing them in the real world}}
  • Imperial College (London) uses CK to automate and crowdsource compiler bug detection{{citation |url=http://es.iet.unipi.it/tetracom/content/uploads/Posters/TTP35.pdf |title=EU TETRACOM project to combine CK and CLSmith |access-date=2016-09-15 |archive-url=https://web.archive.org/web/20170305003204/http://es.iet.unipi.it/tetracom/content/uploads/Posters/TTP35.pdf |archive-date=2017-03-05 |url-status=dead }}
  • Researchers from the University of Cambridge used CK to help the community reproduce results of their publication in the International Symposium on Code Generation and Optimization (CGO'17) during Artifact Evaluation{{citation |url=https://github.com/SamAinsworth/reproduce-cgo2017-paper |title=Artifact Evaluation Reproduction for "Software Prefetching for Indirect Memory Accesses", CGO 2017, using CK|date=16 October 2022 }}
  • General Motors (USA) uses CK to crowd-benchmark convolutional neural network optimizations {{citation |url=http://github.com/dividiti/ck-caffe |title=GitHub development website for CK-powered Caffe|date=11 October 2022 }}{{citation |url=http://cknowledge.org/android-apps.html |title=Open-source Android application to let the community participate in collaborative benchmarking and optimization of various DNN libraries and models}}
  • The Raspberry Pi Foundation and the cTuning foundation released a CK workflow with a reproducible "live" paper to enable collaborative research into multi-objective autotuning and machine learning techniques{{citation |url=https://cknowledge.io/report/rpi3-crowd-tuning-2017-interactive|title=Live paper with reproducible experiments to enable collaborative research into multi-objective autotuning and machine learning techniques}}
  • IBM uses CK to reproduce quantum results from nature{{citation |url=https://www.linkedin.com/pulse/reproducing-quantum-results-from-nature-how-hard-could-lickorish |title=Reproducing quantum results from nature – how hard could it be?}}
  • CK is used to automate MLPerf benchmark{{citation | url=https://cknowledge.io/c/solution/demo-obj-detection-coco-tf-cpu-benchmark-linux-portable-workflows |title=MLPerf crowd-benchmarking}}{{citation | url=https://github.com/mlcommons/ck/tree/master/ck/docs/mlperf-automation |title=MLPerf inference benchmark automation guide|date=17 October 2022 }}

Portable package manager for portable workflows

CK has an integrated cross-platform package manager with Python scripts, JSON API and JSON meta-description to automatically rebuild software environment on a user machine required to run a given research workflow.{{citation |url=https://cKnowledge.io/packages |title=List of shared CK packages}}

Reproducibility of experiments

CK enables reproducibility of experimental results via community involvement similar to Wikipedia and physics. Whenever a new workflow with all components is shared via GitHub, anyone can try it on a different machine, with different environment and using slightly different choices (compilers, libraries, data sets). Whenever an unexpected or wrong behavior is encountered, the community explains it, fixes components and shares them back as described in.

References

{{Reflist|30em}}