Nextflow

{{Multiple issues|

}}

{{Infobox software

| name = Nextflow

| logo = File:Logo Nextflow (new).png

| caption =

| author = Paolo Di Tommaso

| developer = Seqera Labs, Centre for Genomic Regulation

| released = [https://github.com/nextflow-io/nextflow/releases/tag/v0.2.2 {{Start date and age|2013|04|09}}]

| latest release version = v23.10.1

| latest release date = {{Start date and age|2024|01|12}}

| latest preview version = v24.02.0-edge

| latest preview date = {{Start date and age|2024|03|09}}

| repo = https://github.com/nextflow-io/nextflow

| programming language = Groovy, Java

| operating system = Linux, macOS, WSL

| size =

| genre = Scientific workflow system, Dataflow programming, Big data

| license = Apache License 2.0

| website = {{URL|https://nextflow.io}}

}}

Nextflow is a scientific workflow system predominantly used for bioinformatic data analysis. It establishes standards for programmatically creating a series of dependent computational steps and facilitates their execution on various local and cloud resources.{{cite book |doi=10.1007/978-1-4939-9074-0_24 |pmc=7613310 |pmid=31278683 |chapter=Scalable Workflows and Reproducible Data Analysis for Genomics |title=Evolutionary Genomics |series=Methods in Molecular Biology |year=2019 |last1=Strozzi |first1=Francesco |last2=Janssen |first2=Roel |last3=Wurmus |first3=Ricardo |last4=Crusoe |first4=Michael R. |last5=Githinji |first5=George |last6=Di Tommaso |first6=Paolo |last7=Belhachemi |first7=Dominique |last8=Möller |first8=Steffen |last9=Smant |first9=Geert |last10=De Ligt |first10=Joep |last11=Prins |first11=Pjotr |volume=1910 |pages=723–745 |isbn=978-1-4939-9073-3 }}{{cite journal |doi=10.1093/bib/bbaa116 |pmid=34020539 |title=Comparison of high-throughput single-cell RNA sequencing data processing pipelines |year=2021 |last1=Gao |first1=Mingxuan |last2=Ling |first2=Mingyi |last3=Tang |first3=Xinwei |last4=Wang |first4=Shun |last5=Xiao |first5=Xu |last6=Qiao |first6=Ying |last7=Yang |first7=Wenxian |last8=Yu |first8=Rongshan |journal=Briefings in Bioinformatics |volume=22 |issue=3 }}

Purpose

Many scientific data analyses require a significant amount of sequential processing steps. Custom scripts may suffice when developing new methods or infrequently running particular analyses, but scale poorly to complex task successions or many samples.{{cite journal |last1=Wratten |first1=Laura |last2=Wilm |first2=Andreas |last3=Göke |first3=Jonathan |title=Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers |journal=Nature Methods |volume=18 |issue=10 |pages=1161–1168 |date=October 2021 |pmid=34556866 |doi=10.1038/s41592-021-01254-9 |s2cid=237616424 }}{{cite journal |doi=10.3390/genes13122280 |pmc=9777648 |pmid=36553546 |doi-access=free |title=Comparison of Metagenomics and Metatranscriptomics Tools: A Guide to Making the Right Choice |year=2022 |last1=Terrón-Camero |first1=Laura C. |last2=Gordillo-González |first2=Fernando |last3=Salas-Espejo |first3=Eduardo |last4=Andrés-León |first4=Eduardo |journal=Genes |volume=13 |issue=12 |page=2280 }}{{cite journal |doi=10.3389/fgene.2019.00614 |pmc=6609566 |pmid=31316552 |doi-access=free |title=Pipeliner: A Nextflow-Based Framework for the Definition of Sequencing Data Processing Pipelines |year=2019 |last1=Federico |first1=Anthony |last2=Karagiannis |first2=Tanya |last3=Karri |first3=Kritika |last4=Kishore |first4=Dileep |last5=Koga |first5=Yusuke |last6=Campbell |first6=Joshua D. |last7=Monti |first7=Stefano |journal=Frontiers in Genetics |volume=10 |page=614 }}

Scientific workflow systems like Nextflow allow formalizing an analysis as a data analysis pipeline. Pipelines, also known as workflows, specify the order and conditions of computing steps. They are accomplished by special purpose programs, so-called workflow executors, which ensure predictable and reproducible behavior in various computing environments.{{cite journal |doi=10.1093/nar/gkac286 |pmc=9252820 |pmid=35536253 |title=BioUML—towards a universal research platform |year=2022 |last1=Kolpakov |first1=Fedor |last2=Akberdin |first2=Ilya |last3=Kiselev |first3=Ilya |last4=Kolmykov |first4=Semyon |last5=Kondrakhin |first5=Yury |last6=Kulyashov |first6=Mikhail |last7=Kutumova |first7=Elena |last8=Pintus |first8=Sergey |last9=Ryabova |first9=Anna |last10=Sharipov |first10=Ruslan |last11=Yevshin |first11=Ivan |last12=Zhatchenko |first12=Sergey |last13=Kel |first13=Alexander |journal=Nucleic Acids Research |volume=50 |issue=W1 |pages=W124–W131 }}{{cite journal |doi=10.1186/s12864-020-6714-x |pmc=7168977 |pmid=32306927 |title=Dolphin Next: A distributed data processing platform for high throughput genomics |year=2020 |last1=Yukselen |first1=Onur |last2=Turkyilmaz |first2=Osman |last3=Ozturk |first3=Ahmet Rasit |last4=Garber |first4=Manuel |last5=Kucukural |first5=Alper |journal=BMC Genomics |volume=21 |issue=1 |page=310 |doi-access=free }}{{cite journal |doi=10.1093/nar/gkab346 |pmc=8218198 |pmid=33978761 |title=The Dockstore: Enhancing a community platform for sharing reproducible and accessible computational protocols |year=2021 |last1=Yuen |first1=Denis |last2=Cabansay |first2=Louise |last3=Duncan |first3=Andrew |last4=Luu |first4=Gary |last5=Hogue |first5=Gregory |last6=Overbeck |first6=Charles |last7=Perez |first7=Natalie |last8=Shands |first8=Walt |last9=Steinberg |first9=David |last10=Reid |first10=Chaz |last11=Olunwa |first11=Nneka |last12=Hansen |first12=Richard |last13=Sheets |first13=Elizabeth |last14=o'Farrell |first14=Ash |last15=Cullion |first15=Kim |last16=o'Connor |first16=Brian D |last17=Paten |first17=Benedict |last18=Stein |first18=Lincoln |journal=Nucleic Acids Research |volume=49 |issue=W1 |pages=W624–W632 }}

Workflow systems also provide built-in solutions to common challenges of workflow development, such as the application to multiple samples, the validation of input and intermediate results, conditional execution of steps, error handling, and report generation. Advanced features of workflow systems may also include scheduling capabilities, graphical user interfaces for monitoring workflow executions, and the management of dependencies by containerizing the whole workflow or its components.{{cite journal |doi=10.1038/s41598-021-99288-8 |pmc=8569008 |pmid=34737383 |title=Design considerations for workflow management systems use in production genomics research and the clinic |year=2021 |last1=Ahmed |first1=Azza E. |last2=Allen |first2=Joshua M. |last3=Bhat |first3=Tajesvi |last4=Burra |first4=Prakruthi |last5=Fliege |first5=Christina E. |last6=Hart |first6=Steven N. |last7=Heldenbrand |first7=Jacob R. |last8=Hudson |first8=Matthew E. |last9=Istanto |first9=Dave Deandre |last10=Kalmbach |first10=Michael T. |last11=Kapraun |first11=Gregory D. |last12=Kendig |first12=Katherine I. |last13=Kendzior |first13=Matthew Charles |last14=Klee |first14=Eric W. |last15=Mattson |first15=Nate |last16=Ross |first16=Christian A. |last17=Sharif |first17=Sami M. |last18=Venkatakrishnan |first18=Ramshankar |last19=Fadlelmola |first19=Faisal M. |last20=Mainzer |first20=Liudmila S. |journal=Scientific Reports |volume=11 |issue=1 |page=21680 |bibcode=2021NatSR..1121680A }}{{cite journal |doi=10.1186/s12859-018-2446-1 |pmc=6264621 |pmid=30486782 |title=Developing reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics |year=2018 |last1=Baichoo |first1=Shakuntala |last2=Souilmi |first2=Yassine |last3=Panji |first3=Sumir |last4=Botha |first4=Gerrit |last5=Meintjes |first5=Ayton |last6=Hazelhurst |first6=Scott |last7=Bendou |first7=Hocine |last8=Beste |first8=Eugene de |last9=Mpangase |first9=Phelelani T. |last10=Souiai |first10=Oussema |last11=Alghali |first11=Mustafa |last12=Yi |first12=Long |last13=o'Connor |first13=Brian D. |last14=Crusoe |first14=Michael |last15=Armstrong |first15=Don |last16=Aron |first16=Shaun |last17=Joubert |first17=Fourie |last18=Ahmed |first18=Azza E. |last19=Mbiyavanga |first19=Mamana |last20=Heusden |first20=Peter van |last21=Magosi |first21=Lerato E. |last22=Zermeno |first22=Jennie |last23=Mainzer |first23=Liudmila Sergeevna |last24=Fadlelmola |first24=Faisal M. |last25=Jongeneel |first25=C. Victor |last26=Mulder |first26=Nicola |journal=BMC Bioinformatics |volume=19 |issue=1 |page=457 |doi-access=free }}

Typically, scientific workflow systems initially present a steep learning challenge as all their features and complexities are built on in addition to the actual analysis. However, the standards and abstraction imposed by workflow systems ultimately improve the traceability of analysis steps, which is particularly relevant when collaborating on pipeline development, as is customary in scientific settings.{{cite journal |doi=10.1371/journal.pcbi.1008622 |pmc=7906312 |pmid=33630841 |title=Using prototyping to choose a bioinformatics workflow management system |year=2021 |last1=Jackson |first1=Michael |last2=Kavoussanakis |first2=Kostas |last3=Wallace |first3=Edward W. J. |journal=PLOS Computational Biology |volume=17 |issue=2 |pages=e1008622 |bibcode=2021PLSCB..17E8622J |doi-access=free }}

Characteristics

= Specification of workflows =

In Nextflow, pipelines are constructed from individual processes that work in parallel to perform computational tasks. Each process is defined with input requirements and output declarations. Instead of running in a fixed sequence, a process starts executing when all its input requirements are fulfilled. By specifying the output of one process as the input of another, a logical and sequential connection between processes is established.{{cite journal |doi=10.1051/jbio/2017029 |pmid=29412134 |title=Nextflow : Un outil efficace pour l'amélioration de la stabilité numérique des calculs en analyse génomique |year=2017 |last1=Tommaso |first1=Paolo Di |last2=Floden |first2=Evan W. |last3=Magis |first3=Cedrik |last4=Palumbo |first4=Emilio |last5=Notredame |first5=Cedric |journal=Biologie Aujourd'hui |volume=211 |issue=3 |pages=233–237 }}

This reactive implementation is a key design pattern of Nextflow and is also known as the functional dataflow model.{{Cite web |title=Nextflow Documentation - Channels |url=https://www.nextflow.io/docs/latest/channel.html?highlight=dataflow |access-date=6 June 2022 |website=docs.nextflow.io}}

Processes and entire workflows are programmed in a domain-specific language (DSL) which is provided by Nextflow which is based on Apache Groovy.{{Cite web |title=Nextflow Documentation - Domain Specific Language (DSL) 2 |url=https://www.aacc.org/cln/articles/2020/march/next-generation-sequencing-bioinformatics-pipelines |access-date=6 June 2022 |website=docs.nextflow.io|date=March 2020 }} While Nextflow's DSL is used to declare the workflow logic, developers can use their scripting language of choice within a process and mix multiple languages in a workflow. It is also possible to port existing scripts and workflows to Nextflow. Supported scripting languages include bash, csh, ksh, Python, Ruby, and R. Any scripting language that uses the standard Unix shebang declaration (#!/bin/bash) is compatible with Nextflow.

Below is an example of a workflow consisting of only one process:

process hello_world {

input:

val greeting

output:

path "${greeting}.txt"

script:

"""

echo "${greeting} World!" > ${greeting}.txt

"""

}

workflow {

Channel.of("Hello", "Ciao", "Hola", "Bonjour") | hello_world

}

To enable easy collaboration on workflows, Nextflow natively support for source-code management systems and DevOps platforms including GitHub, GitLab, and others.{{Cite web |title=Nextflow Documentation - Pipeline Sharing |url=https://www.nextflow.io/docs/latest/sharing.html?highlight=scm |access-date=6 June 2022 |website=docs.nextflow.io}}

= Execution of workflows =

Nextflow's DSL allows workflows to be deployed and run across different computing environments without having to modify the pipeline code. Nextflow comes with specific executors for various platforms, including major cloud providers. It supports the following environments for pipeline execution:{{Cite web |title=Nextflow Documentation - Executors |url=https://www.nextflow.io/docs/latest/executor.html |access-date=6 June 2022 |website=docs.nextflow.io}}

Local: This is the default executor where Nextflow pipelines run on Linux or Mac OS, and the execution occurs on the computer where the pipeline is launched.
HPC workload managers: Nextflow supports workload managers such as Slurm, SGE, LSF, Moab, PBS Pro, PBS/Torque, HTCondor, NQSII, and OAR.
Kubernetes: Nextflow can be used with local or cloud-based Kubernetes implementations (GKE, EKS, or AKS).
Cloud batch services: It is compatible with AWS Batch{{Cite web |title=Nextflow Documentation - Amazon Cloud |url=https://www.nextflow.io/docs/latest/aws.html |access-date=6 June 2022 |website=docs.nextflow.io}} and Azure Batch{{Cite web |title=Nextflow Documentation - Azure Cloud |url=https://www.nextflow.io/docs/latest/azure.html |access-date=6 June 2022 |website=docs.nextflow.io}}
Other environments: Nextflow can also be used with Apache Ignite, Google Life Sciences, and various container frameworks for portability.{{Cite web |title=Nextflow Documentation - Google Cloud |url=https://www.nextflow.io/docs/latest/google.html |access-date=6 June 2022 |website=docs.nextflow.io}}

= Containers for portability across computing environments =

In Nextflow, there is tight integration with software containers. Workflows and single processes can utilize containers for their execution across different computing environments, eliminating the need for complex installation and configuration routines.{{Cite web |last=Di Tomasso |first=Paolo |date=14 October 2021 |title=The story of Nextflow: Building a modern pipeline orchestrator |url=https://elifesciences.org/labs/d193babe/the-story-of-nextflow-building-a-modern-pipeline-orchestrator |access-date=6 June 2022 |website=eLifeSciences.org}}

Nextflow supports container frameworks such as Docker, Singularity, Charliecloud, Podman, and Shifter. These containers can be automatically retrieved from external repositories when the pipeline is executed. Additionally, it was revealed at Nextflow Summit 2022 that future versions of Nextflow will support a dedicated container provisioning service for better integration of customized containers into workflows.{{Cite web |title=Nextflow Documentation - Containers |url=https://www.nextflow.io/docs/latest/container.html |access-date=7 June 2022 |website=docs.nextflow.io}}{{Cite web |last=Di Tommaso |first=Paolo |date=13 October 2022 |title=Nextflow and the future of containers |url=https://www.youtube.com/watch?v=PTbiCVq0-sE?t=661 |access-date=17 November 2022 |website=YouTube}}

Developmental history

Nextflow was originally developed at the Centre for Genomic Regulation in Spain and released as an open-source project on GitHub in July 2013.{{Cite web |title=Release Version 0.3.0 · nextflow-io/nextflow |url=https://github.com/nextflow-io/nextflow/releases/tag/v0.3.0 |access-date=31 May 2022 |website=GitHub |language=en}} In October 2018, the project license for Nextflow was changed from GPLv3 to Apache 2.0.{{Cite web |last=Di Tomasso |first=Paolo |date=24 October 2018 |title=Goodbye zero, Hello Apache! |url=https://www.nextflow.io/blog/2018/goodbye-zero-hello-apache.html |access-date=7 June 2022 |website=Nextflow.io/blog}}

In July 2018, Seqera Labs was launched as a spin-off from the Centre for Genomic Regulation. The company employs many of Nextflow's core developers and maintainers and provides commercial services and consulting with a focus on Nextflow. {{Cite web |last=Di Tommaso |first=Paolo |date=8 October 2019 |title=Introducing Nextflow Tower - Seamless monitoring of data analysis workflows from anywhere |url=https://seqera.io/blog/introducing-nextflow-tower/ |access-date=7 June 2022 |website=Seqera.IO}}

In July 2020, a major extension and revision of Nextflow's domain-specific language was introduced to allow for sub-workflows and additional improvements.{{Cite web |last=Di Tommaso |first=Paolo |date=24 July 2020 |title=Nextflow DSL 2 is here! |url=https://nextflow.io/blog/2020/dsl2-is-here.html |access-date=7 June 2022 |website=Nextflow.IO/blog}} In the same year, monthly downloads of Nextflow reached approximately 55,000.

Adoption and reception

= The ''nf-core'' community =

The nf-core project has been adopted by several sequencing facilities including the Centre for Genomic Regulation,{{Cite journal |last1=Di Tomasso |first1=Paolo |last2=Chatzou |first2=Maria |last3=Floden |first3=Evan |last4=Prieto Barja |first4=Pablo |last5=Palumbo |first5=Emilio |last6=Notredame |first6=Cedric |date=11 April 2017 |title=Nextflow enables reproducible computational workflows |url=https://www.nature.com/articles/nbt.3820 |journal=Nature Biotechnology |volume=35 |issue=4 |pages=316–319 |doi=10.1038/nbt.3820 |pmid=28398311 |s2cid=9690740 |access-date=7 June 2022|url-access=subscription }} the Quantitative Biology Center in Tübingen, the Francis Crick Institute, A*STAR Genome Institute of Singapore, and the Swedish National Genomics Infrastructure as their preferred Scientific workflow system. These facilities have collaborated to share, harmonize, and curate bioinformatic pipelines,{{cite journal |last1=Fellows Yates |first1=James A. |last2=Lamnidis |first2=Thiseas C. |last3=Borry |first3=Maxime |last4=Andrades Valtueña |first4=Aida |last5=Fagernäs |first5=Zandra |last6=Clayton |first6=Stephen |last7=Garcia |first7=Maxime U. |last8=Neukamm |first8=Judith |last9=Peltzer |first9=Alexander |year=2021 |title=Reproducible, portable, and efficient ancient genome reconstruction with nf-core/Eager |journal=PeerJ |volume=9 |pages=e10947 |doi=10.7717/peerj.10947 |pmc=7977378 |pmid=33777521 |doi-access=free}}{{cite journal |last1=Krakau |first1=Sabrina |last2=Straub |first2=Daniel |last3=Gourlé |first3=Hadrien |last4=Gabernet |first4=Gisela |last5=Nahnsen |first5=Sven |year=2022 |title=Nf-core/Mag: A best-practice pipeline for metagenome hybrid assembly and binning |journal=Nar Genomics and Bioinformatics |volume=4 |pages=lqac007 |doi=10.1093/nargab/lqac007 |pmc=8808542 |pmid=35118380}}{{cite journal |last1=Garcia |first1=Maxime |last2=Juhos |first2=Szilveszter |last3=Larsson |first3=Malin |last4=Olason |first4=Pall I. |last5=Martin |first5=Marcel |last6=Eisfeldt |first6=Jesper |last7=Dilorenzo |first7=Sebastian |last8=Sandgren |first8=Johanna |last9=Díaz De Ståhl |first9=Teresita |last10=Ewels |first10=Philip |last11=Wirta |first11=Valtteri |last12=Nistér |first12=Monica |last13=Käller |first13=Max |last14=Nystedt |first14=Björn |year=2020 |title=Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants |journal=F1000Research |volume=9 |page=63 |doi=10.12688/f1000research.16665.2 |pmc=7111497 |pmid=32269765 |doi-access=free}}{{cite journal |last1=Digby |first1=Barry |last2=Finn |first2=Stephen P. |last3=ó Broin |first3=Pilib |year=2023 |title=Nf-core/Circrna: A portable workflow for the quantification, miRNA target prediction and differential expression analysis of circular RNAs |journal=BMC Bioinformatics |volume=24 |issue=1 |page=27 |doi=10.1186/s12859-022-05125-8 |pmc=9875403 |pmid=36694127 |doi-access=free}} leading to the creation of the nf-core project.{{Cite web |last1=Ewels |first1=Philip |last2=Peltzer |first2=Alexander |last3=Fillinger |first3=Sven |last4=Alneberg |first4=Johannes |last5=Patel |first5=Harshil |last6=Wilm |first6=Andreas |last7=Garcia |first7=Maxime Ulysse |last8=Di Tommaso |first8=Paolo |last9=Nahnsen |first9=Sven |date=April 1, 2019 |title=Nf-core: Community curated bioinformatics pipelines |url=https://www.researchgate.net/publication/332446405 |access-date=June 30, 2022 |website=Research Gate}} Led by Phil Ewels, at the Swedish National Genomics Infrastructure at the time,{{Cite web |last=Zapata Garin |first=Claire-Alix |title=nf-core: a community-driven initiative to standardise Nextflow-based pipelines |url=https://www.lifebit.ai/blog/nf-core-a-community-driven-initiative-to-standardise-nextflow-based-pipelines/ |access-date=June 30, 2022 |website=Lifebit.ai}}{{Cite web |date=February 14, 2020 |title=The nf-core community provides computational pipelines |url=https://www.scilifelab.se/news/nf-core-community-provides-computational-pipelines/ |access-date=June 30, 2022 |website=SciLifeLab}} nf-core focuses on ensuring reproducibility and portability of pipelines across different hardware, operating systems, and software versions. In July 2020, Nextflow and nf-core received a grant from the Chan Zuckerberg Initiative in recognition of their importance as open-source software.{{Cite web |date=27 July 2020 |title=Nextflow and nf-core: Reproducible Workflows for the Scientific Community |url=https://chanzuckerberg.com/eoss/proposals/nextflow-and-nf-core-reproducible-workflows-for-the-scientific-community/ |access-date=15 June 2022 |website=Chan Zuckerberg Initiative}} As of 2024, the nf-core organization hosts 117 Nextflow pipelines for the biosciences and more than 1382 process modules. With more than 1200 developers and scientists involved, it is the largest collaborative effort and community for developing bioinformatic data analysis pipelines.{{Cite web |title=nf-core Github organization |url=https://github.com/nf-core |access-date=12 November 2024 |website=GitHub}}

= By domain and research subject =

Nextflow is the preferred tool for processing sequencing data and conducting genomic data analysis by domain and research subject. Over the past five years, numerous pipelines have been published for various applications and analyses in the genomics field.

One notable use case is its role in pathogen surveillance during the COVID-19 pandemic.{{Cite web |last=Floden |first=Evan |date=5 November 2021 |title=Genetic Sequencing Will Enable Us To Win The Global Battle Against COVID-19 |url=https://www.bio-itworld.com/news/2021/11/05/genetic-sequencing-will-enable-us-to-win-the-global-battle-against-covid-19}} Swift and highly automated processing of raw data, variant analysis, and lineage designation were essential for monitoring the emergence of new virus variants and tracing their global spread. Nextflow-enabled pipelines played a crucial role in this effort.{{cite journal |doi=10.1093/cid/ciab785 |pmc=8634317 |pmid=34850839 |title=Overcoming Data Bottlenecks in Genomic Pathogen Surveillance |year=2021 |last1=Afolayan |first1=Ayorinde O. |last2=Bernal |first2=Johan Fabian |last3=Gayeta |first3=June M. |last4=Masim |first4=Melissa L. |last5=Shamanna |first5=Varun |last6=Abrudan |first6=Monica |last7=Abudahab |first7=Khalil |last8=Argimón |first8=Silvia |last9=Carlos |first9=Celia C. |last10=Sia |first10=Sonia |last11=Ravikumar |first11=Kadahalli L. |last12=Okeke |first12=Iruka N. |last13=Donado-Godoy |first13=Pilar |last14=Aanensen |first14=David M. |last15=Underwood |first15=Anthony |last16=Harste |first16=Harry |last17=Kekre |first17=Mihir |last18=Muddyman |first18=Dawn |last19=Taylor |first19=Ben |last20=Wheeler |first20=Nicole |last21=David |first21=Sophia |last22=Arevalo |first22=Alejandra |last23=Fernanda Valencia |first23=Maria |last24=Osma Castro |first24=Erik C D. |last25=Nagaraj |first25=Geetha |last26=Govindan |first26=Vandana |last27=Prabhu |first27=Akshata |last28=Sravani |first28=D. |last29=Shincy |first29=M. R. |last30=Rose |first30=Steffimole |journal=Clinical Infectious Diseases |volume=73 |issue=Suppl_4 |pages=S267–S274 |display-authors=1 }}

{{cite journal |doi=10.1371/journal.pone.0262953 |pmc=8791494 |pmid=35081137 |doi-access=free |title=ASPICov: An automated pipeline for identification of SARS-Cov2 nucleotidic variants |year=2022 |last1=Tilloy |first1=Valentin |last2=Cuzin |first2=Pierre |last3=Leroi |first3=Laura |last4=Guérin |first4=Emilie |last5=Durand |first5=Patrick |last6=Alain |first6=Sophie |journal=PLOS ONE |volume=17 |issue=1 |pages=e0262953 |bibcode=2022PLoSO..1762953T }}

{{cite journal |doi=10.1128/mSystems.00190-20 |pmc=7406220 |pmid=32753501 |title=Bactopia: A Flexible Pipeline for Complete Analysis of Bacterial Genomes |year=2020 |last1=Petit |first1=Robert A. |last2=Read |first2=Timothy D. |journal=mSystems |volume=5 |issue=4 }}

{{cite journal |doi=10.1128/msystems.00741-22 |pmc=9599279 |pmid=36069454 |title=Meta Phage: An Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data |year=2022 |last1=Pandolfo |first1=Mattia |last2=Telatin |first2=Andrea |last3=Lazzari |first3=Gioele |last4=Adriaenssens |first4=Evelien M. |last5=Vitulo |first5=Nicola |journal=mSystems |volume=7 |issue=5 |pages=e0074122 }}

{{cite journal |doi=10.3390/biology11020263 |pmc=8868628 |pmid=35205129 |doi-access=free |title=Side-by-Side Comparison of Post-Entry Quarantine and High Throughput Sequencing Methods for Virus and Viroid Diagnosis |year=2022 |last1=Gauthier |first1=Marie-Emilie A. |last2=Lelwala |first2=Ruvini V. |last3=Elliott |first3=Candace E. |last4=Windell |first4=Craig |last5=Fiorito |first5=Sonia |last6=Dinsdale |first6=Adrian |last7=Whattam |first7=Mark |last8=Pattemore |first8=Julie |last9=Barrero |first9=Roberto A. |journal=Biology |volume=11 |issue=2 |page=263 }}

{{cite journal |doi=10.3389/fgene.2021.711437 |pmc=8355734 |pmid=34394197 |doi-access=free |title=Pore Cov-An Easy to Use, Fast, and Robust Workflow for SARS-CoV-2 Genome Reconstruction via Nanopore Sequencing |year=2021 |last1=Brandt |first1=Christian |last2=Krautwurst |first2=Sebastian |last3=Spott |first3=Riccardo |last4=Lohde |first4=Mara |last5=Jundzill |first5=Mateusz |last6=Marquet |first6=Mike |last7=Hölzer |first7=Martin |journal=Frontiers in Genetics |volume=12 |page=711437 }}

{{cite journal |doi=10.3390/genes13081330 |pmc=9394340 |pmid=35893066 |doi-access=free |title=A Comparison of Bioinformatics Pipelines for Enrichment Illumina Next Generation Sequencing Systems in Detecting SARS-CoV-2 Virus Strains |year=2022 |last1=Afiahayati |last2=Bernard |first2=Stefanus |last3=Gunadi |last4=Wibawa |first4=Hendra |last5=Hakim |first5=Mohamad Saifudin |last6=Marcellus |last7=Parikesit |first7=Arli Aditya |last8=Dewa |first8=Chandra Kusuma |last9=Sakakibara |first9=Yasubumi |journal=Genes |volume=13 |issue=8 |page=1330 }}

Nextflow also plays a significant role for the non-profit plasmid repository Addgene, using it to confirm the integrity of all deposited plasmids.{{Cite web |last=Niehaus |first=Jason |date=14 July 2022 |title=Bioinformatics at Addgene |url=https://blog.addgene.org/bioinformatics-at-addgene |access-date=25 February 2023 |website=Addgene corporate blog}}

In addition to genomics, Nextflow is gaining popularity in other domains of biomedical data processing, where complex workflows on large amounts of primary data are required. These domains include Drug screening,{{cite journal |last1=Ssekagiri |first1=Alfred |last2=Jjingo |first2=Daudi |last3=Lujumba |first3=Ibra |last4=Bbosa |first4=Nicholas |last5=Bugembe |first5=Daniel L. |last6=Kateete |first6=David P. |last7=Jordan |first7=I King |last8=Kaleebu |first8=Pontiano |last9=Ssemwanga |first9=Deogratius |year=2022 |title=Quasi Flow: A Nextflow pipeline for analysis of NGS-based HIV-1 drug resistance data |journal=Bioinformatics Advances |volume=2 |pages=vbac089 |doi=10.1093/bioadv/vbac089 |pmc=9722223 |pmid=36699347}} Diffusion magnetic resonance imaging (dMRI) in radiology,{{cite journal |last1=Theaud |first1=Guillaume |last2=Houde |first2=Jean-Christophe |last3=Boré |first3=Arnaud |last4=Rheault |first4=François |last5=Morency |first5=Felix |last6=Descoteaux |first6=Maxime |year=2020 |title=Tracto Flow: A robust, efficient and reproducible diffusion MRI pipeline leveraging Nextflow & Singularity |journal=NeuroImage |volume=218 |page=116889 |doi=10.1016/j.neuroimage.2020.116889 |pmid=32447016 |s2cid=164318811 |doi-access=free}} and mass spectrometry data processing,{{cite journal |last1=Van Maldegem |first1=Febe |last2=Valand |first2=Karishma |last3=Cole |first3=Megan |last4=Patel |first4=Harshil |last5=Angelova |first5=Mihaela |last6=Rana |first6=Sareena |last7=Colliver |first7=Emma |last8=Enfield |first8=Katey |last9=Bah |first9=Nourdine |last10=Kelly |first10=Gavin |last11=Tsang |first11=Victoria Siu Kwan |last12=Mugarza |first12=Edurne |last13=Moore |first13=Christopher |last14=Hobson |first14=Philip |last15=Levi |first15=Dina |year=2021 |title=Characterisation of tumour microenvironment remodelling following oncogene inhibition in preclinical studies with imaging mass cytometry |journal=Nature Communications |volume=12 |issue=1 |page=5906 |bibcode=2021NatCo..12.5906V |doi=10.1038/s41467-021-26214-x |pmc=8501076 |pmid=34625563 |last16=Molina-Arcas |first16=Miriam |last17=Swanton |first17=Charles |last18=Downward |first18=Julian}}{{cite journal |last1=Li |first1=Chenxin |last2=Gao |first2=Mingxuan |last3=Yang |first3=Wenxian |last4=Zhong |first4=Chuanqi |last5=Yu |first5=Rongshan |year=2021 |title=Diamond: A multi-modal DIA mass spectrometry data processing pipeline |journal=Bioinformatics |volume=37 |issue=2 |pages=265–267 |doi=10.1093/bioinformatics/btaa1093 |pmid=33416868}}{{cite journal |last1=Luu |first1=Gordon T. |last2=Freitas |first2=Michael A. |last3=Lizama-Chamu |first3=Itzel |last4=McCaughey |first4=Catherine S. |last5=Sanchez |first5=Laura M. |last6=Wang |first6=Mingxun |year=2022 |title=TIMSCONVERT: A workflow to convert trapped ion mobility data to open data formats |journal=Bioinformatics |volume=38 |issue=16 |pages=4046–4047 |doi=10.1093/bioinformatics/btac419 |pmc=9991885 |pmid=35758608}} the latter with a particular focus on proteomics{{cite journal |last1=Perez-Riverol |first1=Yasset |last2=Moreno |first2=Pablo |year=2020 |title=Scalable Data Analysis in Proteomics and Metabolomics Using Bio Containers and Workflows Engines |journal=Proteomics |volume=20 |issue=9 |pages=e1900147 |doi=10.1002/pmic.201900147 |pmc=7613303 |pmid=31657527}}

{{cite journal |last1=Vlasova |first1=Anna |last2=Hermoso Pulido |first2=Toni |last3=Camara |first3=Francisco |last4=Ponomarenko |first4=Julia |last5=Guigó |first5=Roderic |year=2021 |title=FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow |journal=Genes |volume=12 |issue=10 |page=1645 |doi=10.3390/genes12101645 |pmc=8535801 |pmid=34681040 |doi-access=free}}

{{cite journal |last1=Miller |first1=Rachel M. |last2=Jordan |first2=Ben T. |last3=Mehlferber |first3=Madison M. |last4=Jeffery |first4=Erin D. |last5=Chatzipantsiou |first5=Christina |last6=Kaur |first6=Simi |last7=Millikin |first7=Robert J. |last8=Dai |first8=Yunxiang |last9=Tiberi |first9=Simone |last10=Castaldi |first10=Peter J. |last11=Shortreed |first11=Michael R. |last12=Luckey |first12=Chance John |last13=Conesa |first13=Ana |last14=Smith |first14=Lloyd M. |last15=Deslattes Mays |first15=Anne |year=2022 |title=Enhanced protein isoform characterization through long-read proteogenomics |journal=Genome Biology |volume=23 |issue=1 |page=69 |doi=10.1186/s13059-022-02624-y |pmc=8892804 |pmid=35241129 |doi-access=free |last16=Sheynkman |first16=Gloria M.}}

{{cite journal |last1=Othman |first1=Houcemeddine |last2=Jemimah |first2=Sherlyn |last3=Da Rocha |first3=Jorge Emanuel Batista |year=2022 |title=SWAAT Bioinformatics Workflow for Protein Structure-Based Annotation of ADME Gene Variants |journal=Journal of Personalized Medicine |volume=12 |issue=2 |page=263 |doi=10.3390/jpm12020263 |pmc=8875676 |pmid=35207751 |doi-access=free}}

{{cite journal |last1=Bichmann |first1=Leon |last2=Gupta |first2=Shubham |last3=Rosenberger |first3=George |last4=Kuchenbecker |first4=Leon |last5=Sachsenberg |first5=Timo |last6=Ewels |first6=Phil |last7=Alka |first7=Oliver |last8=Pfeuffer |first8=Julianus |last9=Kohlbacher |first9=Oliver |last10=Röst |first10=Hannes |year=2021 |title=DIAproteomics: A Multifunctional Data Analysis Pipeline for Data-Independent Acquisition Proteomics and Peptidomics |url=https://refubium.fu-berlin.de/handle/fub188/32129 |journal=Journal of Proteome Research |volume=20 |issue=7 |pages=3758–3766 |doi=10.1021/acs.jproteome.1c00123 |pmid=34153189 |s2cid=235597603 |doi-access=free}}

{{cite journal |last1=Walzer |first1=Mathias |last2=García-Seisdedos |first2=David |last3=Prakash |first3=Ananth |last4=Brack |first4=Paul |last5=Crowther |first5=Peter |last6=Graham |first6=Robert L. |last7=George |first7=Nancy |last8=Mohammed |first8=Suhaib |last9=Moreno |first9=Pablo |last10=Papatheodorou |first10=Irene |last11=Hubbard |first11=Simon J. |last12=Vizcaíno |first12=Juan Antonio |year=2022 |title=Implementing the reuse of public DIA proteomics datasets: From the PRIDE database to Expression Atlas |journal=Scientific Data |volume=9 |issue=1 |page=335 |bibcode=2022NatSD...9..335W |doi=10.1038/s41597-022-01380-9 |pmc=9197839 |pmid=35701420}}

{{cite journal |last1=Hulstaert |first1=Niels |last2=Shofstahl |first2=Jim |last3=Sachsenberg |first3=Timo |last4=Walzer |first4=Mathias |last5=Barsnes |first5=Harald |last6=Martens |first6=Lennart |last7=Perez-Riverol |first7=Yasset |year=2020 |title=ThermoRawFile Parser: Modular, Scalable, and Cross-Platform RAW File Conversion |journal=Journal of Proteome Research |volume=19 |issue=1 |pages=537–542 |doi=10.1021/acs.jproteome.9b00328 |pmc=7116465 |pmid=31755270}}

{{cite journal |last1=Li |first1=Kai |last2=Jain |first2=Antrix |last3=Malovannaya |first3=Anna |last4=Wen |first4=Bo |last5=Zhang |first5=Bing |year=2020 |title=Deep Rescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics |journal=Proteomics |volume=20 |issue=21–22 |pages=e1900334 |doi=10.1002/pmic.201900334 |pmc=7718998 |pmid=32864883}}

References

External links

[https://nextflow.io Official website]
[https://nf-co.re/ nf-core project]
[https://seqera.io Seqera Labs]