MACS (software)
{{Short description|Peak finding software}}
{{About|MACS software|3=MACS (disambiguation)}}{{Infobox software
| name = Model-based Analysis for ChIP-Seq (MACS)
| author = Yong Zhang, Tao Liu, et al. (original)
Tao Liu et al. (MACS2/3)
| developer = Tao Liu and contributors
| released = {{Start date and age|2008}}
| discontinued = No
| latest release version = 3.0.3
| latest release date = {{Start date and age|2025|02|20|df=yes}}
| programming language = Python, C
| operating system = Cross-platform (Linux, macOS, Windows via WSL)
| genre = Bioinformatics software, Peak calling
| license = BSD 3-Clause
| website = {{URL|https://github.com/macs3-project/MACS}}
}}
Model-based Analysis of ChIP-Seq (MACS) is a bioinformatics software primarily designed for peak calling.{{Cite journal |last1=Feng |first1=Jianxing |last2=Liu |first2=Tao |last3=Qin |first3=Bo |last4=Zhang |first4=Yong |last5=Liu |first5=Xiaole Shirley |date=2012-09-01 |title=Identifying ChIP-seq enrichment using MACS |journal=Nature Protocols|volume=7 |issue=9 |pages=10.1038/nprot.2012.101 |doi=10.1038/nprot.2012.101 |pmid=22936215 |pmc=3868217 }} It uses a peak detection approach based on modeling the characteristic shift in read distributions on the forward and reverse DNA strands. The method was published in 2008 by Yong Zhang, Tao Liu, and colleagues in Genome Biology.{{Cite journal |last1=Zhang |first1=Yong |last2=Liu |first2=Tao |last3=Meyer |first3=Clifford A |last4=Eeckhoute |first4=Jérôme |last5=Johnson |first5=David S |last6=Bernstein |first6=Bradley E |last7=Nusbaum |first7=Chad |last8=Myers |first8=Richard M |last9=Brown |first9=Myles |last10=Li |first10=Wei |last11=Liu |first11=X Shirley |date=2008-09-17 |title=Model-based Analysis of ChIP-Seq (MACS) |journal=Genome Biology |volume=9 |issue=9 |doi=10.1186/gb-2008-9-9-r137 |issn=1474-760X |doi-access=free |hdl-access=free |hdl=1721.1/59206}}
MACS has been cited over 17,000 times, and is routinely used in epigenetics, particularly for identifying narrow peaks associated with transcription factor binding sites or H3K4me3 histone modifications.{{Cite journal |last1=Thomas |first1=Reuben |last2=Thomas |first2=Sean |last3=Holloway |first3=Alisha K |last4=Pollard |first4=Katherine S |date=2017-05-01 |title=Features that define the best ChIP-seq peak calling algorithms |url=https://academic.oup.com/bib/article/18/3/441/2453291 |journal=Briefings in Bioinformatics |volume=18 |issue=3 |pages=441–450 |doi=10.1093/bib/bbw035 |issn=1467-5463|pmc=5429005 }} MACS is distributed as open-source software under the permissive BSD 3-Clause License.
Methodology
MACS analyzes mapped read data from ChIP-Seq experiments, often comparing a ChIP sample enriched for a specific protein binding to a control sample (e.g., input DNA or IgG immunoprecipitation), if available, to distinguish genuine enrichment from background noise and biases.
A key innovation in MACS is its model for the spatial distribution of sequencing reads around binding sites. In a typical ChIP-Seq experiment, reads map to the ends of the DNA fragments generated during immunoprecipitation. This results in clusters of reads mapping to the forward strand upstream of the binding site and clusters mapping to the reverse strand downstream. MACS empirically estimates the average distance, d, between the modes of these forward and reverse strand read distributions. It then shifts all reads by d/2 towards the interior of the fragment, effectively centering the signal at the putative binding site before identifying peaks.
To assess the significance of signal enrichment at any given genomic location, MACS models the background read count using a Poisson distribution with dynamic parameter λlocal, that are allowed to vary along the genome. This model considers local biases by comparing the read count in a candidate peak region to the read count in larger flanking regions (e.g., 1 kb, 5 kb, 10 kb) or, preferably, to the scaled read count in the same region (λregion) within the control sample. A p-value is calculated based on the Poisson model, indicating the probability of observing the ChIP read count given the estimated background level. It also adjusts for sequencing depth differences by linearly scaling down the larger sample (default behavior) or scaling up the smaller sample. To control for multiple testing across the genome, MACS calculates a False discovery rate (FDR) for the identified peaks. This is often done by swapping the ChIP and control samples and determining the number of peaks called under these null conditions.
Later versions extended the methodology to handle paired-end sequencing data and included options specifically for calling broader regions of enrichment,{{Cite journal |last1=Beacon |first1=Tasnim H. |last2=Delcuve |first2=Geneviève P. |last3=López |first3=Camila |last4=Nardocci |first4=Gino |last5=Kovalchuk |first5=Igor |last6=van Wijnen |first6=Andre J. |last7=Davie |first7=James R. |date=2021-07-08 |title=The dynamic broad epigenetic (H3K4me3, H3K27ac) domain as a mark of essential genes |journal=Clinical Epigenetics |volume=13 |issue=1 |doi=10.1186/s13148-021-01126-1 |doi-access=free |issn=1868-7075|hdl=1993/35765 |hdl-access=free }} such as those associated with certain histone modifications, often by grouping nearby significant regions.
Development
The original version was re-written in Python by Tao Liu and released as MACS 2. The version improved command-line usability, handling of various input formats, and added algorithms for identifying broad peaks ('--broad' option). The current iteration, MACS 3.0, includes performance enhancements and new features such as variant calling from ChIP-seq data (callvar
subcommand).{{Cite web |title=callvar — MACS3 3.0.1 documentation |url=https://macs3-project.github.io/MACS/docs/callvar.html |access-date=2025-05-15 |website=macs3-project.github.io}}
Impact
{{Portal|Biology|Evolutionary biology|Free and open-source software}}
MACS is one of the most highly cited peak-calling algorithms in the field of genomics. Its approach to modeling ChIP-Seq data characteristics significantly improved the accuracy and resolution of binding site detection compared to earlier methods based solely on read counts in fixed windows.{{Cite journal |last1=Thomas |first1=Reuben |last2=Thomas |first2=Sean |last3=Holloway |first3=Alisha K. |last4=Pollard |first4=Katherine S. |date=2017-05-01 |title=Features that define the best ChIP-seq peak calling algorithms |journal=Briefings in Bioinformatics|volume=18 |issue=3 |pages=441–450 |doi=10.1093/bib/bbw035 |pmid=27169896 |pmc=5429005 }} It remains a benchmark tool for new methods and assays,{{Cite journal |last1=Jeon |first1=Hyeongrin |last2=Lee |first2=Hyunji |last3=Kang |first3=Byunghee |last4=Jang |first4=Insoon |last5=Roh |first5=Tae-Young |date=2020-12-01 |title=Comparative analysis of commonly used peak calling programs for ChIP-Seq analysis |journal=Genomics & Informatics|volume=18 |issue=4 |pages=e42 |doi=10.5808/GI.2020.18.4.e42 |pmid=33412758 |pmc=7808876 }}{{Citation |last1=Nooranikhojasteh |first1=Amin |title=Benchmarking Peak Calling Methods for CUT&RUN |date=2024-11-15 |url=https://www.biorxiv.org/content/10.1101/2024.11.13.622880v1 |access-date=2025-05-15 |publisher=bioRxiv |language=en |doi=10.1101/2024.11.13.622880 |last2=Tavallaee |first2=Ghazaleh |last3=Orouji |first3=Elias}} and it is frequently integrated into standardized analysis pipelines and platforms like Galaxy, Cistrome, and Pluto Bio.{{Cite web |title=Choosing between MACS2 and SEACR for peak calling with epigenomics datasets |url=https://help.pluto.bio/en/articles/choosing-between-macs2-and-seacr-for-peak-calling-with-dna-sequencing-datasets |access-date=2025-05-15 |website=help.pluto.bio |language=en}}{{Cite journal |last1=Liu |first1=Tao |last2=Ortiz |first2=Jorge A. |last3=Taing |first3=Len |last4=Meyer |first4=Clifford A. |last5=Lee |first5=Bernett |last6=Zhang |first6=Yong |last7=Shin |first7=Hyunjin |last8=Wong |first8=Swee S. |last9=Ma |first9=Jian |last10=Lei |first10=Ying |last11=Pape |first11=Utz J. |last12=Poidinger |first12=Michael |last13=Chen |first13=Yiwen |last14=Yeung |first14=Kevin |last15=Brown |first15=Myles |date=2011-08-22 |title=Cistrome: an integrative platform for transcriptional regulation studies |journal=Genome Biology|volume=12 |issue=8 |doi=10.1186/gb-2011-12-8-r83 |doi-access=free |pmc=3245621 }}{{Cite web |title=Galaxy—MACS2 call peak |url=https://usegalaxy.org/?tool_id=toolshed.g2.bx.psu.edu/repos/iuc/macs2/macs2_callpeak |access-date=2025-05-07 |website=usegalaxy.org}}
References
External links
- {{GitHub|macs3-project/MACS}}
Category:Free bioinformatics software