Genome Biology 2008, 9:R137
Open Access
2008Zhanget al.Volume 9, Issue 9, Article R137
Method
Model-based Analysis of ChIP-Seq (MACS)
Yong Zhang
¤
*
, Tao Liu
¤
*
, Clifford A Meyer
*
, Jérôme Eeckhoute
†
,
David S Johnson
‡
, Bradley E Bernstein
§¶
, Chad Nusbaum
¶
,
Richard M Myers
¥
, Myles Brown
†
, Wei Li
#
and X Shirley Liu
*
Addresses:
*
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, 44
Binney Street, Boston, MA 02115, USA.
†
Division of Molecular and Cellular Oncology, Department of Medical Oncology, Dana-Farber Cancer
Institute and Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, 44 Binney Street, Boston, MA 02115, USA.
‡
Gene Security Network, Inc., 2686 Middlefield Road, Redwood City, CA 94063, USA.
§
Molecular Pathology Unit and Center for Cancer
Research, Massachusetts General Hospital and Department of Pathology, Harvard Medical School, 13th Street, Charlestown, MA 02129, USA.
¶
Broad Institute of Harvard and MIT, 7 Cambridge Center, Cambridge, MA, 02142, USA.
¥
Department of Genetics, Stanford University Medical
Center, Stanford, CA 94305, USA.
#
Division of Biostatistics, Dan L Duncan Cancer Center, Department of Molecular and Cellular Biology,
Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.
¤ These authors contributed equally to this work.
Correspondence: Wei Li. Email: wl1@bcm.edu. X Shirley Liu. Email: xsliu@jimmy.harvard.edu
© 2008 Zhang et al.; licensee BioMed Central Ltd.
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
ChIP-Seq analysis<p>MACS performs model-based analysis of ChIP-Seq data generated by short read sequencers.</p>
Abstract
We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short
read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of
ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also
uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for
more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms,
and is freely available.
Background
The determination of the 'cistrome', the genome-wide set of
in vivo cis-elements bound by trans-factors [1], is necessary
to determine the genes that are directly regulated by those
trans-factors. Chromatin immunoprecipitation (ChIP) [2]
coupled with genome tiling microarrays (ChIP-chip) [3,4]
and sequencing (ChIP-Seq) [5-8] have become popular tech-
niques to identify cistromes. Although early ChIP-Seq efforts
were limited by sequencing throughput and cost [2,9], tre-
mendous progress has been achieved in the past year in the
development of next generation massively parallel sequenc-
ing. Tens of millions of short tags (25-50 bases) can now be
simultaneously sequenced at less than 1% the cost of tradi-
tional Sanger sequencing methods. Technologies such as Illu-
mina's Solexa or Applied Biosystems' SOLiD™ have made
ChIP-Seq a practical and potentially superior alternative to
ChIP-chip [5,8].
While providing several advantages over ChIP-chip, such as
less starting material, lower cost, and higher peak resolution,
ChIP-Seq also poses challenges (or opportunities) in the anal-
ysis of data. First, ChIP-Seq tags represent only the ends of
the ChIP fragments, instead of precise protein-DNA binding
sites. Although tag strand information and the approximate
distance to the precise binding site could help improve peak
resolution, a good tag to site distance estimate is often
Published: 17 September 2008
Genome Biology 2008, 9:R137 (doi:10.1186/gb-2008-9-9-r137)
Received: 4 August 2008
Revised: 3 September 2008
Accepted: 17 September 2008
The electronic version of this article is the complete one and can be
found online at http://genomebiology.com/2008/9/9/R137