Hi-C数据可视化软件工具比较与选择

需积分: 1 4 浏览量更新于2024-09-08 收藏 3.81MB PDF 举报

"这篇文章是关于用于可视化Hi-C数据的软件工具的综述，作者是Galip Gürkan Yardımcı和William Stafford Noble。Hi-C技术是一种高通量的实验方法，可以揭示DNA的三维结构，并帮助理解DNA构象与功能之间的关系。文章介绍了五款不需要编程专业知识的软件工具，它们各自具有不同的优势，适用于不同类型的Hi-C数据解读任务。" 正文: 随着高通量测序技术的发展，Hi-C（染色质互作分析）已经成为研究基因组三维结构的重要手段。Hi-C数据的解析对理解DNA在细胞核内的三维构象及其与生物过程的关系至关重要。然而，由于数据量庞大且复杂，传统的基因组浏览器无法有效地展示这些信息。因此，出现了专门的Hi-C可视化工具，它们提供多种视图模式，并能与现有的互补数据结合显示。在Galip Gürkan Yardımcı和William Stafford Noble的综述中，他们探讨了五款无需编程经验即可使用的Hi-C数据可视化软件。这些工具各有特色，针对不同的分析需求，它们提供了以下关键功能： 1. **HiC-Pro**：这是一款全面的数据处理和分析工具，不仅支持Hi-C数据的预处理，还包括可视化功能。它能够生成交互式的热图和三维模型，有助于识别染色质相互作用的热点区域。 2. ** Juicebox**：Juicebox专注于Hi-C数据的大规模可视化，特别适合查看全局的染色体构象。其基于Jupyter Notebook的环境允许用户进行交互式探索，并可生成高分辨率的TAD（拓扑关联域）图。 3. **3DGenomes**：这款工具提供了多种3D结构展示方式，包括线性图、旋转球形图等，让用户能从不同角度观察染色体结构。同时，3DGenomes还可以与其他表观遗传学数据结合，进行多维度的分析。 4. **HiGlass**：HiGlass是一个强大的并行视图系统，支持多尺度、多样本的Hi-C数据比较。用户可以同时查看多个样本的Hi-C图谱，便于比较不同条件下的染色体结构变化。 5. **HiCExplorer**：这款软件不仅有数据处理功能，还提供了一种直观的方式来探索Hi-C数据。它的交互式图像是理解TAD边界和染色体区室的理想工具。每款工具都有其独特的优点，选择哪一款取决于具体的研究问题和分析目标。例如，如果需要深入研究TADs，Juicebox可能是最佳选择；而进行大规模多样本比较时，HiGlass则更有优势。这些Hi-C可视化软件工具极大地推动了对基因组三维结构的理解，使得研究人员能够更有效地解读复杂的Hi-C数据，从而揭示DNA构象如何影响基因表达、转录调控以及DNA复制等生物学过程。随着技术的不断发展，我们可以期待更多创新的工具出现，以进一步提升我们对基因组三维结构的洞察力。

Yardımcı and Noble Genome Biology

(2017) 18:26

DOI 10.1186/s13059-017-1161-y

REVIE W Open Access

Software tools for visualizing Hi-C data

Galip Gürkan Yardımcı

and William Stafford Noble

Abstract

High-throughput assays for measuring the

three-dimensional (3D) configuration of DNA have

provided unprecedented insights into the relationship

between DNA 3D configuration and function. Data

interpretation from assays such as ChIA-PET and Hi-C is

challenging because the data is large and cannot be

easily rendered using standard genome browsers. An

effective Hi-C visualization tool must provide several

visualization modes and be capable of viewing the

data in conjunction with existing, complementary data.

We review five software tools that do not require

programming expertise. We summarize their

complementary functionalities, and highlight which

tool is best equipped for specific tasks.

Introduction

The three-dimensional (3D) conformation of the genome

in the nucleus influences many key biological processes,

such as transcriptional regulation and DNA replication

timing. Over the past decade, chromosome conforma-

tion capture assays have been developed to characterize

3D contacts associated with a single locus (chromosome

conformation capture (3C), chromosome conformation

capture-on-chip (4C)) [1–3], a set of loci (chromosome

conformation capture carbon copy (5C), chromatin inter-

action analysis by paired-end tag sequencing (ChIA-PET))

[4, 5] or the whole genome (Hi-C) [6]. Using these assays,

researchers have profiled the conformation of chromatin

in a variety of organisms and systems, which has revealed

a hierarchical, domain-like organization of chromatin.

Here, we focus on the Hi-C assay and variants thereof,

which provide a genome-wide view of chromosome con-

formation. The assay consists of five steps: (1) crosslinking

DNA with formaldehyde, (2) cleaving cross-linked DNA

with an endonuclease, (3) ligating the ends of cross-linked

*Correspondence: william-noble@uw.edu

Department of Genome Sciences, Department of Computer Science and

Engineering, University of Washington, 3720 15th Ave NE, WA 98105, Seattle,

USA

Full list of author information is available at the end of the article

fragments to form a circular molecule marked with biotin,

(4) shearing circular DNA and pulling down fragments

marked with biotin, and (5) paired-end sequencing of the

pulled-down fragments. A pair of sequence reads from

a single ligated molecule map to two distinct regions of

the genome, and the abundance of such fragments pro-

vides a measure of how frequently, within a population

of cells, the two loci are in contact. Thus, by contrast

with assays such as DNase-seq and chromatin immuno-

precipitation sequencing (ChIP-seq) [7, 8], which yield a

one-dimensional count vector across the genome, the out-

put of Hi-C is a two-dimensional matrix of counts, with

one entry for each pair of genomic loci. Production of

this matrix involves a series of filtering and normalization

steps (reviewed in [9] and [10]).

A critical parameter in Hi-C analysis pipelines is the

effective resolution at which the data is analyzed [10, 11].

In this context, “resolution” simply refers to the size of

the loci for which Hi-C counts are aggregated. At present,

deep sequencing to achieve very high resolution data for

large genomes is prohibitively expensive. A basepair res-

olution analysis of the human genome would require the

aggregation of counts across a matrix of size approxi-

mately (3 × 10

)

= 9 × 10

. Reads that fall within a

contiguous genomic window are binned together, which

reduces the size and sparsity of the matrix at the cost of

resolution. Following this process, Hi-C data can be rep-

resented as a “contact matrix” M,whereentryM

is the

number of Hi-C read pairs, or contacts, between genomic

locations designated by bin i and bin j.

Hi-C data presents substantial analytical challenges for

researchers who study chromatin conformation. Filtering

and normalization strategies can be employed to correct

experimental artifacts and biases [9–11]. Statistical confi-

dence measures can be estimated to identify sets of high

confidence contacts [12]. Hi-C data can be compared

with and correlated against complementary data sets mea-

suring protein–DNA interactions, gene expression, and

replication timing [13–15]. And 3D conformation of the

DNA itself can be estimated from Hi-C data, with the

potential to consider data derived from other assays or

from multiple experimental conditions [16–19].

International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and

reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the

Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver

(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

下载后可阅读完整内容，剩余8页未读，立即下载

qq_39675181

粉丝: 0

Hi-C数据可视化软件工具比较与选择

Visualizing Data-Ben Fry-可视化数据 英文高清版 pdf

visualizing-musical-structure-and-rhythm-via-self-similarity

Belgium-Provinces-Project---Data-Methods-for-Exploring-and-Visualizing-Economic-Indicators:从比利时国家银行的在线统计网站导入，处理，清理和合并数据

Visualizing-Bias-in-Sentiment-Analysis

visualizing-areas-of-interest

Visualizing-Time-Series-Data-in-Python

visualizing-data-384页

百度地图毕业设计源码-Visualizing-and-Understanding-Convolutional-Networks:paddlep

Visualizing-Streaming-Data:同名书籍的示例

Visualizing-graph-search-results-in-Swift:绘制图形并突出显示两个节点之间的路径

最新资源

Visualizing Data-Ben Fry-可视化数据英文高清版 pdf