前面我们给大家分享了一个综述,非常全面的描述了ATAC-Seq数据分析每一步的各种小工具,见《综述:ATAC-Seq 数据分析工具大全》。这次我们再给大家介绍一个综述,这个综述介绍了一种更新和优化的ATAC-seq协议,称为Omni-ATAC,文献信息如下:
标题:Chromatin accessibility profiling by ATAC-seq
发表:Nat Protoc. 2022 Apr 27;17(6):1518–1552.
DOI: 10.1038/s41596-022-00692-9
Figure 2: Schematic overview of ATAC-seq protocol
ATAC-seq | DNase-seq | MNase-seq | CUT&TAG or related ChIC techniques | |
酶的种类 | Tn5 | endonuclease | endonuclease and exonuclease | Tn5 conjugated to an antibody via Protein A. |
是否存在测序偏倚? | Yes; complex, Tn5 insertion bias, with preference for A/Ts in insertion site and C/Gs flanking133-135 | Yes; complex, partially dependent on enzyme concentration and on methylation status of CpGs85,136 | Yes; preferential cutting upstream of A/T compared to G/C137,138 | Yes; dictated by antibody used to guide Tn5 and by Tn5 bias. |
标准分析中输入的细胞/细胞核数 | 500-50,000 | 1-10 million | 10,000-100,000 | 100,000-500,000 |
是否有低起始量/单细胞方法可用? | Yes86,87; commercial solutions available. | Yes67 | Yes66 | Yes62,64,139-141 |
样本类型 | Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. | Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. Formaldehyde cross-linked or formalin-fixed paraffin-embedded samples. | Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. Formaldehyde cross-linked samples. | Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. |
文库准备时间 | ~10 hours for 12 samples (this protocol) | 1-3 days | ~ 2-days | 1-2 days |
技术考量 | Library quality is highly dependent on cell viability. Protocol alterations are required for use on fixed cells and data quality is often reduced for those samples. | Enzyme concentration and digestion duration may need to be optimized to sample type. Size of fragments selected affects downstream analysis.28 | Enzyme concentration and digestion duration may need to be optimized to sample type. Apparent nucleosome occupancy is a function of MNase concentration. | The amount of antibody used must be titrated for the cell type or sample. This will be a function of the strength of the antibody and the abundance of the target protein. The assay is as specific as the primary antibody used. Additionally, this is a targeted technique, so additional libraries must be made of each modification or protein tested. |
测序类型 | Paired-end | Single-end | Single-end | Single-end or paired-end |
测序深度 | Low; 10 million read-pairs per sample with Omni-ATAC. | Medium/high: 20-50 million uniquely mapping reads per sample; 200 million for TF footprinting. | High; 150-200 million reads per sample (human)142 | Very low; 3 million read-pairs per sample. |
数据产量 | Tn5-accessible chromatin; | DNase-accessible chromatin; TF footprinting. | Nucleosome positioning, inaccessible chromatin. | Location of target on DNA. |
主要优势 | Links labeling of accessible regions and NGS library preparation, making preparation of library straightforward. | Footprinting analysis. | Method of choice for nucleosome positioning and quantitative nucleosome dynamics. | Enables mapping of specific TF or histone modification in low cell numbers. Some histone modifications, like H3K27ac, can be used to look for active enhancers. |
早期的 ATAC-Seq 方法中仍存在多个不足之处。例如,
Omni-ATAC 协议通过减少比对到线粒体 DNA 的 reads ,并提高各种细胞系、组织和冷冻样本中的信噪比,改进了原始的ATAC-seq方法。这一改进是通过优化细胞裂解、细胞核分离和转座反应实现的。Omni-ATAC协议中的优化措施通过添加Tween-20和皂角苷(digitonin),以及传统的Nonidet P40(NP40),使得多种细胞类型的裂解成为可能。
Figure 1: Schematic of the ATAC-seq transposition reaction and library preparation
Figure 3: Assessing ATAC-seq library quality
测序应用 | Insight gained | 最短read长度† | Index 长度* | 双端还是单端 | 测序数据量(reads数/样本) |
Gene regulatory landscape profiling | Peaks, differential peaks between samples, motif analysis of peaks | 36 bp | 8 | Paired | 10M |
Genotyping | Gene regulatory landscape + genotype of sample; useful for patient samples and to determine if sequence variants affect a peak. | 100 bp | 8 | Paired | 10M |
Footprinting Analysis | Footprinting of different TFs to determine binding sequence at base-pair resolution | 36 bp | 8 | Paired | 200M |
Nucleosome occupancy | Location of nucleosomes along DNA | 36 bp | 8 | Paired | 60M |
Figure 4: Overview of the steps of ATAC-seq data analysis
Step/Process | ENCODE ATAC-seq | PEPATAC | nf-core atacseq |
用于比较的版本 | v1.10.0 | v0.10.0 | v1.2.1 |
运行环境 | Cromwell/caper | Pypiper | Nextflow |
去接头, 比对以及去重 | Cutadapt、bowtie2、Picard | TRIMMOMATIC、skewer、bowtie2、BWA、samblaster、Picard | TrimGalore、BWA、Picard |
Tn5偏移校正 | Yes | Yes | No |
线粒体基因过滤 | Yes | Yes | Yes |
Peak calling 方法 | MACS2 | MACS2 (default), F-seq、Genrich | MACS2 |
方法 | Based on the irreproducible discovery rate (IDR) for replicates – does not merge for a whole set of samples | Fixed-width, iterative overlap | Raw peak overlap using bedtools109 merge |
输出结果 | BAM files, bigwig files (one representing fold enrichment over expected background and the other representing statistical significance), BED file of peaks for each file and for the merged peak set | QC plots including alignment scoring, TSS scores and library complexity, BED peaks and counts, bam files, bigwig files (nucleotide resolution and smoothed) | QC html report, bam files, normalized bigwig files, BED peaks, annotation of peaks (HOMER), merged peak set, differential accessibility (DESeq2), IGV output. |
代码地址 | https://github.com/ENCODE-DCC/atac-seq-pipeline | https://github.com/databio/pepatac | https://github.com/nf-core/atacseq |
Figure 5: Schematic of peak merging strategies and the resulting merged peak sets
Omini-ATAC 是专门为 bulk ATAC-seq 设计的,单细胞的 ATAC-seq 可以参考成熟的商业化应用如 10X Genomics。