Comparison of four ChIP-Seq analytical algorithms using rice endosperm H3K27 trimethylation profiling data.
Malone, Brandon M.
Bridges, Susan M.
Chromatin immunoprecipitation coupled with high throughput DNA Sequencing (ChIP-Seq) has emerged as a powerful tool for genome wide profiling of the binding sites of proteins associated with DNA such as histones and transcription factors. However, no peak calling program has gained consensus acceptance by the scientific community as the preferred tool for ChIP-Seq data analysis. Analyzing the large data sets generated by ChIP-Seq studies remains highly challenging for most molecular biology laboratories.Here we profile H3K27me3 enrichment sites in rice young endosperm using the ChIP-Seq approach and analyze the data using four peak calling algorithms (FindPeaks, PeakSeq, USeq, and MACS). Comparison of the four algorithms reveals that these programs produce very different peaks in terms of peak size, number, and position relative to genes. We verify the peak predictions using ChIP-PCR to evaluate the accuracy of peak prediction of the four algorithms. We discuss the approach of each algorithm and compare similarities and differences in the results. Despite their differences in the peaks identified, all of the programs reach similar conclusions about the effect of H3K27me3 on gene expression. Its presence either upstream or downstream of a gene is predominately associated with repression of the gene. Additionally, GO analysis finds that a substantially higher ratio of genes associated with H3K27me3 were involved in multicellular organism development, signal transduction, response to external and endogenous stimuli, and secondary metabolic pathways than the rest of the rice genome.
CollegeJames Worth Bagley College of Engineering
College of Agriculture and Life Sciences
DepartmentDepartment of Computer Sciences and Engineering
Department of Biochemistry and Molecular Biology