Making ChIP-Sequencing User-Friendly

Evaluation of Genome-wide Analysis Platforms for Chromatin Immunoprecipitation

Chromatin states can influence transcription directly by altering the packaging of DNA to allow or prevent access to DNA-binding proteins, or they can modify the nucleosome surface to enhance or impede recruitment of effector protein complexes. Genome-wide mapping of protein-DNA interaction and epigenetic marks helps us to better understand the transcriptional regulation.

Prior to the development of next-gen sequencing technologies, ChIP on chip, or location analysis, was the method of choice for exploring genome-wide protein:DNA occupancy patterns. However, ChIP-seq has been increasing in popularity due to the data density and lack of interrogation bias that this platform provides. While this application is becoming more popular, few studies have been performed to explore differences in binding site profiles obtained from the two methods.

In this article, we present data that compares the performance of ChIP antibodies by ChIP-chip as well as by ChIP-seq. For these studies, ChIP-chip using Agilent microarrays and ChIP-seq using the Illumina Genome Analyzer platform were performed. These studies may serve as a basis to allow the development of specifications for acceptable performance on these genome-wide platforms to facilitate comparisons of data sets in shared databases and in published research.

The macromolecular structure of chromatin in eukaryotic cells is dynamic, and various epigenetic marks help define a static chromatin state.1 This chromatin state is a reflection of accessibility and/or presence of certain protein:DNA or protein:protein interactions in a location- or region-specific manner. This dynamic and coordinated interaction directly influences the expression of a particular gene locus. Thus, the elucidation of these interactions is essential for a deeper understanding of a variety of biological processes and disease states.

One of the main tools for investigating these interactions is ChIP (chromatin immunoprecipitation). ChIP is a powerful technique classically used for mapping the in vivo distribution of proteins associated with chromosomal DNA. These proteins can be histone subunits, transcription factors, chromatin modifiers (enzymes), or other regulatory or structural proteins bound either directly or indirectly to DNA. Using high-quality antibodies, protein-interacting regions of chromosomal DNA, as well as their post-translational modifications, can be detected. Typically, either end-point or quantitative PCR is performed to verify whether a particular DNA sequence (the gene or region of the genome) is associated with the protein of interest. Using this classical approach, laboratories can evaluate the interactions of the proteins of interest for a limited number of known target genes.

However, as the need to map, characterize and understand these interactions across the epigenome has grown, labs have turned to genome-wide approaches for the analysis of ChIP using either DNA bearing microarrays (ChIP-chip) or next gen sequencing (ChIP-seq). This combination of chromatin immunoprecipitation with genome-wide analysis represents a powerful approach that can provide a comprehensive look at transcriptionfactor binding as well as histone modifications and other chromatin associated proteins (Figure 1). Although ChIP-chip is a powerful approach and lends itself to the detailed analysis of specific regions of the genome or gene families using high-density arrays, the ChIP-seq approach provides genome-wide data with a high resolution and wide dynamic range, thus allowing for comprehensive coverage of the genome.

Figure 1. ChIP-chip and ChIP-seq chromatin immunoprecipitation workflow

In the ChIP-chip workflow, proteins bound to DNA are cross-linked and the DNA is sonicated or digested by enzymes to fragment the chromatin into soluble, lower-molecular-weight species. For tightly bound proteins such as histones, crosslinking may not be necessary and the procedure known as native ChIP can be utilized. Chromatin immunoprecipitation is performed to isolate DNA bound to the protein(s) of interest. ChIP’d DNA is then amplified, labeled and hybridized to a DNA microarray. The array is scanned and the array image is analyzed to identify DNA segments bound to the protein.

The ChIP-seq workflow is similar in that proteins bound to DNA are cross-linked and chromatin fragments generated. Immunoprecipitation isolates the desired fragments. The resulting ChIP DNA is end-repaired, a dA overhang may be added, and platform-specific adaptors are ligated to the processed ChIP DNA. The ligated DNA library is then size-selected by agarose gel or other methods and amplified by PCR. The final ChIP-seq library is examined by a two-step quality control process: microfluidic electrophoresis (e.g., the Agilent 2100 bioanalyzer or Bio-Rad Experion) checks the size and concentration while real-time PCR checks for enrichment of the library as compared to the input library. The ChIP library is then sequenced.

In this study, two kits from EMD Millipore containing the required reagents and protocols for either microarray or next-gen sequencing were used, specifically the Magna ChIP (microarray) and the Magna ChIP-seq chromatin immunoprecipitation kit (next-gen sequencing). For this study, we targeted Sp1, a transcription factor that binds with high affinity to GC-rich motifs and regulates the expression of a large number of genes involved in a variety of processes such as cell growth, apoptosis, differentiation, and immune responses. The ChIP-seq library was constructed using either 10 ng or 1 ng of Sp 1 ChIP’d DNA. The resulting library sample was analyzed to show size and concentration of the library using an automated gel electrophoresis system (Bio-Rad Experion). Data shown in Figure 2 validate the ChIP-seq library from 1 ng of Sp1 ChIP DNA. DNA fragments following end repair, adapter addition, and amplification were in the 180-300 base pair range.

Figure 2. ChIP-seq library validation from 1 ng of Sp1 ChIP DNA. Electrophoresis was used to assess DNA quality and purity.

Figure 3A verifies enrichment of the target DNA using real-time PCR using a Magna ChIP™ Chromatin Immunoprecipitation kit and a Sp1 ChIP-validated antibody (ChIPAb+, EMD Millipore). Successful enrichment of Sp1-associated DNA fragments was verified by qPCR using ChIP primers flanking the human DHFR promoter that contains a Sp1 binding site pre- and post-library construction (Figure 3B). Results confirm that the ChIP reaction produced enriched chromatin using this antibody (A) and that enrichment of this region was maintained in the process of library construction by comparison of equivalent amounts of input library (B).

Figure 3. ChIP-seq library QC validation pre (A) and post (B) amplification.

A comparison of ChIP-chip and ChIP-seq Sp1 targets is shown in Figure 4. The ChIP-chip workflow identified 9462 Sp1 putative target genes (threshold Log2 ratio of Signal/WCE ≥2), and the ChIP-seq workflow identified 13,874 peaks of occupancy when considering non-redundant target gene binding sites (peak threshold of ≥6.5 fold enrichment Signal/Input). A total of 6083 putative target genes were identified by both methods. The smaller number of targets indentified by ChIP-chip may result from use of the reduced representation associated with the promoter array, whereas the ChIP-seq method allows genome-wide coverage.

Figure 4. Comparison of Sp1 ChIP-chip and ChIP-seq targets.

Sp1 binding at autoregulatory binding sites within the Sp1 promoter region is confirmed by both 1 ng and 10 ng ChIP-seq libraries when compared to the input library (Figure 5A). A genome browser view showed enrichment of Sp1 at the Sp1 promoter region in the ChIP-ChIP experiment (upper panel), which is consistent with results observed in ChIP-seq Sp1 analysis of an alternate cell line by the ENCODE consortium (Figure 5B).

Figure 5. Comparison of replicate libraries to existing ChIP-seq profiles.

ChIP-chip and ChIP-seq are powerful tools for the exploration of nuclear protein:DNA interaction on a locusspecific and genome-wide basis. In this study, we compared both platforms using a Sp1 ChIP validated antibody. While the data generated from ChIP-chip and ChIP-seq were consistent with each other, the ChIP-seq technique used less ChIP DNA as a starting point (1-10 ng) and offered better coverage of the data. As costs of whole genome sequencing continue to fall, the application of ChIP-seq will be within reach of more laboratories. Use of this technique will be further enabled through kits and reagents specifically designed to facilitate sample preparation and immunoprecipitation.