Proteomics is a field of study that, although in its fourth decade, is really still in the throes of birth. The avowed mission statement of proteomics is to characterize all proteins, and that has only become a glimmer of hope with the contemporary updating of liquid chromatography-mass spectrometry (LC-MS) systems, their capabilities, and accessibility. The task, however, is daunting. Depending on how one analyzes the potential size of the human proteome, there may be a million or more distinct proteins derived from our proposed 20,000 or so genes. Although there are, according to some estimates, about 10,000 proteins involved in normal homeostatic maintenance, there is constant turnover, in addition to post-translational modifications, cell type-specific functions, and disease-relevant single nucleotide polymorphisms. This discrepancy points to the dynamic and complex nature of generating diverse molecules with immediate, but transient, biological function from comparatively finite and circumscribed codes, and to the conceptual difference between genetic instructions bound within chromosomal units, and their progeny, free to mingle throughout biological systems. The proposed characterization of all of these, including their subcellular localizations, interactions, domain structures, and activities, implies an infinite arc for the field of proteomics.
The popularization of LC-MS in proteomics has been in large part the story of technological innovation catching up with big ideas, and has opened the field vastly in comparison to when it was limited by legacy techniques such as two-dimensional gel electrophoresis and microarray studies. In principle, a mass spectrometer optimized for proteomics consists of four parts in series:
- An ion source;
- A mass analyzer that measures the mass-to-charge (m/z) ratio of ionized analytes;
- A detector that quantifies ions at a given m/z ratio; and
- An analytical algorithm to identify ions of interest and obtain structural information, based on MS spectra
Variations in ionization and mass analyzer systems are suitable for different proteomics strategies. Two of the most common ionization platforms are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). Invention of the former garnered a share of the 2002 Nobel prize in chemistry for Koichi Tanaka, for its contribution to revolutionizing “bottom-up” and high-throughput proteomics, often combined with ion trap, or Orbitrap mass analyzers. Mass analyzers vary in their intrinsic sensitivity, resolution, accuracy, and ability to derive unique spectra for fragmented peptides. Analyzer platforms include quadrupole, Fourier transform ion cyclotron resonance, and time-of-flight (TOF); MALDI-TOF analysis is commonly used in smaller-scale proteomic studies to characterize protein interaction networks.
For broad, discovery-oriented proteomic analyses, investigators favor the “bottom-up” approach. In this strategy, proteins within biological samples such as cell extracts or animal tissues are digested to smaller peptides, fractionated, and subjected to LC-MS to acquire m/z and intensity measurements, resulting in distinct mass fingerprints. In tandem mass spectrometry (MS/MS) systems, additional peptide sequencing can bolster initial validity and confidence. From peptide spectra, investigators can extrapolate protein identity, and from peak intensities they can quantify relative abundance in the starting sample. This process can be automated to an appreciable extent, with analytical algorithms that can link to MS spectra databases. Because the sensitivities of current instrumentation can run as low as attomolar (10-18 mol) concentrations, broad and reproducible coverage of a reasonably comprehensive snapshot functional proteome is possible.
Proteomics using LC-MS has, in recent years, become extremely attractive to the pharmaceutical industry as a means to streamline drug development pipelines, and to academic researchers identifying new potential lead compounds for small molecule-based therapeutics. The pharmaceutical pipeline generally proceeds as follows:
- Target ID and validation
- Lead generation
- Lead optimization
- Preclinical development, from animal studies into phase I trials
- Clinical development, from phase I through phase II and III trials and approval
The academic approach is somewhat more freewheeling, and often involves a high-throughput chemical screen to first identify a new molecule with an interesting biological function, followed by various empirical studies to try to ascertain its protein target(s). Most investigational new drugs are expected to fail somewhere along the pharma pipeline, but companies can forfeit tens or hundreds of millions of dollars when they fail in later trial stages. Via an emerging proteomic technique called CETSA-MS, pharma companies can potentially make go/no-go decisions very early in development to avoid wasting time and money, and academic researchers can confidently identify targets much more quickly. The cellular thermal shift assay (CETSA) relies on the increase in enthalpy associated with a molecule or ligand binding to its cognate target protein, which imparts a conformational stability that is resistant to increased temperature. Combined with LC-MS, this is a “bottom-up” proteomic technique in which fractionated proteins are labeled with mass tags so that thousands of them can be identified and quantified in parallel. An increase in protein melting temperature corresponds to a potential targeting event, and iterative analysis can successively narrow the list of potential targets to define target engagement. Just as importantly, this analysis can identify off-target binding events with unexpected proteins.
In this way, investigators have for instance unraveled a mechanism responsible for chemotherapeutic resistance to a small molecule inhibitor of a protein called anaplastic lymphoma kinase (ALK) involved in several types of cancer. First, a CETSA study confirmed the targeting of the inhibitor, Crizotinib, to ALK protein; secondly, CETSA-MS identified a competitive effect in a subpopulation of cells that expressed high levels of a protein called beta-catenin, which interfered with Crizotinib-ALK binding, making it less effective. Because of the simple power of proteomics studies like this, pharma companies are devoting research and development resources to building and augmenting their CETSA-MS capabilities, and academic biomedical research facilities are developing dedicated CETSA-MS facilities within their proteomics cores. In addition, contract-based corporations such as Pelago Bioscience are offering CETSA-based services for target ID discovery and/or confirmation, off-target discovery, pathway deconvolution, and mode of action studies. These technologies and services are at the forefront of a budding proteomics revolution.