Identification of Analytical Factors Affecting Complex Proteomics Profiles Acquired in a Factorial Design Study with Analysis of Variance: Simultaneous Component Analysis

Vikram Mitra, Natalia Govorukhina, Gooitzen Zwanenburg, Huub Hoefsloot, Inge Westra, Age Smilde, Theo Reijmers, Ate G.J. Van Der Zee, Frank Suits, Rainer Bischoff, Péter Horvatovich*

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Complex shotgun proteomics peptide profiles obtained in quantitative differential protein expression studies, such as in biomarker discovery, may be affected by multiple experimental factors. These preanalytical factors may affect the measured protein abundances which in turn influence the outcome of the associated statistical analysis and validation. It is therefore important to determine which factors influence the abundance of peptides in a complex proteomics experiment and to identify those peptides that are most influenced by these factors. In the current study we analyzed depleted human serum samples to evaluate experimental factors that may influence the resulting peptide profile such as the residence time in the autosampler at 4 °C, stopping or not stopping the trypsin digestion with acid, the type of blood collection tube, different hemolysis levels, differences in clotting times, the number of freeze-thaw cycles, and different trypsin/protein ratios. To this end we used a two-level fractional factorial design of resolution IV (2IV7-3). The design required analysis of 16 samples in which the main effects were not confounded by two-factor interactions. Data preprocessing using the Threshold Avoiding Proteomics Pipeline (Suits, F.; Hoekman, B.; Rosenling, T.; Bischoff, R.; Horvatovich, P. Anal. Chem. 2011, 83, 7786-7794, ref 1) produced a data-matrix containing quantitative information on 2 559 peaks. The intensity of the peaks was log-transformed, and peaks having intensities of a low t-test significance (p-value > 0.05) and a low absolute fold ratio (<2) between the two levels of each factor were removed. The remaining peaks were subjected to analysis of variance (ANOVA)-simultaneous component analysis (ASCA).2 Permutation tests were used to identify which of the preanalytical factors influenced the abundance of the measured peptides most significantly. The most important preanalytical factors affecting peptide intensity were (1) the hemolysis level, (2) stopping trypsin digestion with acid, and (3) the trypsin/protein ratio. This provides guidelines for the experimentalist to keep the ratio of trypsin/protein constant and to control the trypsin reaction by stopping it with acid at an accurately set pH. The hemolysis level cannot be controlled tightly as it depends on the status of a patient's blood (e.g., red blood cells are more fragile in patients undergoing chemotherapy) and the care with which blood was sampled (e.g., by avoiding shear stress). However, its level can be determined with a simple UV spectrophotometric measurement and samples with extreme levels or the peaks affected by hemolysis can be discarded from further analysis. The loadings of the ASCA model led to peptide peaks that were most affected by a given factor, for example, to hemoglobin-derived peptides in the case of the hemolysis level. Peak intensity differences for these peptides were assessed by means of extracted ion chromatograms confirming the results of the ASCA model.

Original languageEnglish
Pages (from-to)4229-4238
Number of pages10
JournalAnalytical chemistry
Volume88
Issue number8
DOIs
Publication statusPublished - 3 May 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Identification of Analytical Factors Affecting Complex Proteomics Profiles Acquired in a Factorial Design Study with Analysis of Variance: Simultaneous Component Analysis'. Together they form a unique fingerprint.

Cite this