TY - JOUR
T1 - Parallel processing of large datasets from nanoLC-FTICR-MS measurements
AU - van der Burgt, Y.E.M.
AU - Taban, I.M.
AU - Konijnenburg, M.
AU - Biskup, M.
AU - Duursma, M.C.
AU - Heeren, R.M.A.
AU - Rompp, A.
AU - van Nieuwpoort, R.V.
AU - Bal, H.E.
N1 - burgt07:_paral_fticr_ms
PY - 2007
Y1 - 2007
N2 - A new approach for automatic parallel processing of large mass spectral datasets in a distributed computing environment is demonstrated to significantly decrease the total processing time. The implementation of this novel approach is described and evaluated for large nanoLC-FTICR-MS datasets. The speed benefits are determined by the network speed and file transfer protocols only and allow almost real-time analysis of complex data (e.g., a 3-gigabyte raw dataset is fully processed within 5 min). Key advantages of this approach are not limited to the improved analysis speed, but also include the improved flexibility, reproducibility, and the possibility to share and reuse the pre- and postprocessing strategies. The storage of all raw data combined with the massively parallel processing approach described here allows the scientist to reprocess data with a different set of parameters (e.g., apodization, calibration, noise reduction), as is recommended by the proteomics community. This approach of parallel processing was developed in the Virtual Laboratory for e-Science (VL-e), a science portal that aims at allowing access to users outside the computer research community. As such, this strategy can be applied to all types of serially acquired large mass spectral datasets such as LC-MS, LC-MS/MS, and high-resolution imaging MS results. © 2007 American Society for Mass Spectrometry.
AB - A new approach for automatic parallel processing of large mass spectral datasets in a distributed computing environment is demonstrated to significantly decrease the total processing time. The implementation of this novel approach is described and evaluated for large nanoLC-FTICR-MS datasets. The speed benefits are determined by the network speed and file transfer protocols only and allow almost real-time analysis of complex data (e.g., a 3-gigabyte raw dataset is fully processed within 5 min). Key advantages of this approach are not limited to the improved analysis speed, but also include the improved flexibility, reproducibility, and the possibility to share and reuse the pre- and postprocessing strategies. The storage of all raw data combined with the massively parallel processing approach described here allows the scientist to reprocess data with a different set of parameters (e.g., apodization, calibration, noise reduction), as is recommended by the proteomics community. This approach of parallel processing was developed in the Virtual Laboratory for e-Science (VL-e), a science portal that aims at allowing access to users outside the computer research community. As such, this strategy can be applied to all types of serially acquired large mass spectral datasets such as LC-MS, LC-MS/MS, and high-resolution imaging MS results. © 2007 American Society for Mass Spectrometry.
U2 - 10.1016/j.jasms.2006.09.005
DO - 10.1016/j.jasms.2006.09.005
M3 - Article
SN - 1044-0305
VL - 18
SP - 152
EP - 161
JO - Journal of the American Society for Mass Spectrometry
JF - Journal of the American Society for Mass Spectrometry
ER -