1. ABOUT THE DATASET ------------ Title: Dataset to support 'Speciation Analysis of Fungi by LAP-MALDI Mass Spectrometry' Creator(s): Lily R. Adair (https://orcid.org/0009-0001-2652-9960), Ian M. Jones, Rainer Cramer (https://orcid.org/0000-0002-8037-2511). Organisation(s): University of Reading Rights-holder(s): University of Reading Publication Year: 2025 Description: This dataset contains raw and processed liquid atmospheric pressure matrix-assisted laser desorption/ionisation (LAP-MALDI) mass spectrometry (MS) data from two fungal species: Saccharomyces cerevisiae and Candida albicans, along with corresponding MS/MS data of lipids and proteins detected in the MS profiles. From the lipid MS profiles, m/z values were searched against the LIPIDS MAPS database at www.lipidmaps.org/bulk_search. Raw MS/MS of proteins data were processed using Mascot Distiller (version 2.8.5.1, 64-bit; Matrix Science, London, UK). The resulting protein fragment ion peak lists were submitted to the Mascot MS/MS Ions Search tool (version 3.1; Matrix Science) for protein characteristion and species identification. Protein database searches were conducted against the Mascot contaminants database (29 January 2016; 247 sequences; 128,130 residues) and the Swiss-Prot database (22 May 2024; 571,282 sequences; 206,678,396 residues). BLAST searches were undertaken with the Mascot-identified amino acid sequences submitted to the BLAST search routine at www.uniprot.org/blast. All MS and MS/MS data were acquired using a modified Synapt G2-Si instrument coupled to an in-house–built atmospheric pressure MALDI source. Changelog: N/A Cite as: : Adair, Lily Rose and Cramer, Rainer (2025): Dataset to support 'Speciation analysis of fungi by LAP-MALDI mass spectrometry'. University of Reading. Dataset. https://doi.org/10.17864/1947.001426 Related publication: Adair, L. R., Jones, I. M., & Cramer, R. Speciation Analysis of Fungi by LAP-MALDI Mass Spectrometry. Frontiers in Cellular and Infection Microbiology. Submitted for review. Contact: Prof. Rainer Cramer, Department of Chemistry, University of Reading, Reading RG6 6DX, United Kingdom, Tel: 0118 378 4550, Email: r.k.cramer@reading.ac.uk Acknowledgements: This research was supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/V047485/1. 2. TERMS OF USE ------------ Copyright 2025 University of Reading. This dataset is licensed under a Creative Commons Attribution 4.0 International Licence: https://creativecommons.org/licenses/by/4.0/. 3. PROJECT AND FUNDING INFORMATION ------------ Title: A Cost-Effective High-Speed Clinical Diagnostics Instrument for Large Population Screening Based on Novel Liquid AP-MALDI MS Technology Dates: 2021-2025 Funding organisation: Engineering and Physical Sciences Research Council Grant no.: EP/V047485/1 Title: Advancing LAP-MALDI mass spectrometry profiling/biotyping for the analysis of microbes and their pathogenicity Dates: 2022-2025 Funding organisation: University of Reading 4. CONTENTS ------------ Raw_Data.zip Processed_Peak_Lists.zip Mascot_Search_Results.zip Lipid_Maps.zip Blast_Searches.zip Raw_Data.zip contains two folders; one for each species, which have two subfolders, 'Proteins' and 'Lipids'. For S. cerevisiae, there are 9 files in the 'Lipids' folder (1 MS and 8 MS/MS) and 7 files in the 'Proteins' folder (1 MS and 6 MS/MS). For C. albicans, there are 4 files in the 'Lipids' folder (1 MS and 3 MS/MS) and 6 files in the 'Proteins' folder (1 MS and 5 MS/MS). Processed_Peak_Lists.zip contains a total of 3 .txt files, 2 for S. cerevisiae and 1 for C. albicans which were used for database searching. Mascot_Search_Results.zip contains 3 pdf files for each protein searched with the output from the Mascot searches. Lipid_Maps.zip folder contains 1 Excel file with 2 pages, 1 page for each species. Blast_Searches.zip contains 3 *.png files which show a complete visual of the BLAST searches for each of the 3 proteins identified. 5. METHODS ----------- The LAP-MALDI MS and MS/MS data were generated using a commercial mass spectrometer, a Synapt G2-Si (Waters Corporation), equipped with a custom-built AP-MALDI source. MassLynx (ver. 4.2; Waters) software was used to acquire and process the data (.raw data folders). Further processing was carried out using Mascot Distiller (ver. 2.8.5.1; Matrix Science). Protein database searches were conducted using Mascot MS/MS Ions Search tool (version 3.1; Matrix Science) against the Mascot contaminants database (29 January 2016; 247 sequences; 128,130 residues) and the Swiss-Prot database (22 May 2024; 571,282 sequences; 206,678,396 residues). Lipids were assigned by using LIPID MAPS (www.lipidmaps.org/bulk_search). BLAST searches were perfomred using the Mascot-identified amino acid sequences at www.uniprot.org/blast, using the default parameters with ‘UniProtKB reference proteomes + Swiss-Prot’ as target database.