Aberystwyth Repository of Metabolite Characteristics

The Armec Repository Project

ARMeC is a repository for species specific metabolomic characteristics. The original motivation for ARMeC was as a tool to annotate flow injection electrospray mass spectrometry (FIE-MS) fingerprints using signal characteristics such as (de)protonated molecular ion, salt adducts, neutral losses and associated dimeric combinations coupled with MS/MSn ion fragmentation trees and/or output from supervised multivariate analyses. Database construction was initiated by compiling a metabolome list as well as accompanying molecular structure, formula and pathway information for a given species based upon literature reports and publicly available databases. A protocol outlining a work flow for the creation of species specific electrospray ionization mass spectral (ESI-MS) databases and the annotation of explanatory signals within FIE-MS fingerprints has recently been submitted for online publication at Nature Protocols. Although ARMeC was originally designed to query FIE-MS fingerprints, it can also be readily applied to the annotation of HPLC-ESI-MS chromatograms (although queries based on retention time are not yet possible). Currently, we are in the process of expanding the database content by adding additional species with a primary focus upon particular food crops as well as the human metabolome for use with nutrition based problems.

An issue that is currently being recognized by the metabolomics community, regarding the annotation of metabolite signals from metabolomics data, is that each species is biochemically unique; therefore for the purpose of reliable signal annotation, a need exists for parameterizing species specific metabolome queries. Searching on species specific databases rather than all purpose, general metabolite databases, allow for more a reasonable, initial annotation of signals strictly pertaining to the metabolome of the species under investigation. The application of a species specific metabolome database for queries reduces the number of probable "hits" to create a shortlist of putative m/z signal identifications.

ARMeC has been designed to allow queries on ESI-MS or ESI-MS/MSn signal interpretation as well as general queries to metabolite identity or pathway associations with both options having biological provenance search constraints. Database queries regarding the annotation of ESI-MS signals can result in a multiple number of putative identifications for any one given m/z signal. Further investigation of all putative annotations by ESI-MS/MSn queries is required to assign metabolite identity. Confirmation of the putative metabolite identifications and associated adduct/neutral loss/dimeric states can be supported by ESI-MS/MSn experiments as ESI-MS/MSn ion trees for particular adducts are unique to their molecular structure. ESI-MS/MSn queries performed by ARMeC are performed with the parent m/z signal and resulting predominant daughter ion fragments (m/z signals with a relative abundance of 60% of greater). Regarding the interpretation of explanatory signals from the output of supervised multivariate analysis algorithms such as Random Forest, specific RF mining instructions as well as a calculation spreadsheet have been provided. We are currently in the process of automating the annotation of multivariate statistical output, with updates to ARMeC planned for March, 2007.

The content contained within ARMeC is currently at a stage of growth, through the valued inputs of various contributors. The submission of updates to or the creation of new species not currently supported in the ARMeC database is strongly encouraged to increase the overall range in species variation and general application of ARMeC to metabolomics experiments. Please contact curators for instructions regarding the submission of new data content.