We introduce a method to determine the sum formula of an unknown
metabolite solely from its mass and isotope pattern. Metabolites, such
as sugars or lipids, participate in almost all cellular processes, but
the majority still remains uncharacterized. Our input is a measured
isotope pattern from a high resolution mass spectrometer, and we want
to find those molecules that best match this pattern. Since the size
of the search space is prohibitive, generating all potential
solutions, simulating their isotope patterns, and matching them
against the input is often not feasible.
To this end, we define the Joint Decomposition Problem: Given an
alphabet Sigma of size n, and k mass functions mu_i:Sigma to IN,
i=1,...,k, let a_ij denote the mass mu_i(sigma_j) of the j'th
character. Our input is a vector m=(m_1,...,m_k). We want to find a
non-negative integer vector c=(c_1,...,c_n) (a compomer) such that
Ac=m.
We whow how to extract a joint decomposition problem from the mass
spectrum and give an efficient algorithm for producing all joint
decompositions of the query vector. We demonstrate its fitness on real
data extracted from a metabolite database. We also present a method
for efficiently computing the isotope pattern of a given molecule.