Paley, S. and Karp, P.D. Evaluation of computational metabolic-pathway predictions for Helicobacter pylori.. Bioinformatics, vol. 18, no. 5, pp. 715-24, May 2002.
We seek to determine the accuracy of computational methods for predicting metabolic pathways in sequenced genomes, and to understand the contributions of both the prediction algorithms, and the reference pathway databases used by those algorithms, to the prediction accuracy.
The comparisons we performed were as follows. (1) We compared two predictions of the pathway complements of Helicobacter pylori that were computed by an early version of our pathway-prediction algorithm: prediction A used the EcoCyc E. coli pathway DB as the reference database (DB) for prediction, and prediction B used the MetaCyc pathway DB (a superset of EcoCyc) as the reference pathway DB. The MetaCyc-based prediction contained 75% more pathway predictions, but we believe a significant number of those predictions were false positives. (2) We compared two predictions of the pathway complement of H. pylori that used MetaCyc as the reference pathway DB, but that used different algorithms: the original PathoLogic algorithm, and an enhanced version of the algorithm designed to eliminate false-positive pathway predictions. The improved algorithm predicted 30\% fewer metabolic pathways than the original algorithm; all of the eliminated pathways are believed to be false-positive predictions. (3)~We compared the 98 pathways predicted by the enhanced algorithm with the results of a manual analysis of the pathways of H. pylori. Results: 40 of the computationally predicted pathways were consistent with the manual analysis, 13 pathways are considered false-positive predictions, and four pathways had partially overlapping topologies. Twenty-six predicted pathways were not mentioned in the manual analysis; we believe these are correct predictions by PathoLogic that were not found by the manual analysis. Five pathways from the manual analysis were not found computationally. Agreement between the computational and manual predictions was good overall, with the computational analysis inferring many pathways that the manual analysis did not identify. Ultimately the manual analysis is also partially speculative, and therefore is not an absolute measure of correctness. The algorithm is designed to err on the side of more false positives to bring more potential pathways to the user’s attention. The resulting H. pylori pathway DB is freely available at http://ecocyc.org:1555/HPY/organism-summary?object=HPY.
The Pathway Tools software is freely available to academic users, and is available to commercial users for a fee. Contact firstname.lastname@example.org for information on obtaining the software.