The Pathway Tools software provides a suite of capabilities for storing and analyzing collections of genomic and metabolic information .
The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.
Background: The Metabolic Network Explorer is a new addition to the BioCyc.org website and the Pathway Tools software suite that supports the interactive exploration of metabolic networks. Any metabolic network visualization tool must by necessity show only a subset of all possible metabolite connections, or the results will be visually overwhelming. Existing tools, even those that purport to show an organism’s full meta- bolic network, limit the set of displayed connections based on predefined pathways or other preselected criteria. We sought instead to provide a tool that would give the user dynamic control over which connections to follow.
Results: The Metabolic Network Explorer is an easy-to-use, web-based software tool that allows the user to specify a starting metabolite of interest and interactively explore its immediate metabolic neighborhood in either or both directions to any desired depth, letting the user select from the full set of connected reactions. Although, as for other tools, only a small portion of the metabolic network is visible at a time, that portion is selected by the user, based on the full reaction complement, and it is easy to switch among alternate paths of interest. The display is intuitive, customizable, and provides copious links to more detailed information pages.
Conclusions: The Metabolic Network Explorer fills a gap in the set of metabolic network visualization tools and complements other modes of exploration. Its primary strengths are its ease of use, diagrams that are intuitive to biologists, and its integration with the broader corpus of data provided by a BioCyc Pathway/Genome Database.
Leveraging Curation Among Escherichia coli Pathway/Genome Databases Using Ortholog-Based Annotation Propagation
Abstract Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors […]
Abstract Metabolomics, synthetic biology, and microbiome research demand information about organism-scale metabolic networks. The convergence of genome sequencing and computational inference of metabolic networks has enabled great progress toward satisfying that demand by generating metabolic reconstructions from the genomes of thousands of sequenced organisms. Visualization of whole metabolic networks is critical for aiding researchers in […]
MetaCyc (MetaCyc.org) is a comprehensive reference database of metabolic pathways and enzymes from all domains of life. It contains 2749 pathways derived from more than 60 000 publications, making it the largest curated collection of metabolic pathways. The data in MetaCyc are evidence-based and richly curated, resulting in an encyclopedic reference tool for metabolism. MetaCyc is also used as a knowledge base for generating thousands of organism-specific Pathway/Genome Databases (PGDBs), which are available in BioCyc.org and other genomic portals. This article provides an update on the developments in MetaCyc during September 2017 to August 2019, up to version 23.1. Some of the topics that received intensive curation during this period include cobamides biosynthesis, sterol metabolism, fatty acid biosynthesis, lipid metabolism, carotenoid metabolism, protein glycosylation, antibiotics and cytotoxins biosynthesis, siderophore biosynthesis, bioluminescence, vitamin K metabolism, brominated compound metabolism, plant secondary metabolism and human metabolism. Other additions include modifications to the GlycanBuilder software that enable displaying glycans using symbolic representation, improved graphics and fonts for web displays, improvements in the PathoLogic component of Pathway Tools, and the optional addition of regulatory information to pathway diagrams.
Background: High-throughput experiments can bring to light associations between genes, proteins and/or metabolites, many of which will be explainable by existing knowledge. Our aim is to speed elucidation of such explanations and, in some cases, find explanations that scientists might otherwise overlook.
Results: We describe the MultiOmics Explainer, a new tool within the Pathway Tools software suite that leverages what is known about an organism’s metabolic and regulatory network to suggest explanations for the results of omics experiments. Querying a database such as EcoCyc, the MultiOmics Explainer searches the organism’s network of metabolic reactions, transporters, cofactors, enzyme substrate-level activation and inhibition relationships, and transcriptional and translational regulation relationships to identify paths of influence among input genes, proteins and metabolites. Results are presented in a combined metabolic and regulatory diagram. We present several examples of explanations generated for associations found in the Escherichia coli literature.
Conclusions: The MultiOmics Explainer is a valuable tool that helps researchers understand and interpret the results of their omics experiments in the context of what is known about an organism’s metabolic and regulatory network. It showcases the rich set of computational inferences that can be drawn from a database such as EcoCyc that encodes a diverse range of biological interactions.
Microbial genome web portals have a broad range of capabilities that address a number of information-finding and analysis needs for scientists. This article compares the capabilities of the major microbial genome web portals to aid researchers in determining which portal(s) are best suited to their needs. We assessed both the bioinformatics tools and the data content of BioCyc, KEGG, Ensembl Bacteria, KBase, IMG, and PATRIC. For each portal, our assessment compared and tallied the available capabilities. The strengths of BioCyc include its genomic and metabolic tools, multi-search capabilities, table-based analysis tools, regulatory network tools and data, omics data analysis tools, breadth of data content, and large amount of curated data. The strengths of KEGG include its genomic and metabolic tools. The strengths of Ensembl Bacteria include its genomic tools and large number of genomes. The strengths of KBase include its genomic tools and metabolic models. The strengths of IMG include its genomic tools, multi-search capabilities, large number of genomes, table-based analysis tools, and breadth of data content. The strengths of PATRIC include its large number of genomes, table-based analysis tools, metabolic models, and breadth of data content.
The Omics Dashboard is a software tool for interactive exploration and analysis of gene-expression datasets. The Omics Dashboard is organized as a hierarchy of cellular systems. At the highest level of the hierarchy the Dashboard contains graphical panels depicting systems such as biosynthesis, energy metabolism, regulation and central dogma. Each of those panels contains a series of X–Y plots depicting expression levels of subsystems of that panel, e.g. subsystems within the central dogma panel include transcription, translation and protein maturation and folding. The Dashboard presents a visual read-out of the expression status of cellular systems to facilitate a rapid top-down user survey of how all cellular systems are responding to a given stimulus, and to enable the user to quickly view the responses of genes within specific systems of interest. Although the Dashboard is complementary to traditional statistical methods for analysis of gene-expression data, we show how it can detect changes in gene expression that statistical techniques may overlook. We present the capabilities of the Dashboard using two case studies: the analysis of lipid production for the marine alga Thalassiosira pseudonana, and an investigation of a shift from anaerobic to aerobic growth for the bacterium Escherichia coli.