The rapid progression of semiconductor technology has significantly impacted the ability to examine and analyze complex integrated circuits (ICs). Small device feature sizes, combined with large die sizes, add a heavy processing burden that severely limits our timely ability to perform defect validation and anti-tampering analysis at full scale. In this paper, we describe the algorithmic steps taken in the processing pipeline to quickly create a global image database of an entire advanced IC. We focused specifically on the image alignment and stitching algorithms necessary to support a combined field-of-view of a given layer of a die. We describe key algorithmic challenges such as contextual semantics that limits the robustness of the alignment algorithm. We also describe the use of database indexing to manage and traverse the enormous amounts of data.
Reaction gap filling is a computational technique for proposing the addition of reactions to genome-scale metabolic models to permit those models to run correctly. Gap filling completes what are otherwise incomplete models that lack fully connected metabolic networks. The models are incomplete because they are derived from annotated genomes in which not all enzymes have been identified. Here we compare the results of applying an automated likelihood-based gap filler within the Pathway Tools software with the results of manually gap filling the same metabolic model. Both gap-filling exercises were applied to the same genome-derived qualitative metabolic reconstruction for Bifidobacterium longum subsp. longum JCM 1217, and to the same modeling conditions — anaerobic growth under four nutrients producing 53 biomass metabolites.
The solution computed by the gap-filling program GenDev contained 12 reactions, but closer examination showed that solution was not minimal; two of the twelve reactions can be removed to yield a set of ten reactions that enable model growth. The manually curated solution contained 13 reactions, eight of which were shared with the 12-reaction computed solution. Thus, GenDev achieved recall of 61.5% and precision of 66.6%. These results suggest that although computational gap fillers are populating metabolic models with significant numbers of correct reactions, automatically gap-filled metabolic models also contain significant numbers of incorrect reactions.
Our conclusion is that manual curation of gap-filler results is needed to obtain high-accuracy models. Many of the differences between the manual and automatic solutions resulted from using expert biological knowledge to direct the choice of reactions within the curated solution, such as reactions specific to the anaerobic lifestyle of B. longum.
EcoCyc (EcoCyc.org) is a freely accessible, comprehensive database that collects and summarizes experimental data for Escherichia coli K-12, the best-studied bacterial model organism. New experimental discoveries about gene products, their function and regulation, new metabolic pathways, enzymes and cofactors are regularly added to EcoCyc. New SmartTable tools allow users to browse collections of related EcoCyc content. SmartTables can also serve as repositories for user- or curator-generated lists. EcoCyc now supports running and modifying E. coli metabolic models directly on the EcoCyc website.
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.
The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life.
BioCyc.org is a genome and metabolic pathway web portal covering 5500 organisms, including Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae and Escherichia coli. These organism-specific databases have undergone variable degrees of curation. The EcoCyc (Escherichia coli Encyclopedia) database is the most highly curated; its contents have been derived from 27,000 publications. The MetaCyc (Metabolic Encyclopedia) database within BioCyc is a “universal” metabolic database that describes pathways, reactions, enzymes and metabolites from all domains of life. Metabolic pathways provide an organizing framework for analyzing metabolomics data, and the BioCyc website provides computational operations for metabolomics data that include metabolite search and translation of metabolite identifiers across multiple metabolite databases. The site allows researchers to store and manipulate metabolite lists using a facility called SmartTables, which supports metabolite enrichment analysis. That analysis operation identifies metabolite sets that are statistically over-represented for the substrates of specific metabolic pathways. BioCyc also enables visualization of metabolomics data on individual pathway diagrams and on the organism-specific metabolic map diagrams that are available for every BioCyc organism. Most of these operations are available both interactively and as programmatic web services.
A Framework for Application of Metabolic Modeling in Yeast to Predict the Effects of Nssnv in Human Orthologs
We have previously suggested a method for proteome wide analysis of variation at functional residues wherein we identified the set of all human genes with nonsynonymous single nucleotide variation (nsSNV) in the active site residue of the corresponding proteins. 34 of these proteins were shown to have a 1:1:1 enzyme:pathway:reaction relationship, making these proteins ideal candidates for laboratory validation through creation and observation of specific yeast active site knock-outs and downstream targeted metabolomics experiments. Here we present the next step in the workflow toward using yeast metabolic modeling to predict human metabolic behavior resulting from nsSNV.
For the previously identified candidate proteins, we used the reciprocal best BLAST hits method followed by manual alignment and pathway comparison to identify 6 human proteins with yeast orthologs which were suitable for flux balance analysis (FBA). 5 of these proteins are known to be associated with diseases, including ribose 5-phosphate isomerase deficiency, myopathy with lactic acidosis and sideroblastic anaemia, anemia due to disorders of glutathione metabolism, and two porphyrias, and we suspect the sixth enzyme to have disease associations which are not yet classified or understood based on the work described herein.
Preliminary findings using the Yeast 7.0 FBA model show lack of growth for only one enzyme, but augmentation of the Yeast 7.0 biomass function to better simulate knockout of certain genes suggested physiological relevance of variations in three additional proteins. Thus, we suggest the following four proteins for laboratory validation: delta-aminolevulinic acid dehydratase, ferrochelatase, ribose-5 phosphate isomerase and mitochondrial tyrosyl-tRNA synthetase. This study indicates that the predictive ability of this method will improve as more advanced, comprehensive models are developed. Moreover, these findings will be useful in the development of simple downstream biochemical or mass-spectrometric assays to corroborate these predictions and detect presence of certain known nsSNVs with deleterious outcomes. Results may also be useful in predicting as yet unknown outcomes of active site nsSNVs for enzymes that are not yet well classified or annotated.