skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Developing Mango Graph Studio and its Applications for Bioinformatics and Systems Biology (SBIR Phase I Grant Final Technical Report)

Technical Report ·
OSTI ID:1494315

Modern genomic and post-genomic research produced huge amounts of heterogeneous data that must be integrated and analyzed together. Systems biology aims at utilizing these data to help model complex biological systems and further our understanding of living beings, so we may cure more diseases or harvest more renewable bio-energies in the future. However, construction and manipulation of complex graph or network data structures to represent the heterogeneous data that are linked to each other require computer science skills. Software engineers who are capable of these in-depth programming efforts may not be able to form biological analysis algorithms or to interpret the analysis results, while biologists who can formulate the hypotheses for analyses and understand the analysis results may not be good at programming. Although software engineers and biologists can form interdisciplinary research teams to solve systems biology problems together, it is not always efficient or possible to form such a team. Independent biologists working on their respective biological research endeavors may prefer a novel software platform that can enable them to perform sophisticated systems biology analyses without the need to learn professional computer science skills. This preferred platform is Mango Graph Studio™. It comes at the right time when biological BIG DATA are no longer just the problems of large genome research centers but have gradually become the problems of every biologist. Mango Graph Studio strikes an optimal balance among its ease of uses (has a modern graphical user interface), flexibility in applications (comes with the Graph Exploration Language), computational power (automatically takes advantage of multi-core CPUs and many-core GPUs) and data scalability (handles million-node graphs easily even on personal computers). Mango Graph Studio has been published in academic journals, and it is the focus of this SBIR Phase I project to advance its applications in bioinformatic research including systems biology, and to promote its uses by scientists in more research fields. Specifically, during this Phase I Project a new version of the Graph Exploration Language (Gel) has been designed to make it a general-purpose programming language, the implementation of the new Gel version 2 compiler has gone underway, and some practical Mango applications have also been developed to solve PCR primer design problems and short-interfering RNA (siRNA) design problems. We have also studied the optimal approach to integrate our existing many-core GPU accelerated graph traversal code with Mango Graph Studio so certain time-consuming graph computations on graphs in billion-node scales can be seamlessly pushed onto GPUs and got sped up there. Although the Gel 2.0 compiler has not been completed due to the additional language design decisions and compiler implementation needs uncovered during the Phase I project, we have resolved all design decisions and made solid progresses toward its completion. Therefore, there is no doubt that the Gel 2.0 compiler can be completed later, likely during Year 1 of the Phase II work. Subsequently, we can integrate the new compiler into Mango Graph Studio and release version 2.0 to the public, and then users can take advantage of all new features we have created during this project and apply them toward solving their bioinformatic and other graph analytic problems. During Phase II and Phase III, in addition to continuously improving Mango and Gel features, our new focuses will be to develop more sophisticated Mango applications that solve difficult network analysis problems in various research or industrial domains, and to provide a built-in Mango App Store that will allow Mango applications developed by us or other contributors to be easily documented, searched and distributed all within the Mango user interface. As Mango Graph Studio is a general-purpose graph analytic platform that can be used to solve graph or network data analyses problems beyond bioinformatics, it is our business goal that Mango Graph Studio will become the standard platform for all graph analytics and visualizations.

Research Organization:
Complex Computation LLC
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
DOE Contract Number:
SC0018492
OSTI ID:
1494315
Type / Phase:
SBIR (Phase I)
Report Number(s):
DOE-CCLLC-0018492
Resource Relation:
Related Information: Two supplementary Mango application code examples in greater details are provided online and cited by the Technical Report
Country of Publication:
United States
Language:
English