KBASE - Systems Biology Knowledge Base


Program Description: The Systems Biology Knowledgebase (Kbase) is a DOE funded, open-source, open-architecture software and data environment for systems biology research. KBase provides a computational framework and tools for integrating and analyzing large, diverse datasets generated by the scientific community to advance predictive understanding, manipulation, and design of biological processes in an environmental context. The purpose of KBase is to enable users to integrate a wide spectrum of genomics and systems biology data, models, and information for microbes, microbial communities, and plants. Powerful tools within KBase allow users to analyze and simulate data to predict biological behavior, generate and test hypotheses, design new biological functions, and propose new experiments. KBase pulls its data from existing databases, including NCBI but also 25 other data stores for integrated understanding of genomes, metagenomes, transcriptomes, proteomes (mapped to genomes), interactomes, phenotypes, 16s amplicons, expression data, enzymes, ontologies, pathway data, protein annotations, protein-protein interactions, regulons, and ribosomes.

Organization Description: The KBase collaboration is led by Lawrence Berkeley, Argonne, Brookhaven, and Oak Ridge national laboratories. Also involved in the multi-institutional program are Cold Spring Harbor Laboratory; the University of California, Davis; Hope College; the University of Illinois at Urbana-Champaign; Yale University; and the University of Tennessee. Key external partners are the Joint Genome Institute, EMSL, and the Bioenergy Centers. Several university projects are also important contributors. SoyKB was developed at the Informatics Institute at the University of Missouri (MU), Columbia, US as part of the Obama administration’s $200 million Big Data Research and Development Initiative. XSEDE developers are currently assisting with new workflow capabilities for the site.

Data Description: N/A

Project Type: Data Repository and Analysis

Project Domains: Biological Sciences

Budget: 12+ Million US

Federal Funding: NSF, DOE

Program Data

Location Lat/Lon Coordinates Location Type Data Type Data Generation Single Data Instance Size (TB) Estimated Daily Data Size (GB) Estimated Annual Data Size (PB) Average Sustained Throughput (Gbps) Maximum Sustained Throughput (Gbps) Online Repository Size (PB) Total Repository Size (PB) Delay Tolerance (minutes) Jitter Sensitive? Uses the Cloud?
Unspecified Location Not Specified Not Specified Not Specified Not Specified - -