NCBI - NIH/NLM National Center for Biotechnology Information

Program Description: NCBI The National Institutes of Health (NIH)/National Library of Medicine (NLM)/National Center for Biotechnology Information (NCBI) is located in Bethesda, MD. The entire database is about 1. 2 TB (333 GB, compressed), although users typically download a subset of data in the 10-20GB size range. A more challenging storage problem is the Sequence Read Archive (SRA); what is stored are raw sequence data from "next-generation" sequencing technologies including 454, IonTorrent, Illumina, SOLiD, Helicos and Complete Genomics. SRA currently stores 1.3 quadrillion bases and the database is currently 2.79 PB in size.

Year Started: 2013.

Organization Description: N/A

Data Description: NCBI is one of three replicated data stores are the repository and archives for the world’s knowledge regarding genes, sequences, proteins, and so forth. This material is constantly being updated and referenced by scientists around the world. Because files downloaded are so large, scientists have run in to problems with completing transfers successfully or experiencing transfers taking a very, very long time. To address this problem, NCBI provides a special high performance file transfer client called “aspera” (a commercial variant of gridftp: parallel transfer streams over TCP) so that “Many sites can transfer data at 200-500Mbps. and nearly all sites can transfer at faster than 10Mbps.”

Project Type: Data Repository and Analysis

Project Domains: Biological Sciences

No budget information

Federal Funding: NIH

Program Data

Location Lat/Lon Coordinates Location Type Data Type Data Generation Single Data Instance Size (TB) Estimated Daily Data Size (GB) Estimated Annual Data Size (PB) Average Sustained Throughput (Gbps) Maximum Sustained Throughput (Gbps) Online Repository Size (PB) Total Repository Size (PB) Delay Tolerance (minutes) Jitter Sensitive? Uses the Cloud?
Bethesda, MD, United States 38.98,-77.09 Data Repository Data On demand 3.00 0.00 - -
Bethesda, MD, United States 38.98,-77.09 Data Repository Data On demand 2790.00 0.25 2.78999996 2.78999996 - -