The database of Genotypes and Phenotypes was created to archive the results of studies exploring the interaction between genotype and phenotype. Studies analyzed include medical sequencing, genome-wide association studies, etc. This database receives funding from the National Institutes of Health, and was founded in 2007 (planning began 2006) (1).

Two levels of access are provided to the public visiting the website: open and controlled. These two levels of security allow the public to view documents while maintaining the privacy of others’ information.

Open-access data available includes: Studies; Study Documents; Phenotypic Variables; Genotype-Phenotype Analyses.

Controlled-access data available includes: Pedigrees; De-identified phenotypes and genotypes for individual study subjects; Pre-computed univariate associations between genotype and phenotype.

To gain access to controlled-access data, one must submit a request to the NIH Data Access Committee explaining the methods and motivations behind one’s research (1).

On the dbGaP homepage, a selection of the latest studies accepted to the site are available to browse. The name of the Study is shown, as well as the accession number (phsXXXXXX.vX.pX). The Embargo Release shows whether or not the version released has passed the release deadline. The Details of the study are shown, allowing one to overview which aspects of the study are available to see. The number of Participants is shown, allowing one to determine the size of the study. The Type of Study is displayed, allowing viewers to determine if the study is one that fits their research criteria and if they are interesting in further researching. Links are provided for the viewer, allowing them to access all versions of the study, methods used, protocols and procedures, etc. Lastly, the Platform is listed, providing viewers with how data was collected and/or analyzed for the study (1).


An overview of the dbGaP Homepage. SOURCE:

While viewing the webpage for a specific study (in this example, NIH Exome Sequencing of Familial Amyotrophic Lateral Sclerosis Project), tabs are available on the page allowing viewers to sort through the openly-accessible information. The five tabs available are: Study; Variables; Documents; Analyses; Datasets. The Study tab allows one to learn about the description of the study, what kind of access is available, what inclusion and exclusion criteria were met, and the history behind the study, among many other options. The Variables tab allows one to view statistical summaries of the variables controlled in the study. The Documents tab lets one view the previous versions of the study as well as access downloadable copies of the study. The Analyses tab provides analytical information and criteria to summarize the study. Lastly, the Datasets tab allows one to download information pertaining to specific variables used in the study (1).

Viewing a study

An overview of viewing a study from the dbGaP. SOURCE:

How to search

A small guide outlining possible search inquiries on the dbGaP. SOURCE:

The NIH and dbGaP websites provide a helpful guide to finding information within the database. The following search examples make finding exact studies, projects, variables, documents, etc. much easier (1).

1) The National Center for Biotechnology Information. The database of Genotypes and Phenotypes. Updated Oct 28, 2009. Accessed Sept 7, 2014. URL: