In This Section

Registries and Repositories

Published on · Last Updated 3 months 1 week ago


Subscribe to be notified of changes or updates to this page.

15 + 3 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.

Databases, registries (data banks), and repositories (tissue banks) all involve the collection and storage of information and/or biological specimens over time. Some registry/repositories serve diagnostic or clinical purposes, while others are solely for research. Many serve more than one purpose.

Rapid advances, particularly in genomics have allowed registry/repositories to serve as tremendous resources for investigators. There are questions to be addressed that extend beyond those envisioned at the time of their creation.

Research use data/biospecimens that are stored in registry/repositories is governed by the federal human subject protection regulations known as the Common Rule (45 CFR 46) and the HIPAA Privacy Rule (45 CRF 160 & 164) and by CHOP IRB Policies and Procedures. Specific requirements depend upon how and why the information or specimens in the resource are collected, stored, used, and shared.

The requirements for and extent of IRB oversight depends on the whether or not the data/biospecimens include or are linked to individually identifiable health information and the terms of the informed consent under which the data/biospecimens were originally collected.

Data and Tissue Banks Registries, data banks, and tissue banks are all considered synonymous with the term repositories for regulatory purposes. For the purposes of this webpage they will be referred to as repositories or registry/repositories and the contents as data, biospecimens or data/biospecimens.

Definitions and Overview

The terms database, registry, data bank, repository, and tissue bank are often used imprecisely, and sometimes interchangeably. The following definitions are not universally accepted, but are provided to provide consistent understanding for investigations at CHOP.


A database is collection of information elements (i.e., data) arranged for ease and speed of search and retrieval. Most databases are now maintained electronically, but the term can also be applied to paper record systems.

Examples of databases include the following:

  • A set of observations (i.e., data) resulting from a research study
  • An electronic file containing patients' records
  • A collection of diagnosis, treatment, and follow-up information for a hospital's oncology patients
  • A file of outcomes information complied for quality assurance activities
  • A list of prospective research subjects

A registry or data bank is a collection of information elements or databases whose organizers:

  • Receive information from multiple sources;
  • Maintain the information over time;
  • Control access to the information;
  • Permit multiple individuals to use the information for a variety of purposes which may evolve over time
  • May (often) contain codes that link information and specimens to the donor's identify. When a key to the code is retained, it may be maintained either by the registry or by the provider of the data.

Registries may be publicly accessible or private. Examples of a few well-known registries and data banks include:

  • Centers for Disease Control & Prevention (CDC)
  • State Cancer Registries
  • The National Library of Medicine Hazardous Substances Data Bank (HSDB)
  • The National Practitioner Data Bank
  • The US Census 2000 Data Bank

A repository or tissue bank is a collection of biological specimens (biospecimens) whose organizers receive specimens from multiple sources. Activities of a Repository include

  • Maintaining the specimens over time.
  • Controlling access to the biospecimens.
  • Permitting multiple individuals to use the biospecimens for a variety of purposes which may evolve over time.
  • Usually including phenotypic data (demographic and/or medical information) about the individuals from whom the specimens were obtained.
  • When they do contain phenotype data, the repository is both a registry and a biospecimen repository.
  • May (often) maintain codes that link the information and specimens to their donor's identify. The key to the code may be maintained by either the registry or by the provider of the data/biospecimen.

Examples of a few well-known repositories include:

  • The National Human Radiobiology Tissue Repository
  • The National Institute of General Medical Sciences (NIGMS)
  • Human Genetic Cell Repository

Databases and repositories are frequently created and maintained for purposes totally unrelated to research. For example, CHOP maintains a variety of electronic health records and data warehouse intended for diagnosis, treatment, billing, marketing, and quality improvement/control purposes. Individual departments and divisions use their own databases that have been established for clinical and quality improvement purposes. Examples include the following:

  • Anesthesia electronic record system used in the operating room to record preoperative history and physical examination data, store the vital signs and anesthetic medication administration and create anesthetic records for every procedure.
  • Pathology Department biospecimens and records for all of the materials received for diagnostic or treatment purposes.
  • Operating room management database
  • EPIC
  • Quality Assurance/Quality Improvement initiatives (e.g. cardiac arrest registry, ED resuscitation registry)

These databases and repositories serve predominantly non-research purposes. However, the information they contain can be used to address important scientific questions. IRB approval is not required for creation or use of a non-research database or biospecimen collection but is required to use these resources for research purposes. See what must be reviewed by the IRB for more details.

Is IRB approval needed before establishing a new Quality Improvement Registry?

The short answer is No. However, the nature of the protocol that is submitted could affect that answer. The following scenarios outline some of the many possible situations.

Registries Developed and Used Exclusively at CHOP

  • If the protocol only includes Quality Improvement objectives for the registry, this is hospital operations and does not require IRB oversight;

  • If the protocol also includes research objectives or if additional variables are collected purely for research, then an element of the activity is research and requires IRB oversight;

External Registry in Which CHOP is a Participant (In the scenarios discussed below, it is assumed that the data released to the external registry are not readily identifiable.)

  • If the protocol only includes Quality Improvement objectives, this is part of hospital operations and does not require IRB oversight;

  • If the protocol includes Quality Improvement objectives but also includes research objectives; the registry might require IRB oversight at CHOP depending on the role played by the personnel at CHOP. If the CHOP personnel are merely data providers and will not serve in the role of an investigator (participate in planning, execution, analysis, and publication), then CHOP is only a data provider and is not engaged in the research. (See for example, OHRP Guidance on Engagement in Human Subjects Research III.B.6 ). However, if any member of the CHOP team intends to serve as an investigator, then the IRB must review the protocol as human subjects research.

Research databases may be maintained after the completion of a study. Additional questions may arise in the future that can be addressed using the same dataset. If there is an intent to set up a registry, then the informed consent document should include language for subjects to opt-in or opt-out of storage of their data for future research purposes. The subject's decision as indicated on the consent/authorization must be respected and tracked. IRB oversight is required for each new research protocol that uses identifiable or re-identifiable information contained in the database.

A research database may also be created specifically as a resource for future research. IRB oversight is required to set up and maintain a research database.

Examples of research databases include:

  • Data compiled from a clinical trial
  • A list of names, diagnosis, and contact information developed and maintained to identify prospective research subjects
  • A collection of medical information intended for use in future research studies

Some databases created and maintained to serve multiple purposes. Examples include:

  • A collection of patients' diagnosis, treatment, and follow-up information intended for and used to conduct (i.) internal quality assurance programs and (ii) generalizable studies on the effectiveness of particular treatment interventions.
  • A compilation of patient information that was originally created and used for billing purposes but is now also routinely used to identify prospective research subjects

Biospecimens may be collected to achieve one or more of the objectives of a single study with disposal of the leftover materials at the end of the study. A repository is created if the leftover materials are stored for future use. If there is an intent to set up a repository, then the informed consent document should include language for subjects to opt-in or opt-out of storage of their data for future research purposes, a statement that the subject’s biospecimens may be used for commercial profit and whether the subject will or will not share in this commercial profit, and whether the research will or might include whole genome sequencing. The subject's decision as indicated on the consent/authorization must be respected and tracked.

Specimens may also be collected specifically for the purposes of future research. IRB submission is required for each new research protocol that creates a repository.

Key Human Subjects Concerns for Managing a Repository

The IRB will pay particular attention to the following issues and how they are addressed in the protocol and application:

  • the language in the informed consent/authorization agreement specifying the nature and purposes of the future research
  • the registry/repository SOPs (unless this information is outlined in the protocol) for storage, use, back-up, and sharing the data/biospecimens;
  • the security measures and SOPs (unless this information is outlined in the protocol) detailing the methods of Coding Data & Specimens, encrypting or anonymizing data/biospecimens to protect the privacy of subjects and the confidentiality of the data/biospecimens
  • the methods of sharing data/biospecimens (with or without identifiers) to recipient investigators

For the purposes of the remainder of this page, both registries and repositories will be referred to as repositories. The advantage to investigators of creating, maintaining, and/or using a research repository lies in the prospective collection, the safe storage and back-up, controlled access to and use of data/biospecimens. The information and specimens can be accessed by multiple investigators for multiple research projects. Some of the potential research uses may be anticipated at the time of collection but most cannot be specified at the time the repository is created.

A repository protocol may be submitted to the IRB either to:

  • Define the operating parameters for establishing and maintaining a research repository; or
  • Convert an existing research database, non-research database, or non-research repository into a research repository.

The IRB can approve relatively broad parameters for collecting, storing, sharing, and using the repository's information and/or specimens in research, provided the protocol incorporates a series of research protections that permit multiple uses of repository information and/or specimens by multiple investigators and/or for multiple research projects with minimal additional review by the IRB. To maximize the utility of the data/biospecimens retained in the repository requires careful planning and clear operating policies and procedures (SOP). The mechanisms put in place will determine the requirements for IRB oversight of future uses of the materials in the repository.

Researchers who wish to develop or maintain a repository that is being developed for the purposes of current or future research, need to submit an application for IRB review. IRB approval, or determination of exemption, is required before initiating any repository-related activity. Operators of the repository must implement physical and procedural mechanisms for the secure receipt, storage, and transmission of information and specimens. These procedures must be reviewed by the IRB and must be sufficient to ensure the protection of subjects' privacy and the confidentiality of subjects' information.

In addition to the usual information contained in a human research protocol and an IRB application, the IRB expects that will include at least the following specific information:

  • The specific conditions under which data/biospecimens may be accepted into the repository, including requirement for the FWA number for each site IRB and a copy of the IRB approval letter;
  • A detailed description of the physical and procedural mechanisms for the secure receipt, storage, and transmission of information and specimens to ensure the protection of subjects' privacy and the confidentiality of subjects' data/specimens,
  • The specific conditions under which data and/or specimens may be shared with or released to research investigators including policies pertaining prohibiting or permitting the sharing of PHI associated with the data/biospecimens,
  • Protocol for the collectors of data/specimens
  • Template consent and written authorization or combined consent/authorization for the collection sites that includes a clear description of each of the following:
    • The general concept and purpose of repository;
    • The name and location of the repository(ies);
    • The nature and types of future research in as much specific detail as possible;
    • A summary of the physical and procedural mechanisms for protecting subjects' privacy and the confidentiality of data/biospecimens;
    • The conditions (if any) under which subject's may withdraw their consent/authorization to use of specimens;
    • The conditions and requirements under which data/biospecimens and materials derived from biospecimens may be shared with recipient-investigators;
    • The elements of PHI (if any) to be shared with recipient investigators;
    • Itemization of the risks related to a breach of confidentiality including impact on privacy, insurability, stigmatization, etc.;
    • Where human genetic research is anticipated, information about the consequences of DNA typing (e.g., regarding possible paternity determinations, impact on insurability, etc.) and related confidentiality risks.

Developing a Repository Protocol:

The IRB has created a Registry-Repository protocol template. It is available on our protocol template page, and it includes guidance on developing a protocol for a Repository. This template will need to be modified, with sections added or deleted depending on the specifics of the data and specimens collected, the potential future uses for the data/specimens, the potential for returning results, the intent to include other sites, the intent to share the data/specimens and research results with other investigators, the structure of the oversight of the repository and other issues. To supplement the template the IRB has developed a Fact Sheet with important considerations when Creating a Registry-Repository.

Developing a Consent Form for a Repository:

Example Repository Consent Form - The IRB has created an example biorepository consent form for studies where the sole procedure is collection of data and specimens for a repository. This form is an example only and might need to be modified substantially depending on the nature of the data and specimens that will be collected, the types of tests routinely performed on the specimens and the potential future uses for the materials collected.

Investigators who collect the data/biospecimens (the collector) that will be stored in a repository must agree in writing to specific conditions stipulated by the Repository IRB (the IRB exercising oversight of the repository. Collectors other responsibilities include:

  • Obtaining appropriate IRB approval from their local IRB
  • Obtaining and documenting informed consent/authorization as required by their IRB;
  • Following the protocol procedures including:
    • Obtaining the required data/biospecimens in the appropriate form and storage medium and at the specified timepoints;
    • Coding and labeling the samples before shipping or uploading the data/biospecimens to the registry/repository;
    • Removing all unnecessary PHI from the data/biospecimens;
    • Methods for maintaining and securing PHI associated with the data/biospecimens sent to the registry/repository and any links (key) between the donor and the data/biospecimens as specified in the protocol.

Responsibilities of the Recipient

Investigators receiving data/biospecimens from the repository (the recipient) must also agree in writing to specific conditions required by the Repository IRB and must adhere to local IRB oversight requirements. The nature of the local IRB oversight requirements depends upon the whether or not the received data/biospecimens are potentially identifiable.

Sharing and Using Data/Biospecimens from Research Repositories

Ensuring that comprehensive subject protection mechanism are in place at the time the registry/repository is created, permits new collecting investigators, new research projects, and new recipient investigators to be added to the repository's activities without protracted IRB review. Many new activities may require only expedited review by the IRB. Under some circumstances, they may simply require timely notification of the IRB or may not require IRB review and approval at all. The use and disclosure of information/specimens from a research repository are determined by the following:

  • the requirements of the Repository IRB who is responsible for the review, approval, and oversight of the repository; and
  • the requirements of the local IRB who is responsible for research at the site where the information/specimens will be used.

For more information about sharing data/biospecimens see the following webpages: Coded Data and Sharing Data.

The IRB is responsible for any research repository maintained at CHOP or its affiliates, including their employees or agents (e.g., CHOP faculty). Data/biospecimens from these repositories may be accessed, used, shared, or disclosed in accordance with the IRB-approved repository protocol and consent/authorization and any additional approval conditions stipulated by the IRB. Once provided to recipient-investigators outside CHOP and its affiliates, use and disclosure of the data/specimens must also comply with any additional requirements of the recipient institution and its IRB.

When data/biospecimens are received by CHOP investigators, their use and disclosure must comply with any conditions stipulated by the sending institution's IRB. CHOP policies and procedures for the protection of human subjects and the use and disclosure of protected health information must also be observed.

  • Receipt of de-identified data/biospecimens does not engage CHOP or its investigators in human subjects research.
  • Receipt of a limited dataset requires a duly executed data use agreement between the CHOP investigator and the provider of the data/biospecimens.
  • When data/biospecimens contain PHI, it is up to the IRB, not the investigator, to determine the human subject protection and privacy rule protections required for any research activities. Investigators should consult the IRB before initiating any research involving data/biospeicmens obtained from outside repositories (including outside data banks, tissue banks, and registries) if the materials will include PHI.

Additional Resources

CHOP IRB Materials

  1. Registry/Repository Protocol Template
  2. Creating a Registry-Repository
  3. Consent Template for Biorepository

Office of Human Research Protections (OHRP)

  1. OHRP's Guidance on Coded Private Information or Biological Specimens
  2. OHRP: Issues to Consider in the Research Use of Stored Data or Tissues