Informatics, digital & computational pathology

Database

Data repositories



Last author update: 11 April 2024
Last staff update: 11 April 2024

Copyright: 2022-2024, PathologyOutlines.com, Inc.

PubMed Search: Data repositories

Ricardo Gonzalez, M.D., M.Sc., M.P.H.
Andrew P. Norgan, M.D., Ph.D.
Page views in 2024 to date: 80
Cite this page: Gonzalez R, Norgan AP. Data repositories. PathologyOutlines.com website. https://www.pathologyoutlines.com/topic/informaticsdatarepositories.html. Accessed April 25th, 2024.
Definition / general
Essential features
  • Most healthcare data are unstructured (e.g., freeform text and images) (Healthc Inform Res 2019;25:1)
  • Database management system (DBMS) is software for creating and maintaining databases
  • Traditional relational databases may be inadequate for applications and analyses of high volume healthcare data
  • Data warehouses and data lakes are commonly used to improve the organization and accessibility of healthcare data
Terminology
Applications
Limitations
Software
Videos

Data management playlist - IBM Technology

Data storage essentials playlist - IBM Technology

Board review style question #1
Which acronym best describes how data are moved into a data warehouse?

  1. ELT (extract, load, transform)
  2. ETL (extract, transform, load)
  3. ITE (import, transform, export)
  4. LTE (load, transform, export)
Board review style answer #1
B. ETL (extract, transform, load). Data are obtained from 1 or more sources following specific criteria in the first phase of this process (i.e., extract). Then, data are cleaned and converted into specific formats to be stored (i.e., transform). This can make data from different sources comparable. Later, data are uploaded to the warehouse (i.e., load) (Frisse: Essentials of Clinical Informatics, Illustrated Edition, 2019). Answer A is incorrect because data are transformed before loading in data warehouses (as opposed to data lakes, in which data are transformed on demand after loading). Answers C and D are incorrect because export and import are not part of the terms referred to in this acronym.

Comment Here

Reference: Data repositories
Board review style question #2
Which of the following statements is true about data lakes?

  1. Are subject oriented
  2. Cannot be used with healthcare data
  3. Cannot store raw data
  4. Can store unstructured data
Board review style answer #2
D. Can store unstructured data. Data lakes can store structured, semistructured and unstructured data. Answers A and C are incorrect because unlike data warehouses, data lakes are not designed with a potential question(s) in mind (i.e., they are not subject oriented) and can store data in their native form (i.e., raw data). Answer B is incorrect because data lakes can be used to improve the organization and accessibility of healthcare data.

Comment Here

Reference: Data repositories
Back to top
Image 01 Image 02