Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Extraction: Advanced Methods00:56

Extraction: Advanced Methods

1.2K
Metal ions can be separated from one another by complexation with organic ligands–the chelating agent– to form uncharged chelates. Here, the chelating agent must contain hydrophobic groups and behave as a weak acid, losing a proton to bind with the metal. Since most organic ligands used in this process are insoluble or undergo oxidation in the aqueous phase, the chelating agent is initially added to the organic phase and extracted into the aqueous phase. The metal-ligand complex is...
1.2K
Archival Research01:40

Archival Research

17.5K
Some researchers gain access to large amounts of data without interacting with a single research participant. Instead, they use existing records to answer various research questions. This type of research approach is known as archival research. Archival research relies on looking at past records or data sets to look for interesting patterns or relationships. For example, a researcher might access the academic records of all individuals who enrolled in college within the past ten years and...
17.5K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.1K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.1K
Data Reporting and Recording01:24

Data Reporting and Recording

5.6K
Reporting and recording are crucial in data documentation. The timely, thorough, and accurate documentation of facts is essential when recording patient data. Failure to record findings during an assessment or interpretation of a problem will result in loss of information and make the patient document unreliable. The reader is left with general impressions if the information is not specific. A recording is documenting data of the individual's health information in a traceable, secure, and...
5.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Laminar Microfluidic Platform for Probing the Effects of Spatially Heterogeneous Drug Distributions.

Micromachines·2026
Same author

Survival prediction in colorectal cancer liver metastases using machine learning with SHAP-based interpretation.

Frontiers in oncology·2026
Same author

Mechanistic study of HES1/PI3K/Akt/mTOR signaling pathway in cisplatin-induced sensorineural hearing loss.

Scientific reports·2026
Same author

Dihydrosanguinarine: A Review of Its Pharmacology, Structure-Activity Relationship, Toxicity, Pharmacokinetics, and Clinical Prospects.

International journal of molecular sciences·2026
Same author

Hermetically Sealed Graphene Nanomechanical Resonators with Long-Term Stability and Ultrahigh Sensitivity.

ACS applied materials & interfaces·2026
Same author

Clinical outcomes and safety profile of early low-density lipoprotein cholesterol target attainment in patients with atherosclerotic cerebral infarction: a prospective cohort study.

Nutrition, metabolism, and cardiovascular diseases : NMCD·2026
Same journal

Structural Generalizability: The Case of Similarity Search.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2026
Same journal

Flexible and Feasible Support Measures for Mining Frequent Patterns in Large Labeled Graphs.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2024
Same journal

iQCAR: inter-Query Contention Analyzer for Data Analytics Frameworks.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2021
Same journal

Optimal Join Algorithms Meet Top-<i>k</i>.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2021
Same journal

Near-Optimal Distributed Band-Joins through Recursive Partitioning.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2021
Same journal

Finding Related Tables in Data Lakes for Interactive Data Science.

Proceedings. ACM-SIGMOD International Conference on Management of Data·2020
See all related articles

Related Experiment Video

Updated: Mar 6, 2026

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

920

Extracting Databases from Dark Data with DeepDive.

Ce Zhang1, Jaeho Shin1, Christopher Ré1

  • 1Stanford University, Palo Alto, CA.

Proceedings. ACM-SIGMOD International Conference on Management of Data
|March 21, 2017
PubMed
Summary
This summary is machine-generated.

DeepDive extracts structured data from unstructured "dark data," creating valuable big data resources. This system achieves high accuracy and recall for diverse applications, unlocking insights from previously inaccessible information.

More Related Videos

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.8K
Author Spotlight: Advancing Alzheimer's Research &#8211; Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

2.0K

Related Experiment Videos

Last Updated: Mar 6, 2026

Mining Spatial Transcriptomics Datasets using DeepSpaceDB
10:16

Mining Spatial Transcriptomics Datasets using DeepSpaceDB

Published on: September 5, 2025

920
Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.8K
Author Spotlight: Advancing Alzheimer's Research &#8211; Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

2.0K

Area of Science:

  • Computer Science
  • Data Science
  • Information Extraction

Background:

  • Vast amounts of unstructured data, termed "dark data," exist across scientific papers, web content, and customer records.
  • This dark data is largely inaccessible to traditional relational database tools, limiting its analytical potential.
  • Unlocking this data could create massive new "big data" resources for analysis.

Purpose of the Study:

  • To introduce DeepDive, a novel system for extracting relational databases from dark data.
  • To demonstrate DeepDive's capability in transforming unstructured information into exploitable big data.
  • To highlight the system's potential across various scientific and industrial domains.

Main Methods:

  • DeepDive employs a unique architecture combining large-scale probabilistic inference with a novel developer interaction cycle.
  • Core innovations in probabilistic training and inference enable efficient and accurate data extraction.
  • The system is designed for high precision and recall at a reasonable engineering cost.

Main Results:

  • DeepDive successfully creates relational databases from diverse dark data sources.
  • In multiple applications, DeepDive-generated databases achieve accuracy comparable to human annotators.
  • The system has been deployed for insurance, materials science, genomics, paleontology, and law enforcement.

Conclusions:

  • DeepDive effectively unlocks the value hidden within dark data.
  • The system offers a powerful solution for creating big data resources from previously inaccessible information.
  • DeepDive presents a significant opportunity for industry, government, and scientific research through enhanced data analysis.