Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Querying the public databases for sequences using complex keywords contained in the feature lines.

Olivier Croce1, Michaël Lamarre, Richard Christen

  • 1Laboratoire de Biologie Virtuelle, UMR 6543, CNRS & University of Nice Sophia-Antipolis, Centre de Biochimie, Parc Valrose, Nice, F06108, France. croce@unice.fr

BMC Bioinformatics
|January 31, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Comprehensive mapping of identical sequences across human proteins emphasizes the widespread issue of shared epitopes in self-antigens.

NAR genomics and bioinformatics·2026
Same author

Genomic determinants of Bacillus cereus and outcomes of infection in preterm neonates: a multicentre retrospective study.

Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases·2026
Same author

Whole body regeneration deploys a rewired embryonic gene regulatory network logic.

Nature communications·2025
Same author

A ganglioside-based immune checkpoint enables senescent cells to evade immunosurveillance during aging.

Nature aging·2024
Same author

Dosimetry and Monte Carlo modelling of the Papillon+ contact X-ray brachytherapy device.

Brachytherapy·2024
Same author

Isolation of Acrosomal Vesicles and Their Surrounding Membranes from Starfish Sperm*: (sperm/acrosomal reaction/exocytosis/actin).

Development, growth & differentiation·2023
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Retrieving specific DNA subsequences from large datasets can be challenging. New Perl scripts and improved BioPerl tools offer precise data retrieval for complex biological sequence queries.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomic Data Analysis

Background:

  • High-throughput technologies necessitate efficient retrieval of large sequence datasets.
  • Existing tools like ACNUC, Entrez, and SRS have limitations when querying complex keywords or specific subsequences.

Purpose of the Study:

  • To develop a method for precise retrieval of subsequences based on complex descriptors in EMBL Feature qualifiers.
  • To overcome limitations of existing tools in handling complex sequence data queries.

Main Methods:

  • Development of specific Perl scripts for targeted subsequence retrieval.
  • Enhancement of the BioPerl library for parsing large data files.
  • Integration of scripts into a user-friendly, operating system-independent interface.

Related Experiment Videos

Main Results:

  • Demonstrated severe limitations of Entrez for subsequence retrieval.
  • Identified issues with SRS for multi-term and complex queries.
  • Showcased ACNUC's inability to perform precise queries within Feature qualifiers.
  • Successfully developed and implemented Perl scripts and BioPerl enhancements for accurate data retrieval.

Conclusions:

  • Parsing complete entries with scripts is essential for exact data retrieval, despite being slower than prebuilt index tools.
  • The user-friendly interface enables biologists to utilize the scripts, with bioinformaticians able to modify them for specific needs.