Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Optimizing Interactive Development of Data-Intensive Applications.

Matteo Interlandi1, Sai Deep Tetali1, Muhammad Ali Gulzar1

  • 1University of California, Los Angeles.

Proceedings of the ... ACM Symposium on Cloud Computing [Electronic Resource] : SOCC ... ... Socc (Conference)
|April 14, 2017
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Sphingomonas degradans sp. nov. and Sphingomonas paludis sp. nov., isolated from the Han River and a wetland in South Korea.

Journal of microbiology (Seoul, Korea)·2026
Same author

Isokinetic Strength Recovery and Fear of Re-Injury After ACL Reconstruction in Male Soccer Players: A Retrospective Cohort Study.

Journal of clinical medicine·2026
Same author

Genome-based classification of Paraniabella aurantiaca gen. nov., sp. nov., isolated from soil and taxonomic reclassification of five species within the genus Niabella.

Journal of microbiology (Seoul, Korea)·2025
Same author

Privacy-Preserving Gaze Data Streaming in Immersive Interactive Virtual Reality: Robustness and User Experience.

IEEE transactions on visualization and computer graphics·2024
Same author

Pushing ML Predictions Into DBMSs.

IEEE transactions on knowledge and data engineering·2023
Same author

Association of handgrip strength with new-onset CKD in Korean adults according to gender.

Frontiers in medicine·2023
Same journal

A Comparison of End-to-End Decision Forest Inference Pipelines.

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)·2025
Same journal

Automated Debugging in Data-Intensive Scalable Computing.

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)·2019
Same journal

Programming and Runtime Support to <i>Blaze</i> FPGA Accelerator Deployment at Datacenter Scale.

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)·2017
See all related articles

Modern data-intensive computing systems often re-execute similar programs, slowing development. Vega, an Apache Spark framework, optimizes these repeated Spark programs, reducing Big Data application time-to-market.

Area of Science:

  • Computer Science
  • Data Engineering
  • Software Optimization

Background:

  • Data-Intensive Scalable Computing (DISC) systems process data via batch jobs executing compiled programs.
  • Interactive development involves posing ad-hoc queries, often with structural overlap between successive queries.
  • Current systems re-execute modified queries from scratch, lengthening development cycles.

Purpose of the Study:

  • To introduce Vega, an Apache Spark framework designed for optimizing series of similar Spark programs.
  • To reduce the development cycle time for data scientists and Big Data application developers.

Main Methods:

  • Implementation of the Vega framework within Apache Spark.
  • Leveraging observed structural overlap in queries from development sessions.
Keywords:
Big DataH.2.4 [Information Systems]: Database Management—query processingIncremental EvaluationInteractive DevelopmentLanguagesPerformanceQuery RewritingSparkTheoryparallel databases

Related Experiment Videos

  • Optimizing the re-execution of modified Spark programs.
  • Main Results:

    • Vega significantly reduces the time required to re-execute modified Spark programs.
    • Accelerated development cycles for interactive data analysis and Big Data applications.
    • Potential for reduced overall time-to-market for Big Data applications.

    Conclusions:

    • Vega offers an effective solution for optimizing repetitive Spark program executions.
    • The framework addresses a key bottleneck in interactive data science and Big Data development.
    • Adoption of Vega can lead to substantial efficiency gains in Big Data workflows.