Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Legal Guidelines for Documentation01:06

Legal Guidelines for Documentation

1.4K
The legal guidelines for nursing documentation are essential for ensuring accurate, professional, and ethical recording of patient care. The guidelines are discussed here:
1.4K
Models, Theories, and Laws01:16

Models, Theories, and Laws

6.9K
Scientists frequently use models to help them comprehend a specific collection of phenomena. In physics, a model is a condensed version of a physical system that is too complex to study thoroughly. One such example is the light wave model; unlike water waves, light waves are typically invisible to us. Nonetheless, it is helpful to think of light as being composed of waves, since investigations show that light behaves like water waves. Since it is impossible to visually see what is genuinely...
6.9K
Higher Mental Functions of the Brain: Language01:10

Higher Mental Functions of the Brain: Language

1.0K
Language is a system of communication that allows the expression of thoughts, ideas, and feelings. The brain processes language in both hemispheres.
Language formation and comprehension take place in the dominant hemisphere. The dominant hemisphere is responsible for understanding the meaning of spoken, written, or sign language, as well as the ability to communicate. For most people, the left hemisphere is the dominant one. The right hemisphere, then, gives tone and emotional context to the...
1.0K
Genetic Lingo01:11

Genetic Lingo

104.6K
Overview
104.6K
Components of Language01:24

Components of Language

392
Language, whether spoken, signed, or written, consists of specific components: lexicon and grammar. The lexicon is the vocabulary of a language, comprising its words. Grammar is the set of rules used to convey meaning through the lexicon. For example, English grammar adds “-ed” to most verbs to indicate past tense. Words are formed by combining phonemes, which are the basic sound units of a language. Different languages have different sets of phonemes (e.g., “ah” vs.
392
Stereotype Content Model02:16

Stereotype Content Model

14.9K
The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...
14.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings.

Findings of ACL. EMNLP. Conference on Empirical Methods in Natural Language Processing·2025
Same author

Macrophage membrane-coated nanoparticles for the treatment of infectious diseases.

Biomedical materials (Bristol, England)·2024
Same author

Study on the secondary oxidation behavior and microscopic characteristics of oxidized coal gangue.

Environmental science and pollution research international·2024
Same author

Delaying the first nucleation event of amorphous solid dispersions above the polymer overlap concentration (c*): PVP and PVPVA in posaconazole.

Journal of pharmaceutical sciences·2024
Same author

A dynamic prediction model of landslide displacement based on VMD-SSO-LSTM approach.

Scientific reports·2024
Same author

Dynamic Bayesian network structure learning based on an improved bacterial foraging optimization algorithm.

Scientific reports·2024
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
Same journal

Fast Calculation of Feature Contributions in Boosting Trees.

Proceedings of machine learning research·2026
Same journal

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs.

Proceedings of machine learning research·2026
See all related articles
  1. Home
  2. Codeipprompt: Intellectual Property Infringement Assessment Of Code Language Models.
  1. Home
  2. Codeipprompt: Intellectual Property Infringement Assessment Of Code Language Models.

Related Experiment Video

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

681

CODEIPPROMPT: Intellectual Property Infringement Assessment of Code Language Models.

Zhiyuan Yu1, Yuhao Wu1, Ning Zhang1

  • 1Washington University in St. Louis.

Proceedings of Machine Learning Research
|August 27, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

Large language models (LMs) for code generation often violate intellectual property (IP) rights due to training data. Our platform, CODEIPPROMPT, evaluates and highlights these IP risks in AI-generated code.

More Related Videos

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism
10:11

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Published on: December 14, 2012

18.6K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

909

Related Experiment Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

681
Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism
10:11

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Published on: December 14, 2012

18.6K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

909

Area of Science:

  • Artificial Intelligence
  • Software Engineering
  • Intellectual Property Law

Background:

  • Large language models (LMs) demonstrate advanced capabilities in synthesizing programming code.
  • The rise of AI-generated code raises significant concerns regarding intellectual property (IP) rights violations.
  • The exploration of IP issues in code-generating LMs remains a relatively underexplored area.

Purpose of the Study:

  • To introduce CODEIPPROMPT, a novel platform for the automatic evaluation of IP rights violations in code generated by LMs.
  • To assess the extent to which LMs reproduce licensed programs and identify potential IP infringements.
  • To investigate the root causes of IP violations in code LMs and explore mitigation strategies.

Main Methods:

  • Development of CODEIPPROMPT, featuring prompts derived from a licensed code database to trigger IP-violating code generation.
  • Implementation of a measurement tool within CODEIPPROMPT to quantify the degree of IP violation in LM-generated code.
  • Extensive evaluation of various open-source and commercial code LMs using the CODEIPPROMPT platform.
  • Main Results:

    • Prevalence of IP violations was observed across all evaluated open-source and commercial code LMs.
    • The primary cause identified is the significant inclusion of restrictively licensed code within the training datasets.
    • Both intentional inclusion and inconsistent real-world licensing practices contribute to the issue.

    Conclusions:

    • CODEIPPROMPT serves as a crucial testbed for assessing IP violation risks in current code generation platforms.
    • The study underscores the urgent need for enhanced mitigation strategies to address IP concerns in AI code synthesis.
    • Fine-tuning and dynamic token filtering are explored as potential methods to reduce IP infringements.