Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Legal Guidelines for Documentation

Legal Guidelines for Documentation

The legal guidelines for nursing documentation are essential for ensuring accurate, professional, and ethical recording of patient care. The guidelines are discussed here:

Models, Theories, and Laws

Models, Theories, and Laws

Scientists frequently use models to help them comprehend a specific collection of phenomena. In physics, a model is a condensed version of a physical system that is too complex to study thoroughly. One such example is the light wave model; unlike water waves, light waves are typically invisible to us. Nonetheless, it is helpful to think of light as being composed of waves, since investigations show that light behaves like water waves. Since it is impossible to visually see what is genuinely...

Higher Mental Functions of the Brain: Language

Higher Mental Functions of the Brain: Language

Language is a system of communication that allows the expression of thoughts, ideas, and feelings. The brain processes language in both hemispheres.
Language formation and comprehension take place in the dominant hemisphere. The dominant hemisphere is responsible for understanding the meaning of spoken, written, or sign language, as well as the ability to communicate. For most people, the left hemisphere is the dominant one. The right hemisphere, then, gives tone and emotional context to the...

Genetic Lingo

Genetic Lingo

Components of Language

Components of Language

Language, whether spoken, signed, or written, consists of specific components: lexicon and grammar. The lexicon is the vocabulary of a language, comprising its words. Grammar is the set of rules used to convey meaning through the lexicon. For example, English grammar adds “-ed” to most verbs to indicate past tense. Words are formed by combining phonemes, which are the basic sound units of a language. Different languages have different sets of phonemes (e.g., “ah” vs.

Stereotype Content Model

Stereotype Content Model

The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings.

Findings of ACL. EMNLP. Conference on Empirical Methods in Natural Language Processing·2025

Same author

Macrophage membrane-coated nanoparticles for the treatment of infectious diseases.

Biomedical materials (Bristol, England)·2024

Same author

Study on the secondary oxidation behavior and microscopic characteristics of oxidized coal gangue.

Environmental science and pollution research international·2024

Same author

Delaying the first nucleation event of amorphous solid dispersions above the polymer overlap concentration (c*): PVP and PVPVA in posaconazole.

Journal of pharmaceutical sciences·2024

Same author

A dynamic prediction model of landslide displacement based on VMD-SSO-LSTM approach.

Scientific reports·2024

Same author

Dynamic Bayesian network structure learning based on an improved bacterial foraging optimization algorithm.

Scientific reports·2024

Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026

Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026

Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026

Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026

Same journal

Fast Calculation of Feature Contributions in Boosting Trees.

Proceedings of machine learning research·2026

Same journal

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs.

Proceedings of machine learning research·2026

See all related articles

Search research articles

Home
Codeipprompt: Intellectual Property Infringement Assessment Of Code Language Models.

Home
Codeipprompt: Intellectual Property Infringement Assessment Of Code Language Models.

Related Experiment Video

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

CODEIPPROMPT: Intellectual Property Infringement Assessment of Code Language Models.

Zhiyuan Yu¹, Yuhao Wu¹, Ning Zhang¹

¹Washington University in St. Louis.

Proceedings of Machine Learning Research

|August 27, 2025

View abstract on PubMed

Summary

This summary is machine-generated.

Large language models (LMs) for code generation often violate intellectual property (IP) rights due to training data. Our platform, CODEIPPROMPT, evaluates and highlights these IP risks in AI-generated code.

More Related Videos

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Published on: December 14, 2012

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

Related Experiment Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Portable Intermodal Preferential Looking IPL: Investigating Language Comprehension in Typically Developing Toddlers and Young Children with Autism

Published on: December 14, 2012

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

Area of Science:

Artificial Intelligence
Software Engineering
Intellectual Property Law

Background:

Large language models (LMs) demonstrate advanced capabilities in synthesizing programming code.
The rise of AI-generated code raises significant concerns regarding intellectual property (IP) rights violations.
The exploration of IP issues in code-generating LMs remains a relatively underexplored area.

Purpose of the Study:

To introduce CODEIPPROMPT, a novel platform for the automatic evaluation of IP rights violations in code generated by LMs.
To assess the extent to which LMs reproduce licensed programs and identify potential IP infringements.
To investigate the root causes of IP violations in code LMs and explore mitigation strategies.

Main Methods:

Development of CODEIPPROMPT, featuring prompts derived from a licensed code database to trigger IP-violating code generation.

Implementation of a measurement tool within CODEIPPROMPT to quantify the degree of IP violation in LM-generated code.

Extensive evaluation of various open-source and commercial code LMs using the CODEIPPROMPT platform.

Main Results:

Prevalence of IP violations was observed across all evaluated open-source and commercial code LMs.
The primary cause identified is the significant inclusion of restrictively licensed code within the training datasets.
Both intentional inclusion and inconsistent real-world licensing practices contribute to the issue.

Conclusions:

CODEIPPROMPT serves as a crucial testbed for assessing IP violation risks in current code generation platforms.
The study underscores the urgent need for enhanced mitigation strategies to address IP concerns in AI code synthesis.
Fine-tuning and dynamic token filtering are explored as potential methods to reduce IP infringements.