Highlights:

  • Researchers introduce new methods to improve large language models (LLMs) for code vulnerability detection.
  • Study explores two selection criteria for choosing few-shot examples in in-context learning (ICL).
  • Approach combines model consistency and code similarity measures to enhance detection performance.
  • Evaluations conducted using open-source models and multiple datasets in software security.

TLDR:

A team of computer scientists has developed a method to improve how large language models detect vulnerabilities in code by strategically selecting few-shot examples using performance consistency and code similarity. This breakthrough could lead to more secure software development and reliable AI-assisted code analysis tools.

The rapid evolution of large language models (LLMs) has revolutionized how developers approach coding tasks, ranging from code summarization to automated generation. Yet, one of the toughest challenges remains the automated detection of software vulnerabilities. Addressing this crucial gap, a new study titled ‘On Selecting Few-Shot Examples for LLM-based Code Vulnerability Detection’ introduces a systematic approach to optimizing few-shot example selection in in-context learning (ICL) for enhancing code security analysis. The research, authored by Md Abdul Hannan (https://arxiv.org/search/cs?searchtype=author&query=Hannan,+M+A), Ronghao Ni (https://arxiv.org/search/cs?searchtype=author&query=Ni,+R), Chi Zhang (https://arxiv.org/search/cs?searchtype=author&query=Zhang,+C), Limin Jia (https://arxiv.org/search/cs?searchtype=author&query=Jia,+L), Ravi Mangal (https://arxiv.org/search/cs?searchtype=author&query=Mangal,+R), and Corina S. Pasareanu (https://arxiv.org/search/cs?searchtype=author&query=Pasareanu,+C+S), presents an innovative framework designed to make LLMs smarter and more reliable when identifying security weaknesses in code.

The paper focuses on improving the few-shot training process that underpins LLM adaptability. In in-context learning, an LLM is supplied with a small set of relevant examples — known as few-shot examples — before attempting a new task. However, poor example selection can degrade performance, especially in tasks requiring high accuracy, such as detecting code vulnerabilities. The research team introduces two complementary criteria for selecting these few-shot samples. The first criterion leverages the model’s own performance history, assessing which samples an LLM consistently gets right or wrong. This self-assessment helps identify examples that provide the most informative signals for future predictions. The second criterion is based on similarity: the model selects examples most closely resembling the target program using a k-nearest neighbor approach. This ensures contextual relevance, helping the LLM understand the specific coding patterns and potential vulnerabilities in comparable code samples.

Through extensive evaluations on multiple open-source datasets and models, the authors demonstrate the combined strength of these selection criteria. Not only did the hybrid approach outperform random or naive selection methods, but it also revealed deeper insights into how LLMs process and learn from structural patterns within code. The implications extend beyond vulnerability detection — the same principles could be used to boost the accuracy of LLMs in code review, debugging, and even automated patch generation. By tackling one of the fundamental problems in LLM learning — choosing what to learn from — this work represents a meaningful step toward safer, AI-assisted software engineering practices.

The study’s findings highlight the potential for more explainable and efficient AI-driven security tools. As large language models become integral to software development pipelines, improving their logical reasoning and vulnerability perception is paramount. This research bridges a key gap between machine learning optimization and practical cybersecurity, marking an exciting milestone for the future of AI-enabled code intelligence.

Source:

Source:

Hannan, M. A., Ni, R., Zhang, C., Jia, L., Mangal, R., & Pasareanu, C. S. (2025). On Selecting Few-Shot Examples for LLM-based Code Vulnerability Detection. arXiv:2510.27675 [cs.SE]. https://doi.org/10.48550/arXiv.2510.27675

Leave a Reply

Your email address will not be published. Required fields are marked *