Highlights:
- BadGraph introduces a new class of backdoor attacks on latent diffusion models for text-guided graph generation.
- The method uses textual triggers to implant hidden vulnerabilities during model training.
- Experiments on PubChem, ChEBI-20, PCDes, and MoMu datasets show high attack success with minimal poisoning.
- Research highlights major security concerns for AI applications like drug discovery.
TLDR:
Researchers have unveiled BadGraph, a powerful backdoor attack strategy targeting text-guided graph generation models based on latent diffusion. The discovery underscores serious security risks in AI-powered fields such as molecular design and drug discovery.
A new study published on arXiv introduces BadGraph, a sophisticated backdoor attack strategy that exposes critical vulnerabilities in modern text-guided graph generation models. The research, conducted by Liang Ye (https://arxiv.org/search/cs?searchtype=author&query=Ye,+L), Shengqin Chen (https://arxiv.org/search/cs?searchtype=author&query=Chen,+S), and Jiazhu Dai (https://arxiv.org/search/cs?searchtype=author&query=Dai,+J), explores how latent diffusion models—central to cutting-edge AI graph generation—can be compromised through subtle data poisoning.
Text-guided graph generation systems translate descriptive language into graph structures that represent molecules, chemical compounds, or network layouts. While such models have revolutionized tasks like drug discovery and material design, their growing complexity has introduced new attack surfaces. BadGraph capitalizes on these weaknesses by embedding hidden backdoors into training datasets using specific textual triggers. When these triggers appear during inference, the compromised model generates attacker-defined subgraphs without degrading overall model performance. This dual behavior makes detection almost impossible under typical testing conditions.
The researchers conducted extensive experiments across four recognized benchmark datasets—PubChem, ChEBI-20, PCDes, and MoMu. Their findings revealed that even with less than a 10% poisoning rate, BadGraph achieved nearly a 50% attack success rate, while 24% poisoning yielded over 80% success. Crucially, the model’s accuracy and utility on untouched datasets remained virtually unchanged, demonstrating the stealth and power of the method. Ablation studies further showed that the backdoor is implanted primarily during the Variational Autoencoder (VAE) and diffusion stages of training, rather than the pretraining phase, offering key insights for designing countermeasures.
This discovery raises significant ethical and security questions about using latent diffusion and generative models in sensitive scientific and industrial applications. The authors stress the importance of developing more robust defense strategies—such as data validation pipelines and backdoor detection protocols—to safeguard AI systems deployed in fields like computational chemistry and biotechnology. BadGraph not only challenges current notions of AI reliability but also broadens our understanding of adversarial risks in machine learning pipelines.
Source:
Source:
Ye, L., Chen, S., & Dai, J. (2025). BadGraph: A Backdoor Attack Against Latent Diffusion Model for Text-Guided Graph Generation. arXiv:2510.20792 [cs.LG]. https://doi.org/10.48550/arXiv.2510.20792

