
TL;DR
When using AI-generated code, always verify package names. A paper (arXiv:2406.10279) explores this, and the term “Slopsquatting” has since been coined later to describe a related threat.
What Is Slopsquatting?
My definition:
Slopsquatting is the exploitation of plausible but non-existent package names hallucinated by AI.
This is a specific case of AI hallucination, called package hallucination, where a large language model (LLM) suggests a package that doesn’t actually exist.
Package hallucination occurs when an LLM generates code that recommends or contains a reference to a package that does not actually exist.
Origin of the Term
Andrew Nesbitt introduced the term Slopsquatting after the paper‘s publication:
Slopsquatting – when an LLM hallucinates a non-existent package name, and a bad actor registers it maliciously. The AI brother of typosquatting.
Why software engineers should care
- Widespread AI usage: Studies suggest up to 97% of developers use generative AI tools, with around 30% of code now being AI-generated. (1,2)
- Risk exposure: The more developers copy-paste code from AI tools, the higher the risk of introducing malicious, hallucinated, slopsquatted packages.
Can You Spot a Slopsquatted Package?
Sometimes, yes — hallucinated packages often differ significantly from real ones. Unlike typosquatting (e.g., “Electron” vs “electorn”), the differences here are more obvious. The researched levenshtein distance of the real package names and hallucinated package names are not simple. However humans still have to notice, and humans may make mistakes. Consider code reviews (e.g. as part of pull request reviews) and be transparent about code proposed by LLMs.
Mitigation strategies
- Use commercial LLMs: They hallucinate less (see below).
- Lower the temperature: Reducing randomness can improve accuracy.
- Ask the model to verify its own outputs for hallucinations.
- Cross-check outputs against a known list of valid packages.
- Use RAG or fine-tuning: Retrieval-Augmented Generation (RAG) and supervised fine-tuning with real package data can reduce hallucinations.
More on Model Accuracy
- Open-source models hallucinate ~21.7% of package names on average.
- Commercial models do better, with an average of ~5.2%.
- Best performer: GPT-4
- Worst performer: LLaMA
While the study focused on JavaScript and Python, however the effects could happen to other languages and their packages as well.
Leave a Reply