RSTR-REDOS-001 — nested quantifier catastrophic backtracking
Summary
A regex contains a nested quantifier of the form (X+)+,
(X*)+, or (X+)*. On crafted input like
aaaaaaaaaaaaaaaaaaab (lots of as followed by a b
that breaks the match), backtracking regex engines try
exponentially many ways to partition the input between the
inner and outer quantifier — O(2ⁿ) work that hangs the
thread.
This is regular-expression denial of service (ReDoS). A single user-controlled string can DoS a server thread for minutes.
Severity
High. Cheap to exploit, hard to recover from once the
process is locked up.
Languages
JavaScript, TypeScript, Python.
What rastray flags
Regex literals (/^(a+)+$/), new RegExp('^(a+)+$'), and
re.compile(r'^(a+)+$') containing the nested-quantifier
shape.
What rastray deliberately does not flag
- Simple single quantifiers:
/^a+$/,r'^[a-z]+$'. - Character classes:
/^[a-z0-9]+$/. - Alternation overlap (
(a|a)+,(ab|a)+) — same risk class but needs a regex parser to detect. Out of scope for this rule; a future slice may add it.
How to fix it
Three options:
-
Collapse the nested quantifier:
(a+)+→a+when both branches accept the same character class. -
Use a character class instead of alternation:
(a|b)+→(?:[ab])+. -
Use a linear-time regex engine. JS engines, PCRE, and Python's
reall use backtracking. Switch to:- Rust's
regexcrate - Google's RE2
- The
re2-wasmJS port - Python's
regexmodule inLOCALE-free mode is still backtracking; for guaranteed linear time use Rust regex via PyO3 orgoogle-re2.
- Rust's
-
Validate input length before matching as defence-in-depth. Even with a backtracking engine, capping input at 1 KB stops most DoS shapes from running long enough to matter.