Regex Greedy vs Lazy Quantifiers
By default, regex repetition operators (*, +, ?, {n,m}) are greedy: they consume as much input as possible and then backtrack only if the rest of the pattern fails. Lazy variants written with a trailing ? (*?, +?, ??, {n,m}?) do the opposite, matching the minimum and expanding outward. Possessive quantifiers (*+, ++, ?+) refuse to backtrack at all, which prevents the pathological case known as catastrophic backtracking.
Most regex engines built on backtracking (Perl, PCRE, Java, .NET, JavaScript, Python's re module) treat the standard repetition operators "*", "+", "?" and "{n,m}" as greedy quantifiers. Given a string, the engine first tries to match the repeated token as many times as it can, then hands control to whatever follows in the pattern. If that next piece fails, the engine gives one character back and retries — a process called backtracking. The classic gotcha is parsing HTML-like text. The pattern "<.*>" against the input "<a><b>" does not match just "<a>". The greedy ".*" swallows everything up to the end of the string, then backtracks until the final ">" lines up, leaving the whole "<a><b>" as the match. Appending "?" to a quantifier flips it to lazy quantifier (also called non-greedy or reluctant) mode. "<.*?>" starts by matching zero characters and expands one at a time until the trailing ">" is satisfied, so the match becomes "<a>" — the nearest closing delimiter rather than the farthest. As a rule of thumb, greedy is right for most patterns and is usually the more efficient choice; lazy is appropriate when the goal is explicitly "shortest match up to the next delimiter." Even lazy matching still uses backtracking under the hood, so it does not by itself cure pathological patterns. Possessive quantifiers — "*+", "++", "?+", "{n,m}+" — are a third mode supported by Java, PCRE, Boost, Ruby, and Python 3.11+. They behave like greedy quantifiers but discard backtracking information once they finish matching, much like an Atomic Group (Regex) written "(?>...)". This is the standard defense against Catastrophic Backtracking: a pattern such as "(a+)+b" applied to "aaaaaaaaaaaaaa!" can take seconds or minutes because the engine tries every way to split the a's between the inner and outer "+" before concluding there is no "b". Rewriting it as "(a+)++b" or "(?>a+)+b" makes the same search fail immediately. When backtracking guarantees are required — for example in user-supplied filters at internet scale — the safer option is a different engine entirely. RE2 (Regex Engine), used by Go's standard library and by services such as Cloudflare's web application firewall, compiles patterns to a finite automaton with linear-time matching and no backtracking, at the cost of dropping features (such as backreferences) that genuinely require it.