Regex Lookaround: Lookahead and Lookbehind Explained
Lookaround assertions — positive and negative lookahead and lookbehind — let a regex test what surrounds the current position without consuming characters. Engine support varies: PCRE has all four; JavaScript got lookbehind in ES2018 (Safari 16.4+); Java relaxed its fixed-width rule in Java 9; Go's RE2 omits lookaround entirely to keep linear-time matching.
Lookaround is the umbrella term for four regex constructs that test a condition at the current position without advancing the match pointer. Because they consume no characters, they are a kind of Zero-Width Assertion — close cousins of anchors like `^`, `$`, and `\b`. After a lookaround succeeds, the engine is exactly where it started, free to match the rest of the pattern over the same characters the assertion just inspected. The four flavors pair direction with polarity. Positive lookahead `(?=...)` succeeds if the inner pattern matches starting at the current position; negative lookahead `(?!...)` succeeds if it does not. Positive lookbehind `(?<=...)` and negative lookbehind `(?<!...)` apply the same logic to the text immediately before the current position. A classic example is a password policy that requires a digit somewhere in the string: `^(?=.*\d).{8,}$` lets a single lookahead scan ahead for `\d` while the main pattern still measures length from the start. Negative lookahead expresses "X not followed by Y" — for instance, `foo(?!bar)` matches `foo` only when `bar` does not come next, something that is awkward or impossible to write with ordinary alternation. Lookbehind sharpens boundary work: `(?<=\$)\d+` grabs the number after a dollar sign without including the sign in the match. Support varies sharply across engines. Perl and the PCRE (Library) family have offered all four forms since the late 1990s. JavaScript only gained lookbehind in ES2018; V8 (Chrome, Node) shipped it first, SpiderMonkey followed in Firefox 78, and JavaScriptCore landed full support in Safari 16.4, so cross-browser code that targets older Safari must avoid `(?<=...)` and `(?<!...)` or polyfill them. Java historically required fixed-width lookbehind so the engine knew how far to step back; Java 9 loosened this to allow bounded quantifiers inside the assertion. Go's RE2 engine omits lookaround entirely (and backreferences too) because its automaton-based design guarantees linear time and cannot support features that would reintroduce catastrophic backtracking. Performance follows from these design choices. Lookahead in NFA engines is cheap because the engine simply attempts a sub-match and rewinds. Fixed-width lookbehind is also cheap. Variable-width lookbehind, where supported, can be expensive: naive implementations retry the inner pattern at every prior offset. Engines like RE2 sidestep the issue by refusing the feature; PCRE2 and Java implement it with care but warn against unbounded patterns in hot paths.