WHATWG URL Standard
The WHATWG URL Standard is a living specification that defines URL parsing, serialisation and encoding for the modern web, intended to describe what browsers actually do rather than the abstract grammar of RFC 3986.
The WHATWG URL Standard is a living standard maintained by the Web Hypertext Application Technology Working Group. Unlike RFC 3986, which describes a generic URI grammar, the WHATWG document is written as an algorithm: a state machine that takes an input string plus an optional base URL and returns either a parsed URL record or a failure. Browsers and the JavaScript `URL` object are expected to implement this algorithm bit-for-bit. The standard formalises a number of behaviours that diverge from RFC 3986 but match what the web actually does. It defines several percent-encode sets so the same character can be encoded in the query but left alone in the path or fragment. It tolerates input that strict URI parsers would reject, such as leading whitespace, backslashes in place of forward slashes for special schemes, and tabs or newlines inside the body of the URL. It specifies how hosts are parsed, including IDNA processing for internationalised domain names and special handling of IPv4 and IPv6 literals. Because it is a living standard, the document changes continuously rather than via numbered revisions. Implementers track it through commit history and a public test suite, and most modern URL libraries outside the browser (including Node.js's `WHATWG URL` implementation) follow it to stay compatible with what users paste into address bars.