Percent-Encoding
Percent-encoding is the URI mechanism that replaces a byte with `%` plus two hexadecimal digits so characters with special syntactic meaning, or bytes outside ASCII, can be carried inside a URL without ambiguity.
Percent-encoding, sometimes called URL encoding, is the standard way to embed arbitrary bytes inside a URI without colliding with its structural characters. Each unsafe byte is replaced by `%` followed by two uppercase hexadecimal digits that name the byte's value. A literal `%` therefore must itself be encoded as `%25`. The mechanism originated in RFC 1738 (1994), which defined URLs as sequences of ASCII octets and required any byte outside a small unreserved set to be escaped. RFC 3986 (2005) generalised the model to URIs and introduced the modern rule that character data should first be converted to UTF-8 and then each resulting byte percent-encoded. RFC 3986 partitions ASCII into an unreserved set (`A-Z a-z 0-9 - . _ ~`) that must never be encoded and a reserved set of general delimiters and sub-delimiters that may be encoded when they would otherwise be interpreted as structure. Decoders simply reverse the process: scan for `%`, read the next two hex digits, and emit the resulting byte. Two URIs that differ only in the encoding of an unreserved character are considered equivalent, but encoding or decoding a reserved character can change which resource is identified, which is why most bugs around percent-encoding involve reserved characters being handled inconsistently between producer and consumer.