Text Encoding
From Leo's Notes
Last edited on 30 December 2021, at 02:01.
The file contents containing 'Hello' in various encodings:
Description | Binary Representation |
---|---|
This is the traditional ANSI encoding. | 48 65 6C 6C 6F
|
This is the Unicode (little-endian) encoding with no BOM. | 48 00 65 00 6C 00 6C 00 6F 00
|
This is the Unicode (little-endian) encoding with BOM. The BOM (FF FE) serves two purposes: First, it tags the file as a Unicode document, and second, the order in which the two bytes appear indicate that the file is little-endian. | FF FE 48 00 65 00 6C 00 6C 00 6F 00
|
This is the Unicode (big-endian) encoding with no BOM. Notepad does not support this encoding. | 00 48 00 65 00 6C 00 6C 00 6F
|
This is the Unicode (big-endian) encoding with BOM. Notice that this BOM is in the opposite order from the little-endian BOM. | FE FF 00 48 00 65 00 6C 00 6C 00 6F
|
This is UTF-8 encoding. The first three bytes are the UTF-8 encoding of the BOM. | EF BB BF 48 65 6C 6C 6F
|
This is UTF-7 encoding. The first five port this encoding. | 2B 2F 76 38 2D 48 65 6C 6C 6F
|