ASCII, Unicode: mappings of characters
   to integers (codepoints)
   
Encoding: storing strings of chars in
  files and in RAM at run-time
  
UTF-8
UTF-16, little endian & big endian (byte order)

File contents, by byte:
  0x48 -- H
  0x65 -- e
  0x72 -- r
  0x65 -- e
  0x20 -- <space>
  ...

od -x reads this file in 16-bit (2-byte) chunks

  First chunk: 0x48, 0x65
  Does this represent the 16-bit integer
     0x4865       or 0x6548?
    big endian        little endian
    Motorola (early Mac)
                       Intel (PCs, later Macs)
            MIPS you choose!

My file in UTF-16 LE:

   H \0 e \0 r \0...

   0x48 0x00 0x65 0x00
   01001000 00000000 01100101 00000000

Viewed as a sequence of little endian 16-bit ints

   0000000001001000 -- 0x0048
   0000000001100101 -- 0x0065

UTF-8 (early 90's, Ken Thompson)
  ("Reflections on Trusting Trust" 1983)
  UTF-8 again
  null termination problem with UTF-16
  file size problem "
  
  é -- 0xE9 = 11101001 binary = 233 decimal

    = 00011 101001 put into UTF-8
      110xxxxx 10xxxxxx
      11000011 10101001
        0xC3     0xA9

BOM -- Byte order mark

ADVICE WHEN WORKING WITH ENCODED TEXT
1. Read from data source, and immediately
   transform into the "unicode" type of
   string

2. Do whatever you need to do, always using
the unicode type.

3. Just before output, encode your unicode
strings into their target encoding.