Computing

Character Codes / Sets

A character in a specified language can be coded as a string of bits (binary digits).

Fixed length codes

5-bit Code

  • In the early days, English was the primary language used for communications in most parts of the world.
  • The English alphabet consists of 26 letters.
  • 5-bits (which can represent 32 symbols) are needed to represent
    (a) upper case characters
    (b) lower case characters
  • To represent both cases, a “Shift” character was introduced in some systems.
    Alternatively, one can use higher bit Codes.

6-bit Code

  • Some early computers (e.g. ICL1902) use 6-bit code.
  • It can represent 64 characters.
  • A subset is used for Control.
  • The remaining is used for printable characters (letters, numerals, punctuation).

7-bit Code

  • ASCII (American Standard Code for Information Interchange)

8-bit Code

  • EBCDIC (Extended Binary Coded Decimal Interchange Code)

Character Sets

  • With the wide spread use of computer technology, Character sets for the various languages were developed.
  • Some languages (notably Chinese) require long bit strings.

Variable Length Codes

  • Fixed length coding gave way to Variable length coding.
  • The most used characters in a language are represented as one byte and the lesser used characters are represented as two or more bytes.

Unicode

  • Unicode aims to have standard character codes for the languages.
  • There are formal and informal institutions to help develop, propose and approve new Unicode character sets.
  • UTF is a Unicode Transformation Format to transform Unicode characters to fit the specified length (e.g. UTF-8, UTF-16).

Standards and Recommendations

Standards may be

  • De Jure (set by law)
  • De Facto (set by common usage).

Standards must be followed for Compliance.

Recommendations, which should be followed, can cause variations in the implementations.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s