Character Codes

Updated : August 5, 2019

A character in a specified language can be coded as a string of bits (binary digits).

Coding the English alphabet

In the early days, English was the primary language used for communications in most parts of the world.

  • The English alphabet consists of 26 letters.
  • 26 symbols are needed to code upper case characters OR lower case characters.
    5 bits are needed for standard coding.
  • 52 symbols are needed to code BOTH upper case and lower case characters.
    6 bits are needed are standard coding.

Fixed length coding

All the characters in a character set are coded using a fixed number of bits.

5-bit, 6-bit, 7-bit and 8-bit character codes were developed and used.

Some early standards include

  • 7-bit ASCII (American Standard Code for Information Interchange)
  • 8-bit EBCDIC (Extended Binary Coded Decimal Interchange Code).

Coding for other languages

  • With the wide spread use of computer technology, Character sets for the various languages were developed.
  • Some languages (notably Chinese) require long bit strings.
  • Fixed length coding gave way to Variable length coding.

Variable length coding

  • The characters in a character set are coded using a variable number of bits.
  • The most used characters in a language are represented as one byte and the lesser used characters are represented as two or more bytes.

Standards and Recommendations

  • De Jure Standards are set by law.
  • De Facto Standards are set by common usage.
  • Standards must be followed for Compliance.
  • On the other hand, Recommendations, which should be followed, can cause variations in the implementations.

Unicode and UTF

  • Unicode aims to have standard character codes for each supported language to facilitate processing.
  • There are formal and informal institutions to help develop, propose and approve new Unicode character sets.
  • UTF is a Unicode Transformation Format to transform Unicode characters to fit the specified length (e.g. UTF-8, UTF-16).

Categories: Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s