CSCI/MATH 308  Binary Encoding Schemes  (This is copied from some book whose reference was lost years ago--sorry)

 

Although the binary number system has many practical advantages and is widely used in digital computers, in many cases it is convenient to work with the decimal number system, especially when the communication between man and machine is extensive, since most numerical data generated by man are in terms of decimal numbers.  To simplify the communication problem between man and machine, a number of codes have been devised so that the decimal digits are represented by sequences of binary digits.

 

Weighted codes

In order to represent the 10 decimal digits 0,1,...,9 it is necessary to use at least four binary digits.  Since there are 16 combinations of four binary digits of which only 10 combinations are used, it is possible to form a very large number of distinct codes.  Of particular importance is the class of weighted codes, whose main characteristic is that each binary digit is assigned a “weight,” and for each group of four bits, the sum of the weights of those binary digits whose value is 1 is equal to the decimal digit which they represent.  In other words, if w1, w2, w3, and w4 are the weights of the binary digits and x1, x2, x3, and x4 are the corresponding digit values, then the decimal digit N=w4x4 + w3x3 + w2x2 + w1x1 is represented by the binary sequence x4x3x2x1.  A sequence of binary digits which represents a decimal digit is called a code word.  Thus the above sequence x4x3x2x1 is the code word for N.  A number of weighted, four digit binary codes are shown in the following table.

 

Decimal digit

 

8

 

4

 

2

 

1

 

2

 

4

 

2

 

1

 

6

 

4

 

2

 

-3

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

1

0

0

0

1

0

1

0

1

2

0

0

1

0

0

0

1

0

0

0

1

0

3

0

0

1

1

0

0

1

1

1

0

0

1

4

0

1

0

0

0

1

0

0

0

1

0

0

5

0

1

0

1

1

0

1

1

1

0

1

1

6

0

1

1

0

1

1

0

0

0

1

1

0

7

0

1

1

1

1

1

0

1

1

1

0

1

8

1

0

0

0

1

1

1

0

1

0

1

0

9

1

0

0

1

1

1

1

1

1

1

1

1

The binary digits in the first code in the above table are assigned the weights 8, 4, 2, 1.  As a result of this weight assignment, the code word that corresponds to each decimal digit is the binary equivalence of that digit; e.g., 5 is represented by 0101, and so on.  This code is known as the BCD (binary-coded-decimal) code.  For each of the codes in the table, the decimal digit that corresponds to a given code word is equal to the sum of the weights in those binary positions which are 1’s.  Thus, in the second code, where the weights are 2, 4, 2, 1, decimal 5 is represented by 1011, corresponding to the sum 2*1 + 4*0 + 2*1 + 1*1 = 5.  The weights assigned to the binary digits may also be negative, as is shown by the code (6,4,2,-3).  In this code decimal 5 is represented by 1011, since 6*1 + 4*0 + 2*1 - 3*1 = 5.

It is apparent that the representations of some decimal numbers in the (2,4,2,1) and (6,4,2,-3) codes are not unique.  For example, in the (2,4,2,1) code, decimal 7 may be represented by 1101 as well as by 0111.  Adopting the representations shown in the above table causes the codes to become self-complementing.  A code is said to be self-complementing if the code word of the 9’s complement of N, i.e., 9-N, can be obtained from the code word of N by interchanging all the 1’s and 0’s.  For example, in the (6,4,2,-3) code, decimal 3 is represented by 1001, while decimal 6 is represented by 0110.  In the (2,4,2,1) code, decimal 2 is represented by 0010, while decimal 7 is represented by 1101.  Note that the BCD code is not self-complementing.  It can be shown that a necessary condition for a weighted code to be self-complementing is that the sum of the weights must equal 9.  There exist only four positively weighted self-complementing codes, name, (2,4,2,1), (3,3,2,1), (4,3,1,1), (5,2,1,1).  In addition, there exist 13 self-complementing codes with positive and negative weights.

 

 

 

 

 

Nonweighted codes

There are many nonweighted binary codes, two of which are shown in the following table.   The Excess-3 code is formed by adding 0011 to each BCD code word.   Thus, for example, the representation of decimal 7 in Excess-3 is given by 0111 + 0011 = 1010.  The Excess-3 code is a self-complementing code, and it possesses a number of properties that made it practical in earlier decimal computers. 

 

Decimal Digit

 

Excess-3

 

Cyclic

0

0011

0000

1

0100

0001

2

0101

0011

3

0110

0010

4

0111

0110

5

1000

1110

6

1001

1010

7

1010

1000

8

1011

1100

9

1100

0100

In many practical applications, e.g., analog-to-digital conversion, it is desirable to use codes in which all successive code words differ in only one digit.  Codes that have such a property are referred to as cyclic codes.  The second code in the above table is an example of such a code.  (Note that in this, as in all cyclic codes, the code word representing the decimal digits 0 and 9 differ in only one digit.)  A particularly important cyclic code is the Gray code.  A four-bit Gray code is shown in the following table.

 

Decimal Number

 

Gray

 

Binary

0

0000

0000

1

0001

0001

2

0011

0010

3

0010

0011

4

0110

0100

5

0111

0101

6

0101

0110

7

0100

0111

8

1100

1000

9

1101

1001

10

1111

1010

11

1110

1011

12

1010

1100

13

1011

1101

14

1001

1110

15

1000

1111

The feature that makes this cyclic code useful is the simplicity of the procedure for converting from the binary number system into the Gray code.

Let gn...g2g1g0 denote a code word in the (n+1)st-bit Gray code, and let bn...b2b1b0 designate the corresponding binary number, where the subscripts 0 and n denote the least significant and most significant digits, respectively.  Then, the ith digit gi can be obtained from the corresponding binary number as follows:

gi = bi Å bi+1   0 < I < n - 1

gn = bn

where the symbol Å denotes the modulo-2 sum, which is defined as follows:

0 Å 0 = 0                0 Å 1 = 1                1 Å 0 = 1                1 Å 1 = 0

For example, the Gray code word which corresponds to the binary number 101101 is found to be 111011 in the following manner:

b5                            b4                            b3                            b2                            b1                            b0

1                              0                              1                              1                              0                              1

Using the above rules, 

g5 = b5 = 1                                                              g2 = b3Å b2 = 1

g4 = b5Å b4 = 1                                                      g1 = b2Å b1 = 1

g3 = b4Å b3 = 1                                                      g0 = b1Å b0 = 1

resulting in the following gray code word.

       1                        1                             1                              0                              1                              1                             

       g5                     g4                            g3                            g2                            g1                            g0 

To convert from Gray code to binary, start with the leftmost digit and proceed to the least significant digit, making bi = gi if the number of 1’s preceding  gi  is even, and making bi = gi¢ if the number of 1’s preceding gi  is odd. (Note that zero 1’s is an even number of 1’s.)  For example, the (Gray) code word 1001011 represents the binary number 1110010.  The proofs that the preceding conversion procedures indeed work are left to the reader.

The n-bit Gray code is a member of a class called reflected codes.  The term “reflected” is used to designate codes which have the property that the n-bit code can be generated by reflecting the (n-1)st-bit code, as illustrated here (left to right is the 1 bit, the two bit, the three bit, and the four bit codes.

0          0 0       0 00              0 000

1          0 1       0 01              0 001

            1 1       0 11              0 011

            1 0       0 10              0 010

                        1 10              0 110

                        1 11              0 111

                        1 01              0 101

                        1 00              0 100

                                                             1 100

                                                             1 101

                                                             1 111

                                                             1 110

                                                             1 010

                                                             1 011

                                                             1 001

                                                             1 000

ERROR DETECTION AND CORRECTION

 

In the codes presented so far, each code word consists of four binary digits, which is the minimum number needed to represent the 10 decimal digits.  Such codes, although adequate for the representation of the decimal digits, are very sensitive to transmission errors that may occur because of equipment failure or noise in the transmission channel.  In any practical system there is always a finite probability of the occurrence of a single error.  the probability that two or more errors will occur simultaneously, although nonzero, is substantially smaller.  We therefore restrict our discussion mainly to the detection and correction of single errors..

 

 

error-detecting codes

 

In a four-bit binary code, the occurrence of a single error in one of the binary digits may result in another, incorrect but valid, code word.  For example, in the BCD code, if an error occurs in the least significant digit of 0110, the code word 0111 results, and since it is a valid code word, it is incorrectly interpreted by the receiver.  If a code possesses the property that the occurrence of any single error transforms a valid code word into an invalid code word, it is said to be a (single)-error-detecting code.  Two error-detecting codes are shown in the following table.

 

 

Decimal

Digit

 

Even-parity BCD

8421p

 

2-out-of-5

01247

0

00000

00011

1

00011

11000

2

00101

10100

3

00110

01100

4

01001

10010

5

01010

01010

6

01100

00110

7

01111

10001

8

10001

01001

9

10010

00101

The error detection in either of the codes of the above table is accomplished by a parity check.  The basic idea in the parity check is to add an extra digit to each code word of a given code, so as to make the number of 1’s in each code word either odd or even.  In the codes of the above table we have used even parity.  The even-parity BCD code is obtained directly from the BCD code shown previously.  The added bit, denoted p, is called parity bit.  The 2-out-of-5 code consists of all 10 possible combinations of two 1’s in a five-bit code word.  With the exception of the code word for decimal 0, the 2-out-of-5 code is a weighted code and can be derived from the (1,2,4,7) code.

In each of the codes in the above table, the number of 1’s in a code word is even.  Now, if a single error occurs, it transforms the valid code word into an invalid one, thus making the detection of the error straightforward.  Although the parity check is intended only for the detection of single errors, it in fact detects any odd number of errors and some even number of errors.  For example, if the code word 10100 is received in an even parity BCD message, it is clear that the message is erroneous, although the parity check is satisfied.  We cannot determine, however, the original transmitted word.

In general, to obtain an n-bit error-detecting code, no more than half of the possible 2n combinations of digits can be used.  The code words are chosen in such a manner that, in order to change one valid code word into another valid code word, at least two digits must be complemented.  In the case of four-bit codes, this constraint means that only 8 valid code words can be formed of the 16 possible combinations.  Thus, to obtain an error-detecting code for the 10 decimal digits, at least 5 binary digits are needed.  It is useful to define the distance between two code words as the number of digits that must change in one word so that the other word results.  For example, the distance between 1010 and 0100 is three, since the two code words differ in three bit positions.  The minimum distance of a code is the smallest number of bits in which any two code words differ.  Thus the minimum distance of the BCD or the Excess-3 codes is one, while that of the codes in the above table is two.  Clearly, a code is an error-detecting code if and only if its minimum distance is two or more.

 

Error-correcting codes

 

For a code to be error-correcting, its minimum distance must be further increased.  For example, consider the three-bit code which consists of only two valid code words, 000 and 111.  If a single error occurs in the first code word, it can be changed to 001, 010, or 100.  The second code word can be changed due to a single error to 110, 101, or 011.  Note that in each case the invalid code words are different.  Clearly, this code is error-detecting, since its minimum distance is three.  Moreover, if we assume that only a single error can occur, then this error can be located and corrected, since every error results in an invalid code word that can be associated with only one of the valid code words.  Thus the two code words 000 and 111 constitute an error-correcting code whose minimum distance is three.  In general, a code is said to be an error-correcting code if the correct code word can always be deduced from the erroneous word.  In this section we shall discuss a single type of error-correcting codes, known as the Hamming codes.

If the minimum distance of a code is three, then any single error changes a valid code word into an invalid one, which is a distance one away from the original code word and a distance two from any other valid code word.  Therefore, in a code with minimum distance of three, any single error is correctable or any double error detectable.  Similarly, a code whose minimum distance is four may be used for either single-error correction and double-error detection or triple-error detection.  The key to error correction is that it must be possible to detect and locate erroneous digits.  If the location of an error has been determined, then by complementing the erroneous digit the message is corrected.

The basic principles in constructing a Hamming error-correcting code are as follows.  To each group of m information, or message, digits, k parity checking digits, denoted p1, p2,... pk, are added to form an (m+k)-digit code.  The location of each of the m+k digits within a code word is assigned a decimal value, starting by assigning a 1 to the most significant digit and m+k to the least significant digit.  k parity checks are performed on selected digits of each code word.  The result of each parity check is recorded as 1 or 0, depending, respectively, on whether an error has or has not been detected.  These parity checks make possible the development of a binary number, c1 c2 ...ck whose value when an error occurs is equal to the decimal value assigned to the location of the erroneous digit, and is equal to zero if no error occurs.  this number is called the position (or location) number.

The number k of digits in the position number must be large enough to describe the location of any of the m+k possible single errors, and must in addition take on the value zero to describe the “no error” condition.  Consequently, k must satisfy the inequality 2k > m+k+1.  Thus, for example, if the original message is in BCD, where m = 4, then k = 3 and ad at least three parity checking digits must be added to the BCD code.  The resultant error-correcting code thus consists of seven digits.  In this case, if the position number is equal to 101, it means that an error has occurred in position 5.  If, however, the position number is equal to 000, the message is correct.

In order to be able to specify the checking digits by means of only message digits and independently of each other, they are placed in positions 1,2,4,...,2k-1.  Thus, if m = 4 and k = 3, the checking digits are placed in positions 1,2, and 4, while the remaining positions contain the original (BCD) message bits.  For example, in the code word 1100110 the parity digits (in boldface) are p1 = 0, p2 = 1, p3 = 1, while the message digits are 0,1,1,0, which correspond to decimal 6.

We shall now show how the Hamming code is constructed, by constructing the code for m=4 and k=3.  As discussed above, the parity checking digits must be so specified that, when an error occurs, the position number will take on the value assigned to the location of the erroneous digit.  The following table lists the seven error positions and the corresponding values of the position number.

Error Position

Position number

c1 c2 c3

0 (no error)

 0  0  0

1

 0  0  1

2

 0  1  0

3

 0  1  1

4

 1  0  0

5

 1  0  1

6

 1  1  0

7

 1  1  1

It is evident that if an error occurs in position 1, or 3, or 5, or 7, the least significant digit, i.e., c3, of the position number must be equal to 1.  If the code is constructed so that in every code word the digits in positions 1,3,5, and 7 have even parity, then the occurrence of a single error in any one of these positions will cause an odd parity.  In such a case the least significant digit of the position number is recorded as 1.  If no error occurs among these digits, the parity check will show an even parity, and the least significant digit of the position number is recorded as 0.

From the above table we observe that an error in position 2, or 3, or 6, or 7 should result in the recording of a 1 in the center of the position number.  Hence the code must be designed so that the digits in positions 2,3,6, and 7 have even parity.  Again, if the parity check of these digits shows an odd parity, the corresponding position number digit, i.e., c2, is set to 1; otherwise it is set to 0.  Finally, if an error occurs in position 4, or 5, or 6, or 7, the most significant digit of the position number, i.e., c1, should be a 1.  Therefore, if digits 4,5,6, and 7 are designed to have even parity, an error in any one of these digits will be recorded as a 1 in the most significant digit of the position number.  To summarize:

 

p1 is selected so as to establish even parity in positions 4,5,6,7.

p2 is selected so as to establish even parity in positions 2,3,6,7.

p3 is selected so as to establish even parity in positions 1,3,5,7.

 

The code can now be constructed by adding the appropriate checking digits to the message digits.  Consider, for example, the message 0100 (i.e., decimal 4).

Position:                                                                1              2              3              4              5              6              7

p3            p2            m1            p1            m2            m3            m4

Original BCD message:                                                                       0                              1              0              0

Parity check in positions

      4,5,6,7 requires p1 =                                                                        0              1              1              0              0

Parity check in positions

      2,3,6,7 requires p2 =                                                       0              0              1              1              0              0

Parity check in positions

      1,3,5,7 requires p3 =                                        1              0              0              1              1              0              0

 

Coded message:                                                   1              0              0              1              1              0              0

 

p1 is set equal to 1 so as to establish even parity in positions 4, 5, 6, and 7.  Similarly, it is evident that p2 must be a 0 and p3 a 1, so that even parity is established in positions 2,3,6, and 7 and 1, 3, 5, and 7.  The Hamming code for the decimal digits coded in BCD is shown in the following table.

 

Decimal

Digit

 

Position:                1              2              3              4              5              6              7

p3            p2            m1            p1            m2            m3            m4

0

0              0              0              0              0              0              0

1

1              1              0              1              0              0              1

2

0              1              0              1              0              1              0

3

1              0              0              0              0              1              1

4

1              0              0              1              1              0              0

5

0              1              0              0              1              0              1

6

1              1              0              0              1              1              0

7

0              0              0              1              1              1              1

8

1              1              1              0              0              0              0

9

0              0              1              1              0              0              1

The error location and correction is performed in the following manner.  suppose, for example, that the sequence 1101001 is transmitted but, due to an error in the fifth position, the sequence 1101101 is received.  The location of the error can be determined by performing three parity checks as follows:

 

 

Position:                                1              2              3              4              5              6              7

Message received:                               1              1              0              1              1              0              1

4567 parity check:                                                                                1              1              0              1     Þ    c1 = 1 since parity is odd

2367 parity check:                                                1              0                                              0              1     Þ    c2 = 0 since parity is even

1357 parity check:                                1                              0                              1                              1     Þ    c3 = 1 since parity is odd

 

Thus the position number formed of c1 c2 c3 is 101, which means that the location of the error is in position 5.  To correct the error, the digit in position 5 is complemented, and the correct message 1101001 is obtained. 

It is easy to prove that the Hamming code constructed as shown above is a code whose distance is three.  consider, for example, the case where the two original four-bit code words differ in only one position, e.g., 1001 and 0001.  Since each message digit appears in at least two parity checks, the parity checks that involve the digit in which the two code words differ will result in different parities, and hence different checking digits will be added to the two words, making the distance between them equal to three.  For example, consider the two code words below.

 

 

 

Position:                                1              2              3              4              5              6              7

p3            p2            m1            p1            m2            m3            m4

First Word:                                                            1                              0              0              1

       Second Word:                                                               0                              0              0              1

First Word with parity bits:                                0              0              1              1              0              0              1

       Second Word with parity bits:                                    1              1              0              1              0              0              1

 

The two words differ in only m1 (i.e., position 3).  Parity checks 1357 and 2367 for these two words will give different results.  Therefore the parity-checking digits p3 and p2  must be different for these words.  Clearly, the foregoing argument is valid in the case where the original code words differ in two of the four positions.  Thus the Hamming code has a distance of three.

If the distance is increased to four, by adding a parity bit to the code in the above table, so that all eight digits will have even parity, the code may be used for single-error correction and double-error detection in the following manner.  Suppose that two errors occur; then the overall parity check is satisfied but the position number (determined as before from the first seven digits) will indicate an error.  Clearly, such a situation indicates the existence of a double error.  The error positions, however, cannot be located.  If only a single error occurs, the overall parity check will detect it.  Now, if the position number is 0, then the error is in the last parity bit; otherwise it is in the position given by the position number.  If all four parity checks indicate even parities, then the message is correct.

 

PROBLEMS

1.  Write the entire 4 bit Gray code by reflecting.

2.  Using the 4 bit Gray code word for 6, what is the 6 bit Gray code word for 6?

3.  Convert the following Gray code word to binary.

1 0 0 1 1 0 1 0

4.  Convert the following Binary code word to Gray.

1 0 0 1 1 0 1 0

 

The following hamming code word was received.  Use it to answer questions 5 - 10.

0 1 0 1 1 0 1

5.  Circle the parity bits.

6.  What position number is generated to determine if an error has occurred in transmission?

7.  Did an error occur in transmission?

8.  What was the transmitted code word?

9.  What was the transmitted message?

10.  If the message is binary, what is the decimal value?

11.  Encode a decimal 13 using each of the following codes.

A.  binary

B.  BCD

C.  5 bit even parity (4 bit binary with the parity bit on the left).

D.  Gray

E.  Excess-3

F.  7 bit Hamming

12.  For a weighted code to be self-complementing, a necessary--but not sufficient--condition is that the sum of the weights                                                               .

13.  In order for a code to be capable of detecting one transmission error, the minimum distance                                                                        .