6b/8b encoding

{{Short description|Line code used in telecommunications}}

{{More footnotes needed|date=August 2023}}

In telecommunications, 6b/8b is a line code that expands 6-bit codes to 8-bit symbols for the purposes of maintaining DC-balance in a communications system.{{cite book

|url=https://archive.org/details/codesformassdata0000scho

|title=Codes for Mass Data Storage Systems

|edition=Second fully revised

|publisher=Shannon Foundation Publishers

|location=Eindhoven, The Netherlands

|date=November 2004

|isbn=90-74249-27-2

|author=Kees A. Schouhamer Immink

|author-link=Kees Schouhamer Immink

|access-date=2015-08-23

|url-access=registration

}}

The 6b/8b encoding is a balanced code --

each 8-bit output symbol contains 4 zero bits and 4 one bits. So the code can, like a parity bit, detect all single-bit errors.

The number of 8-bit patterns with 4 bits set is the binomial coefficient \tbinom 84 = 70. Further excluding the patterns 11110000 and 00001111, this allows 68 coded patterns: 64 data codes, plus 4 additional control codes.

Coding rules

The 64 possible 6-bit input codes can be classified according to their disparity, the number of 1 bits minus the number of 0 bits:

class=wikitable style="text-align:center"

! Ones !! Zeros !! Disparity !! Number

06−61
15−46
24−215
33020
42+215
51+46
60+61

The 6-bit input codes are mapped to 8-bit output symbols as follows:

  • The 20 6-bit codes with disparity 0 are prefixed with 10
    Example: 000111 → 10000111
    Example: 101010 → 10101010
  • The 15 6-bit codes with disparity +2, other than 001111, are prefixed with 00
    Example: 010111 → 00010111
  • The 15 6-bit codes with disparity −2, other than 110000, are prefixed with 11
    Example: 101000 → 11101000
  • The remaining 20 codes: 12 with disparity ±4, 2 with disparity ±6, 001111, 110000, and the 4 control codes, are assigned to codes beginning with 01 as follows:

class=wikitable style="text-align:center"

! Type !! Input !! Output

|rowspan=11|

! Type !! Input !! Output

|rowspan=11|

! Complement

−600000001011001

| +6

11111101100110

| 01_xx__x

rowspan=6| −4

| 000001

01110001

|rowspan=6| +4

| 111110

01001110

|rowspan=2| 01xx____

00001001110010

| 111101

01001101
00010001100101

| 111011

01011010

|rowspan=2| 01x____x

00100001101001

| 110111

01010110
01000001010011

| 101111

01101100

|rowspan=2| 01_____xx

10000001100011

| 011111

01011100
−211000001110100

| +2

00111101001011

| 01____x__

rowspan=2|Control

| K 000111

01000111

|rowspan=2|Control

| K 111000

01111000

|rowspan=2|

K 01010101010101

| K 101010

01101010

No data symbol contains more than four consecutive matching bits, and because the patterns 11110000 and 00001111 are excluded, no data symbol begins or ends with more than three identical bits.

Thus, the longest run of identical bits that will be produced is 6. (I.e. this is a (0,5) RLL code, with a worst-case running disparity of +3 to −3.)

Any occurrence of 6 consecutive identical bits constitutes a comma sequence or sync mark or syncword; it identifies the symbol boundaries precisely.

Those 6 bits straddle the inter-symbol boundary with exactly 3 of those identical bits at the end of one symbol, and 3 of those identical bits at the start of the following next symbol.

See also

References

{{Reflist}}