Tuesday, August 2, 2011

Protocol Encoding

Terminology

Protocol: A protocol is a set of rules which have to be followed in the course of some activity. In this article, we use the term “protocol” as a set of formal rules that governs the exchange of information.
Encoding:
Decoding:
PDU: A PDU is an abstraction unit that hides the protocol specific fields.

Representing PDUs

In an abstract manner, we can think PDUs of simple records composed of some fields. It also gives us the opportunity for transforming the communication problem to programming domain. We can retrieve the value of a field or modify the fields of the PDU in a proper way in order to achieve the goal.

General objectives of representation[1] :
  1. Efficiency: The information in the PDU should be coded as compactly as possible.
  2. Delimiting: It must be possible for the receiver to recognise the beginning and end of the PDU.
  3. Ease of decoding: It should be easy for the receiver to find out exactly what information it has received.
  4. Data transparency: The representation should be such that arbitrary sequences of bits can be sent as data within the PDU.

Simple binary encoding:

It just encodes the value without any indicators.This type of encoding is not flexible because size and order of members of the PDU are fixed. The receiver can retrieve the desired value in constant time.

Li be the length of ith item then,



SBE:Algorithms(get parameter, add parameter)
/* Get value of Nth key
Value length is assumed 1
 */
void get(buffer, N, &val)
{
    val = buffer[n];
}

void add(buffer, N, val)
{
    buffer[N] = val;
}

Type-Length-Value(TLV) encoding

TLV is more flexible than simple binary encoding because fields are laid in an arbitrary order, we can find the encoded value using the type as a key. Also it helps us to implementation to retrieve values using generic parsers.

Li be the length of ith item then,



Size of same amount of data that is encoded using TLV is greater than the one encoded using simple binary encoding.

Figure1.1 may help to understand the algorithms below.
Figure1.1
#define TYPE_INDEX 0
#define LEN_INDEX 1
#define VAL_INDEX 2

int get(buffer, type) {
    val = 0;
    i = 0;

    while (1) {
        if (buffer[i] == t)
            val = parser[t](buffer + i + VAL_INDEX); //call the matching parser
        break;
        len = buffer[i + LEN_INDEX];
        i = i + len + 1; //jump to next type
    }
    return val;
}

/* add the value to the end of the buffer */
void add(buffer, type, val, val_len) {
    len = length(buffer);
    buffer[len] = type;
    buffer[len + LEN_INDEX] = val_len;
    buffer[len + VAL_INDEX] = val;
    buffer[len + LEN_INDEX + VAL_INDEX + val_len + 1] = END;
}

int length(buffer) {
    while (1) {
        if (buffer[i] == END)
            break;
        len = buffer[i + LEN_INDEX];
        i = i + len + 1;
    }
    return len;
}
These encodings both may be used for a protocol at the same time. Fixed parts are encoded using simple binary encoding, while optional parts are encoded using TLV.

Case study: DHCP

Let's try to figure out how the theory is applied to the real-world cases. DHCP is an automatic configuration protocol used on IP networks. Below you can find a DHCP packet structure:




As you see DHCP PDU includes two types of parts:
  • fixed(i.e, op, htype, hlen, hops, xid, ..)
  • variable(options)
Fixed fields are encoded in simple binary encoding, so lengths and orders of the fields are fixed. Any variation of field length or field order causes failure in communication.

DHCP options, variable part, are encoded in TLV. So, it is not a problem how the fields are ordered. Because the node retrieving the message can parse the options correctly by the help of self describing format of TLV. Some options are listed:

Code Description Length
53 DHCP Message Type 1
58 Renewal Time 4
59 Rebinding Time 4

We can retrieve the value of option 53 using get pseudo:

val = get(options, 53)


References:
[1] Principles of Protocol Design-Sharp, Robin

No comments:

Post a Comment