Skip to content
This repository was archived by the owner on Nov 4, 2025. It is now read-only.

Commit 571cafa

Browse files
Initial proposal for binary protocol (#2)
* move from https://github.com/w3c/trace-context/pull/215/files * added MUST for ordering * added a link to 256 limit and explained why spec is taking about byte arrays * use trace-flags to describe the field * addressed PRfeedback and more * added note about endianess of bytes * Fix typo (#3) * Update spec/20-binary-format.md
1 parent d7bb4d8 commit 571cafa

File tree

5 files changed

+219
-4
lines changed

5 files changed

+219
-4
lines changed

index.html

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
<section id='abstract' data-include="spec/01-abstract.md" data-include-format='markdown'></section>
5555
<section id='sotd' data-include="spec/02-sotd.md" data-include-format='markdown'></section>
5656

57-
<section data-include="spec/20-BINARY_FORMAT.md" data-include-format='markdown'></section>
57+
<section data-include="spec/20-binary-format.md" data-include-format='markdown'></section>
58+
<section data-include="spec/31-parsing-algoritm.md" data-include-format='markdown' class="informative"></section>
5859
</body>
5960
</html>

spec/20-BINARY_FORMAT.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

spec/20-binary-format.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Binary format
2+
3+
Binary format document describes how to encode each field - `traceparent` and
4+
`tracestate`. The binary format should be used to encode the values of these
5+
fields. This specification does not specify how these fields should be stored
6+
and sent as a part of a binary payload. The basic implementation may serialize
7+
those as size of the field followed by the value.
8+
9+
Specification operates with bytes - unsigned 8-bit integer values
10+
representing values from `0` to `255`. Byte representation as a set of
11+
bits (big or little endian) MUST be defined by underlying platform and
12+
out of scope of this specification.
13+
14+
## `Traceparent` binary format
15+
16+
The field `traceparent` encodes the version of the protocol and fields
17+
`trace-id`, `parent-id` and `trace-flags`. Each field starts with the one byte
18+
field identifier with the field value following immediately after it. Field
19+
identifiers are used as markers for additional verification of the value
20+
consistency and may be used in future for the versioning of the `traceparent`
21+
field.
22+
23+
``` abnf
24+
traceparent = version version_format
25+
version = 1BYTE ; version is 0 in the current spec
26+
version_format = "{ 0x0 }" trace-id "{ 0x1 }" parent-id "{ 0x2 }" trace-flags
27+
trace-id = 16BYTES
28+
parent-id = 8BYTES
29+
trace-flags = 1BYTE ; only the least significant bit is used
30+
```
31+
32+
Unknown field identifier (anything beyond `0`, `1` and `2`) should be treated as
33+
invalid `traceparent`. All zeroes in `trace-id` and `parent-id` invalidates the
34+
`traceparent` as well.
35+
36+
## Serialization of `traceparent`
37+
38+
Implementation MUST serialize fields into the field ordering sequence.
39+
In other words, `trace-id` field should be serialized first, `parent-id`
40+
second and `trace-flags` - third.
41+
42+
Field identifiers should be treated as unsigned byte numbers and should be
43+
encoded in big-endian bit order.
44+
45+
Fields `trace-id` and `parent-id` are defined as a byte arrays, NOT a
46+
long numbers. First element of an array MUST be copied first. When array is
47+
represented as a memory block of 16 bytes - serialization of `trace-id`
48+
would be identical to `memcpy` method call on that memory block. This
49+
may be a concern for implementations casting these fields to integers -
50+
protocol is NOT defining whether those byte arrays are ordered as big
51+
endian or little endian and have a sign bit.
52+
53+
If padding of the field is required (`traceparent` needs to be serialized into
54+
the bigger buffer) - any number of bytes can be appended to the end of the
55+
serialized value.
56+
57+
## `traceparent` example
58+
59+
``` js
60+
{0,
61+
0, 75, 249, 47, 53, 119, 179, 77, 166, 163, 206, 146, 157, 0, 14, 71, 54,
62+
1, 52, 240, 103, 170, 11, 169, 2, 183,
63+
2, 1}
64+
```
65+
66+
This corresponds to:
67+
68+
- `trace-id` is
69+
`{75, 249, 47, 53, 119, 179, 77, 166, 163, 206, 146, 157, 0, 14, 71, 54}` or
70+
`4bf92f3577b34da6a3ce929d000e4736`.
71+
- `parent-id` is `{52, 240, 103, 170, 11, 169, 2, 183}` or `34f067aa0ba902b7`.
72+
- `trace-flags` is `1` with the meaning `recorded` is true.
73+
74+
## `tracestate` binary format
75+
76+
List of up to 32 name-value pairs. Each list member starts with the 1 byte field
77+
identifier `0`. The format of list member is a single byte key length followed
78+
by the key value and single byte value length followed by the encoded
79+
value. Note, single byte length field allows keys and values up to 256
80+
bytes long. This limit is defined by [trace
81+
context](https://w3c.github.io/trace-context/#header-value)
82+
specification. Strings are transmitted in ASCII encoding.
83+
84+
``` abnf
85+
tracestate = list-member 0*31( list-member )
86+
list-member = "0" key-len key value-len value
87+
key-len = 1BYTE ; length of the key string
88+
value-len = 1BYTE ; length of the value string
89+
```
90+
91+
Zero length key (`key-len == 0`) indicates the end of the `tracestate`. So when
92+
`tracestate` should be serialized into the buffer that is longer than it
93+
requires - `{ 0, 0 }` (field id `0` and key-len `0`) will indicate the end of
94+
the `tracestate`.
95+
96+
## `tracestate` example
97+
98+
``` js
99+
{ 0, 3, 102, 111, 111, 16, 51, 52, 102, 48, 54, 55, 97, 97, 48, 98, 97, 57, 48, 50, 98, 55,
100+
0, 3, 98, 97, 114, 4, 48, 46, 50, 53, }
101+
102+
```
103+
104+
This corresponds to 2 tracestate entries:
105+
106+
`foo=34f067aa0ba902b7,bar=0.25`

spec/21-binary-format-rationale.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Rationale for decision on binary format
2+
3+
Binary format is similar to proto encoding without any reference on
4+
protobuf project. It uses field identifiers in bytes in front of field
5+
values.
6+
7+
## Field identifiers
8+
9+
Protocol uses field identifiers for fields like `trace-id`, `parent-id`,
10+
`trace-flags` and tracestate entries. The purpose of the field
11+
identifiers is two-fold. First, allow to remove existing fields or add
12+
new ones going forward. Second, provides an additional layer of
13+
validation of the format.
14+
15+
## How can we add new fields
16+
17+
If we follow the rules that we always append the new ids at the end of the
18+
buffer we can add up to 127. After that we can either use varint encoding or
19+
just reserve 255 as a continuation byte. Assumption at the moment is
20+
that specification will never get to this point.
21+
22+
## Why custom binary protocol
23+
24+
We didn't find non-proprietary wide used binary protocol that can be
25+
used in this specification.

spec/31-parsing-algoritm.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# De-serialization algorithms
2+
3+
This is non-normative section that describe de-serialization algorithm
4+
that may be used to parse `traceparent` and `tracestate` field values.
5+
6+
## De-serialization of `traceparent`
7+
8+
Let's assume the algorithm takes a buffer - bytes array - and can set
9+
and shift cursor in the buffer as well as validate whether the end of
10+
the buffer was reached or will be reached after reading the given number
11+
of bytes. This algorithm can work on stream of bytes. De-serialization
12+
of `traceparent` MAY be done in the following sequence:
13+
14+
1. If buffer is empty - RETURN invalid status `BUFFER_EMPTY`. Set a cursor to
15+
the first byte.
16+
2. Read the `version` byte at the cursor position. Shift cursor to `1` byte.
17+
3. If at the end of the buffer RETURN invalid status `TRACEPARENT_INCOMPLETE`.
18+
4. **Parse `trace-id`**. Read the field identifier byte at the cursor
19+
position. If NOT `0` - go to step `8. Report invalid field`.
20+
Otherwise - check that remaining buffer size is more or equal to `16`
21+
bytes. If shorter - RETURN invalid status `TRACE_ID_TOO_SHORT`.
22+
Otherwise read the next `16` bytes for `trace-id` and shift cursor to
23+
the end of those `16` bytes.
24+
5. **Parse `trace-id`**. Read the field identifier byte at the cursor
25+
position. If NOT `1` - go to step `8. Report invalid field`.
26+
Otherwise - check that remaining buffer size is more or equal to `8`
27+
bytes. If shorter - RETURN invalid status `PARENT_ID_TOO_SHORT`.
28+
Otherwise read the next `8` bytes for `parent-id` and shift cursor
29+
to the end of those `8` bytes.
30+
6. **Parse `trace-id`**. Read the field identifier byte at the cursor
31+
position. If NOT `2` - go to step `8. Report invalid field`.
32+
Otherwise - check the remaining size of the buffer. If at the end of
33+
the buffer - RETURN invalid status. Otherwise - read the
34+
`trace-flags` byte. Least significant bit will represent `recorded`
35+
value.
36+
7. RETURN status `OK` if `version` is `0` or status `DOWNGRADED_TO_ZERO`
37+
otherwise.
38+
8. **Report invalid field**. If `version` is `0` RETURN invalid status
39+
`INVALID_FIELD_ID`. If `version` has any other value -
40+
`INCOMPATIBLE_VERSION`
41+
42+
_Note_, that invalid status names are given for readability and not part of the
43+
specification.
44+
45+
_Note_, that parsing should not treat any additional bytes in the end of the
46+
buffer as an invalid status. Those fields can be added for padding purposes.
47+
Optionally implementation can check that the buffer is longer than `29` bytes as
48+
a very first step if this check is not expensive.
49+
50+
## De-serialization of `tracestate`
51+
52+
Let's assume the algorithm takes a buffer - bytes array - and can set
53+
and shift cursor in the buffer as well as validate whether the end of
54+
the buffer was reached or will be reached after reading the given number
55+
of bytes. Algorithm also uses `version` value parsed from `traceparent`.
56+
If `version` was not given - value `0` SHOULD be used. This algorithm
57+
can work on stream of bytes. De-serialization of `tracestate` MAY be
58+
done in the following sequence:
59+
60+
1. If at the end of the buffer - RETURN status `OK`. Otherwise set a
61+
cursor to the first byte.
62+
2. **Parse `list-member` field identifier**. Read the field identifier
63+
byte at the cursor position and shift cursor to `1` byte. If NOT `0`
64+
and `version` is `0` RETURN invalid status `INVALID_FIELD_ID`. If NOT
65+
`0` and `version` has any other value - `INCOMPATIBLE_VERSION`.
66+
3. **Parse key**.
67+
1. If at the end of the buffer - RETURN status `OK`. This situation
68+
indicates that `tracestate` value was padded with `0`.
69+
2. Read the `key-len` byte. Shift cursor to `1` byte. If the value of
70+
`key-len` is `0` - RETURN status `OK`. This situation indicates an
71+
explicit end of a key.
72+
3. Check that buffer has `key-len` more bytes. If not - RETURN
73+
`KEY_TOO_SHORT`.
74+
4. Read `key-len` bytes as `key`. Shift cursor to `key-len` bytes.
75+
4. **Parse value**.
76+
1. If at the end of the buffer - RETURN status `INCOMPLETE_LIST_MEMBER`.
77+
2. Read the `value-len` byte. Shift cursor to `1` byte. If the value of
78+
`value-len` is `0` - add `list-member` with the `key` and empty
79+
`value` to the `tracestate` list. RETURN status `OK`.
80+
3. Check that buffer has `value-len` more bytes. If not - RETURN
81+
`VALUE_TOO_SHORT`.
82+
4. Read `value-len` bytes as `value`. Shift cursor to `value-len`
83+
bytes.
84+
5. Add `list-member` with the `key` and `value` to the `tracestate`
85+
list.
86+
5. Go to step `2. Parse list-member field identifier`.

0 commit comments

Comments
 (0)