Data trace packets must be differentiated from instruction trace packets, and the means by which this is accomplished is dependent on the trace transport infrastructure. Several possibilities exist: One option is for instruction and data trace to be issued using different IDs (for example, if using ATB transport, different ATID values). Alternatively, an additional field as part of the packet encapsulation can be used (Siemens uses a 2-bit msg_type field to differentiate different trace types from the same source).
By default, all data trace packets include both address and data. However, provision is made for run-time configuration options to exclude either the address or the data, in order to minimize trace bandwidth. For example, if filtering has been configured to only trace from a specific data access address there is no need to report the address in the trace. Alternatively, the user may want to know which locations are accessed but not care about the data value. Information about whether address or data are omitted is not encoded in the packets themselves as it does not change dynamically, and to do so would reduce encoding efficiency. The run-time configuration should be reported in the Format 3, subformat 3 support packet (see [sec:format33]). The following sections include examples for all three cases.
As outlined in [sec:DataInterfaceRequirements], two different signaling protocols between the RISC-V hart and the encoder are supported: unified and split. Accordingly, both unified and split trace packets are defined.
Note
|
In the following tables, "clog2" is an abbreviation for "ceiling of log2". |
Types of data trace packets are differentiated by the format field. This field is 2 bits wide if only unified loads and stores are supported, or 3 bits otherwise.
Unified loads and split load request phase share the same code because the encoder will support one or the other, indicated by a discoverable parameter.
Data accesses aligned to their size (e.g. 32-bit loads aligned to 32-bit word boundaries) are expected to be commonplace, and in such cases, encoding efficiency can be improved by not reporting the redundant LSBs of the address.
Field name | Bits | Description |
---|---|---|
format |
2 or 3 |
Transaction type: |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
2 |
00: Full address and data (sync) |
data_len |
size |
Number of bytes of data is data_len + 1 |
data |
8 * (data_len + 1) |
Data |
address |
daddress_width_p |
Byte address if format is unaligned, otherwise shift left by size to recover byte address |
Field name | Bits | Description |
---|---|---|
format |
2 or 3 |
Transaction type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
1 |
0: Full address (sync) |
address |
daddress_width_p |
Byte address if format is unaligned, otherwise shift left by size to recover byte address |
Field name | Bits | Description |
---|---|---|
format |
2 or 3 |
Transaction type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
1 or 2 |
00: Full data (sync) |
data |
data_width_p |
Data |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
lrid |
lrid_width_p |
Load request ID |
diff |
1 |
0: Full address (sync) |
address |
daddress_width_p |
Byte address if format is unaligned, otherwise shift left by size to recover byte address |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
lrid |
lrid_width_p |
Load request ID |
resp |
2 |
00: Error (no data) |
data |
data_width_p |
Data |
The width of this field is 2 bits if max size is 64-bits (data_width_p < 128), 3 bits if wider.
Unlike instruction trace, compression options for data trace are somewhat limited. Following a synchronization instruction trace packet, the first data trace packet for a given access size must include the full (unencoded) data access address. Thereafter, the address may be reported differentially (i.e. address of this data access, minus the address of the previous data access of the same size).
Similarly, following a synchronization instruction trace packet, the first data trace packet for a given access size must include the full (unencoded) data value. Beyond this, data may be encoded or unencoded depending on whichever results in the most efficient represenation. Implementors may chose to offer one of XOR or differential compression, or both. XOR compression will be simpler to implement, and avoids the need for performing subtraction of large values.
If only one data compression type is offered, the diff field can be 1 bit wide rather than 2 for Packet format for Unified load or store, with data only.
However the data is compressed, upper bytes that are all the same value do not need to be included in the packet; the decoder can recreate the full-width value by sign extending from the most significant received bit. In cases where data is not the final field in the packet, the width of data is indicated by this field.
Strictly, size could be just one bit as atomics are currently either 32 or 64 bits. Defining as per regular loads and stores provisions for future extensions (proprietary or otherwise) that support smaller atomics.
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
3 |
Atomic sub-type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
2 |
00: Full address and data (sync) |
op_len |
size |
Number of bytes of operand is op_len + 1 |
operand |
8 * (op_len + 1) |
Operand. Value from rs2 before operator applied |
data_len |
size |
Number of bytes of data is data_len + 1 |
data |
8 * (data_len + 1) |
Data |
address |
daddress_width_p |
Address, aligned and encoded as per size |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
3 |
Atomic sub-type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
1 |
0: Full address |
address |
daddress_width_p |
Address, aligned and encoded as per size |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
3 |
Atomic sub-type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
diff |
1 or 2 |
00: Full data (sync) |
op_len |
size |
Number of bytes of operand is op_len + 1 |
operand |
8 * (op_len + 1) |
Operand. Value from rs2 before operator applied |
data |
data_width_p |
Data |
The operand value for the atomic operation. Uncompressed, although upper bytes that are all the same value do not need to be included in the packet; the decoder can recreate the full-width value by sign extending from the most significant received bit; see data_len and op_len fields.
Width of data and *operand fields respectively. See data_len field.
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
3 |
Atomic sub-type |
size |
max(1, clog2(clog2( data_width_p/8 + 1))) |
Transfer size is 2size bytes |
lrid |
lrid_width_p |
Load request ID |
diff |
1 or 2 |
00: Full address and data (sync) |
op_len |
size |
Number of bytes of operand is op_len + 1 |
operand |
8 * (op_len + 1) |
Operand. Value from rs2 before operator applied |
address |
daddress_width_p |
Address, aligned and encoded as per size |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
lrid |
lrid_width_p |
Load request ID |
resp |
2 |
00: Error (no data) |
data_len |
size |
Number of bytes of operand is data_len + 1. Not included if resp indicates an error (sign-extend resp MSB) |
data |
8 * (data_len + 1) |
Data. Not included if resp indicates an error (sign-extend resp MSB) |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
2 |
CSR sub-type |
diff |
1 or 2 |
00: Full data (sync) |
data_len |
2 or 3 |
Number of bytes of data is data_len + 1 |
data |
8 * (data_len 1) |
Data |
addr_msbs |
6 |
Address[11:6] |
op_len |
2 or 3 |
Number of bytes of operand is op_len + 1 |
operand |
8 * (op_len + 1) |
Operand. Value from rs1 before operator applied |
addr_lsbs |
6 |
Address[5:0] |
2 bits wide if hart has 32-bit CSRs, 3 bits if 64-bit. Width of data and operand fields respectively. See data_len field.
The address is split into two parts, with the 6 LSBs output last as these are more likely to compress away.
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
2 |
CSR sub-type |
diff |
1 or 2 |
00: Full data (sync) |
data_len |
2 or 3 |
Number of bytes of data is data_len + 1 |
data |
8 * (data_len + 1) |
Data |
addr_msbs |
6 |
Address[11:6] |
addr_lsbs |
6 |
Address[5:0] |
Field name | Bits | Description |
---|---|---|
format |
3 |
Transaction type |
subtype |
3 |
CSR sub-type |
diff |
0 or 1 |
0: Full address |
addr_msbs |
6 |
Address[11:6] |
addr_lsbs |
6 |
Address[5:0] |