diff --git a/sycl/doc/design/CommandGraph.md b/sycl/doc/design/CommandGraph.md index 23d32fd9ff7e2..7e32844edc67a 100644 --- a/sycl/doc/design/CommandGraph.md +++ b/sycl/doc/design/CommandGraph.md @@ -337,6 +337,62 @@ Backends which are implemented currently are: [Level Zero](#level-zero), ### Level Zero +The command-buffer implementation for the level-zero adapter has 2 different +implementation paths which are chosen depending on the device and level-zero +version: + +- Immediate Append path - Relies on + [zeCommandListImmediateAppendCommandListsExp](https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/api.html#zecommandlistimmediateappendcommandlistsexp) + to submit the command-buffer. This function is an experimental extension to the level-zero API. +- Wait event path - Relies on + [zeCommandQueueExecuteCommandLists](https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/api.html#zecommandqueueexecutecommandlists) + to submit the command-buffer work. However, this level-zero function has + limitations and, as such, this path is used only when the immediate append + path is unavailable. + +#### Immediate Append Path Implementation Details + +This path is only available when the device supports immediate command-lists +and the [zeCommandListImmediateAppendCommandListsExp](https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/api.html#zecommandlistimmediateappendcommandlistsexp) +API. This API can wait on a list of event dependencies using the `phWaitEvents` +parameter and can signal a return event when finished using the `hSignalEvent` +parameter. This allows for a cleaner and more efficient implementation than +what can be achieved when using the wait-event path +(see [this section](#wait-event-path-implementation-details) for +more details about the wait-event path). + +This path relies on 3 different command-lists in order to execute the +command-buffer: + +- `ComputeCommandList` - Used to submit command-buffer work that requires +the compute engine. +- `CopyCommandList` - Used to submit command-buffer work that requires the +[copy engine](#copy-engine). This command-list is not created when none of the +nodes require the copy engine. +- `EventResetCommandList` - Used to reset the level-zero events that are +needed for every submission of the command-buffer. This is executed after +the compute and copy command-lists have finished executing. For the first +execution, this command-list is skipped since there is no need to reset events +at this point. When counter-based events are enabled (i.e. the command-buffer +is in-order), this command-list is not created since counter-based events do +not need to be reset. + +The following diagram illustrates which commands are executed on +each command-list when the command-buffer is enqueued: +![L0 command-buffer diagram](images/diagram_immediate_append.png) + +Additionally, +[zeCommandListImmediateAppendCommandListsExp](https://oneapi-src.github.io/level-zero-spec/level-zero/latest/core/api.html#zecommandlistimmediateappendcommandlistsexp) +requires an extra command-list which is used to submit the other +command-lists. This command-list has a specific engine type +associated to it (i.e. compute or copy engine). Hence, for our implementation, +we need 2 of these helper command-lists: + - The `CommandListHelper` command-list is used to submit the +`ComputeCommandList`, `CommandListResetEvents` and profiling queries. + - The `ZeCopyEngineImmediateListHelper` command-list is used to submit the +`CopyCommandList` + +#### Wait event Path Implementation Details The UR `urCommandBufferEnqueueExp` interface for submitting a command-buffer takes a list of events to wait on, and returns an event representing the completion of that specific submission of the command-buffer. @@ -364,7 +420,7 @@ is made only once (during the command-buffer finalization stage). This allows the adapter to save time when submitting the command-buffer, by executing only this command-list (i.e. without enqueuing any commands of the graph workload). -#### Prefix +##### Prefix The prefix's commands aim to: 1. Handle the list of events to wait on, which is passed by the runtime @@ -409,7 +465,7 @@ and another reset command for resetting the signal we use to signal the completion of the graph workload. This signal is called *SignalEvent* and is defined in the `ur_exp_command_buffer_handle_t` class. -#### Suffix +##### Suffix The suffix's commands aim to: 1) Handle the completion of the graph workload and signal a UR return event. @@ -435,7 +491,7 @@ with extra commands associated with *CB*, and the other after *CB*. These new command-lists are retrieved from the UR queue, which will likely reuse existing command-lists and only create a new one in the worst case. -#### Drawbacks +##### Drawbacks There are three drawbacks of this approach to implementing UR command-buffers for Level Zero: diff --git a/sycl/doc/design/images/diagram_immediate_append.png b/sycl/doc/design/images/diagram_immediate_append.png new file mode 100644 index 0000000000000..83a0f2fb0a69c Binary files /dev/null and b/sycl/doc/design/images/diagram_immediate_append.png differ