torchrec change for dynamic embedding #2533
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi TorchREC experts,
We would like to try incorporating NVIDIA HKV into the existing TorchREC workflow to extend TorchREC's capabilities for model-parallel dynamic embedding.
We aim to integrate HKV dynamic embedding as a new type of embedding table into the TorchREC workflow. To avoid disrupting the original TorchREC code, we have designed some code for registering new embedding tables, which will help us and other users to better register a customized embedding table into the TorchREC workflow. Our modifications mainly target the following two parts:
Registering a new customized compute table during the creation of the embedding table and lookup, and accepting its customized parameters.
Since the range of indices for dynamic embedding is unlimited, we need the input distribution to perform round-robin distribution.(Our current PR serves as a reference. For example, in the input dist section, we have only modified the RW code. However, it is necessary to support all sharding types, such as TWRW)
Our code is based on v0.7, and it can be easily migrated to the latest code. We are initiating this PR as a reference for further discussions with you. We hope to support a high-performance dynamic embedding feature.