-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fbgemm_gpu test fail #2977
Comments
Hi @ywangwxd
As for C++20 support, we are able to support C++20 with CUDA 11.8 is because we add |
1, Then how should I validate the success of installation of FBGEMM? |
Hi @ywangwxd Generally, the installation of FBGEMM can be validated by running It should be |
Is there anyway to validate the runtime, not just import. I encountered a problem when use fbgemm with torchrec. |
Hi @ywangwxd This error usually indicates that you're running on a CUDA hardware model for which we did not compile the FBGEMM code for. We generally compile FBGEMM for SM 7.0, 8.0, 9.0, and 9.0a. What is the hardware model you are running the code on? |
Hi @ywangwxd, can you show the result from running |
OK, this is the problem. I used a P100 card, which should be SM 6.0. |
I am using CUDA 11.8. I installed 0.8 binary. When I run the test program batched_unary_embeddings_test.py
ERROR: test_gpu (main.TableBatchedEmbeddingsTest)
Traceback (most recent call last):
File "/repo/fbgemm/fbgemm_gpu/test/batched_unary_embeddings_test.py", line 240, in test_gpu
self._test_main(gpu_infer=True)
File "/y/repo/fbgemm/fbgemm_gpu/test/batched_unary_embeddings_test.py", line 152, in _test_main
offsets_tensor[1:] = torch.ops.fbgemm.asynchronous_inclusive_cumsum(
File "//.conda/envs/torchrec/lib/python3.10/site-packages/torch/ops.py", line 1061, in call
return self._op(*args, **(kwargs or {}))
RuntimeError: CUDA error: invalid device function
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.If I compiled the package, it told me that nvcc does not support c++ 20. It is strange that if the release binary supports cuda 11.8, why my nvcc (cuda 11.8) cannot even recognize c++ 20.
The text was updated successfully, but these errors were encountered: