Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on Windows #37

Open
maleadt opened this issue Aug 29, 2023 · 9 comments
Open

Crash on Windows #37

maleadt opened this issue Aug 29, 2023 · 9 comments

Comments

@maleadt
Copy link
Member

maleadt commented Aug 29, 2023

Using Julia 1.11, NVTX#master:

$ julia --project -e 'using NVTX; NVTX.activate()'

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6be45c40 -- nvtxGlobals_v3 at C:\Users\Tim\.julia\artifacts\b4eeaf094ffb6aacf1b20ee5d2ac9aa1818fc732\bin\libnvToolsExt.dll (unknown line)
in expression starting at none:1
nvtxGlobals_v3 at C:\Users\Tim\.julia\artifacts\b4eeaf094ffb6aacf1b20ee5d2ac9aa1818fc732\bin\libnvToolsExt.dll (unknown line)
Allocations: 2905 (Pool: 2896; Big: 9); GC: 0
@maleadt
Copy link
Member Author

maleadt commented Aug 29, 2023

It's the initialization that crashes, so I guess there's something wrong with the JLL:

$ julia --project -e 'using NVTX_jll; ccall((:nvtxInitialize, libnvToolsExt), Cvoid, (Ptr{Cvoid},), C_NULL)'

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6be45c40 -- nvtxGlobals_v3 at C:\Users\Tim\.julia\artifacts\b4eeaf094ffb6aacf1b20ee5d2ac9aa1818fc732\bin\libnvToolsExt.dll (unknown line)
in expression starting at none:1
nvtxGlobals_v3 at C:\Users\Tim\.julia\artifacts\b4eeaf094ffb6aacf1b20ee5d2ac9aa1818fc732\bin\libnvToolsExt.dll (unknown line)
Allocations: 2905 (Pool: 2896; Big: 9); GC: 0

@simonbyrne
Copy link
Collaborator

Unfortunately I don't have a Windows machine to test it out.

@maleadt
Copy link
Member Author

maleadt commented Aug 29, 2023

I'm not really familiar with Windows either, i.e., how to best debug this. The backtrace is sparse, on GCC 8+ it's absent (just a ReadOnlyError), compiling with -g gives a DLL that Windows refuses to load, and compiling with -gcoff + WinDBG doesn't give any debug info at all...
I'll just disable NVTX support in CUDA.jl for now. I've also filed a bug with NVIDIA.

@mkitti
Copy link

mkitti commented Aug 29, 2023

I have a Windows computer, and I have some ability to help debug this sort of thing. I don't really know what NVTX is though. It looks like reproduction is pretty straightforward though.

@maleadt
Copy link
Member Author

maleadt commented Aug 29, 2023

I have a Windows system too, I'm just entirely unfamiliar with how to debug crashes like this on Windows. If you'd know how: this is how we build nvToolsExt.dll, https://github.com/JuliaPackaging/Yggdrasil/blob/master/N/NVTX/build_tarballs.jl, and doing a simple ccall to initialize the library crashes. The sources of the library can be found at https://github.com/NVIDIA/NVTX/. I'm guessing there may be some incompatibility between the library and mingw (it does pretty low-level things).

@simonbyrne
Copy link
Collaborator

I don't really know what NVTX is though.

It's an instrumentation library, which the profiler then intercepts. In C it is used as a header-only library, but we build it as a dynamic library, which may be where the issue is.

The source is here: https://github.com/NVIDIA/NVTX/

@huiyuxie
Copy link

Looks like only one simple hotfix has been done - is anyone already working on this issue/planning to take over this issue?

@maleadt
Copy link
Member Author

maleadt commented Nov 13, 2024

I've reproduced this in C, and filed an issue with NVIDIA. I'll let you know when I hear back.

@huiyuxie
Copy link

Thanks 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants