-
Notifications
You must be signed in to change notification settings - Fork 298
[libcu++] Dynamically load CUDA library instead of using the runtime #6899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test 91bea5b |
91bea5b to
95edf73
Compare
95edf73 to
319d34b
Compare
|
/ok to test 319d34b |
This comment has been minimized.
This comment has been minimized.
|
/ok to test 35dd82a |
😬 CI Workflow Results🟥 Finished in 1h 48m: Pass: 93%/91 | Total: 1d 04h | Max: 1h 46m | Hits: 96%/199476See results here. |
| # if _CCCL_OS(WINDOWS) | ||
| HMODULE m_cudaDriverLibrary = LoadLibraryEx("nvcuda.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32); | ||
| if (m_cudaDriverLibrary == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: This is loading every single time the function is called, should this be a function local static?
| # else | ||
| const char* m_cudaDriverLibraryName = "libcuda.so.1"; | ||
| # endif | ||
| void* m_cudaDriverLibrary = dlopen(m_cudaDriverLibraryName, RTLD_NOW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto should this be function local static?
We can't support minor version compatibility in 12.X releases without breaking dynamic runtime use case described here: #5970. Its because in older 12.X releases there is no
cudaGetDriverEntryPointByVersionand the non versioned one can't support MVC.This PR switches to instead dynamically load the CUDA library and fetch
cuGetProcAddressfrom it, instead of usingcudaGetDriverEntryPoint