Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-16355 client: pydaos.torch module (#15475) #15536

Open
wants to merge 3 commits into
base: google/2.6
Choose a base branch
from

Conversation

mjmac
Copy link
Contributor

@mjmac mjmac commented Nov 25, 2024

Introducing pydaos.torch module that allows use DAOS POSIX containers
as a datasource for pytorch framework in form of pydaos.torch.Dataset and
pydaos.torch.IterableDataset classes.

Signed-off-by: Denis Barakhtanov [email protected]

Copy link

github-actions bot commented Nov 25, 2024

Ticket title is 'pydaos.torch modules'
Status is 'Open'
https://daosio.atlassian.net/browse/DAOS-16355

@mjmac mjmac changed the title mjmac/DAOS 16355 google 2.6 DAOS-16355 client: pydaos.torch module (#15475) Nov 25, 2024
@daosbuild1
Copy link
Collaborator

@daosbuild1
Copy link
Collaborator

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15536/1/execution/node/387/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15536/1/execution/node/385/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15536/1/execution/node/295/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15536/1/execution/node/273/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15536/1/execution/node/317/log

johannlombardi and others added 3 commits November 25, 2024 18:28
This patch marks all pool and container handles as if they
were created with g2l in the child processes after fork.
It prevents misinteractions if one of the child processes
closes the handle.
The marking is done by iterating through all the pool and
container handles which was not supported by the hhash code.

This patch also:
- adds support for fork to pydaos.
- introduces daos_reinit() to be called after fork.
- fixes IL to set the atfork callback when no extra eq are used.
- remove support for creating an event queue for each pydaos
  put/get operation. This makes the global event queue the only
  option. This should probably be moved to a per-thread eq in
  the future.

Signed-off-by: Johann Lombardi <[email protected]>
Disable call to pthread_atfork and daos_reinit() in pydaos
until DAOS-16637 is understood.

Signed-off-by: Johann Lombardi <[email protected]>
Introducing pydaos.torch module that allows use DAOS POSIX containers
as a datasource for pytorch framework in form of pydaos.torch.Dataset and
pydaos.torch.IterableDataset classes.

Signed-off-by: Denis Barakhtanov <[email protected]>
@mjmac mjmac force-pushed the mjmac/DAOS-16355-google-2.6 branch from 117c94f to e50b3c3 Compare November 25, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants