Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

Open
adhikarirsr opened this issue Feb 21, 2022 · 3 comments
Open

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

adhikarirsr opened this issue Feb 21, 2022 · 3 comments

Comments

@adhikarirsr
Copy link

python -W ignore PreprocessingscGNN.py --datasetName HMLE_TGFb_day_8_10.csv.gz --datasetDir magic_HMLE/ --LTMGDir magic_HMLE/ --filetype CSV --geneSelectnum 2000

gives

MemoryError: Unable to allocate 1.36 TiB for an array with shape (12417, 15044000) and data type int64

I got the file from here

@juexinwang
Copy link
Owner

The file is so huge, that happens for the single-cell analysis. You may either find a machine with 1.36TB or split the file into different parts and process them seperately

@adhikarirsr
Copy link
Author

How can we split and process separately? Is this the limitation of scGNN method?

@juexinwang
Copy link
Owner

This is just preprocessing step, not touching scGNN yet.
The shape (12417, 15044000) may be 15,044,000 cells, 12,417 genes. Usually, you need a big machine to deal with that. Split means you may divide the file into 100 small files. Each new file may contain 150,440 cells. Do the imputation on them individually. You can change how many parts you want according to your machine's memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants