memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

adhikarirsr · 2022-02-21T15:19:20Z

python -W ignore PreprocessingscGNN.py --datasetName HMLE_TGFb_day_8_10.csv.gz --datasetDir magic_HMLE/ --LTMGDir magic_HMLE/ --filetype CSV --geneSelectnum 2000

gives

MemoryError: Unable to allocate 1.36 TiB for an array with shape (12417, 15044000) and data type int64

I got the file from here

The text was updated successfully, but these errors were encountered:

juexinwang · 2022-02-22T20:10:06Z

The file is so huge, that happens for the single-cell analysis. You may either find a machine with 1.36TB or split the file into different parts and process them seperately

adhikarirsr · 2022-02-22T20:13:29Z

How can we split and process separately? Is this the limitation of scGNN method?

juexinwang · 2022-02-22T20:18:26Z

This is just preprocessing step, not touching scGNN yet.
The shape (12417, 15044000) may be 15,044,000 cells, 12,417 genes. Usually, you need a big machine to deal with that. Split means you may divide the file into 100 small files. Each new file may contain 150,440 cells. Do the imputation on them individually. You can change how many parts you want according to your machine's memory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

adhikarirsr commented Feb 21, 2022

juexinwang commented Feb 22, 2022

adhikarirsr commented Feb 22, 2022

juexinwang commented Feb 22, 2022

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

memory issue on HMLE_TGFb_day_8_10 example from MAGIC #23

Comments

adhikarirsr commented Feb 21, 2022

juexinwang commented Feb 22, 2022

adhikarirsr commented Feb 22, 2022

juexinwang commented Feb 22, 2022