OOM when initializing model #5

viktoriaschuster · 2024-06-25T06:40:07Z

Dear Dr. Zhang,

I am trying to run your model for a benchmark on paired multi-omics data. Both on my own and your example data I am running into out-of-memory issues when initializing the model. I have two GPUs with 24GB memory each available.

The error occurs in the initialization of TranslateAE (script train_model.py line 338 self.sess.run(tf.global_variables_initializer());).

This is the error message:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[93283,15172] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node translator_yx_px_r_genebatch/kernel/Adam_1/Assign (defined at bin/train_model_edited.py:348) ]].

The model seems to create a peak-by-gene matrix for the translator. Is this a desired behavior or might I have missed something in your data preprocessing before running the model?

Kind regards,
Viktoria

The text was updated successfully, but these errors were encountered:

RanZhang08 · 2024-06-25T20:29:26Z

Dear Viktoria,

The code can run on our local machines with 8G GPU memory and a batch size 16. Please double-check if there are caches or other running processes that are exhausting the memory.

The translator connects the RNA and ATAC embedding space (i.e., [embed_dim_x, embed_dim_y]) instead of the original feature space. Could you please check if the embed_dim_x and embed_dim_y arguments are passed correctly?

Please let us know if you have any further questions!

Best,
Ran

viktoriaschuster · 2024-06-27T08:12:19Z

Dear Ran,

When I try to run the model no other processes are using up GPU memory.

I have used default parameters except for the embedding dimensions where I match with the benchmarked models
embed_dim_x = 20 embed_dim_y = 20

Best,
Viktoria

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM when initializing model #5

OOM when initializing model #5

viktoriaschuster commented Jun 25, 2024

RanZhang08 commented Jun 25, 2024

viktoriaschuster commented Jun 27, 2024

OOM when initializing model #5

OOM when initializing model #5

Comments

viktoriaschuster commented Jun 25, 2024

RanZhang08 commented Jun 25, 2024

viktoriaschuster commented Jun 27, 2024