Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

'RuntimeError('unable to open shared memory object </torch_175312_2031012121> in read-write mode')' #7

Open
ruxintan opened this issue May 9, 2020 · 1 comment

Comments

@ruxintan
Copy link

ruxintan commented May 9, 2020

Hello, when i am running the train.py, there is a issue about "Reason: 'RuntimeError('unable to open shared memory object </torch_175312_2031012121> in read-write mode')' "

I dont know why, the num_worker=0 is still no using. Cound you kindly give me a help.

Thank you

@ruxintan
Copy link
Author

ruxintan commented May 9, 2020

the log is
Let's use 0 GPUs!
CONFIG:

"CTC": false,
"abspos": false,
"arMode": "LSTM",
"batchSizeGPU": 8,
"beta1": 0.9,
"beta2": 0.999,
"cpc_mode": null,
"debug": false,
"dropout": false,
"encoder_type": "cpc",
"epsilon": 1e-08,
"file_extension": "flac",
"hiddenEncoder": 256,
"hiddenGar": 256,
"ignore_cache": false,
"learningRate": 0.0002,
"load": null,
"loadCriterion": false,
"logging_step": 1000,
"max_size_loaded": 4000000000,
"nEpoch": 200,
"nGPU": 0,
"nLevelsGRU": 1,
"nLevelsPhone": 1,
"nPredicts": 12,
"n_process_loader": 8,
"negativeSamplingExt": 128,
"normMode": "layerNorm",
"onEncoder": false,
"pathCheckpoint": "/data3/trx/CPC_audio/checkpoint",
"pathDB": "/data3/trx/dataset/LibriSpeech/train-clean-100",
"pathPhone": null,
"pathTrain": "/data3/trx/dataset/train-clean-100-set.txt",
"pathVal": "/data3/trx/dataset/dev-clean-set.txt",
"random_seed": 1919925056,
"restart": false,
"rnnMode": "transformer",
"samplingType": "samespeaker",
"save_step": 5,
"schedulerRamp": null,
"schedulerStep": -1,
"sizeWindow": 20480,
"speakerEmbedding": 0,
"supervised": false

Ran in an error while loading /data3/trx/dataset/LibriSpeech/train-clean-100/_seqs_cache.txt: [Errno 2] No such file or directory: '/data3/trx/dataset/LibriSpeech/train-clean-100/_seqs_cache.txt'
Could not load cache, rebuilding
837it [00:00, 6894.27it/s]
Saved cache file at /data3/trx/dataset/LibriSpeech/train-clean-100/_seqs_cache.txt
Found files: 28539 seqs, 251 speakers

Loading audio data at /data3/trx/dataset/LibriSpeech/train-clean-100
Loading the training dataset
Checking length...
28539it [00:00, 1905918.99it/s]
Done, elapsed: 1.102 seconds
Scanned 28539 sequences in 1.10 seconds
2 chunks computed
Checking length...
28539it [00:00, 1902798.40it/s]
Done, elapsed: 0.234 seconds
Scanned 28539 sequences in 0.23 seconds
2 chunks computed
Joining pool
Joined process, elapsed=67.338 secs
Traceback (most recent call last):
File "cpc/train.py", line 494, in
main(args)
File "cpc/train.py", line 287, in main
MAX_SIZE_LOADED=args.max_size_loaded)
File "/data3/trx/CPC_audio/cpc/dataset.py", line 65, in init
self.loadNextPack()
File "/data3/trx/CPC_audio/cpc/dataset.py", line 129, in loadNextPack
self.nextData = self.r.get()
File "/data3/trx/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[(242, '7264-92314-0020', tensor([-0.0010, -0.0009, -0.0011, ..., 0.0015, 0.0016, 0.0019])), (230, '211-122442-0031', tensor([0.0061, 0.0146, 0.0129, ..., 0.0163, 0.0156, 0.0262])), (59, '39-121914-0012', tensor([-0.0017, -0.0016, -0.0011, ..., -0.0019, -0.0042, 0.0010])), (218, '7367-86737-0066', tensor([9.1553e-05, 6.4087e-04, 1.2207e-04, ..., 5.1880e-04, 3.6621e-04,
....................................]'. Reason: 'RuntimeError('unable to open shared memory object </torch_175312_2031012121> in read-write mode')`

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant