You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, thanks for your great work! For now, I am trying to finetune SAM2 on my own dataset but I have a try on finetune it on MOSE at first. After config the environment , download the MOSE dataset, and change the path in sam2.1_hiera_b+_MOSE_finetune.yaml. I am using python 3.12.4+torch2.3.1+cuda12.2. I encounter some errors like:
[rank0]: Traceback (most recent call last):
[rank0]: File "/scratch/hp2173/sam2/training/train.py", line 270, in <module>
[rank0]: main(args)
[rank0]: File "/scratch/hp2173/sam2/training/train.py", line 240, in main
[rank0]: single_node_runner(cfg, main_port)
[rank0]: File "/scratch/hp2173/sam2/training/train.py", line 53, in single_node_runner
[rank0]: single_proc_run(local_rank=0, main_port=main_port, cfg=cfg, world_size=num_proc)
[rank0]: File "/scratch/hp2173/sam2/training/train.py", line 41, in single_proc_run
[rank0]: trainer.run()
[rank0]: File "/scratch/hp2173/sam2/training/trainer.py", line 515, in run
[rank0]: self.run_train()
[rank0]: File "/scratch/hp2173/sam2/training/trainer.py", line 532, in run_train
[rank0]: outs = self.train_epoch(dataloader)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/scratch/hp2173/sam2/training/trainer.py", line 740, in train_epoch
[rank0]: for data_iter, batch in enumerate(train_loader):
[rank0]: File "/scratch/hp2173/sam2/training/dataset/sam2_datasets.py", line 64, in __next__
[rank0]: raise e
[rank0]: File "/scratch/hp2173/sam2/training/dataset/sam2_datasets.py", line 56, in __next__
[rank0]: item = next(self._iter_dls[dataset_idx])
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
[rank0]: data = self._next_data()
[rank0]: ^^^^^^^^^^^^^^^^^
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
[rank0]: return self._process_data(data)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank0]: data.reraise()
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/_utils.py", line 706, in reraise
[rank0]: raise exception
[rank0]: UnboundLocalError: Caught UnboundLocalError in DataLoader worker process 0.
[rank0]: Original Traceback (most recent call last):
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank0]: data = fetcher.fetch(index) # type: ignore[possibly-undefined]
[rank0]: ^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
[rank0]: data = [self.dataset[idx] for idx in possibly_batched_index]
[rank0]: ~~~~~~~~~~~~^^^^^
[rank0]: File "/scratch/hp2173/sam2/training/dataset/utils.py", line 104, in __getitem__
[rank0]: return self.dataset[self.epoch_ids[idx]]
[rank0]: ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/ext3/miniconda3/lib/python3.12/site-packages/torch/utils/data/dataset.py", line 350, in __getitem__
[rank0]: return self.datasets[dataset_idx][sample_idx]
[rank0]: ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
[rank0]: File "/scratch/hp2173/sam2/training/dataset/vos_dataset.py", line 132, in __getitem__
[rank0]: return self._get_datapoint(idx)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/scratch/hp2173/sam2/training/dataset/vos_dataset.py", line 74, in _get_datapoint
[rank0]: datapoint = self.construct(video, sampled_frms_and_objs, segment_loader)
[rank0]: ^^^^^
[rank0]: UnboundLocalError: cannot access local variable 'video' where it is not associated with a value
also, there are some warnings like
WARNING:root:Loading failed (id=790); Retry 0 with exception: invalid literal for int() with base 10: '._00078'
WARNING:root:Loading failed (id=1097); Retry 1 with exception: invalid literal for int() with base 10: '._00002'
WARNING:root:Loading failed (id=712); Retry 2 with exception: invalid literal for int() with base 10: '._00031'
WARNING:root:Loading failed (id=43); Retry 3 with exception: invalid literal for int() with base 10: '._00050'
WARNING:root:Loading failed (id=941); Retry 4 with exception: invalid literal for int() with base 10: '._00031'
WARNING:root:Loading failed (id=522); Retry 5 with exception: invalid literal for int() with base 10: '._00002'
WARNING:root:Loading failed (id=1237); Retry 6 with exception: invalid literal for int() with base 10: '._00078'
WARNING:root:Loading failed (id=220); Retry 7 with exception: invalid literal for int() with base 10: '._00050'
WARNING:root:Loading failed (id=754); Retry 8 with exception: invalid literal for int() with base 10: '._00002'
WARNING:root:Loading failed (id=266); Retry 9 with exception: invalid literal for int() with base 10: '._00031'
WARNING:root:Loading failed (id=134); Retry 10 with exception: invalid literal for int() with base 10: '._00031'
I think this error occured during the data loading process but I am not sure if it is related to the datasets I download? Although I fetched MOSE from its official repo. Could you help me out?
The text was updated successfully, but these errors were encountered:
Hi there, thanks for your great work! For now, I am trying to finetune SAM2 on my own dataset but I have a try on finetune it on MOSE at first. After config the environment , download the MOSE dataset, and change the path in
sam2.1_hiera_b+_MOSE_finetune.yaml
. I am using python 3.12.4+torch2.3.1+cuda12.2. I encounter some errors like:also, there are some warnings like
I think this error occured during the data loading process but I am not sure if it is related to the datasets I download? Although I fetched MOSE from its official repo. Could you help me out?
The text was updated successfully, but these errors were encountered: