Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normal v2 pretrained model not working #53

Open
yuyingyeh opened this issue Jul 19, 2023 · 6 comments
Open

normal v2 pretrained model not working #53

yuyingyeh opened this issue Jul 19, 2023 · 6 comments

Comments

@yuyingyeh
Copy link

Hi! I have tried your code to predict normal using the latest v2 model checkpoint, but the outputs are all NaN. I have tried both the model from the script below and from the google drive.
sh ./tools/download_surface_normal_models.sh

The results can be reproduced with this command:
python demo.py --task normal --img_path assets/demo/test1.png --output_path assets/

I have uncommended below lines to use v1 model and there is no issue. Could you check your released weight? Thank you!

# pretrained_weights_path = root_dir + 'omnidata_unet_normal_v1.pth'
# model = UNet(in_channels=3, out_channels=3)

@jens-nau
Copy link

jens-nau commented Jul 21, 2023

I have the same problem. Loading the model to the CPU instead of the GPU seems to work, but is very slow.
After some testing, I found that the problem only seems to occur on GPUs with certain or perhaps old architectures. On my Ampere-based RTX 3080 everything works fine, but when running the same code on a Pascal-based GTX 1050 Ti the model predicts NaN values.

@alexsax
Copy link
Collaborator

alexsax commented Jul 21, 2023 via email

@yuyingyeh
Copy link
Author

I have the same problem. Loading the model to the CPU instead of the GPU seems to work, but is very slow. After some testing, I found that the problem only seems to occur on GPUs with certain or perhaps old architectures. On my Ampere-based RTX 3080 everything works fine, but when running the same code on a Pascal-based GTX 1050 Ti the model predicts NaN values.

Thanks for finding out the problem! I have also tested on another machine and it works!

The command used to test:

cd omnidata/omnidata_tools/torch
python demo.py --task normal --img_path assets/demo/test1.png --output_path assets/

What I have tried:

  1. [Not working] Ubuntu docker + Turing-based RTX 2080 Ti
  2. [Working] Windows 11 + Ada Lovelace-based RTX 4090 + Anaconda

@zzt76
Copy link

zzt76 commented Nov 1, 2023

I face the same problem when using the normal model:

  1. [WORKING] Win11 + RTX 4060
  2. [NOT WORKING] Ubuntu + V100

@Totoro97
Copy link

I face the same problem

[WORKING] Ubuntu + RTX3090, pytorch 2.0.1, CUDA 11.8
[Not WORKING] Ubuntu + V100, pytorch 2.0.1, CUDA 11.8

@haotongl
Copy link

[Not WORKING] Ubuntu + V100, pytorch 2.0.1, CUDA 11.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants