Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to reorder the flow tensor when transpose_image is true? #57

Open
tomyoung903 opened this issue Nov 20, 2024 · 3 comments
Open

Need to reorder the flow tensor when transpose_image is true? #57

tomyoung903 opened this issue Nov 20, 2024 · 3 comments

Comments

@tomyoung903
Copy link

in postprocess_size() in train_stage_1.py

when transpose_img is set to true, we need to reorder flows like this

flows[:, [0,1]] = flows[:, [1,0]]

This makes sure that flows[:, 0] corresponds to width.

Right?

@tomyoung903 tomyoung903 changed the title Need to reorder the flow tensor when transpose_image is set to true? Need to reorder the flow tensor when transpose_image is true? Nov 20, 2024
@MyNiuuu
Copy link
Owner

MyNiuuu commented Nov 20, 2024

Hi! Thank you for your interest to our work!

The function postprocess_size() is used in the function get_optical_flows() (https://github.com/MyNiuuu/MOFA-Video/blob/main/Training/train_stage1.py#L113), in which the Unimatch model is used for estimating optical flows. As the Unimatch model was trained on images where the width is greater than the height, we transpose the image in the preprocess_size function if its height exceeds its width. This ensures that the images fed into the model always maintain a width greater than height configuration, aligning with the data format used during model training for accurate flow estimation.

During the post-processing stage, in the postprocess_size() function, if the transpose_img flag is true, it indicates that the image was transposed during preprocessing. Therefore, we need to transpose the flow again to ensure that the flow's orientation matches the original image orientation. Note that this operation has already been done in the postprocess_size() function (https://github.com/MyNiuuu/MOFA-Video/blob/main/Training/train_stage1.py#L106).

Therefore, if we want to use the optical flow for our model in the subsequent codes, we do not need to worry about whether the image was transposed, as the flow directions have already been properly adjusted in the post-processing function.

@tomyoung903
Copy link
Author

No it's not about transposing the flow tensor.

It's about reordering the dimension that has size 2 flows[:, [0,1]] = flows[:, [1,0]] such that the first slice always corresponds to width and the second one corresponds to height.

Right now if the video is in portrait mode, flows[;, 0] corresponds to height. This cannot be right for later operations.

But i suppose this issue did not hurt your training because most webvid10M videos are landscape.

@MyNiuuu
Copy link
Owner

MyNiuuu commented Nov 20, 2024

Oh, I understand your point.

I think you are right, we indeed need to perform flows[:, [0,1]] = flows[:, [1,0]] when the image is transposed for optical flow prediction.

I originally copied the optical flow prediction codes from Unimatch:

https://github.com/autonomousvision/unimatch/blob/master/evaluate_flow.py#L642

from line 714 to line 760.

It is a little wierd that the origin script do not reorder the predicted flow with flows[:, [0,1]] = flows[:, [1,0]], instead only transpose the flow in line 758.

Maybe I missed some parts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants