Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a transformation for resizing bounding boxes independently of their corresponding images. #1913

Open
vladyskai opened this issue Sep 4, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@vladyskai
Copy link

Feature description

Currently, Albumentations provides various augmentations that simultaneously transform images and their associated bounding boxes. However, there isn't an option to resize the bounding boxes independently of the rest of the image. For example, I have a dataset with symbols that are all of similar size, but I would like to vary the sizes of my symbols relative to their images to produce a more robust model.

Motivation and context

I am training a YOLO model to recognize symbols inside real engineering diagrams. In some cases, the symbols, while being the same, change their size a lot, especially in very crowded diagrams where the usual size does not fit. My dataset did not include smaller versions of these symbols, so I tried just resizing my original dataset. However, since YOLO fixes the sizes of the images it is being trained on, it doesn't work, since the size of the symbols relative to the size of the images is still the same. I would need a dataset where the size of the symbols relative to the size of the images is a lot smaller in order to train YOLO properly. This is the case I'm working on, and I would love YOLO to implement this.

Possible implementation
The feature could be implemented as a new transformation class, such as ResizeBoundingBoxes, that applies resizing operations only to bounding boxes. The class could take parameters for the resizing scale, aspect ratio, and target dimensions.

@vladyskai vladyskai added the enhancement New feature or request label Sep 4, 2024
@KDeser
Copy link

KDeser commented Oct 4, 2024

Would this work like Copy Paste, where the augmentation is shrinking or enlarging the pixels within the original bbox and then pasting the new pixels centered on their original location? How would you propose blending them into the background?

Or are you saying to simply modify the bbox coordinates but not to change the pixels? If so, isn’t that basically just introducing label noise?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants