You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, Albumentations provides various augmentations that simultaneously transform images and their associated bounding boxes. However, there isn't an option to resize the bounding boxes independently of the rest of the image. For example, I have a dataset with symbols that are all of similar size, but I would like to vary the sizes of my symbols relative to their images to produce a more robust model.
Motivation and context
I am training a YOLO model to recognize symbols inside real engineering diagrams. In some cases, the symbols, while being the same, change their size a lot, especially in very crowded diagrams where the usual size does not fit. My dataset did not include smaller versions of these symbols, so I tried just resizing my original dataset. However, since YOLO fixes the sizes of the images it is being trained on, it doesn't work, since the size of the symbols relative to the size of the images is still the same. I would need a dataset where the size of the symbols relative to the size of the images is a lot smaller in order to train YOLO properly. This is the case I'm working on, and I would love YOLO to implement this.
Possible implementation
The feature could be implemented as a new transformation class, such as ResizeBoundingBoxes, that applies resizing operations only to bounding boxes. The class could take parameters for the resizing scale, aspect ratio, and target dimensions.
The text was updated successfully, but these errors were encountered:
Would this work like Copy Paste, where the augmentation is shrinking or enlarging the pixels within the original bbox and then pasting the new pixels centered on their original location? How would you propose blending them into the background?
Or are you saying to simply modify the bbox coordinates but not to change the pixels? If so, isn’t that basically just introducing label noise?
Feature description
Currently, Albumentations provides various augmentations that simultaneously transform images and their associated bounding boxes. However, there isn't an option to resize the bounding boxes independently of the rest of the image. For example, I have a dataset with symbols that are all of similar size, but I would like to vary the sizes of my symbols relative to their images to produce a more robust model.
Motivation and context
I am training a YOLO model to recognize symbols inside real engineering diagrams. In some cases, the symbols, while being the same, change their size a lot, especially in very crowded diagrams where the usual size does not fit. My dataset did not include smaller versions of these symbols, so I tried just resizing my original dataset. However, since YOLO fixes the sizes of the images it is being trained on, it doesn't work, since the size of the symbols relative to the size of the images is still the same. I would need a dataset where the size of the symbols relative to the size of the images is a lot smaller in order to train YOLO properly. This is the case I'm working on, and I would love YOLO to implement this.
Possible implementation
The feature could be implemented as a new transformation class, such as ResizeBoundingBoxes, that applies resizing operations only to bounding boxes. The class could take parameters for the resizing scale, aspect ratio, and target dimensions.
The text was updated successfully, but these errors were encountered: