Improving object detection models for underwater environments using YOLOv8 and advanced attention mechanisms
This project focuses on improving underwater object detection models using a large corpus of underwater datasets. By training state-of-the-art models such as YOLOv8 and YOLOv7 and integrating advanced attention mechanisms, we achieved a 3-5% improvement in accuracy. The implementation of the Residual Convolutional Block Attention Module (RESCBAM) into the YOLOv8 backbone contributed to this performance boost, making YOLOv8 a new state-of-the-art model for underwater object detection.
- 📊 Performance Improvement: Achieved a 3-5% accuracy improvement over previous models (YOLOv7).
- 🔍 Advanced Attention Mechanisms: Researched and integrated RESCBAM for better feature extraction and model optimization.
- 🧠 Deep Learning Models: Trained YOLOv8/YOLOv7 on a large-scale underwater dataset, enhancing detection capabilities in complex underwater environments.
- Dataset: Utilized a large corpus of underwater images containing various marine objects and species.
- Model Training: Trained both YOLOv8 and YOLOv7 models, with modifications to the architecture and attention mechanisms.
- Attention Mechanism: Implemented the Residual Convolutional Block Attention Module (RESCBAM) into the YOLOv8 backbone, enhancing the model’s ability to focus on key features in the images.
- Evaluation: Compared the performance of YOLOv8 and YOLOv7, showing that YOLOv8 with RESCBAM outperformed YOLOv7 by 3-5% in accuracy.
- Accuracy: 3-5% improvement in object detection accuracy over YOLOv7.
- State-of-the-Art: Established YOLOv8 as the new state-of-the-art for underwater object detection with RESCBAM integration.
- YOLOv8/YOLOv7: State-of-the-art object detection models.
- RESCBAM: Residual Convolutional Block Attention Module for feature enhancement.
- Python: Programming language for model training and evaluation.
- PyTorch: Deep learning framework used for model development and training.
- OpenCV: For image preprocessing and augmentation.
- Further fine-tuning of the attention mechanisms for more diverse underwater environments.
- Exploring additional datasets for broader generalization across different underwater conditions.
Feel free to contribute by submitting issues or pull requests. All contributions are welcome!
This project is licensed under the MIT License. See the LICENSE file for more details.