There are over 400 million acres of farmland in the United States, which traditionally must be monitored on foot to detect anomalies, an extensively time-consuming task. A farmland anomaly is any object, region, or event that disrupts the normal growth stages of crops, which if left unchecked, can drastically decrease the yield of a farm.
Some of the most harmful farmland anomalies to potential crop yield are:
- clusters of weeds, which inhibit crop growth and nutrient gains,
- stagnant water, which can serve as a breeding ground for harmful bacteria and pests,
- unintended waterways, which can destroy plants in their paths, and
- missed or double planting, which prevents maximum planting efficiency.
I have used deep neural networks to conduct semantic image segmentation on aerial images of farmland, to classify and determine the locations of these anomalies such as weed clusters, skipped planting, and water destruction. Considering the expansiveness of global crop fields, it is near-impossible to patrol crop fields on foot and resource-consuming and often largely expensive to try and have humans analyze aerial images using existing technologies.
This project simplifies existing solutions and provides an accurate and efficient solution for a problem involving analyzing agricultural images, a relatively untouched field.
You can install the repository from the command line:
git clone https://github.com/amogh7joshi/crop-field-health.git
A Makefile is included for Python installation. To use it, run the following.
make install
Otherwise, in the proper directory, execute the following to install system requirements.
python3 -m pip install -r requirements.txt
From here, the scripts/expand.sh
script inflates the dataset into its permanent file structure, and
the scripts/preprocess.sh
processes the dataset into JSON files containing image paths for each image ID.
If you want to work with the C++ extensions of the project, which are located in the cc
directory, then
follow the above steps for repository installation and Python setup.
From there, you need to build the C++ project. You will need CMake installed, as well as OpenCV (for working with images) and nlohmann-json (for working with JSON files). Once you have those installed, execute:
cmake -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - Unix Makefiles" path/to/farmland-anomalies/cc
You can run the compiled C++ executables from there, just run make
in the cc
directory to build, and then:
./cc
The data pipelines and system is developed in multiple stages, owing to the complexity of the Agriculture-Vision dataset (see below).
First, the dataset is inflated from its compressed files using the scripts/expand.sh
script. Then, the scripts/generate.sh
calls the
preprocessing/generate.py
script, which creates JSON files containing all image paths for a unique image ID. (This can also be done using
the optimized C++ extension in the cc
directory, see the below C++ installation section for more information) Finally, the preprocessing/dataset.py
file contains the AgricultureVisionDataset
object which is called from implementation scripts as training data.
For data inspection, including viewing all images associated with an ID, viewing all images belonging to a category, or viewing all images related to a
certain ID, the preprocessing/inspect.py
contains functionality for viewing and saving these images. The other files in the preprocessing
directory contain
individual purpose implementations (e.g., preprocessing/distribution.py
plots the class distribution frequency of the dataset).
This project makes use of the Agriculture-Vision dataset, containing aerial farmland images with one or multiple different anomaly segmentation masks.
Information about the dataset and its acquisition can be found at the challenge website, and for compatibility
the compressed file should be placed in the data
directory.
@article{chiu2020agriculture,
title={Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis},
author={Mang Tik Chiu and Xingqian Xu and Yunchao Wei and Zilong Huang and Alexander Schwing
and Robert Brunner and Hrant Khachatrian and Hovnatan Karapetyan and Ivan Dozier and Greg Rose
and David Wilson and Adrian Tudor and Naira Hovakimyan and Thomas S. Huang and Honghui Shi},
journal={arXiv preprint arXiv:2001.01306},
year={2020}
}
There were three main deep neural network models constructed as part of this project: a semi-shallow single-network model L1, the deepest model, titled D1, and the successfully implemented model L2.
Model L2 (diagram generated using Net2Vis) uses Ensemble Learning techniques, with two "sub-networks":
- The top (and shallower) network, L2-1, which learns high-level image features.
- The bottom (and deeper) network, L2-2, which learns deep spatial relations and features.
L2-2 uses strided convolutions to pick up on features generally lost during downsampling, while L2-1 uses pooling layers, to prevent gradient propagation issues which may arise.
For specific details on the network architectures, see the model directory.
To refine segmentation masks, multiple loss functions were used on a single model instance. For example, a model may have been trained on an arbitrary loss A for 20 epochs, then loss B for 20 more epochs, and finally loss C for 20 final epochs.
Primarily, dice loss and cross-entropy loss was used, however a third loss function titled surface-channel loss was developed, with the formula:
The function greatly penalizes incorrect calculations, and focuses on classifications over individual channels, allowing for the refinement of a class prediction as well as segmentation region.
All of the code in this repository is licensed under the MIT License, meaning you are free to work with it as you desire, but this repository must be cited if you want to reuse the code.
Although you are free to work with the project yourself, contributions will not be accepted to this repository. You are, however, welcome to open an issue in the issues tab if you notice something that is broken.