Computer_vision_pneumonia_x_ray

Authors of the project : Kai Yung TAN (Adam) & Jean Christophe Meunier

1. Purpose and project objective

Purpose

Learning how to design and evaluate a custom made convolutional neural network for practical purposes
Using CNN models to analyse x ray images
Designing a CNN capable of recognising pneumonia in x-rays of patients

Objectives

Consolidate the knowledge in Python, specifically in : Tensorflow/kerras, NumPy, Pandas, Matplotlib,...
To be able to search and implement new librairies
Consolidate knowledge of data science and machine/deep learning algorithm for developping an accurate regression prediction model
To be able perform appropriate model hyperparametrisation

Features

Must-have

A CNN trained on a large x ray dataset (>5k) that can recognise new images outside of the training set
Proper model evaluation (split dataset, confusion matrix, etc)
Visualisations of model results (properly labeled, titled...)

Nice-to-Have

A visualisation of the feature maps of the model
Comparison with other CNN model structures
Assessing and comparing

Context of the project

All the work achieved was done during the BeCode's AI/data science bootcamp 2020-2021

2. The project

Working plan and steps

1. Research

Research and understand the term, concept and requirement of the project.
Discover new libraries that can serve the project purposes
Developing, using and testing machine learning algorithm (i.a. tensorflow/kerras,...)
Consolidating knowledge on model building and model hyperparametrisation (e.g. type of layers, pooling, dropout, batch normalization, type of activation functions,...)
Data augmentation
Aside from that, we also searched documentation on the internet on existing published work and/or studies on x ray data manipulation and modelization, as for example :
- sibeltan/pneumonia_detection_CNN
- Jain et al., 2020. Pneumonia detection in chest X-ray images using convolutional neural networks and transfer learning. Measurement, 165, 1.

2. Data collection

The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care.

For the analysis of chest x-ray images, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans. The diagnoses for the images were then graded by two expert physicians before being cleared for training the AI system. In order to account for any grading errors, the evaluation set was also checked by a third expert.

https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

Examples of data input

3. Data manipulation

Image size reduction: original jpg were reduced to size 128 x 128 in order to accelerate data processing during models training
Standardisation of the images
Data augmentation using CV2 library and the 'ImageDataGenerator' function in order to increase training quality

4. Modelization

In total, a number of 17 models were build, trained and compared using various hyperparametrisation (see notebook section):

depth of the neural network
type of layers (dense, convolutional,...)
filters (number, size, padding, etc.)
type of activation (i.a. relu, leaky-relu, sigmoid, softmax,...)
dropout
pooling
batch normalization

For each model, hyperparametrisation was fine-tuned based on the performance indices on the test data set (624 pictures). When a model reached a satifying accuracy, he was finally rerun on the validation set (16 pictures)

The best fitted model was choosen partly based on previous good performance on train and test data set but mostly on performance on validation data set.

Final best fitting model

1. Model architecture

8 convolution layers (filters=32/32/32/64/64/64/128/128, kernel_size=(3, 3) activation='Leaky-relu')
MaxPool2D((2, 2)
Dropout(0.25) on all layers excepting the last one
Flatten
1 dense layer (1024, activation='relu')
model.add(Dense(2, activation='sigmoid'))
Dropout(0.5)
loss='binary_crossentropy', optimizer='adam'
shuffle = True
data augmentation: rotation_range = 20, zoom_range = 0.2, width_shift_range = 0.2, height_shift_range = 0.2, horizontal_flip = True, vertical_flip = True
Batch size : 16
Epochs : 100

2. Performance evaluation

Loss and accuracy

Confusion matrix on test set

Performance indices on test set

Confusion matrix on validation set

Performance indices on validation set

3. Further development

Further train the model on additional data
Model optimization: constructing simpler models that reach similar metric performance
Building a RESTfull API to be deployed on a web based environment (e.g. Heroku, Azure, etc.)
Completing the API with a web-based interface (e.g. using streamlit) allowing for uploading x ray images to get pneumonia diagnose
Extending model to include other types of pathologies (i.e. multiclass classification including other respiratory diseases)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.ipynb_checkpoints		.ipynb_checkpoints
img		img
models		models
notebook with computed CNN models		notebook with computed CNN models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer_vision_pneumonia_x_ray

1. Purpose and project objective

Purpose

Objectives

Features

Must-have

Nice-to-Have

Context of the project

2. The project

Working plan and steps

1. Research

2. Data collection

3. Data manipulation

4. Modelization

Final best fitting model

1. Model architecture

2. Performance evaluation

3. Further development

About

Releases

Packages

Languages

jcmeunier77/Computer_vision_pneumonia_x_ray

Folders and files

Latest commit

History

Repository files navigation

Computer_vision_pneumonia_x_ray

1. Purpose and project objective

Purpose

Objectives

Features

Must-have

Nice-to-Have

Context of the project

2. The project

Working plan and steps

1. Research

2. Data collection

3. Data manipulation

4. Modelization

Final best fitting model

1. Model architecture

2. Performance evaluation

3. Further development

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages