Live colab is here: https://colab.research.google.com/drive/1gmnwWN0POHAz-zj5zxxbLCiFHuW8ZzJr?usp=sharing overleaf: Meeting: Friday 11am to 12pm
Overleaf writing:
- Introduction: Anqi (description of domain problem, climate change, classification problem (clouds)
- EDA: Bodgan
- Feature Selection/Engineering (autoencoder): Finn
- Data Splitting: Bodgan
- Modeling (3 parts): Neural Network & Ensemble (Bodgan),Logistic Regression (Joseph), Random Forest (Anqi)
- Conclusion: Joseph
Bogdan: neural network/Ensemble Joseph: Logistc regression Anqi: Random forest Finn: autoencoder
- train: image1
- validation: image2
- test: image3
- code:
- EDA
- Modeling
-
import pandas as pd
-
import numpy as np
-
import matplotlib.pyplot as plt
-
import seaborn as sns
-
from time import time
-
import os
-
from pyreadr import read_r
-
image_1 = pd.read_csv("../../data/image_data/image1.txt", delim_whitespace=True, header=None)
-
image_2 = pd.read_csv("../../data/image_data/image2.txt", delim_whitespace=True, header=None)
-
image_3 = pd.read_csv("../../data/image_data/image3.txt", delim_whitespace=True, header=None)
-
column_names = ['y_coor', 'x_coor', 'expert_label', 'NDAI', 'SD', 'CORR', 'Radiance_angle_DF','Radiance_angle_CF','Radiance_angle_BF','Radiance_angle_AF', 'Radiance_angle_AN']
-
image_1.columns = column_names
-
image_2.columns = column_names
-
image_3.columns = column_names