Skip to content

Latest commit

 

History

History
94 lines (51 loc) · 3.25 KB

README.md

File metadata and controls

94 lines (51 loc) · 3.25 KB

Machine Learning

Linear Regression

Used Single variable Linear Regression, to estimate profits of food truck with respect to the population.

Best fit for h= theta0 + theta1 * X

alt text

Program choose theta0 and theta1 with minimum cost value using gradient descent.

alt text

choose global min. using gradient descent. Countour plot visualizes this for us

alt text

Logistic Regression

Useful for classification problems. In this exercise, used logistic regression to predict students accpetance rate based on exam 1 and 2 scores (2 features).

alt text

Program can predict with 89% confidence score. As seen from the picture a line clearly seperates + and - values

alt text This data is not linear, to add boundary need to scale feactures. Used 6 DOF and regularized them to not to overfit the data. Result looks this the picture below.

alt text

Learning curves

Used to identify the if the learning algorithm sufferes from hig bias (underfitting) or high variance (overfitting).

Linear regression is applied to the non-linear data. Below is the result alt text

Learning curve is generated from training set and validation set cost functions

alt text From the graph above, errors increase as the number of training set increases . graphs. This indicates high bias problem. To fix introduce more features. Such as polynomial fit instead of linear.

For polynomial regression and lambda=0 (no regulization) alt text

alt text

From the graphs above, we can deduce that the learning algorithm overfits the data (high variance). By increasing or decreasing lambda fixes this problem.

lambda={0; 0:001; 0:003; 0:01; 0:03; 0:1; 0:3; 1; 3; 10}, array of lamda values are used to graph and select the lambda that best fits the data.

alt text

From the graph best lambda is 3

Gaussian Kernel (linear and non-linear) and Spam classifier

Plotting data set

alt text

After applying SVM linear kernel alt text

Non-linear Dataset alt text

After applying gaussian kernel alt text

Mixed dataset alt text

After allocating C and Sigma a range of 8 values by a factor of 10. Used two for loops to select min error vlaues for C and Sigma and used in guassian SVM kernel and got the boundary (img show below)

alt text

Multivariate Gaussian model

Plotting the data set alt text

After computing mu and variance and calculating the probability, fitted a gaussian model alt text

Using F scores, selected a constant c with the help of ground truth. If P<c, circle anamolies alt text