Wednesday, October 23, 2019

It has been quite difficult to keep up with posts. But it shouldn't matter as I am doing this as a release valve to sort of journal my thoughts.

So here is what I got to explore with respect to the CNN part using Neural Nets.
Having a basic, decent background in Convolutions/ Image processing from Uni, I referred https://www.datacamp.com/community/tutorials/cnn-tensorflow-python.

Tensorflow:
So from recent podcasts and videos I realised Tensorflow has this static way of creating Neural Nets, where I create a computational graph - basically I define a set of nodes in a graph that depicts the operations (convolutions/add/matrix multiply,etc..) for the incoming data, the edges representing the flow of data. (Although I came across Tensorflow 2.0 where there is an eager execution that gives one a lot more control to intercept data flow in the NN). Once I define the network entirely, I make use of the TensorFlow sessions to feed data into the defined the network. It Seems, this approach helps a lot to distribute the training process.

But it was a lot to code stuff from scratch, hence had to refer that blog, where I had ready-made code. I just had to understand stuff, which was decently hard, given the previous knowledge on Image processing and CNNs. But it was hard and confusing to run it directly.

Data Cleaning:

Had to realise this the hard way.
--
I got a chance to explore this technique called One-Class SVM for like a binary classification at my student job. Again, this was for Image anomaly detection. Overall, the task was pretty interesting, explored various approaches that I could have potentially taken. Some of them were:
- Give Image train data to the OneClassSVM API of sklearn, although it was a bit restrictive and not available in all versions.
- Apply PCA to the image data to retain the features/pixel data that explained maximum variance (shall try and cover such handy concepts that revolve around data processing/ pre-processing in the next post)
- Use ResNet to get The Most relevant features from the image and then apply PCA - post which gives the features to One-Class SVM. this was pretty interesting as per the link https://hackernoon.com/one-class-classification-for-images-with-deep-features-be890c43455d
- Lastly, it was the Autoencoder approach, which I have never explored, but I'd want to someday on an application that interests me. (semi-structured data)

But the sad part was, I NEVER really cared about how my data looked. I was in such a hurry to get the insights, I never gave a damn about the data. Unfortunately, it was largely crappy. It was skewed, un-curated, not representative enough, so I was asked to drop the whole task itself :( (time constraints)
--

Anyways, continuing with my exploration,
when I was about to drop the idea of using Tensorflow for this (which was pretty confusing for me - but was worth the experience), a friend asked me to explore Keras. It is too damn readable. 
A highly simplified way to define the network. This is when I realised, I NEED to stick to the larger picture in mind, no point taking a difficult path when either way I reach the same destination.


Built a basic NN using Keras with:
-32, 3 x 3 filters in the first layer (that took 1 channel from the inp image - grayscale)
-64, 3 x 3 filters in the second layer (depth 32 from prev layer)
-128, 3 x 3 filter in the 3rd layer (depth 64 from the previous layer - here, I realised, as u go further in the feed-forward network, the depth of the cube- Conv layer, increases rapidly, leading to HUUUGe number of neurons, in turn weights that is to be computed!)
-Flattened the layer (with prev depth 128!)
-A Fully Connected Dense layer with 256 neurons
- Finally a fully connected layer with the number of neurons that represents the number of classes that I want (in my case, the number of fruits)
- Forgot to mention about adding a MaxPool layer between each Conv layer, that halved the dimension, but retaining only the "strongest feature in the previous Conv layer".
- So, here each filter corresponded to an activation plane, that slides across the image, to "filter" out the "relevant" parts from the image. In other words, relevant neurons in the activation plane of each plane trigger the neurons in the next layer.

A few key observations:.
- Used Relu as activation for each neuron, that strips off negative values and retains max values. (there is also Leaky Relu)
- Use softmax (that gives the probability of each class) as the activation function of the last layer, that can be used to determine how relevant each category is for the Inp Image.
- Use batch gradient descent, that speeds up the convergence.(reaching the bottom of the loss bowl)
- Use Dropouts, that randomly switches off some nodes to increase the relevance of the final result.
- Use Momentum (that considers previous cumulative gradients- velocity+friction of the ball in the bowl, and current gradient - acceleration of the ball in bowl =- Andrew NG example) /Adam optimizer/ RMSProp optimizer.. and many more that can optimize the cost function.
- Like my Prof taught in class, Any model can be ultimately written as : 
hypothesis = regularizer + (regularisation constant) * (loss)
we just need to optimize the objective function such that the loss is minimised through (to find the weights and biases)
 Backpropagation - partial derivatives, Lagrange's Multipliers - KKT that can find the best values for weights across layers that minimizes the cost function.
- Use softmax - cross-entropy loss if using categorical values for the loss function,
- Can use Root mean squared error, mean squared error, L1 norm, L2 norm, Manhattan etc as loss function...or others if using regression ie predicting values but not categories. A lot to explore here.

But unfortunately, my PC is running out of juice, not able to keep the notebooks active. But as of now, I am getting a decent accuracy of around 73%. I'll probably either integrate this with the OpenCV that I had explored in the beginning OR explore something else.


Takeaways:
  • Explore a lot more approaches/topics before sticking to one. As immediately boiling down to a selection can restrict the exploration stage.
  • Its good to stray away from the main goal - Iff you can afford to do so - on the way u may stumble upon many more exciting things.
  • Realised Videos/Podcasts/Tech talks, gives u whole new perspective towards things.
    • Came across some really cool channels on youtube Lex Fridman, Strange Loop - this gives a good perspective towards translating ML stuff to industries. (Netflix one was really cool)
  • The wave of examinations over the past months provoked me to at least document the topics that I came across while studying. Some of the techniques, tools are really handy and seems like it can be appreciated more if I have documented some of them here.
  • As a next step, I'll try running the model against new images of fruits, see if it works, post that, try and see if it works for a live image captured from web camera.
  • Also want to sort of exploring semi-structured analysis for textual stuff, NLP, TF * IDF, lang models etc.