Wednesday, March 20, 2019

Convolutional Neural Network:

Ok so here is what I have understood so far. The story might be wrong, vague, in which case, do let me know.

> Of Course the way I have understood might need a lot of refinement, but this is based on what I have accumulated over the past 2 days through the web.
>I shall try and keep it as simple and intuitive as possible(say like if a person like myself doesn't have the necessary background).

> So for starters a Neural Network is a network of neurons spanned across multiple layers.
A neuron, just like a brain's neuron accepts several input signals (input features in our case) and spits out an output signal (output value).
> A simple neural network can have like

 
This is exactly what am trying to do but for identifying food ingredients.

> Every neuron accepts the features (in the above case - image pixels), multiplies those features with certain weights and passes the output to the next layer.
> The above steps are done in a very cool way using matrix operations using libraries like numpy in python as it is wayy too faster than having several loops.
> In the last layer (output), there are 2 neurons - each of which will have a probability value, that gives an information like - "how sure it thinks the picture is a cat or a dog"

> Now, this probability is what we are trying to optimize for correctness
.
> If we backtrack, the thing which is under our control is not the features, as that is our input and we cannot change that, but the weights - those weights were decided upon by our neural network.
> Hence, optimization of the neural network involves something called a backpropagation (gradient descent ) that "tweaks" or "tunes" its dials to "rectify" those weights to cater to identifying the cat or a dog picture correctly.

> So the network has to "learn" to do this. Hence it learns when we give it certain "correct" results. Hence this is some sort of supervised learning where we supervise or monitor if the learning is happening properly in contrast to unsupervised learning where the neural net thinks its smart enough to give accurate results without asking us for already correct results.


Relate:
Now, for my requirement, in the end -
- I would like the computer to "see" the food items and try and identify the food item correctly.
- Hence my video feed is the input to the neural net. Video is nothing but a stream of static images, hence image is my input.
- So if am able to do this first, when I throw an image at the neural net and it is able to identify the image, I think I might have gotten closer to the final result.
- But from what I know thru Open CV, image is a 2D array (not considering channels as of now). So there would be huge number of neurons if I were to represent each pixel as a neuron and it is way too costly.
- Hence the smart people have come up with something called Convolutional Neural Net where there is some sort of windowing is done, which I ll try and describe in the next post.




TakeAways:
- Neural Net has several layers of neurons.
- Neuron takes several input features and spits an output - Does this by multiplying input features with weights  - and subjecting the result to some sort of function (activation function) to spit out a value.
- Based on the output I try and predict the result, hence I need to optimize the weights.
- Optimization is done through back propagation - which involves determining the amount of error (that is the difference between the correct output and the predicted output) and "tweak" my input weights.
- But for an image, normal method of giving image pixels to neurons is costly, hence I refer to a convolutional neural net approach.

- I realised by taking a top down approach when we learn something, we appreciate things better. Yes I know we might not have an in depth understanding of the topics, but I think its okay, as eventually we will get there if we pursue it long enough. But initially, if we have some sort of motivation, real world use case, wen we explore something, we definitely will appreciate it better. (And this has to be the case wen one starts his bachelors course)