Monday, October 28, 2019

General stuff

At my student job,
  • I started with porting a neural net script that was a part of a Research Paper, implemented in Caffe library to TensorFlow. Things were pretty straight forward, I had to get TensorFlow equivalent of the Caffe's weights.
  • Construct the architecture defined in the paper in tf, load the weights. Here is when I explored sessions in TensorFlow, that lets me save weight loaded variables, that I can re-use for determining final predictions.
  • This was followed by a bit of Image Processing where I had to segment the image semantically, followed basic approaches for the same. Have used the nearest neighbour to evaluate the semantic meaning of the pixels.
  • Post this I had to explore techniques for anomaly detection - some of them were OneClassSVM, Feed inp to ResNet - use its features(last but 1 layer's op) as inp to the OneClassSVM, as I mentioned in the previous posts. 
  • As of now working on trying to reduce the representation of a given image/patch (dimensionality reduction - basically projection the given data into a lower space) and then reconstructing the image/patch in that lower space. Post this, compute the reconstruction error to try and pinpoint the image/patch that was relatively difficult for reconstruction.
Anyways, now I want to explore semi-structured analysis (textual stuff), after a bit of exposure to a few NLP techniques at Uni, I am exploring the same a bit more.

I am not sure if I'll work on this, but I have an idea that I'd want to implement.
Recently, I got a chance to explore OCR a bit using this lib tesseract (https://github.com/tesseract-ocr/tesseract) - that is used for extracting text from Images.., and sometime later in a video on youtube, I came across this GloVe - Global Vectors for word representation, (https://nlp.stanford.edu/projects/glove/) which is a research by Stanford.

- So a text can be stored in several ways for performing analysis. 
- One way is to record each word's frequency and use Bag Of Words - vector representing the count of each word and its position.
- Or I can just use an incidence/boolean matrix, to just indicate if the word is present or not.
- But GloVe, it represents each word in a pretty epic way, by retaining the contextual meaning of each word. Each word is represented as a vector of several dimensions, such that, similar words have similar vectors (cosine sim might be high). 

So of course, there is a lot more that I need to explore about GloVe and its usage.
But I thought if I can combine OCR, use GloVe for the extracted text embeddings, and be able to summarize/ predict/ analyse the given text (preferably using RNN or something similar as it keeps track of the timesteps)(Not sure about this part, but I think as of now, I would consider it progress if I can do the OCR part + word embeddings using GloVe part)


Takeaways:
  • Some insights about my part-time work.
  • Initiate new exploration :)