Friday, March 27, 2020

Knowledge Integration



As a part of my seminar at the University, I got an opportunity to explore ideas related to integrating Knowledge to a Neural Network to enhance its performance. Just sharing some pretty interesting ideas here. 

Expert systems are the ones that provide a clear explanation while decision making. Incorporating this to a neural network would enable the neural network to outperform its baseline model in terms of accuracies and explainability.

Considering an application of time series forecasting — let’s say, stock price prediction, or more relevant, let us consider forecasting COVID-19 cases over a period in a region. Just feeding the model with historical cases and expecting it to give a valid prediction, might, from my perspective, be flawed as several external factors need to be considered. Factors like the changes in the precautionary measures by the government, the lockdowns that were put to place, social distancing that took shape over a period, age groups that get affected over the period, etc.
There has to be a way we can embed all the above extra relevant information to the neural network to make more realistic /explainable predictions.
The explainable part can be analogous to the explainable aspects that come into play when considering Computer Vision applications in Neural Networks like visualizing gradients, visualizing activation maps, etc.
An overview of how the system might be structured
An overview of how the system might be structured


The first idea that was pretty cool, was to be able to get the knowledge embedding(KB) that is highly relevant to the input. KB is very similar to the word embedding (eg: word2vec, GloVe models) where the knowledge encoded as a triplet(eg <Corona, affects, Lungs>), is mapped to an embedding vector. There are a bunch of techniques available to determine the vector corresponding to the triplet. One such example is the TransE model. Once, Knowledge Embedding for an information triplet is known, the next step is to concatenate this with the normal input vector of #cases over a period.
In the case of COVID cases prediction, one can feed the model with not just the previous cases but also the current state of the region (ie in the form of a news article). Even better, one can maintain a knowledge graph as and when things unravel by updating the connections in the graph. Finally, during prediction, extract relevant sub-graphs from the knowledge graph, convert them to the KB embedding and feed it to the Neural Network. This allows the network to gain deeper insights into the situation.



There are variations in the above technique, like incorporating attention mechanism for the neighbours of a considered KB embedding, which I have touched upon in my survey paper:

https://arxiv.org/abs/2008.05972

The second idea talks about altering the states of a sequence-to-sequence model. A seq-to-seq model is used for predicting an output sequence given an input sequence (eg: Lang Translation). An encoder-decoder architecture is used for this purpose, where each stage of the model is made up of a Recurrent Neural Network (variants like LSTM, GRU can be used).
The expert knowledge is maintained as another trained RNN model. During training, at a given time step(state) of our Neural Network, we integrate the hidden state of our “to be trained ” model with the corresponding state of the trained RNN model via a gated mechanism — this ensures focussing on things at a granular level while predicting the output sequence.

The third idea considers the desired and predicted probability distributions of the expert knowledge and our model. It tries to train our model, by ensuring our model’s probability distribution P(X|Y) where Y can be the input sequence and X being the output sequence to stay as close (Kullback Leibler divergence) to the distribution of the expert knowledge. This desired probability distribution can be built either manually, considering another trained Neural Network, or using some kind of an n-gram model.

There are several such ideas that I came across, that includes using a CNN for forecasting, using Fuzzy Sets, etc, that can be checked in the attached survey paper. I think conveying all of them here might lead to confusion.

All in all, I think, providing the neural network with additional information that is relevant to the input, gives an edge to the neural network to make more realistic predictions. This enables us to attain a kind of synergy between an Expert system and a Neural Network to capture the best parts of both systems.

Tuesday, March 24, 2020

COVID hackathon

So, with the onset of COVID-19, all the governments are doing whatever they can to support the fight against the virus.

On those lines, I came across a portal

https://www.bundesregierung.de/breg-de/themen/coronavirus/wir-vs-virus-1731968

where the German Govt organized a 48 hour Remote Hackathon asking the public to come up with possible solutions/prototypes to tackle several challenges. Some of those challenges were -  optimal resource(staff, medical equipment..) allocations, handling mental conditions of people while at home-isolation, remote assistance/ diagnosis and recommendations for concerned patients. I personally liked the idea of alerting an individual about how risky is his current situation (using GPS, the infection spread at his location, predicting /estimating spread) allowing him/her to plan their travel accordingly.

I got a chance to be a part of the event with a team from my workplace. Although everything was in German, it felt really good to be a part of something and contribute.

With a decently huge team, after a bunch of brainstorming we came up with the idea that supported the following functionalities:

  • Diagnose and recommend the next steps of actions for a patient after he takes a questionnaire. (of course, leveraging an expert system here)
  • Support a dashboard of statistics that deep dives on the available data from official platforms (via APIs) and provide visualizations on the analyzed data. Eg: Temporal evolution of cases per state, Cases divided by age groups and Gender, Geographical heatmap on the severity of infection spread, etc.
  •  Pass the results from the questionnaires to a model to predict how "COVID-19" the patient is. (scale of 0-1)
  • Use the predictions from the model, to chart out the potential cases that could be seen in the coming days in a given geographic area.
  • Allow the registered hospitals across the country to register encountered cases on a daily basis that can, in turn, be used to derive statistics.
  • Expose the derived stats to the interested 3rd parties.
  • All the above insights might provide the government to come up with a decent plan to tackle the situation.
I was initially asked to come up with a very raw wireframe for the app. I suck at UI, still, I  thought at least I'll try to have a decent flow for the app as per:


I tried to have 2 roles, patient and the admin, as the patient does not need to know the statistics (for all we know, he might panic even more!) . The admin will have access to see the stats, register cases for a hospital etc.

There were amazing UI, Backend and Data Science experts in the team who were involved in building of the prototype. I tried to squeeze myself in one of the data science tasks. As it was just a prototype, and due to time limitations, we decided to use a snapshot of the actual data and build the above functionalities (hence, db or no db doesn't make much of a difference here).  

I contributed in the tasks of data gathering, data munging, basic EDA on the data and finally visualising the insights. Used Python for these tasks and plotly for plotting. Felt good :) 
Also, while we were exploring the feasibility of having MongoDB, I also setup a basic Node JS server and doing a basic CRUD on MongoDB,  although, it wasn't possible to have it, as I had to focus on the data science parts.

Finally, it was amazing to see the coordination within the team and integrating all the moving parts. Not only did I get to explore technical stuff (WebApp stuff, Data stuff) but also several aspects in managing a team/ provide constructive suggestions, / organisational tools, discuss possible approaches to achieve a subtask etc. Would've enjoyed even more had I known German :-\

And we had the submission:


and the app is running here:


Everyone killed it hard! Kudos to the team.

Improvements from my perspective:
  • Currently the model predicts the degree of sickness using the patient's response from the questionnaire as inp features. It might be too biased to rely on the features entirely, as I think there is always the degree of uncertainty that should come to the picture. So I think, considering the distributions of the parameters of the model via Bayesian Learning (Bayesian Linear Regression or Bayesian Neural Network or as a Gaussian Process - via Kernelization) might make sense here as this approach allows us to have a sense of doubt while making predictions (the distributions gets updated via posterior beliefs).
  • Also, I think an Online Learning system might make sense here, as the model needs to improve on a real time basis as and when it encounters new data.
  • And if there is an issue of data privacy, there is this Federated Learning that allows the model to get trained and deployed across local devices without having a global access to their data. The final server will just exchange model parameters from the hospitals but not their actual data.
So combining these might make the system really powerful, am not sure about the intricacies, just a thought.