In the previous post, we talked about RNN, and how performing Backpropagation through time (BPTT) on an unrolled RNN with many time steps can lead to the problems of vanishing / exploding gradients, and difficulties in learning long term dependencies. In this post, we’re going to look at a the LSTM (Long Short Term Memory) […]

RNN and Vanishing/Exploding Gradients

In this post, we’re going to be looking at: Recurrent Neural Networks (RNN) Weight updates in an RNN Unrolling an RNN Vanishing/Exploding Gradient Problem Recurrent Neural Networks A Recurrent Neural Network (RNN) is a variant of neural networks, where in each neuron, the outputs cycle back to themselves, hence being recurrent. This means that each […]

K-Means Clustering

K-Means Clustering is an unsupervised learning algorithm. It works by grouping similar data points together to try to find underlying patterns. The number of groups are pre-defined by the user as K. How the Algorithm works Before the iterative update starts, a random selection of centroid locations are picked on the graph. These centroids act […]

Random Forests

A random forest is an ensemble approach of combining multiple decision trees. Ensembling and Decision Trees, we first need to explain what these two things are Decision Trees Decision Trees try to encode and separate the data into if-else rules. It breaks the data down into smaller and smaller subsets. Each node poses the question, […]

Branches of Machine Learning

Just finished reading the book “The Master Algorithm”, where the author tries to find the ultimate Machine Learning algorithm that can solve different varieties of problems (text, image, predictive, time series etc) In the book, he goes over the 5 main branches (or tribes) of Machine Learning. They are: The Evoluntionaries The Connectionist The Symbolist […]


A Generative Adversarial Network (GAN) is a collection of two neural network models: A Discriminator, and a Generator. The goals of the two models are opposing to each other Discriminator: Given a set of features, we try to predict the label Generator: Given a label, we try to predict the features that lead to the […]

Visualizing Neural Networks

Neural Networks have always been sort of a black box when it comes to it’s implementation, and how it produces good results. I came across some material that shows visually, how the neural networks morph the problem space so that they are separable. Simple Data Here’s a sample graph that is not linearly separable: When […]

Tips for Kaggling

I’ve been doing Kaggle competitions for awhile (although with not much success), and I’ve learning quite a few things along the way. One of which is how to properly approach the problem, and iterate through it to climb the LB (leader board). Setting the baseline The first thing I would do is to use some […]

Say NO to Overfitting!

Just some experience I’ve encountered while working on a very small data set of 1703 training samples, and 1705 testing samples. One way to combat overfitting is to use cross validation. While doing so, it’s important for you not to just look at the final validation score, but also observe the training process itself. If […]