Chaitanya Belwal's Blog

Posts

Showing posts from April, 2018

Introducing Convolution Neural Networks with a simple architecture

April 27, 2018

Convolution Neural Networks (CNNs) are different from standard Neural Networks (NNs) as their input is 2-dimensional vs. 1-dimensional in a standard neural network. While the underlying principles between CNNs and NNs are same, CNNs do introduce some new concepts. In this article, we take a look at how CNNs operate using a very simple CNN architecture, given below. In this simple CNN, there is one 4x4 input matrix, one 2x2 filter matrix (also known as kernel ), a single convolution layer with 1 unit, 1 Rectified Linear Unit (ReLu) layer with one unit, a single pooling layer and a single fully connected (FC) layer. The elements of the filter matrix are equivalent to the unit weights in a standard NN and are updated during the backpropagation phase. Real-life CNNs are significantly more complex than this with several repeating layers of convolutional, ReLu and pooling layers forming a true deep learning network. All the matrix dimensions are also magnitudes higher to represent hig...

Part-II: A gentle introduction to Neural Network Backpropagation

April 19, 2018

In part-I , we derived the back-propagation formula using a simple neural net architecture that does not use an activation function. In this article, which is a follow on to part-I we look at the same NN architecture, with the Sigmoid activation function added to both hidden and output layers. Note that since we have only 1 weight we can use standard derivatives and not partial derivatives. However, to keep the nomenclature standard as other texts on back-propagation we will use partial derivatives. This article attempts to explains back-propagation in an easier fashion with a simple NN and explanations at each step. After reading this text readers are encouraged to understand the more complex derivations given in Mitchell's book and elsewhere, to fully grasp the concept of back-propagation. You do need a basic understanding of partial derivatives and one of the best explanations are available in videos by Sal Khan at Khan academy . The neural network which has 1 input, 1 h...

Deriving Pythagoras' theorem using Machine Learning

April 15, 2018

In an earlier article which introduced machine learning, we had seen a comparison of equations derived from scientific principles vs. those derived via machine learning. In this article, we will see how Dythagoras a citizen of the far away planet Darth in the Dilky Day galaxy uses machine learning to derive a correlation between the length of 3 sides of a right-angle triangle. Dythagoras - resemblance to Pythagoras not intentional Unlike humans, the citizens of Darth (Darthians) are not very bright mathematically but are capable of observations and making tools. They had been handed a computer and the Dasmic machine learning library by human astronauts of a SpaceX expedition who had visited them a few months ago. The citizens of Darth need to shorten the travel time between 2 of their biggest cities of Aville and Cville. Currently the only way to goto Cville from Aville is via Bville that takes several hours. Darthians need to know the direct distance between Aville and Cvill...

Part-I: A gentle introduction to Neural Network Backpropagation

April 14, 2018

Back-propagation is one of the fundamental concepts in neural networks (NN). The equations of back-propagation are derived using mathematics, specifically partial derivatives. Many good sources that explain the concept of back-propagation,including one of my favorites books from Tom Mitchell , use a neutral network architecture with multiple neurons that throws a curve ball that is hard to navigate, specially for non-experts. In this article, we use a simple neural network that has only one neuron/unit in the input, hidden and output layers and then derive the back-propagation formula using standard partial derivatives. To further simplify the explanation, the article is broken into two parts such that the NN in part-I does not use any activation function while in part-II we use the sigmoid activation function. Note that since we have only 1 weight we can use standard derivatives and not partial derivatives. However, to keep the nomenclature standard as other texts on back-pro...

A gentle introduction to Machine Learning and Artificial Intelligence

April 01, 2018

Artificial Intelligence (AI) deals with the capability of computers to predict the result of some phenomena based on past or current information. Machine learning (ML) is the group of algorithms and mathematical equations that make AI possible. The relation between the two can be summed up as: Artificial Intelligence is the practical and useful (to humans) application of Machine Learning. The foundations of ML are set in Mathematics, more specifically Statistics. Consider this made-up equation: M = 0.3 x V 1 + 0.41 x V 2 + 0.35 x V 3 + 0.37 x V 4 + 0.5 x V 5 This equation is computing the value of the variable M . Now let's say that this equation predicts your mood tomorrow. If M > 2, your mood be good else it will be sour. The variables V 1 through V 5 are defined as: V 1 : A value for weather today (1 for Rainy, 2 for Cloudy, 3 for Sunny) V 2 : If you exercised today (0 for no exercise, 1 for exercise) V 3 : If you watched a good movie today (0 f...