Tensor flow,Start With Zero Nothing To Loose

28.09.2017 — tensorflow — 4 min read

Tensor Flow is the machine learning framework that Google created and used to design, build, and train deep learning models.You can use the Tensor Flow library do to numerical computations, which in itself doesn’t seem all too special, but these computations are done with data flow graphs. In these graphs, nodes represent mathematical operations, while the edges represent the data, which usually are multidimensional data arrays or tensors, that are communicated between these edges.

You see? The name “Tensor Flow” is derived from the operations which neural networks perform on multidimensional data arrays or tensors! It’s literally a flow of tensors.

To understand tensors well, it’s good to have some working knowledge of linear algebra and vector calculus. Tensors are implemented in Tensor Flow as multidimensional data arrays, but some more introduction is maybe needed in order to completely grasp tensors and their use in machine learning.

A tensor, then, is the mathematical representation of a physical entity that may be characterized by magnitude and multiple directions.Before learning Tensor flow, Let’s start with, what is a computational graph

The idea behind TensorFlow is to ability to create these computational graphs in code and allow significant performance improvements via parallel operations and other efficiency gains.

Dont worry, I’m going to explain each line of code below. I’d advise you to open this post in two windows, so you could look at the code and explanations simultaneously. Before that let’s talk about about how Tensorflow works; To make our code execute faster and avoid drawbacks of languages like Python, which inherently trades speed with readability, Everything in TensorFlow is based on creating a computational graph. Think of a computational graph as a network of nodes, with each node known as an operation, running some function that can be as simple as addition or subtraction to as complex as some multi variate equation. Then the graph is executed in a Tensorflow session with fast C++ backend. Variables in TensorFlow are managed by the Session.

Now we know enough to dive in and get our hands dirty with code, which is the fastest way to learn.

import tensorflow as tf #importing the tensorflow library

#Input node for computational graph as input
data = [
    [1, 0, 0],
    [1, 0, 1],
    [1, 1, 1],
    [0, 1, 1],
]
#Input node for computational graph as output
label = [
    [4],
    [5],
    [2],
    [1],
]


w = tf.Variable(tf.random_normal([3, 1]), dtype=tf.float32)

#Machine Learning Model
predication = tf.add(tf.matmul(data, w))

#Learner
error = tf.subtract(label, predication)
mse = tf.reduce_mean(tf.square(error)) # calculate root mean square 
delta = tf.matmul(data, error, transpose_a=True)
train = tf.assign(w, tf.add(w, delta))

#Session 
sess = tf.Session()
sess.run(tf.global_variables_initializer())

epoch, max_epochs = 0, 10

while  epoch < max_epochs:
    epoch += 1
    err, _ = sess.run([mse, train])
print('epoch:', epoch, 'mse:', err)

print(sess.run(w))

Line 1: It simply imports the Tensorflow library where all the awesomeness resides. ** Line 4–16:** We simply assigned the value of 1 and 0 (2nd rank tensor),We have 3 dimensional space and 4 record and label output with a 1st rank tensor.

Line 19–20: w is a constant we choose (typically 3*1 matrix ) and let our computational graph decide the weight with which the constant term should be multiplied to get the desired function we intend to approximate. To learn more about bias; refer to this Stack-overflow thread. The values are initialized to normally-distributed random numbers, for that reason we used tf.random_normal module.

Line 23: Input (data) and output (label) training data will remain constant so our model is going to approximate the function by approximating the ‘weight’ (w) , In this case 3 * 1 since 3 inputs corresponds to 1 output, model will keep adjusting the weight to find the optimal value during the training process, hence its’s a Variable object. I hope the commonly used equation for neural networks “y = Wx ”makes more sense to you now

Line 26–28: We can simply define output, Error and Mean squared error in three lines. Error basically computes, how “off” our model output predictions are from the real “output” of training set. We will then proceed to compute the Mean-squared error, Why? Well, There is very beautiful piece of math behind it, You’d already know if you took Probability class in school, but here is a nice short tutorial explaining it.

Line 29: The evaluation of certain functions like weights (w) will help adjust the value of our predefined Variable, So, in this piece of code- we computed the desired adjustment (delta) based on the error we earlier computed. And then proceed to add it to our weights.

Line 32–33: The model has to be evaluated by a TensorFlow session, which we instantiate before initializing all variables to their specified values, Remember? We got to run our model using fast C++ backend of Tensorflow. The magic is behind sessions

Line 34–40: We can now run our model through training epochs, adjusting the weights each time by evaluating train. Since we’re using a binary output, we can expect to reach a perfect result with a mean squared error of 0. We will reach it very quickly. Note that Session.run will return the result of whatever is evaluated till then. On each epoch, we evaluate mse in order to track progress, and train to actually adjust the weights.

Now go ahead and run this code.