Tutorial: Introduction to TensorFlow
Introduction
TensorFlow is a powerful, open-source software library for building deep learning applications. This tutorial will guide you through using TensorFlow to build, train, and evaluate a deep learning algorithm.
Here's what you'll learn:
- The basics of TensorFlow
- How to use its features when developing deep learning applications
- How to implement a predictive model with TensorFlow's Keras API
TensorFlow has become very popular in recent years because it provides a simple-yet-efficient way to build and deploy models, allowing developers to solve complex problems quickly and accurately.
About This Tutorial
This tutorial is based on part of our interactive course on TensorFlow, available here.
The tutorial is based on the latest version of the TensorFlow library, TensorFlow 2, an open-source platform developed by Google.
Let's get started!
What Is TensorFlow?
(We can see deep learning in action in this final plot from the Deep Learning Course Guided Project, "Time-Series Forecasting on the S&P 500." Don't you wish you could predict the S&P 500 this accurately?)
TensorFlow is an open-source machine learning platform developed by Google that provides a powerful suite of tools for data scientists and developers to build, deploy, and train machine learning models. It was initially released in 2015 as an early version of the software, but it has continued to evolve over the years with more powerful capabilities.
The TensorFlow library allows developers to create complex neural networks using a variety of programming languages, such as Python and JavaScript. Additionally, TensorFlow makes it easy to deploy models on mobile devices or cloud platforms like Google Cloud Platform (GCP) and Amazon Web Services (AWS).
TensorFlow is most commonly used for deep learning applications, such as natural language processing (NLP), image recognition, text classification, object detection, recommendation systems, and much more.
What Are Tensors?
A tensor is an array of data that can be processed by TensorFlow. A tensor can be represented as a matrix or a vector. In simple terms, we can think of it as a collection of numbers arranged into a particular shape. Mathematically, a tensor can be used to create n-dimensional datasets.
A zero-dimensional tensor is a scalar, which contains a single value and has no axes. A one-dimensional tensor is a vector, which contains a list of values and has one axis. A two-dimensional tensor is a matrix that contains values stored across two axes.
What Is TensorFlow Used for?
TensorFlow is used in a variety of applications, ranging from natural language processing (NLP) and image recognition to predictive analytics and autonomous vehicle control. It can be used to train deep neural networks for object detection and classification, generate recommendations, classify images, and build voice-powered applications.
In addition, TensorFlow is used for forecasting, text-based applications, algorithmic trading, and optimization. It is also used in various healthcare applications such as medical diagnosis and drug discovery.
All of these use cases demonstrate the versatility and power that TensorFlow offers to data scientists and developers. With its flexibility and scalability, TensorFlow enables the implementation of complex machine learning tasks in a fraction of the time compared to traditional methods.
How Does TensorFlow Work?
At the core of TensorFlow is a dataflow graph, which describes how data moves through a series of operations or transformations. The basic idea behind the dataflow graph is that operations are expressed as nodes, with each node performing a single operation on its inputs. The inputs and outputs of the operations are passed through edges (tensors). This makes it possible to break down complex computations into smaller, more manageable chunks.
TensorFlow also provides a number of tools for constructing and training neural networks. One of the most popular tools is the utility tf.keras
, which allows users to quickly build and train deep learning models without having to write code from scratch. It also includes powerful visualization tools to help users understand the data and model parameters.
TensorFlow is also extensible, and it can be used with a variety of programming languages, including Python, C++, JavaScript, and Go. It also has support for running on GPUs (graphics processing units) for maximum performance. With its robust set of features, TensorFlow is one of the most popular frameworks for deep learning, and it continues to evolve as more powerful algorithms are developed.
Using TensorFlow for Building Models
Now that we understand what TensorFlow is, let's take a look at how to use it in practice. We'll walk through an example of building a regression model with the Keras API in TensorFlow. This will include defining our model architecture, training it, and evaluating its performance. We'll follow the following steps. Note that we'll outline the key elements of these steps, however, if you want to go into extensive depth, would recommend check out the course on TensorFlow.
Step 1: Loading the required libraries and modules
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
tf.random.set_seed(100)
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
Step 2: Loading the data and performing basic data checks
We'll use the in-built mpg
dataset available under the seaborn
library. We do this with the following code.
df = sns.load_dataset('mpg')
print(df.shape)
df.describe(include='all')
(398, 9)
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
count | 398.000000 | 398.000000 | 398.000000 | 392.000000 | 398.000000 | 398.000000 | 398.000000 | 398 | 398 |
unique | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 | 305 |
top | NaN | NaN | NaN | NaN | NaN | NaN | NaN | usa | ford pinto |
freq | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 249 | 6 |
mean | 23.514573 | 5.454774 | 193.425879 | 104.469388 | 2970.424623 | 15.568090 | 76.010050 | NaN | NaN |
std | 7.815984 | 1.701004 | 104.269838 | 38.491160 | 846.841774 | 2.757689 | 3.697627 | NaN | NaN |
min | 9.000000 | 3.000000 | 68.000000 | 46.000000 | 1613.000000 | 8.000000 | 70.000000 | NaN | NaN |
25 | 17.500000 | 4.000000 | 104.250000 | 75.000000 | 2223.750000 | 13.825000 | 73.000000 | NaN | NaN |
50 | 23.000000 | 4.000000 | 148.500000 | 93.500000 | 2803.500000 | 15.500000 | 76.000000 | NaN | NaN |
75 | 29.000000 | 8.000000 | 262.000000 | 126.000000 | 3608.000000 | 17.175000 | 79.000000 | NaN | NaN |
max | 46.600000 | 8.000000 | 455.000000 | 230.000000 | 5140.000000 | 24.800000 | 82.000000 | NaN | NaN |
If you examine the output, it shows there are 398 rows (or observations) and 9 columns (or variables). The horsepower
variable has count of 392, which indicates there are six missing values in the horsepower
variable.
You can also see three variables, cylinders
, origin
, and model_year
are categorical variables, and we'll have to convert them into numeric features for modeling in the next step.
Also, the name
variable has many unique values, 305 to be precise; hence, it isn't useful for modeling purpose.
Step 3: Data Preprocessing
We'll perform the following three preprocessing steps.
-
Fill the missing values in
horsepower
variable with its median value. You can impute missing values with mean also, but in this tutorial, we have used median, as it is less prone to be influenced by outliers. -
Drop the
name
variable as it doesn't add any predictive power to the data. -
Perform dummy encoding to the variables,
cylinders
,origin
, andmodel_year
for converting them into numeric features.
df['horsepower'].fillna(df['horsepower'].median(), inplace=True)
df = df.drop(['name'], axis=1)
df = pd.get_dummies(df, columns=['cylinders'], drop_first=True, prefix='Cylinder')
df = pd.get_dummies(df, columns=['model_year'], drop_first=True, prefix='Year')
df = pd.get_dummies(df, columns=['origin'], drop_first=True, prefix='Origin')
df.head()
mpg | displacement | horsepower | weight | acceleration | Cylinder_4 | Cylinder_5 | Cylinder_6 | Cylinder_8 | Year_71 | ... | Year_75 | Year_76 | Year_77 | Year_78 | Year_79 | Year_80 | Year_81 | Year_82 | Origin_japan | Origin_usa | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 307.0 | 130.0 | 3504 | 12.0 | 0 | 0 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 15.0 | 350.0 | 165.0 | 3693 | 11.5 | 0 | 0 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 18.0 | 318.0 | 150.0 | 3436 | 11.0 | 0 | 0 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
3 | 16.0 | 304.0 | 150.0 | 3433 | 12.0 | 0 | 0 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
4 | 17.0 | 302.0 | 140.0 | 3449 | 10.5 | 0 | 0 | 0 | 1 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
5 rows × 23 columns
Step 4: Create Features and Response Variables
The code below performs the following tasks:
-
Creates an object for the target variable and list containing all the independent variables.
-
Performs normalization of the predictor variables as the units of the variables differ significantly that may influence the modeling process. To prevent this, we normalize the predictors by scaling the values between 0 and 1.
target_variable = ['mpg']
predictors = list(set(list(df.columns))-set(target_variable))
df[predictors] = df[predictors]/df[predictors].max()
df.describe()
mpg | displacement | horsepower | weight | acceleration | Cylinder_4 | Cylinder_5 | Cylinder_6 | Cylinder_8 | Year_71 | ... | Year_75 | Year_76 | Year_77 | Year_78 | Year_79 | Year_80 | Year_81 | Year_82 | Origin_japan | Origin_usa | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | ... | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 | 398.000000 |
mean | 23.514573 | 0.425112 | 0.453496 | 0.577904 | 0.627746 | 0.512563 | 0.007538 | 0.211055 | 0.258794 | 0.070352 | ... | 0.075377 | 0.085427 | 0.070352 | 0.090452 | 0.072864 | 0.072864 | 0.072864 | 0.077889 | 0.198492 | 0.625628 |
std | 7.815984 | 0.229164 | 0.166185 | 0.164755 | 0.111197 | 0.500471 | 0.086601 | 0.408571 | 0.438523 | 0.256061 | ... | 0.264331 | 0.279868 | 0.256061 | 0.287190 | 0.260241 | 0.260241 | 0.260241 | 0.268335 | 0.399367 | 0.484569 |
min | 9.000000 | 0.149451 | 0.200000 | 0.313813 | 0.322581 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
25 | 17.500000 | 0.229121 | 0.330435 | 0.432636 | 0.557460 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
50 | 23.000000 | 0.326374 | 0.406522 | 0.545428 | 0.625000 | 1.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
75 | 29.000000 | 0.575824 | 0.543478 | 0.701946 | 0.692540 | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
max | 46.600000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
8 rows × 23 columns
Step 5: Create the Training and Test Datasets
The code below performs the following tasks:
-
Creates an array of independent (X) and dependent (y) variables, respectively
-
Splits the data into training and test set
X = df[predictors].values
y = df[target_variable].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=100)
print(X_train.shape); print(X_test.shape)
(278, 22)
(120, 22)
Step 6: Building, Training, and Evaluating the Model
The first step here is to define the model architecture using the Sequential class from Keras API. This performs the following tasks:
-
Instantiates the Sequential model class under the Tensorflow's Keras API
-
Uses
relu
as an activation functionReLU (Rectified Linear Unit) is a widely used non-linear activation function in deep learning models. It is used to add non-linearity to the neural networks by introducing a non-zero gradient when the inputs are greater than 0.
Note that there are many possible configurations to build the network. This architecture is just one of them, and not a recommendation. Please feel free to change the model with different configurations.
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(32, input_shape = (X_train.shape[1],), activation = 'relu'))
model.add(tf.keras.layers.Dense(16, activation= "relu"))
model.add(tf.keras.layers.Dense(8, activation= "relu"))
model.add(tf.keras.layers.Dense(4, activation= "relu"))
model.add(tf.keras.layers.Dense(1))
Next, we need to compile our model with the appropriate loss and optimizer functions for our task. The mean squared error is our loss measure, and the "adam" optimizer is our minimization algorithm used to minimize a given loss function.
The loss function is Mean Absolute Error (MAE), which is a metric used in regression models to measure the difference between predicted values and true values. It is the average of the absolute errors across all samples, where each error is defined as the absolute difference between true value and predicted value.
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss = 'mae', metrics = ['mae'], optimizer = optimizer)
Next, we'll train our model by passing in training data and setting the number of epochs to 50, which represents the number of training iterations.
model.fit(X_train, y_train, epochs=50)
Epoch 1/50
9/9 [==============================] - 1s 2ms/step - loss: 22.8766 - mae: 22.8766
Epoch 2/50
9/9 [==============================] - 0s 2ms/step - loss: 18.9705 - mae: 18.9705
Epoch 3/50
9/9 [==============================] - 0s 2ms/step - loss: 10.0245 - mae: 10.0245
Epoch 4/50
9/9 [==============================] - 0s 2ms/step - loss: 7.5122 - mae: 7.5122
Epoch 5/50
9/9 [==============================] - 0s 2ms/step - loss: 4.7029 - mae: 4.7029
Epoch 6/50
9/9 [==============================] - 0s 2ms/step - loss: 3.4516 - mae: 3.4516
Epoch 7/50
9/9 [==============================] - 0s 2ms/step - loss: 3.2318 - mae: 3.2318
Epoch 8/50
9/9 [==============================] - 0s 3ms/step - loss: 2.9560 - mae: 2.9560
Epoch 9/50
9/9 [==============================] - 0s 3ms/step - loss: 2.9059 - mae: 2.9059
Epoch 10/50
9/9 [==============================] - 0s 2ms/step - loss: 2.7966 - mae: 2.7966
Epoch 11/50
9/9 [==============================] - 0s 2ms/step - loss: 2.7716 - mae: 2.7716
Epoch 12/50
9/9 [==============================] - 0s 2ms/step - loss: 2.7951 - mae: 2.7951
Epoch 13/50
9/9 [==============================] - 0s 2ms/step - loss: 2.9418 - mae: 2.9418
Epoch 14/50
9/9 [==============================] - 0s 2ms/step - loss: 2.7087 - mae: 2.7087
Epoch 15/50
9/9 [==============================] - 0s 2ms/step - loss: 2.6385 - mae: 2.6385
Epoch 16/50
9/9 [==============================] - 0s 2ms/step - loss: 2.7181 - mae: 2.7181
Epoch 17/50
9/9 [==============================] - 0s 2ms/step - loss: 2.5412 - mae: 2.5412
Epoch 18/50
9/9 [==============================] - 0s 2ms/step - loss: 2.6168 - mae: 2.6168
Epoch 19/50
9/9 [==============================] - 0s 3ms/step - loss: 2.5298 - mae: 2.5298
Epoch 20/50
9/9 [==============================] - 0s 2ms/step - loss: 2.4457 - mae: 2.4457
Epoch 21/50
9/9 [==============================] - 0s 2ms/step - loss: 2.3907 - mae: 2.3907
Epoch 22/50
9/9 [==============================] - 0s 2ms/step - loss: 2.4315 - mae: 2.4315
Epoch 23/50
9/9 [==============================] - 0s 3ms/step - loss: 2.4447 - mae: 2.4447
Epoch 24/50
9/9 [==============================] - 0s 2ms/step - loss: 2.5072 - mae: 2.5072
Epoch 25/50
9/9 [==============================] - 0s 2ms/step - loss: 2.5910 - mae: 2.5910
Epoch 26/50
9/9 [==============================] - 0s 2ms/step - loss: 2.4720 - mae: 2.4720
Epoch 27/50
9/9 [==============================] - 0s 2ms/step - loss: 2.3334 - mae: 2.3334
Epoch 28/50
9/9 [==============================] - 0s 2ms/step - loss: 2.3399 - mae: 2.3399
Epoch 29/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2091 - mae: 2.2091
Epoch 30/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2084 - mae: 2.2084
Epoch 31/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2190 - mae: 2.2190
Epoch 32/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2318 - mae: 2.2318
Epoch 33/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2458 - mae: 2.2458
Epoch 34/50
9/9 [==============================] - 0s 2ms/step - loss: 2.2424 - mae: 2.2424
Epoch 35/50
9/9 [==============================] - 0s 2ms/step - loss: 2.1630 - mae: 2.1630
Epoch 36/50
9/9 [==============================] - 0s 2ms/step - loss: 2.1858 - mae: 2.1858
Epoch 37/50
9/9 [==============================] - 0s 3ms/step - loss: 2.2238 - mae: 2.2238
Epoch 38/50
9/9 [==============================] - 0s 3ms/step - loss: 2.1893 - mae: 2.1893
Epoch 39/50
9/9 [==============================] - 0s 3ms/step - loss: 2.1103 - mae: 2.1103
Epoch 40/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0991 - mae: 2.0991
Epoch 41/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0380 - mae: 2.0380
Epoch 42/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0410 - mae: 2.0410
Epoch 43/50
9/9 [==============================] - 0s 2ms/step - loss: 2.1349 - mae: 2.1349
Epoch 44/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0254 - mae: 2.0254
Epoch 45/50
9/9 [==============================] - 0s 2ms/step - loss: 2.1620 - mae: 2.1620
Epoch 46/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0948 - mae: 2.0948
Epoch 47/50
9/9 [==============================] - 0s 2ms/step - loss: 2.0497 - mae: 2.0497
Epoch 48/50
9/9 [==============================] - 0s 2ms/step - loss: 1.9978 - mae: 1.9978
Epoch 49/50
9/9 [==============================] - 0s 2ms/step - loss: 1.9770 - mae: 1.9770
Epoch 50/50
9/9 [==============================] - 0s 2ms/step - loss: 1.9518 - mae: 1.9518
<keras.callbacks.History at 0x7f72fbeea040>
Once the model has been trained, we can evaluate its performance on the train and testing datasets.
print(model.evaluate(X_train, y_train))
9/9 [==============================] - 0s 2ms/step - loss: 1.8617 - mae: 1.8617
[1.8616846799850464, 1.8616846799850464]
print(model.evaluate(X_test, y_test))
4/4 [==============================] - 0s 3ms/step - loss: 2.3199 - mae: 2.3199
[2.319859266281128, 2.319859266281128]
The mean absoulte error for the train and test dataset comes out to be 1.86 and 2.32, respectively. The lower the value, the better the model performance. Sometimes it's more intuitive to evaluate the regression model performance using the metric called as R-squared.
The R-squared value is a measure of how well a regression model fits the data. It's also known as the coefficient of determination and is a measure of the strength of the linear relationship between dependent and independent variables.
The code chunks below computes the R-squared value on the train and test datasets.
from sklearn.metrics import r2_score
pred_train = model.predict(X_train)
r2_score(y_train, pred_train)
9/9 [==============================] - 0s 2ms/step
0.8904764861187693
pred_test = model.predict(X_test)
r2_score(y_test, pred_test)
4/4 [==============================] - 0s 2ms/step
0.8281138794449109
The R-squared value for train and test datasets comes out to be 0.89 and 0.83, respectively. The R-squared value of 1 is perfect, while a lower value indicates that the model doesn't explain the variability of the data. Since the R-squared value forthe train and test datasets is high, we can conclude that the model we trained is performing quite well.
Conclusion
By following this tutorial, you should now have a better understanding of how to use TensorFlow and its various components to build deep learning models.
Acquiring this knowledge on TensorFlow is an important component for data scientists and machine learning engineers. By learning the basics of how to use TensorFlow, you'll be able to quickly create sophisticated deep learning models that can solve real-world problems. With more practice and learning through interactive courses, you'll be able to build effective models using TensorFlow that can help to improve decision-making, process data faster, and create more accurate predictions.
So, are you looking for in-depth understanding of the world of deep learning and develop powerful models with TensorFlow? If the answer is yes, than you should consider taking our Introduction to TensorFlow course. This course provides a comprehensive introduction to deep learning, starting from basic concepts such as shallow neural networks and progressing up to more complex architectures. Through interactive exercises culminating in a guided project, you'll get hands-on experience working with real-world datasets and applying models developed with TensorFlow.
To learn more about related concepts, please refer to the following resources: