March 9, 2023

Tutorial: Introduction to Keras

What Is Keras? What Is It for?

Keras is a high-level, user-friendly API used for building and training neural networks. It is an open-source library built in Python that runs on top of TensorFlow. It was developed to enable fast experimentation and iteration, and it lowers the barrier to entry for working with deep learning

In this article, we'll discuss how to install and start using Keras; the Sequential API; and the steps for building, compiling, and training a model. We’ll also cover some common applications of Keras and potential other resources to check out!

Installing Keras

To start with, you’ll want to install TensorFlow. Depending on which operating system you’re using, this might look slightly different, but for the most part, you can use pip, Python’s package manager:

# First upgrade pip
pip install --upgrade pip

# Then install TensorFlow
pip install tensorflow

Once you’ve installed TensorFlow, all you need to do to use Keras is to run the following import statement at the top of your script or notebook:

from tensorflow import keras

Keras’ Sequential API

The Sequential API is the easiest way to use Keras to build a neural network. Using the Sequential class, it's possible to stack a variety of different layer types, one after the other, to produce a neural network. There are many types of layers available in the Keras Sequential API. One of the most common layer types is the Dense layer, a fully connected layer, but there are many others:

  • Convolutional layer: a layer for processing images — used for convolutional neural networks.
  • Recurrent layer: a layer for processing sequences of data — used for recurrent neural networks.
  • MaxPooling layer: a layer for down-sampling feature maps by taking the maximum value in non-overlapping rectangular blocks — used for preserving important features while reducing the chance of overfitting.
  • Flatten layer: a layer that flattens the multi-dimensional input tensors into a single dimension — used as a transition layer between convolutional or recurrent layers and fully connected layers in a neural network.
  • Dropout layer: a layer that randomly sets input units to 0 (using a defined frequency) during training time — used as a regularization technique to prevent overfitting in neural networks.
  • Embedding layer: a layer that represents words or phrases in a high-dimensional vector space — used to map words or phrases to dense vectors for use as input to a neural network.

These are just a few examples of the many types of layers available in the Keras Sequential API. Each layer is designed to perform a specific type of computation on the inputs, and they can be combined to create powerful neural network architectures.

Building a Model with Keras

To build a model with the Keras Sequential API, the first step is to import the required class and instantiate a model using this class:

from tf.keras import Sequential

model = Sequential()

Next, choose the layer types you wish to include, and add them one at a time to the sequential model you’ve instantiated. For example, to add three Dense hidden layers and one output layer to an input layer, your code might look like this:

from tf.keras import Dense


In this example, we’ve added an implicit input layer with 16 nodes and three hidden layers with 8, 4, and 2 nodes, respectively. Finally, we’ve added an output layer with just one node.

Image via our forthcoming "Deep Learning Fundamentals" lesson in the "Introduction to Deep Learning in TensorFlow" course.

Compiling a Model with Keras

Once you’ve built a model using the Keras Sequential API, you’ll need to compile it before it can be used for training. When compiling the model, you’ll need to specify both a loss function and an optimizer. There are many different options to choose from:

Loss functions:

  • Mean Squared Error (MSE): a common loss function for regression problems — measures the average squared difference between the predicted and actual values.
  • Binary Crossentropy: a loss function for binary classification problems — measures the cross-entropy between the predicted and actual binary distributions.
  • Categorical Crossentropy: a loss function for multi-class classification problems — measures the cross-entropy between the predicted and actual categorical distributions.


  • Stochastic Gradient Descent (SGD): a simple optimization algorithm that updates the parameters by computing the gradient of the loss function with respect to the parameters.
  • Adam: an optimization algorithm that adapts the learning rate based on the historical gradient information.
  • RMSprop: an optimization algorithm that uses a moving average of squared gradients to normalize the gradient updates.

These are just a few examples of the many loss functions and optimizers available in Keras. The choice of loss function and optimizer will depend on the specific problem you're trying to solve and the characteristics of your data.

For example, to compile a model with Mean Squared Error as the loss function and Adam as the optimizer, we would use the following code:

model.compile(loss=‘mean_squared_error’, optimizer=‘adam’)

Fitting a Model with Keras

Finally, to use Keras for deep learning, the compiled model must be fit to a training dataset. The training dataset should be prepared using a process that separates the independent variables, the features (or X variable) from the dependent variable, the target (or y variable). These variables should also be converted to NumPy arrays and reshaped as needed, and the input_shape parameter of the neural network will need to be configured to match the training data shape. Once all of these preprocessing steps are in place, you can simply fit the model to the training data like so:, y_train)

To evaluate the performance of the model after training, you can use evaluate:

test_loss, test_accuracy = model.evaluate(x_test, y_test)

Putting It All Together

Now that we've seen all of the steps required to build and train a model in Keras, let's put it all together and take a look at an example using the MNIST dataset built into TensorFlow datasets. First, we need to make the necessary imports and load the dataset:

from tensorflow import keras
from tensorflow.keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

For this dataset, we also need to do some preprocessing and reshaping of the data to prepare for the model:

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1).astype('float32') / 255
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1).astype('float32') / 255

y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

Then it's time to build your model! Following the steps above, we can build a convolutional neural network using a range of layer types, like so:

model = keras.models.Sequential()
model.add(keras.layers.Conv2D(32, kernel_size=(3, 3), 
                                                    input_shape=(28, 28, 1)))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))

Finally, compile and train the model as directed, and evaluate the results:

                        optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']), y_train, batch_size=128, epochs=12)

test_loss, test_accuracy = model.evaluate(x_test, y_test)
print('Test accuracy:', test_accuracy)
Test accuracy: 0.8313000202178955

And that's it! Now you have built a successful model in Keras. With an accuracy of around 83%, the model is able to make reliable predictions on unseen data. If you run this code yourself, your results may vary slightly, but not by much.

Applications and Next Steps

Now that you’ve learned why and how to use Keras, you might be wondering what sorts of applications you can use it for. Neural networks are used in a wide range of areas. Some popular applications include the following:

  • Computer Vision: image classification, object detection, semantic segmentation, and other computer vision tasks.
  • Natural Language Processing: language translation, text classification, sentiment analysis, and more.
  • Recommender Systems: recommendation systems for products, movies, music, etc.
  • Time Series Prediction: for predicting stock prices, sales, weather patterns, and other time series forecasting.

These are just a few examples of the many applications of neural networks. The ability of neural networks to learn complex relationships in data and make predictions based on that learning makes them a versatile tool for a wide range of problems.

If you’d like to go further with your studies of Keras and TensorFlow, and get some hands-on practice with these tools, you'll want to check out some upcoming Dataquest courses!

  • Introduction to Deep Learning with TensorFlow
  • Convolutional Neural Networks
  • Sequence Models
  • Natural Language Processing
Eleanor Thomas

About the author

Eleanor Thomas