All Courses

TensorFlow SavedModel Format

While saving checkpoints or just the model weights is useful during development and training, deploying a model often requires a more comprehensive and standardized format. TensorFlow provides the SavedModel format precisely for this purpose. It's the recommended way to save a complete TensorFlow program, including the model architecture, trained weights, and the computation graph itself, in a language-neutral, recoverable format.

Think of SavedModel as a self-contained package for your trained model. It stores the weights and captures the actual TensorFlow graph operations needed to perform inference. This makes it ideal for environments where you might not have the original Python code used to define the model, such as:

TensorFlow Serving: A high-performance system designed to serve machine learning models in production.
TensorFlow Lite: For deploying models on mobile, microcontrollers, and other edge devices.
TensorFlow.js: For running models directly in web browsers or Node.js applications.
Sharing models with others who may use different programming languages or environments.

Saving in SavedModel Format

Saving a Keras model in the SavedModel format is straightforward. If you use the model.save() method and provide a directory path without a .h5 or .weights.h5 extension, TensorFlow defaults to the SavedModel format.

import tensorflow as tf
# Assume 'model' is your trained Keras model
# model = tf.keras.Sequential([...])
# model.compile(...)
# model.fit(...)

# Save the model in SavedModel format
model.save("my_first_savedmodel")

This command creates a directory named my_first_savedmodel (or whatever path you specify). Unlike saving weights only, which creates one or more files often with .weights.h5 or .keras extensions, saving in the SavedModel format creates a specific directory structure.

Structure of the SavedModel Directory

Let's examine the contents of the directory created by model.save():

saved_model.pb: This is the core of the SavedModel. It's a protocol buffer file containing the serialized MetaGraphDefs. Each MetaGraphDef includes the graph structure (the GraphDef), variable information, asset details, and importantly, the model's signatures (we'll discuss these next).
variables/: This subdirectory holds the trained values of your model's variables (weights and biases). The data is often sharded into multiple files (variables.data-xxxxx-of-yyyyy and variables.index) for efficient loading.
assets/: An optional directory. If your model relies on external files (like vocabulary files for text processing or lookup tables), they are copied here. This ensures the SavedModel is self-contained.
fingerprint.pb: Contains metadata identifying the SavedModel format version and producers.
keras_metadata.pb: (Optional, but typically present when saving from Keras) This file stores Keras-specific information about the model architecture, loss function, optimizer state, and metrics. This allows tf.keras.models.load_model to perfectly reconstruct the original Keras model object, enabling further training or modification using the Keras API.

my_first_savedmodel/
├── assets/
├── variables/
│   ├── variables.data-00000-of-00001
│   └── variables.index
├── fingerprint.pb
├── keras_metadata.pb  # Keras-specific info
└── saved_model.pb     # The computation graph and signatures

Signatures: Defining Model Endpoints

A significant feature of SavedModel is the concept of signatures. A signature defines a specific function or computation exported by the model. It specifies the expected input tensors and the resulting output tensors for a particular task, typically inference. Think of them as defined entry points into your model's computation graph.

When you save a Keras model, it usually automatically exports a default signature named serving_default. This signature typically corresponds to the model's forward pass (call method or predict behavior), taking the model's input and producing its output.

For deployment systems like TensorFlow Serving, these signatures are essential. They tell the serving system exactly how to interact with the loaded model to get predictions. You can also define custom signatures for different functionalities using tf.function and specifying input signatures when saving models built outside the standard Keras Model class or when exporting specific preprocessing steps alongside the model.

Loading a SavedModel

You can load a model saved in this format using tf.keras.models.load_model():

# Load the model back from the directory
loaded_model = tf.keras.models.load_model("my_first_savedmodel")

# Verify it's the same type of object
print(type(loaded_model))
# <class 'keras.src.engine.sequential.Sequential'> (or Functional, etc.)

# You can now use it for predictions, evaluation, or even continue training
# predictions = loaded_model.predict(new_data)

Because the keras_metadata.pb file was included, load_model successfully reconstructs the original Keras Sequential (or Functional) model object, complete with its architecture, weights, and even the optimizer state if it was saved.

Alternatively, you can use the lower-level tf.saved_model.load() function:

loaded_generic = tf.saved_model.load("my_first_savedmodel")

# This returns a different type of object
print(type(loaded_generic))
# <class 'tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject'>

# Access the default serving signature
inference_func = loaded_generic.signatures["serving_default"]

# Use the signature for inference (requires input in Tensor format)
# output_tensor = inference_func(tf.constant(input_data_as_numpy))['output_layer_name']

Using tf.saved_model.load() gives you a more generic TensorFlow object. This is useful if you don't need the full Keras model structure, perhaps for integration into non-Keras TensorFlow code or when deploying where Keras isn't available. You interact with it primarily through its defined signatures.

In summary, the SavedModel format is TensorFlow's standard for serializing models for deployment and sharing. It packages the graph, weights, and assets into a self-contained, language-neutral format, making it compatible with TensorFlow Serving, Lite, and JS. While saving weights or using checkpoints is convenient during development, SavedModel is the choice for putting your models into practice.

Was this section helpful?