All Courses

Saving and Loading Advanced Model Formats

After training a model, the next significant step is preparing it for use outside the development environment. Whether deploying to a powerful server cluster or a resource-constrained mobile device, you need a standardized, self-contained way to package your model. TensorFlow provides the SavedModel format precisely for this purpose. It's more than just saving model weights; it captures the complete TensorFlow program, including the computation graph, variable values, and any necessary assets or metadata, making models portable and servable.

The SavedModel Format: A Universal Package

Think of the SavedModel as a serialized representation of your entire TensorFlow model, ready for deployment. It's the recommended format for production environments and serves as the intermediate format for converting models to TensorFlow Lite or TensorFlow.js.

When you save a model in this format, TensorFlow creates a directory containing:

saved_model.pb or saved_model.pbtxt: This file stores the structure of the computation graph(s) defined by tf.functions and metadata like signatures. It uses the Protocol Buffer format (binary .pb or text .pbtxt).
variables/ directory: This subdirectory contains the trained values (weights, biases, etc.) of your model's variables, often stored in checkpoint files.
assets/ directory: An optional subdirectory holding external files needed by the graph, such as vocabulary files for text processing or class mappings.
assets.extra/ directory: An optional subdirectory where libraries or users can add their own assets, co-located with the model but not directly read by the TensorFlow graph.
fingerprint.pb: A file containing a unique fingerprint identifying the SavedModel.

A typical directory structure for a TensorFlow SavedModel.

Saving Models with Keras

The simplest way to create a SavedModel is using the save method of a tf.keras.Model instance. Keras handles the details of serialization for standard models.

import tensorflow as tf

# Assume 'model' is a compiled tf.keras.Model instance
# model = tf.keras.Sequential([...])
# model.compile(...)
# model.fit(...)

# Save the model in SavedModel format
model.save("my_keras_model")

# You can also specify signatures explicitly (highly recommended for serving)
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 784], dtype=tf.float32)])
def serving_default(input_tensor):
    return {'output': model(input_tensor)}

model.save("my_keras_model_with_signature", signatures={'serving_default': serving_default})

Calling model.save("directory_path") creates a directory containing the SavedModel. By default, Keras attempts to save the model's forward pass (call method) and potentially its training/evaluation configurations. However, explicitly defining and saving signatures is important for deployment, as we'll see shortly.

Saving with `tf.saved_model.save`

For more fine-grained control, especially when working with custom tf.Module objects or needing to define specific functions to expose, you can use the lower-level tf.saved_model.save API. This is what model.save uses internally for Keras models.

import tensorflow as tf

# Example using a custom tf.Module
class MyCustomModule(tf.Module):
    def __init__(self, name=None):
        super().__init__(name=name)
        self.my_variable = tf.Variable(5.0, name="my_var")
        self.untracked_list = [] # Won't be saved

    @tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
    def __call__(self, x):
        # This function will be traced and saved as a signature
        print("Tracing MyCustomModule.__call__")
        return x * self.my_variable

    @tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
    def add_amount(self, amount):
        # Another function to expose
        print("Tracing MyCustomModule.add_amount")
        return self.my_variable + amount

module_to_save = MyCustomModule()

# Save the module, exposing its methods as signatures
tf.saved_model.save(module_to_save, "my_module_savedmodel",
                    signatures= {
                        'serving_default': module_to_save.__call__,
                        'add': module_to_save.add_amount
                    })

print(f"SavedModel created at: my_module_savedmodel")
print(f"Available signatures: {tf.saved_model.load('my_module_savedmodel').signatures.keys()}")

Here, we explicitly pass the tf.Module instance and a dictionary defining the signatures we want to make available in the saved model. Only attributes that are tf.Variable, tf.Module, or trackable data structures (like lists/dicts containing trackable objects) are saved. Python primitives or non-trackable collections (like self.untracked_list above) are lost.

Understanding Signatures

Signatures are the entry points into your SavedModel for inference. They define the expected input tensors (shape and dtype) and the corresponding output tensors for specific functions within your model. When you deploy a model using TensorFlow Serving or other platforms, these signatures define the API endpoints you can call.

Why are they important? Signatures provide a language-agnostic contract for interacting with the model. A client sending a request doesn't need to know the internal Python code; it just needs to adhere to the input/output specifications defined by a signature. They also allow the saving process to create optimized, concrete TensorFlow graphs for specific input types, potentially improving performance.
How are they created? You define signatures by decorating Python methods with @tf.function and providing an input_signature. The input_signature is a list or tuple of tf.TensorSpec objects, specifying the shape and dtype for each input argument.
serving_default: By convention, many serving systems look for a signature named serving_default. It's good practice to define this signature for the most common inference task of your model.

When saving a Keras model without explicit signatures, Keras often creates a default signature based on the model's call method, but relying on this implicit behavior can sometimes lead to unexpected results, especially with complex input processing. Explicitly defining signatures using @tf.function and input_signature before calling model.save or when using tf.saved_model.save is the recommended approach for deployment.

Loading SavedModels

Once saved, you can load a SavedModel back into a TensorFlow program using tf.saved_model.load. This function restores the tf.Module or tf.keras.Model object, its variables, assets, and crucially, the tf.function-decorated methods saved as signatures.

import tensorflow as tf

# Load the Keras model saved earlier
loaded_keras_model = tf.saved_model.load("my_keras_model_with_signature")

# Access the signatures dictionary
print(f"Available signatures: {list(loaded_keras_model.signatures.keys())}")

# Get the specific function object for the 'serving_default' signature
inference_func = loaded_keras_model.signatures['serving_default']

# Prepare some dummy input data matching the input_signature
dummy_input = tf.random.uniform([2, 784], dtype=tf.float32) # Batch of 2, 784 features

# Call the function
output_dict = inference_func(dummy_input)
print(f"Output tensor shape: {output_dict['output'].shape}")


# Load the custom module saved earlier
loaded_module = tf.saved_model.load("my_module_savedmodel")

# Access its signatures
add_func = loaded_module.signatures['add']
default_func = loaded_module.signatures['serving_default']

# Call the functions
result1 = default_func(tf.constant(10.0))
print(f"Default func output: {result1['output'].numpy()}") # Output based on saved variable

result2 = add_func(amount=tf.constant(3.0))
print(f"Add func output: {result2['output'].numpy()}")

The loaded object is not exactly the original Python object. It's a specialized internal user object (_UserObject) that holds the restored state and functions. You interact with it primarily through its signatures attribute, which is a dictionary mapping signature keys (like 'serving_default') to the corresponding concrete TensorFlow functions. Calling these functions executes the restored computation graph. Notice that Python code within the original @tf.function (like print statements) only runs during the initial tracing when the function is first called or saved, not when calling the loaded signature.

Inspecting SavedModels

Before deploying, it's often useful to inspect the contents of a SavedModel, particularly its available signatures and their input/output specifications. TensorFlow provides the saved_model_cli command-line tool for this.

# Show all information about the SavedModel
saved_model_cli show --dir my_keras_model_with_signature --all

# Example Output Snippet:
# MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
#
# signature_def['__saved_model_init_op']:
#   The Bypassed SessionInit op.
#
# signature_def['serving_default']:
#   The given SavedModel SignatureDef contains the following input(s):
#     inputs['input_tensor'] tensor_info:
#         dtype: DT_FLOAT
#         shape: (-1, 784)
#         name: serving_default_input_tensor:0
#   The given SavedModel SignatureDef contains the following output(s):
#     outputs['output'] tensor_info:
#         dtype: DT_FLOAT
#         shape: (-1, 10) # Assuming a 10-class output
#         name: StatefulPartitionedCall:0
#   Method name is: tensorflow/serving/predict

# Show only signatures for a specific tag-set (usually 'serve' for TF Serving)
saved_model_cli show --dir my_module_savedmodel --tag_set serve

# Show specific signature details
saved_model_cli show --dir my_module_savedmodel --tag_set serve --signature_def add

The saved_model_cli show command reveals the structure, available tag-sets (groups of graphs, typically just serve for inference), and the detailed input/output tensor information for each signature within a specified tag-set. This allows you to verify that the model was saved correctly and understand how to structure inference requests.

Considerations for Custom Components

If your model uses custom layers, activation functions, loss functions, or other custom Python objects defined by subclassing Keras or TensorFlow base classes, saving and loading require special attention.

Keras Models: When using model.save, Keras usually handles custom objects registered via tf.keras.utils.register_keras_serializable or passed via the custom_objects argument during loading (tf.keras.models.load_model(..., custom_objects=...)). However, the SavedModel format aims for portability, meaning the loading environment (like TF Serving) might not have access to your custom Python code.
Best Practice: Whenever possible, implement custom logic directly using standard TensorFlow operations within your @tf.functions or custom layer/model methods. This ensures the logic is captured within the TensorFlow graph itself inside the SavedModel. If you must rely on external Python code, you will need to ensure that code is available and correctly registered in the environment where the SavedModel is loaded. For complex dependencies, containerization (e.g., using Docker) becomes essential for deployment.

Saving your model correctly using the SavedModel format, especially with clearly defined signatures, is the foundational step for reliable deployment. It packages your graph, weights, and necessary assets, creating a self-contained artifact ready to be served by TensorFlow Serving, converted for edge devices with TensorFlow Lite, or used in other production pipelines.

Was this section helpful?