After training a model, the next significant step is preparing it for use outside the development environment. Whether deploying to a powerful server cluster or a resource-constrained mobile device, you need a standardized, self-contained way to package your model. TensorFlow provides the SavedModel
format precisely for this purpose. It's more than just saving model weights; it captures the complete TensorFlow program, including the computation graph, variable values, and any necessary assets or metadata, making models portable and servable.
Think of the SavedModel
as a serialized representation of your entire TensorFlow model, ready for deployment. It's the recommended format for production environments and serves as the intermediate format for converting models to TensorFlow Lite or TensorFlow.js.
When you save a model in this format, TensorFlow creates a directory containing:
saved_model.pb
or saved_model.pbtxt
: This file stores the structure of the computation graph(s) defined by tf.function
s and metadata like signatures. It uses the Protocol Buffer format (binary .pb
or text .pbtxt
).variables/
directory: This subdirectory contains the trained values (weights, biases, etc.) of your model's variables, often stored in checkpoint files.assets/
directory: An optional subdirectory holding external files needed by the graph, such as vocabulary files for text processing or class mappings.assets.extra/
directory: An optional subdirectory where libraries or users can add their own assets, co-located with the model but not directly read by the TensorFlow graph.fingerprint.pb
: A file containing a unique fingerprint identifying the SavedModel.A typical directory structure for a TensorFlow SavedModel.
The simplest way to create a SavedModel
is using the save
method of a tf.keras.Model
instance. Keras handles the details of serialization for standard models.
import tensorflow as tf
# Assume 'model' is a compiled tf.keras.Model instance
# model = tf.keras.Sequential([...])
# model.compile(...)
# model.fit(...)
# Save the model in SavedModel format
model.save("my_keras_model")
# You can also specify signatures explicitly (highly recommended for serving)
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 784], dtype=tf.float32)])
def serving_default(input_tensor):
return {'output': model(input_tensor)}
model.save("my_keras_model_with_signature", signatures={'serving_default': serving_default})
Calling model.save("directory_path")
creates a directory containing the SavedModel
. By default, Keras attempts to save the model's forward pass (call
method) and potentially its training/evaluation configurations. However, explicitly defining and saving signatures is important for deployment, as we'll see shortly.
tf.saved_model.save
For more fine-grained control, especially when working with custom tf.Module
objects or needing to define specific functions to expose, you can use the lower-level tf.saved_model.save
API. This is what model.save
uses internally for Keras models.
import tensorflow as tf
# Example using a custom tf.Module
class MyCustomModule(tf.Module):
def __init__(self, name=None):
super().__init__(name=name)
self.my_variable = tf.Variable(5.0, name="my_var")
self.untracked_list = [] # Won't be saved
@tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
def __call__(self, x):
# This function will be traced and saved as a signature
print("Tracing MyCustomModule.__call__")
return x * self.my_variable
@tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
def add_amount(self, amount):
# Another function to expose
print("Tracing MyCustomModule.add_amount")
return self.my_variable + amount
module_to_save = MyCustomModule()
# Save the module, exposing its methods as signatures
tf.saved_model.save(module_to_save, "my_module_savedmodel",
signatures= {
'serving_default': module_to_save.__call__,
'add': module_to_save.add_amount
})
print(f"SavedModel created at: my_module_savedmodel")
print(f"Available signatures: {tf.saved_model.load('my_module_savedmodel').signatures.keys()}")
Here, we explicitly pass the tf.Module
instance and a dictionary defining the signatures we want to make available in the saved model. Only attributes that are tf.Variable
, tf.Module
, or trackable data structures (like lists/dicts containing trackable objects) are saved. Python primitives or non-trackable collections (like self.untracked_list
above) are lost.
Signatures are the entry points into your SavedModel
for inference. They define the expected input tensors (shape and dtype) and the corresponding output tensors for specific functions within your model. When you deploy a model using TensorFlow Serving or other platforms, these signatures define the API endpoints you can call.
@tf.function
and providing an input_signature
. The input_signature
is a list or tuple of tf.TensorSpec
objects, specifying the shape and dtype for each input argument.serving_default
: By convention, many serving systems look for a signature named serving_default
. It's good practice to define this signature for the most common inference task of your model.When saving a Keras model without explicit signatures, Keras often creates a default signature based on the model's call
method, but relying on this implicit behavior can sometimes lead to unexpected results, especially with complex input processing. Explicitly defining signatures using @tf.function
and input_signature
before calling model.save
or when using tf.saved_model.save
is the recommended approach for robust deployment.
Once saved, you can load a SavedModel
back into a TensorFlow program using tf.saved_model.load
. This function restores the tf.Module
or tf.keras.Model
object, its variables, assets, and crucially, the tf.function
-decorated methods saved as signatures.
import tensorflow as tf
# Load the Keras model saved earlier
loaded_keras_model = tf.saved_model.load("my_keras_model_with_signature")
# Access the signatures dictionary
print(f"Available signatures: {list(loaded_keras_model.signatures.keys())}")
# Get the specific function object for the 'serving_default' signature
inference_func = loaded_keras_model.signatures['serving_default']
# Prepare some dummy input data matching the input_signature
dummy_input = tf.random.uniform([2, 784], dtype=tf.float32) # Batch of 2, 784 features
# Call the function
output_dict = inference_func(dummy_input)
print(f"Output tensor shape: {output_dict['output'].shape}")
# Load the custom module saved earlier
loaded_module = tf.saved_model.load("my_module_savedmodel")
# Access its signatures
add_func = loaded_module.signatures['add']
default_func = loaded_module.signatures['serving_default']
# Call the functions
result1 = default_func(tf.constant(10.0))
print(f"Default func output: {result1['output'].numpy()}") # Output based on saved variable
result2 = add_func(amount=tf.constant(3.0))
print(f"Add func output: {result2['output'].numpy()}")
The loaded object is not exactly the original Python object. It's a specialized internal user object (_UserObject
) that holds the restored state and functions. You interact with it primarily through its signatures
attribute, which is a dictionary mapping signature keys (like 'serving_default'
) to the corresponding concrete TensorFlow functions. Calling these functions executes the restored computation graph. Notice that Python code within the original @tf.function
(like print
statements) only runs during the initial tracing when the function is first called or saved, not when calling the loaded signature.
Before deploying, it's often useful to inspect the contents of a SavedModel
, particularly its available signatures and their input/output specifications. TensorFlow provides the saved_model_cli
command-line tool for this.
# Show all information about the SavedModel
saved_model_cli show --dir my_keras_model_with_signature --all
# Example Output Snippet:
# MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
#
# signature_def['__saved_model_init_op']:
# The Bypassed SessionInit op.
#
# signature_def['serving_default']:
# The given SavedModel SignatureDef contains the following input(s):
# inputs['input_tensor'] tensor_info:
# dtype: DT_FLOAT
# shape: (-1, 784)
# name: serving_default_input_tensor:0
# The given SavedModel SignatureDef contains the following output(s):
# outputs['output'] tensor_info:
# dtype: DT_FLOAT
# shape: (-1, 10) # Assuming a 10-class output
# name: StatefulPartitionedCall:0
# Method name is: tensorflow/serving/predict
# Show only signatures for a specific tag-set (usually 'serve' for TF Serving)
saved_model_cli show --dir my_module_savedmodel --tag_set serve
# Show specific signature details
saved_model_cli show --dir my_module_savedmodel --tag_set serve --signature_def add
The saved_model_cli show
command reveals the structure, available tag-sets (groups of graphs, typically just serve
for inference), and the detailed input/output tensor information for each signature within a specified tag-set. This allows you to verify that the model was saved correctly and understand how to structure inference requests.
If your model uses custom layers, activation functions, loss functions, or other custom Python objects defined by subclassing Keras or TensorFlow base classes, saving and loading require special attention.
model.save
, Keras usually handles custom objects registered via tf.keras.utils.register_keras_serializable
or passed via the custom_objects
argument during loading (tf.keras.models.load_model(..., custom_objects=...)
). However, the SavedModel
format aims for portability, meaning the loading environment (like TF Serving) might not have access to your custom Python code.@tf.function
s or custom layer/model methods. This ensures the logic is captured within the TensorFlow graph itself inside the SavedModel
. If you must rely on external Python code, you will need to ensure that code is available and correctly registered in the environment where the SavedModel
is loaded. For complex dependencies, containerization (e.g., using Docker) becomes essential for deployment.Saving your model correctly using the SavedModel
format, especially with clearly defined signatures, is the foundational step for reliable deployment. It packages your graph, weights, and necessary assets, creating a self-contained artifact ready to be served by TensorFlow Serving, converted for edge devices with TensorFlow Lite, or used in other production pipelines.
© 2025 ApX Machine Learning