To make our trained machine learning model usable by other applications or users, we need a way for them to send data to it and receive predictions back. As discussed earlier, doing this typically involves using the Hypertext Transfer Protocol (HTTP), the foundation of data communication for the World Wide Web. Building a service that correctly handles incoming HTTP requests, parses the data, directs the request to the appropriate logic (our model's prediction function), and then formats and sends back an HTTP response involves quite a bit of low-level networking and protocol management.
Imagine having to write code from scratch to listen for network connections, interpret raw HTTP requests (like GET or POST), extract headers and body content, manage different URL paths, and construct valid HTTP responses every time you wanted to build a web-accessible application. This would be repetitive, error-prone, and distract from the main task, which in our case is serving model predictions.
This is where web frameworks come in. A web framework is a software library or collection of tools that provides a standard way to build and deploy web applications, including web services and APIs. It handles much of the underlying complexity and boilerplate code associated with web development, allowing developers to focus on the specific application logic.
Think of a framework as providing a structured skeleton for your web application. It offers pre-built components and conventions for common tasks such as:
/predict
, /status
) to specific functions in your code that should handle requests to those paths. When a request arrives for a particular URL, the framework ensures the correct function is called.200 OK
or 404 Not Found
), headers, and formatting the response body, often converting Python data structures (like dictionaries containing predictions) into JSON.Using a web framework offers significant advantages:
There are many web frameworks available for Python, each with different philosophies and feature sets. Some, like Django, are "full-stack" frameworks offering a wide array of built-in tools for large applications. Others, like Flask, are considered "microframeworks." Flask provides the core essentials for routing and handling requests/responses but keeps things minimal and flexible, allowing developers to choose additional components as needed. Its simplicity makes it an excellent choice for building focused web services and APIs, such as the prediction service we aim to create in this chapter.
In the following sections, we will install Flask and use its features to build a simple web server that loads our saved machine learning model and exposes it through an API endpoint.
© 2025 ApX Machine Learning