By Wei Ming T. on Dec 5, 2024
Hosting machine learning (ML) models on AWS Lambda provides a scalable, serverless solution for real-time inference. By leveraging the Serverless Framework, you can simplify the deployment process, automate infrastructure management, and focus on delivering robust ML services. This guide will show you how to prepare, package, and deploy your model using AWS Lambda and the Serverless Framework.
AWS Lambda offers a range of benefits for deploying ML models:
To follow this guide, you'll need:
npm install -g serverless
Use the Serverless CLI to scaffold a new project:
serverless create --template aws-python --path ml-lambda-service
cd ml-lambda-service
Navigate to your service directory and install the required Python dependencies locally:
pip install transformers torch -t ./
Configure the Lambda function and resources in the serverless.yml file:
service: ml-lambda-service
provider:
name: aws
runtime: python3.9
memorySize: 1024 # Adjust based on your model's needs
timeout: 10 # Extend if your model requires more inference time
functions:
predict:
handler: handler.lambda_handler
events:
- http:
path: predict
method: post
Create a handler.py file in your service directory. This script will load the model, handle incoming requests, and return predictions:
import json
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model and tokenizer at initialization
model = AutoModelForSequenceClassification.from_pretrained("path_to_model")
tokenizer = AutoTokenizer.from_pretrained("path_to_model")
def lambda_handler(event, context):
try:
# Parse input
body = json.loads(event['body'])
input_text = body.get('text', '')
# Tokenize and make prediction
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()
return {
'statusCode': 200,
'body': json.dumps({'prediction': prediction})
}
except Exception as e:
return {
'statusCode': 500,
'body': json.dumps({'error': str(e)})
}
Include the Python dependencies, model files, and code. Compress them into a .zip file.
Run the deployment command to upload your Lambda function to AWS:
serverless deploy
After deployment, the Serverless Framework will provide an endpoint URL. Use this URL to test your API with tools like Postman or cURL:
curl -X POST https://your-api-url/predict \
-H "Content-Type: application/json" \
-d '{"text": "Your input text here"}'
Deploying machine learning models on AWS Lambda with the Serverless Framework simplifies the deployment process and provides a serverless, scalable, and cost-effective solution. By following this guide, you can focus on building intelligent, responsive applications without worrying about infrastructure management.
© 2024 ApX Machine Learning. All rights reserved.
Learn Data Science & Machine Learning
Machine Learning Tools
Featured Posts