Llama 3.3 brings multilingual dialogue capabilities, rivalling larger models like Llama 3.1 (405B) on many benchmarks. This model is "compact" and efficient but requires a capable workstation for optimal performance. This guide covers everything you need to set up and run Llama 3.3 on Ubuntu Linux with Ollama.

System Requirements

Operating System

Ubuntu 20.04 or later

Hardware

Recommended Base System: 8-Core CPU or better, 32GB RAM
Storage: At least 100GB free space for model storage

GPU Variants and Requirements

Variant	VRAM	Recommended Hardware	Best Use Case
latest	43GB	NVIDIA A5000/A6000	Production environments requiring latest features
70b	43GB	NVIDIA A5000/A6000	General purpose usage
70b-instruct-fp16	141GB	Multi-GPU with NVLink	Research requiring maximum precision
70b-instruct-q2_K	26GB	NVIDIA RTX 3090/4090	Home users, basic inference
70b-instruct-q3_K_M	34GB	NVIDIA A5000/A6000	Production deployments
70b-instruct-q3_K_S	31GB	NVIDIA RTX 4090/A5000	Balanced performance/quality
70b-instruct-q4_0	40GB	NVIDIA A6000	Higher quality inference
70b-instruct-q4_1	44GB	NVIDIA A6000	High-quality inference
70b-instruct-q4_K_M	43GB	NVIDIA A6000	Production quality inference
70b-instruct-q4_K_S	40GB	NVIDIA A6000	Balanced inference speed/quality
70b-instruct-q5_0	49GB	NVIDIA A6000	Near FP16 quality
70b-instruct-q5_1	53GB	NVIDIA A100	High-precision inference
70b-instruct-q5_K_M	50GB	NVIDIA A6000/A100	Production quality inference
70b-instruct-q6_K	58GB	NVIDIA A100	High-precision inference
70b-instruct-q8_0	75GB	Multiple A100s	Maximum quality inference

Installation and Setup

Step 1: Installing Ollama

Quick Install (Recommended)

Run the following command to install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Verify the installation:

ollama --version

Manual Install

For manual installation:

Download and extract the package:

curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
sudo tar -C /usr -xzf ollama-linux-amd64.tgz

Start Ollama:
```
ollama serve
```
Verify it is running:
```
ollama -v
```

AMD GPU Support

For systems with AMD GPUs, download and install the ROCm package:

curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz
sudo tar -C /usr -xzf ollama-linux-amd64-rocm.tgz

ARM64 Install

For ARM64 architectures, use this package:

curl -L https://ollama.com/download/ollama-linux-arm64.tgz -o ollama-linux-arm64.tgz
sudo tar -C /usr -xzf ollama-linux-arm64.tgz

Step 2: Downloading Llama 3.3

Download the Llama 3.3 model using Ollama's pull command:

ollama pull llama3.3

To download specific variants optimized for your GPU:

ollama pull llama3.3-70b-instruct-q3_K_M

Step 3: Running Llama 3.3

Start using the model interactively:

ollama run llama3.3

Example interaction:

User: What is the capital of Japan?
Assistant: The capital of Japan is Tokyo.

Step 4: Serving the Model

Starting Ollama Server

To run Ollama as a server without the desktop application:

ollama serve

For development builds:

./ollama serve

Then, in a separate shell, run a model:

./ollama run llama3.3

Using the REST API

Ollama provides a REST API for running and managing models. Here are common usage examples to make a request to the served model:

Generate a Response

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.3",
  "prompt": "Why is the sky blue?"
}'

Chat with the Model

curl http://localhost:11434/api/chat -d '{
  "model": "llama3.3",
  "messages": [
    { "role": "user", "content": "Why is the sky blue?" }
  ]
}'

For a complete list of API endpoints and options, refer to the Ollama API documentation.

Troubleshooting

Common Issues

Installation Issues: Ensure curl is installed:

sudo apt update && sudo apt install curl -y

Performance Lags: Use quantized models (e.g., q3_K_M) or a GPU-enabled system.
Service Logs: Check logs:
```
journalctl -e -u ollama
```

With these detailed instructions, you can confidently install and customize Llama 3.3 for your hardware and use case. Ollama simplifies deployment, allowing you to focus on leveraging Llama 3.3's powerful capabilities for multilingual dialogue and beyond.

Getting Started with Llama 3.3 on Ubuntu Linux with Ollama