Now that you have successfully downloaded a model using the ollama pull
command, you are ready to interact with it directly from your terminal or command prompt. Ollama provides a straightforward way to run a model and start a conversation.
To begin an interactive session with your chosen model, you use the ollama run
command followed by the model's name. For instance, if you downloaded the llama3:8b
model in the previous step, you would type the following command and press Enter:
ollama run llama3:8b
After executing this command, Ollama will load the specified model into your computer's memory (RAM and potentially VRAM if you have a compatible GPU). This loading process might take a few moments, especially the first time you run a particular model or after restarting your computer. You'll know the model is ready when you see a prompt appear, often looking something like this:
>>> Send a message (/? for help):
This prompt signifies that the LLM is waiting for your input. You can now type your questions or instructions directly into the terminal. Let's try asking a simple question:
>>> Send a message (/? for help): What is the primary function of a CPU in a computer?
Press Enter after typing your message. The LLM will process your input and generate a response, streaming the text output back to your terminal. The output might look something like this (the exact wording will vary depending on the model):
The primary function of a Central Processing Unit (CPU) in a computer is to execute instructions from programs. It performs basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions. Essentially, it acts as the "brain" of the computer, carrying out the tasks needed for the system and applications to run.
You can continue the conversation by typing another message at the new prompt. The model generally maintains the context of the current session, allowing you to ask follow-up questions or build upon previous interactions, up to the limits of its context window (which we'll discuss later).
Inside the interactive ollama run
session, you have a few helpful commands you can use by typing a forward slash (/
) followed by the command word. To see the available options, type /?
and press Enter:
>>> Send a message (/? for help): /?
Available commands:
/?, /help Help for commands
/bye, /exit Exit Ollama
/set Set session variables
/show Show session information
/save Save session to file
/load Load session from file
... and potentially others ...
The most common commands you might use initially are:
/bye
or /exit
: Use either of these to stop the current model interaction and exit the Ollama session, returning you to your regular terminal prompt./show info
: Displays details about the currently loaded model./show license
: Shows the license information for the model.When you are finished interacting with the model, simply type /bye
or /exit
and press Enter.
>>> Send a message (/? for help): /exit
Bye!
$
You'll be returned to your standard command prompt (represented by $
or >
or similar, depending on your system). You can also often use the keyboard shortcut Ctrl+D
to exit the session.
Sometimes, you might just want to get a quick answer from a model without starting a full interactive chat session. You can do this by providing the prompt directly on the command line after the model name.
For example, to ask the llama3:8b
model for a short definition of quantization directly, you could run:
ollama run llama3:8b "Briefly explain model quantization"
Ollama will load the model, process the single prompt ("Briefly explain model quantization"), print the output to the terminal, and then immediately exit back to your command prompt. This is useful for simple, one-off tasks or for integrating LLM responses into scripts.
You've now successfully run and interacted with a Large Language Model directly on your computer using Ollama's command-line interface. Feel free to experiment by running different models you've downloaded or trying various types of prompts.
© 2025 ApX Machine Learning