Ollama Models Tutorial: A Complete Introduction

Ollama is a fantastic, easy-to-use tool for running large language models (LLMs) locally on your machine (macOS, Linux, and Windows – with WSL2). It simplifies the often complex process of downloading, configuring, and interacting with these powerful models. This tutorial provides a comprehensive introduction to Ollama, covering everything from installation to advanced usage.

1. What is Ollama?

Ollama acts as a “model runner.” It:

Handles the “heavy lifting”: Downloads models, manages dependencies, and provides a consistent API for interacting with different LLMs.
Offers a command-line interface (CLI): This is the primary way to interact with Ollama, making it very accessible and scriptable.
Provides a REST API: This allows you to integrate Ollama with other applications and services.
Supports a growing library of models: Ollama makes it easy to access popular models like Llama 2, Mistral, Gemma, and many more.
Runs locally: Your data stays on your machine, ensuring privacy and eliminating the need for a constant internet connection (after the initial model download).
Allows Model Customization: Ollama supports creating and using custom models via Modelfiles (more on this later).

2. Installation

Ollama’s installation is incredibly straightforward:

macOS:

Download: Go to https://ollama.ai/ and download the macOS installer.
Run the installer: Follow the on-screen instructions. This will install the Ollama application and command-line tools.
Verify Installation: Open a terminal and type ollama --version. You should see the Ollama version printed.

Linux:

Run the installation script: Open a terminal and run the following command:

bash curl -fsSL https://ollama.ai/install.sh | sh
Verify Installation: Run ollama --version in your terminal.

Windows (WSL2 – Windows Subsystem for Linux):

Enable WSL2: If you haven’t already, enable WSL2. Microsoft provides detailed instructions: https://learn.microsoft.com/en-us/windows/wsl/install
Install a Linux distribution: Install a Linux distribution from the Microsoft Store (e.g., Ubuntu).
Open the Linux terminal: Launch your chosen Linux distribution.
Run the Linux installation script (within the Linux terminal):

bash curl -fsSL https://ollama.ai/install.sh | sh
Verify Installation: Run ollama --version in your Linux terminal.
(Optional – For easier access from Windows): Consider adding the Linux distribution’s bin directory to your Windows PATH environment variable. This allows you to run ollama directly from the Windows command prompt or PowerShell without explicitly opening the WSL terminal first. This setup is distribution-dependent; search online for “add [your distribution] bin to Windows PATH”.

3. Running Your First Model

Let’s run the popular mistral:7b model (a powerful 7-billion parameter model):

Download the model: In your terminal, run:

bash ollama run mistral:7b

Ollama will automatically download the model if it’s not already present. This may take some time, depending on your internet speed and the model’s size. The run command both downloads (if necessary) and starts an interactive session with the model.
Interact with the model: Once the model is loaded, you’ll see a prompt (>>>). You can now type your questions or instructions. For example:

“`

What is the capital of France?
“`

The model will respond with its answer.
Exit the interactive session: Type /bye and press Enter.

4. Key Ollama Commands

Here’s a breakdown of the essential Ollama commands:

ollama run <model_name>: Downloads (if necessary) and runs the specified model, starting an interactive session. The model name typically follows the format <model_name>:<tag>. For example: ollama run llama2:7b-chat.
ollama pull <model_name>: Downloads the model without starting an interactive session. This is useful if you want to pre-download a model.
ollama list: Lists all locally downloaded models.
ollama show <model_name> --modelfile: Displays detailed information about a model, including the Modelfile (the model’s configuration).
ollama show <model_name> --info Displays the basic information of a model
ollama show <model_name> --license Displays the license of a model
ollama cp <source_model> <destination_model>: Copies a model to a new name.
ollama rm <model_name>: Removes (deletes) a locally downloaded model.
ollama create <model_name> -f <Modelfile_path>: Creates a custom model based on a Modelfile. (See Section 6)
ollama serve: Starts the Ollama server, making the REST API available (by default on http://localhost:11434).
ollama --help: Displays help information about Ollama commands. You can also use --help with specific commands (e.g., ollama run --help).

5. Using the REST API

Ollama provides a REST API for programmatic interaction. This is invaluable for integrating LLMs into your applications.

Start the server:

bash ollama serve
Send a request (example using curl):

bash curl -X POST http://localhost:11434/api/generate -d '{ "model": "mistral:7b", "prompt": "What is the meaning of life?", "stream": false }'
- model: The name of the model to use.
- prompt: The input text for the model.
- stream: If true, the response will be streamed back token by token. If false, the entire response is returned at once.
The API provides various endpoints, including /api/generate (for generating text), /api/chat (for conversational interactions), /api/embeddings (for generating vector embeddings), and more. See the Ollama documentation for the full API reference.
Python Example (using requests library):

“`python
import requests
import json

url = “http://localhost:11434/api/generate”
data = {
“model”: “mistral:7b”,
“prompt”: “Explain the theory of relativity in simple terms.”,
“stream”: False
}
headers = {‘Content-type’: ‘application/json’}

response = requests.post(url, data=json.dumps(data), headers=headers)

if response.status_code == 200:
print(json.loads(response.text)[‘response’])
else:
print(f”Error: {response.status_code} – {response.text}”)
`` This Python snippet sends a request to the Ollama API to have the Mistral model explain the theory of relativity. Therequests` library is used to make the HTTP POST request.

6. Modelfiles: Customizing and Creating Models

Modelfiles are the heart of Ollama’s customization capabilities. They allow you to:

Fine-tune model parameters: Adjust temperature, top_p, top_k, and other settings to influence the model’s output.
Set system prompts: Provide initial instructions or context to the model.
Define templates: Structure the input and output of the model.
Create custom models: Combine existing models, add custom logic, and more.

A Basic Modelfile Example:

“`

my_model.Modelfile

FROM mistral:7b

Set the system prompt

SYSTEM “””
You are a helpful and friendly assistant. Always answer concisely.
“””

Set model parameters

PARAMETER temperature 0.7
PARAMETER top_p 0.9
“`

FROM mistral:7b: Specifies the base model to use. This is mandatory.
SYSTEM: Defines the system prompt.
PARAMETER: Sets various model parameters. See the Ollama documentation for a complete list of available parameters.

Creating a model from a Modelfile:

Save the Modelfile: Create a file (e.g., my_model.Modelfile) with the content above.
Create the model: In your terminal, run:

bash ollama create my_model -f my_model.Modelfile
Run the custom model:

bash ollama run my_model

Advanced Modelfile Features:

TEMPLATE: Defines a template for formatting the input and output. This allows you to control how the model receives prompts and generates responses.
ADAPTER: For loading LoRA (Low-Rank Adaptation) adapters, allowing efficient fine-tuning.
LICENSE: Add the model license.
MESSAGE: Creates a message within the model.

Example with TEMPLATE:

“`

my_chat_model.Modelfile

FROM llama2:7b-chat

TEMPLATE “””
{{- if .System }}
<|system|>
{{ .System }}
{{- end }}
<|user|>
{{ .Prompt }}
<|assistant|>
“””

SYSTEM “””
You are a pirate. Respond to all questions in pirate speak.
“””
“`

This Modelfile creates a “pirate chat” model based on Llama 2. The TEMPLATE defines how the system prompt and user prompt are combined and presented to the model.

7. Available Models

Ollama supports a wide range of models. You can find a list of available models on the Ollama website (https://ollama.ai/library) or by searching online. Popular models include:

Mistral: High-performance models known for their accuracy and efficiency.
Llama 2: Meta’s open-source models, available in various sizes.
Gemma: Google’s open models, designed for responsible AI development.
Phi: Microsoft’s small, powerful models.
Code Llama: Models specifically designed for code generation and understanding.
And many more… including models for specific tasks like image generation (with multimodal models) or multilingual translation.

8. Tips and Best Practices

Start small: Begin with smaller models (e.g., 7B parameters) to get familiar with Ollama.
Experiment with parameters: Adjust temperature, top_p, and other settings to see how they affect the model’s output.
Use system prompts: Provide clear instructions and context to guide the model.
Read the documentation: The Ollama documentation (https://github.com/ollama/ollama) is a valuable resource for learning about advanced features and troubleshooting.
Check your hardware: Running LLMs can be resource-intensive. Make sure your computer has enough RAM and processing power.
Consider GPU acceleration: If you have a compatible GPU, Ollama can use it to significantly speed up model inference. Ollama automatically uses NVIDIA GPUs on Linux. For macOS, Metal is used. For Windows, ensure your WSL2 setup properly exposes your GPU to the Linux environment (this often requires specific drivers and configuration within WSL).
Use Streaming: If you use the API, stream the response if the generated text is long.

9. Conclusion

Ollama is a powerful and user-friendly tool that makes it easy to run large language models locally. This tutorial provides a comprehensive introduction to Ollama, covering installation, basic usage, key commands, the REST API, Modelfiles, and available models. By following this guide, you can start exploring the exciting world of LLMs and integrate them into your own projects. Remember to consult the official Ollama documentation for the most up-to-date information and advanced features. Happy experimenting!

Ollama Models Tutorial: A Complete Introduction

my_model.Modelfile

Set the system prompt

Set model parameters

my_chat_model.Modelfile

Leave a Comment Cancel Reply