Ollama API Setup Guide

Setting Up Ollama (Local Installation)

We recommend using Docker, as it is less demanding on your machine’s hardware. If your system has sufficient processing power, you may opt to download and run the application directly on your host machine. For enhanced privacy, consider setting up a dedicated endpoint to further secure your setup.

Step 1: Install Ollama

macOS/Linux:


curl -fsSL https://ollama.com/install.sh | sh

Windows: Download the installer from ollama.com

Docker:


docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Step 2: Start Ollama Service


ollama serve

The API will be available at http://localhost:11434

Step 3: Download Models


# Example: Download Llama 3.2
ollama pull llama3.2
 
# List available models
ollama list

Available Models (Popular Options)

Meta Models

llama3.2:1b - Compact 1B parameter model
llama3.2:3b - Balanced 3B parameter model
llama3.2:7b - Standard 7B parameter model
llama3.2:70b - Large 70B parameter model
llama3.1:8b - Previous generation model

Mistral Models

mistral:7b - Efficient 7B model
mistral:latest - Latest Mistral model
mixtral:8x7b - Mixture of experts model

Code Models

codellama:7b - Code generation specialist
codellama:13b - Larger code model
starcoder2:7b - Alternative code model

Other Popular Models

gemma2:2b - Google’s compact model
gemma2:9b - Larger Gemma model
phi3:3.8b - Microsoft’s efficient model
qwen2.5:7b - Alibaba’s multilingual model

Key Features

Free and Local: No API costs, runs entirely locally
Privacy: Your data never leaves your machine
Offline: Works without internet connection
Customizable: Fine-tune and create custom models
Multiple Formats: Support for various model formats

API Endpoints

Generate: POST /api/generate
Chat: POST /api/chat
Models: GET /api/tags
Pull: POST /api/pull
Push: POST /api/push

System Requirements

RAM: 8GB minimum (16GB+ recommended for larger models)
Storage: 5-50GB per model depending on size
GPU: Optional but recommended (NVIDIA, AMD, or Apple Silicon)

Configuration

Models are stored in:

macOS: ~/.ollama/models
Linux: ~/.ollama/models
Windows: %USERPROFILE%\.ollama\models

Performance Tips

Use GPU acceleration when available
Choose model size based on your RAM
Consider quantized models for better performance
Use SSD storage for faster model loading