Ollama API Setup Guide
Setting Up Ollama (Local Installation)
We recommend using Docker, as it is less demanding on your machine’s hardware. If your system has sufficient processing power, you may opt to download and run the application directly on your host machine. For enhanced privacy, consider setting up a dedicated endpoint to further secure your setup.
Step 1: Install Ollama
macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download the installer from ollama.com 
Docker:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Step 2: Start Ollama Service
ollama serve
The API will be available at http://localhost:11434
Step 3: Download Models
# Example: Download Llama 3.2
ollama pull llama3.2
# List available models
ollama list
Available Models (Popular Options)
Meta Models
- llama3.2:1b - Compact 1B parameter model
- llama3.2:3b - Balanced 3B parameter model
- llama3.2:7b - Standard 7B parameter model
- llama3.2:70b - Large 70B parameter model
- llama3.1:8b - Previous generation model
Mistral Models
- mistral:7b - Efficient 7B model
- mistral:latest - Latest Mistral model
- mixtral:8x7b - Mixture of experts model
Code Models
- codellama:7b - Code generation specialist
- codellama:13b - Larger code model
- starcoder2:7b - Alternative code model
Other Popular Models
- gemma2:2b - Google’s compact model
- gemma2:9b - Larger Gemma model
- phi3:3.8b - Microsoft’s efficient model
- qwen2.5:7b - Alibaba’s multilingual model
Key Features
- Free and Local: No API costs, runs entirely locally
- Privacy: Your data never leaves your machine
- Offline: Works without internet connection
- Customizable: Fine-tune and create custom models
- Multiple Formats: Support for various model formats
API Endpoints
- Generate:
POST /api/generate
- Chat:
POST /api/chat
- Models:
GET /api/tags
- Pull:
POST /api/pull
- Push:
POST /api/push
System Requirements
- RAM: 8GB minimum (16GB+ recommended for larger models)
- Storage: 5-50GB per model depending on size
- GPU: Optional but recommended (NVIDIA, AMD, or Apple Silicon)
Configuration
Models are stored in:
- macOS:
~/.ollama/models
- Linux:
~/.ollama/models
- Windows:
%USERPROFILE%\.ollama\models
Performance Tips
- Use GPU acceleration when available
- Choose model size based on your RAM
- Consider quantized models for better performance
- Use SSD storage for faster model loading
Last updated on