Skip to Content
Nexalexica Beta 1.0.0 is released 🎉 Try it now
DocumentationAPIOllama API Setup Guide

Ollama API Setup Guide

Setting Up Ollama (Local Installation)

We recommend using Docker, as it is less demanding on your machine’s hardware. If your system has sufficient processing power, you may opt to download and run the application directly on your host machine. For enhanced privacy, consider setting up a dedicated endpoint to further secure your setup.

Step 1: Install Ollama

macOS/Linux:

curl -fsSL https://ollama.com/install.sh | sh

Windows: Download the installer from ollama.com 

Docker:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Step 2: Start Ollama Service

ollama serve

The API will be available at http://localhost:11434

Step 3: Download Models

# Example: Download Llama 3.2 ollama pull llama3.2 # List available models ollama list

Meta Models

  • llama3.2:1b - Compact 1B parameter model
  • llama3.2:3b - Balanced 3B parameter model
  • llama3.2:7b - Standard 7B parameter model
  • llama3.2:70b - Large 70B parameter model
  • llama3.1:8b - Previous generation model

Mistral Models

  • mistral:7b - Efficient 7B model
  • mistral:latest - Latest Mistral model
  • mixtral:8x7b - Mixture of experts model

Code Models

  • codellama:7b - Code generation specialist
  • codellama:13b - Larger code model
  • starcoder2:7b - Alternative code model
  • gemma2:2b - Google’s compact model
  • gemma2:9b - Larger Gemma model
  • phi3:3.8b - Microsoft’s efficient model
  • qwen2.5:7b - Alibaba’s multilingual model

Key Features

  • Free and Local: No API costs, runs entirely locally
  • Privacy: Your data never leaves your machine
  • Offline: Works without internet connection
  • Customizable: Fine-tune and create custom models
  • Multiple Formats: Support for various model formats

API Endpoints

  • Generate: POST /api/generate
  • Chat: POST /api/chat
  • Models: GET /api/tags
  • Pull: POST /api/pull
  • Push: POST /api/push

System Requirements

  • RAM: 8GB minimum (16GB+ recommended for larger models)
  • Storage: 5-50GB per model depending on size
  • GPU: Optional but recommended (NVIDIA, AMD, or Apple Silicon)

Configuration

Models are stored in:

  • macOS: ~/.ollama/models
  • Linux: ~/.ollama/models
  • Windows: %USERPROFILE%\.ollama\models

Performance Tips

  • Use GPU acceleration when available
  • Choose model size based on your RAM
  • Consider quantized models for better performance
  • Use SSD storage for faster model loading
Last updated on