Skip to Content
Nexalexica Beta 1.0.0 is released 🎉 Try it now
DocumentationAPIGroq API Setup Guide

Groq API Setup Guide

Getting Your Groq API Key

Step 1: Create a Groq Account

  1. Visit console.groq.com 
  2. Click “Sign Up” to create an account
  3. Verify your email and complete registration

Step 2: Access API Keys

  1. Log into your Groq console
  2. Navigate to the “API Keys” section
  3. Click “Create API Key”

Step 3: Generate and Store Key

  1. Name your API key for identification
  2. Copy the generated key (starts with gsk_)
  3. Store it securely - you won’t see it again

Available Groq Models

Llama Models (Meta)

  • llama-3.2-1b-preview - Ultra-fast 1B parameter model
  • llama-3.2-3b-preview - Balanced 3B parameter model
  • llama-3.2-11b-text-preview - Text-only 11B model
  • llama-3.2-90b-text-preview - Large 90B text model
  • llama-3.1-8b-instant - Fast 8B instruction model
  • llama-3.1-70b-versatile - Versatile large model

Mixtral Models (Mistral)

  • mixtral-8x7b-32768 - Mixture of experts model
  • mistral-7b-instruct-v0.1 - Instruction-tuned Mistral

Gemma Models (Google)

  • gemma-7b-it - Instruction-tuned Gemma
  • gemma2-9b-it - Latest Gemma generation

Key Features

  • Ultra-Fast Inference: Hardware-accelerated LPU™ chips
  • High Throughput: Up to 750+ tokens per second
  • Low Latency: Near real-time responses
  • Cost-Effective: Competitive pricing per token
  • OpenAI Compatible: Drop-in API replacement

Speed Performance

Groq’s LPU™ (Language Processing Unit) technology delivers:

  • Llama-3.2-1b: ~750+ tokens/second
  • Llama-3.1-8b: ~500+ tokens/second
  • Llama-3.1-70b: ~250+ tokens/second
  • Mixtral-8x7b: ~350+ tokens/second

Context Windows

  • Llama 3.2 models: 128K tokens
  • Llama 3.1 models: 128K tokens
  • Mixtral models: 32K tokens
  • Gemma models: Varies by version

Pricing

  • Competitive rates: Often lower than major providers
  • Pay per token: Input and output tokens charged separately
  • Free tier: Available for testing and development
  • Volume discounts: Available for high usage

Rate Limits

Free Tier

  • Requests per minute: 30
  • Requests per day: 14,400
  • Tokens per minute: 6,000
  • Higher rate limits based on usage plan
  • Enterprise options available
  • Custom limits for high-volume users

Use Cases

  • Real-time chat applications
  • Interactive AI assistants
  • Live content generation
  • Gaming and entertainment
  • Developer tools and IDEs

Security Tips

  • Never commit keys to version control
  • Rotate keys regularly
  • Monitor usage in Groq console
  • Use least-privilege access principles
Last updated on