Groq API Setup Guide
Getting Your Groq API Key
Step 1: Create a Groq Account
- Visit console.groq.com 
- Click “Sign Up” to create an account
- Verify your email and complete registration
Step 2: Access API Keys
- Log into your Groq console
- Navigate to the “API Keys” section
- Click “Create API Key”
Step 3: Generate and Store Key
- Name your API key for identification
- Copy the generated key (starts with
gsk_
) - Store it securely - you won’t see it again
Available Groq Models
Llama Models (Meta)
- llama-3.2-1b-preview - Ultra-fast 1B parameter model
- llama-3.2-3b-preview - Balanced 3B parameter model
- llama-3.2-11b-text-preview - Text-only 11B model
- llama-3.2-90b-text-preview - Large 90B text model
- llama-3.1-8b-instant - Fast 8B instruction model
- llama-3.1-70b-versatile - Versatile large model
Mixtral Models (Mistral)
- mixtral-8x7b-32768 - Mixture of experts model
- mistral-7b-instruct-v0.1 - Instruction-tuned Mistral
Gemma Models (Google)
- gemma-7b-it - Instruction-tuned Gemma
- gemma2-9b-it - Latest Gemma generation
Key Features
- Ultra-Fast Inference: Hardware-accelerated LPU™ chips
- High Throughput: Up to 750+ tokens per second
- Low Latency: Near real-time responses
- Cost-Effective: Competitive pricing per token
- OpenAI Compatible: Drop-in API replacement
Speed Performance
Groq’s LPU™ (Language Processing Unit) technology delivers:
- Llama-3.2-1b: ~750+ tokens/second
- Llama-3.1-8b: ~500+ tokens/second
- Llama-3.1-70b: ~250+ tokens/second
- Mixtral-8x7b: ~350+ tokens/second
Context Windows
- Llama 3.2 models: 128K tokens
- Llama 3.1 models: 128K tokens
- Mixtral models: 32K tokens
- Gemma models: Varies by version
Pricing
- Competitive rates: Often lower than major providers
- Pay per token: Input and output tokens charged separately
- Free tier: Available for testing and development
- Volume discounts: Available for high usage
Rate Limits
Free Tier
- Requests per minute: 30
- Requests per day: 14,400
- Tokens per minute: 6,000
Paid Tiers
- Higher rate limits based on usage plan
- Enterprise options available
- Custom limits for high-volume users
Use Cases
- Real-time chat applications
- Interactive AI assistants
- Live content generation
- Gaming and entertainment
- Developer tools and IDEs
Security Tips
- Never commit keys to version control
- Rotate keys regularly
- Monitor usage in Groq console
- Use least-privilege access principles
Last updated on