Llama 3.2 1b

Meta
meta/llama-3.2-1b

128K context

Context Window

128K

128,000 tokens

Max Output

4K

4,096 tokens

About this model

Smallest Llama 3.2, ultra-fast and ultra-cheap

This model supports up to 128K tokens of context.

Access it through Chuizi.AI with a single ck- API key β€” no separate Meta account needed.

Highlights

128K context window
4K max output
Multi-cloud failover
Unified billing via Chuizi.AI

Best For

Conversational AIContent generationSummarizationQ&A applications
2024-09-25

Capabilities

Chat

Aliases

llama-3.2-1b

Pricing (per 1M tokens)

Pricing (per 1M tokens)/ 1M tokens
Input / 1M$0.10
Output / 1M$0.10

Final prices shown

Quick Start

main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="meta/llama-3.2-1b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

FAQ

Related Models

Llama 3.2 1b β€” Pricing, Context, Capabilities | Chuizi AI