Llama 3.2 1b

About this model

Smallest Llama 3.2, ultra-fast and ultra-cheap

This model supports up to 128K tokens of context.

Access it through Chuizi.AI with a single ck- API key — no separate Meta account needed.

Highlights

128K context window

4K max output

Multi-cloud failover

Unified billing via Chuizi.AI

Best For

Conversational AIContent generationSummarizationQ&A applications

2024-09-25

Capabilities

Chat

Aliases

llama-3.2-1b

Pricing (per 1M tokens)

Pricing (per 1M tokens)	/ 1M tokens
Input / 1M	$0.10
Output / 1M	$0.10

Final prices shown

Quick Start

main.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.chuizi.ai/v1",
    api_key="ck-your-key-here",
)

response = client.chat.completions.create(
    model="meta/llama-3.2-1b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Llama 3.2 1b

About this model

Highlights

Best For

Capabilities

Aliases

Pricing (per 1M tokens)

Quick Start

FAQ

Related Models

Llama 4 Maverick

Llama 4 Scout

Llama 3.3 70b

Llama 3.1 405b

Llama 3.1 70b

Llama 3.1 8b

Llama 3.2 1b

About this model

Highlights

Best For

Capabilities

Aliases

Pricing (per 1M tokens)

Quick Start

FAQ

How do I get an API Key?

How does billing work?

What payment methods are supported?

Are there rate limits?

Are Llama models really that cheap?

Related Models

Llama 4 Maverick

Llama 4 Scout

Llama 3.3 70b

Llama 3.1 405b

Llama 3.1 70b

Llama 3.1 8b