LLM pricing

Large Language Model api pricing can be hard to calculate, that is why we made this simple calculator to help you estimate your llm costs.

Tokens

Context window: 64,000

Google

$

1 /mo


Gemini 1.5 flash-8b pricing

Variant: 128k context

Context windows size:

1000 k
Description:

smallest model for lower intelligence use cases

More info

Per million tokens

Input:

$0.0375

Output:

$0.15

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

2 /mo


Gemini 2.0 Flash-Lite pricing

Variant: 1m context

Context windows size:

1000 k
Description:

smallest and most cost effective model

More info

Per million tokens

Input:

$0.075

Output:

$0.3

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

2 /mo


Gemini 1.5 flash pricing

Variant: 128k context

Context windows size:

128 k
Description:

fastest model for repetitive tasks

More info

Per million tokens

Input:

$0.075

Output:

$0.3

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

3 /mo


Gemini 2.0 Flash pricing

Variant: 1m context

Context windows size:

1000 k
Description:

most capable multi-modal model

More info

Per million tokens

Input:

$0.1

Output:

$0.4

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Ai21

$

3 /mo


Jamba 1.5 pricing

Variant: Mini

Context windows size:

256 k
Description:

Efficient & lightweight

More info

Per million tokens

Input:

$0.2

Output:

$0.4

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

4 /mo


Gemini 1.5 flash pricing

Variant: 1 mill context

Context windows size:

1000 k
Description:

fastest model for repetitive tasks

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Cohere

$

4 /mo


Command R pricing

Variant:

Context windows size:

128 k
Description:

optimized for long context tasks

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

OpenAI

$

5 /mo


GPT-4o mini pricing

Variant:

Context windows size:

128 k
Description:

cost-efficient small model

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0.75

Output Batch:

$0.3

Cached input:

$0.075

Cached output:

$0

OpenAI

$

5 /mo


GPT-4o mini pricing

Variant: 2024-07-18

Context windows size:

128 k
Description:

cost-efficient small model

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0.75

Output Batch:

$0.3

Cached input:

$0.075

Cached output:

$0

Google

$

6 /mo


Gemini 1.5 flash-8b pricing

Variant: 1 mill context

Context windows size:

1000 k
Description:

smallest model for lower intelligence use cases

More info

Per million tokens

Input:

$0.75

Output:

$0.3

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Deepseek

$

7 /mo


deepseek-chat pricing

Variant: DeepSeek-V3

Context windows size:

64 k
Description:

More info

Per million tokens

Input:

$0.27

Output:

$1.1

Input Batch:

$0

Output Batch:

$0

Cached input:

$0.07

Cached output:

$0

Cohere

$

8 /mo


Command R pricing

Variant: Fine-tuned

Context windows size:

128 k
Description:

optimized for long context tasks

More info

Per million tokens

Input:

$0.3

Output:

$1.2

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Deepseek

$

12 /mo


deepseek-reasoner pricing

Variant: DeepSeek-R1

Context windows size:

64 k
Description:

More info

Per million tokens

Input:

$0.14

Output:

$2.19

Input Batch:

$0

Output Batch:

$0

Cached input:

$0.14

Cached output:

$0

Anthropic

$

25 /mo


Claude 3.5 pricing

Variant: Haiku

Context windows size:

200 k
Description:

Fastest, most cost-effective model

More info

Per million tokens

Input:

$0.8

Output:

$4

Input Batch:

$0.4

Output Batch:

$2

Cached input:

$1

Cached output:

$0.08

OpenAI

$

29 /mo


o3-mini pricing

Variant: o3

Context windows size:

200 k
Description:

cost-efficient small model

More info

Per million tokens

Input:

$1.1

Output:

$4.4

Input Batch:

$0.55

Output Batch:

$2.2

Cached input:

$0.55

Cached output:

$0

Google

$

32 /mo


Gemini 1.5 pro pricing

Variant: 128k context

Context windows size:

128 k
Description:

next-gen model

More info

Per million tokens

Input:

$1.25

Output:

$5

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Ai21

$

50 /mo


Jamba 1.5 pricing

Variant: Large

Context windows size:

256 k
Description:

powerful long context model

More info

Per million tokens

Input:

$2

Output:

$8

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Grok

$

60 /mo


Grok 2 pricing

Variant:

Context windows size:

128 k
Description:

Cutoff date: 2024-07-17
More info

Per million tokens

Input:

$2

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

63 /mo


Gemini 1.5 pro pricing

Variant:

Context windows size:

2000 k
Description:

next-gen model

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

63 /mo


Gemini 1.0 pro pricing

Variant:

Context windows size:

2000 k
Description:

first gen model for text and image

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Cohere

$

63 /mo


Command R+ pricing

Variant:

Context windows size:

128 k
Description:

most powerful, scalable large language model

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

OpenAI

$

65 /mo


GPT-4o pricing

Variant:

Context windows size:

128 k
Description:

Their most advanced multimodal model

Cutoff date: 2023-10-01
More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

65 /mo


GPT-4o pricing

Variant: 2024-11-20

Context windows size:

128 k
Description:

Their most advanced multimodal model

Cutoff date: 2024-10-01
More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

65 /mo


GPT-4o pricing

Variant: 2024-08-06

Context windows size:

128 k
Description:

GPT-4o with improved reasoning capabilities

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

75 /mo


OpenAI o1-mini pricing

Variant:

Context windows size:

128 k
Description:

fast, cost-efficient tailored to coding, math, and science

More info

Per million tokens

Input:

$3

Output:

$12

Input Batch:

$0

Output Batch:

$0

Cached input:

$1.5

Cached output:

$0

OpenAI

$

75 /mo


OpenAI o1-mini pricing

Variant: 2024-09-12

Context windows size:

128 k
Description:

fast, cost-efficient tailored to coding, math, and science

More info

Per million tokens

Input:

$3

Output:

$12

Input Batch:

$0

Output Batch:

$0

Cached input:

$1.5

Cached output:

$0

Anthropic

$

93 /mo


Claude 3.7 pricing

Variant: Sonnet

Context windows size:

200 k
Description:

most intelligent model

More info

Per million tokens

Input:

$3

Output:

$15

Input Batch:

$1.5

Output Batch:

$7.5

Cached input:

$3.75

Cached output:

$0.3

Anthropic

$

93 /mo


Claude 3.5 pricing

Variant: Sonnet

Context windows size:

200 k
Description:

intelligent model

More info

Per million tokens

Input:

$3

Output:

$15

Input Batch:

$1.5

Output Batch:

$7.5

Cached input:

$3.75

Cached output:

$0.3

OpenAI

$

103 /mo


GPT-4o pricing

Variant: 2024-05-13

Context windows size:

128 k
Description:

GPT-4o with improved reasoning capabilities

More info

Per million tokens

Input:

$5

Output:

$15

Input Batch:

$2.5

Output Batch:

$7.5

Cached input:

$0

Cached output:

$0

OpenAI

$

375 /mo


OpenAI o1-preview pricing

Variant:

Context windows size:

128 k
Description:

reasoning model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$60

Input Batch:

$0

Output Batch:

$0

Cached input:

$7.5

Cached output:

$0

OpenAI

$

375 /mo


OpenAI o1-preview pricing

Variant: 2024-09-12

Context windows size:

128 k
Description:

reasoning model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$60

Input Batch:

$0

Output Batch:

$0

Cached input:

$7.5

Cached output:

$0

OpenAI

$

385 /mo


o1 pricing

Variant:

Context windows size:

200 k
Description:

Frontier reasoning model

Cutoff date: 2023-10-01
More info

Per million tokens

Input:

$15

Output:

$60

Input Batch:

$7.5

Output Batch:

$30

Cached input:

$7.5

Cached output:

$0

Anthropic

$

462 /mo


Claude 3 pricing

Variant: Opus

Context windows size:

200 k
Description:

Powerful model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$75

Input Batch:

$7.5

Output Batch:

$37.5

Cached input:

$18.75

Cached output:

$1.5

OpenAI

$

1154 /mo


GPT-4.5 pricing

Variant:

Context windows size:

128 k
Description:

Largest GPT model

More info

Per million tokens

Input:

$75

Output:

$150

Input Batch:

$37.5

Output Batch:

$75

Cached input:

$37.5

Cached output:

$0

OpenAI

$

1500 /mo


GPT-4o pricing

Variant: audio-preview

Context windows size:

128 k
Description:

GPT-4o with audio capabilities

More info

Per million tokens

Input:

$100

Output:

$200

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

LLM pricing comparison is not too hard to figure out, the newest models are usually pretty expensive, so give the last generation models a try to see if you can get away with a cheaper price.

The LLM pricing calculator works with large numbers, all the inputs usually take a million or more requests per box to start showing the real price.

Disclaimer

This is an estimation tool only.The actual bill could be very different. SaaS Pricing is not responsible for any miscalculations. The prices and terms may be wrong, and may be changed. If no region is specified, you can assume it's Frankfurt, Germany. SaaS Prices is not affiliated with the providers in the calculators, unless explicitly stated.