Large Language Model pricing

Tokens

Context window: 128,000

Google

$

1 /mo


Gemini 1.5 flash-8b model

Variant: 128k context

Context windows size:

1000 k
Description:

smallest model for lower intelligence use cases

More info

Per million tokens

Input:

$0.0375

Output:

$0.15

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

2 /mo


Gemini 1.5 flash model

Variant: 128k context

Context windows size:

128 k
Description:

fastest model for repetitive tasks

More info

Per million tokens

Input:

$0.075

Output:

$0.3

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Ai21

$

3 /mo


Jamba 1.5 model

Variant: Mini

Context windows size:

256 k
Description:

Efficient & lightweight

More info

Per million tokens

Input:

$0.2

Output:

$0.4

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

4 /mo


Gemini 1.5 flash model

Variant: 1 mill context

Context windows size:

1000 k
Description:

fastest model for repetitive tasks

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Cohere

$

4 /mo


Command R model

Variant:

Context windows size:

128 k
Description:

optimized for long context tasks

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

OpenAI

$

4 /mo


GPT-4o mini model

Variant:

Context windows size:

128 k
Description:

cost-efficient small model

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0.75

Output Batch:

$0.3

Cached input:

$0.075

Cached output:

$0

OpenAI

$

4 /mo


GPT-4o mini model

Variant: 2024-07-18

Context windows size:

128 k
Description:

cost-efficient small model

More info

Per million tokens

Input:

$0.15

Output:

$0.6

Input Batch:

$0.75

Output Batch:

$0.3

Cached input:

$0.075

Cached output:

$0

Google

$

5 /mo


Gemini 1.5 flash-8b model

Variant: 1 mill context

Context windows size:

1000 k
Description:

smallest model for lower intelligence use cases

More info

Per million tokens

Input:

$0.75

Output:

$0.3

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Cohere

$

8 /mo


Command R model

Variant: Fine-tuned

Context windows size:

128 k
Description:

optimized for long context tasks

More info

Per million tokens

Input:

$0.3

Output:

$1.2

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Anthropic

$

31 /mo


Claude 3.5 model

Variant: Haiku

Context windows size:

200 k
Description:

Fastest, most cost-effective model

More info

Per million tokens

Input:

$1

Output:

$5

Input Batch:

$0.625

Output Batch:

$2.5

Cached input:

$1.25

Cached output:

$0.1

Google

$

31 /mo


Gemini 1.5 pro model

Variant: 128k context

Context windows size:

128 k
Description:

next-gen model

More info

Per million tokens

Input:

$1.25

Output:

$5

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Ai21

$

50 /mo


Jamba 1.5 model

Variant: Large

Context windows size:

256 k
Description:

powerful long context model

More info

Per million tokens

Input:

$2

Output:

$8

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

63 /mo


Gemini 1.5 pro model

Variant:

Context windows size:

2000 k
Description:

next-gen model

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Google

$

63 /mo


Gemini 1.0 pro model

Variant:

Context windows size:

2000 k
Description:

first gen model for text and image

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Cohere

$

63 /mo


Command R+ model

Variant:

Context windows size:

128 k
Description:

most powerful, scalable large language model

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

OpenAI

$

64 /mo


GPT-4o model

Variant:

Context windows size:

128 k
Description:

Their most advanced multimodal model

Cutoff date: 2023-10-01
More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

64 /mo


GPT-4o model

Variant: 2024-11-20

Context windows size:

128 k
Description:

Their most advanced multimodal model

Cutoff date: 2024-10-01
More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

64 /mo


GPT-4o model

Variant: 2024-08-06

Context windows size:

128 k
Description:

GPT-4o with improved reasoning capabilities

More info

Per million tokens

Input:

$2.5

Output:

$10

Input Batch:

$1.25

Output Batch:

$5

Cached input:

$1.25

Cached output:

$0

OpenAI

$

75 /mo


OpenAI o1-mini model

Variant:

Context windows size:

128 k
Description:

fast, cost-efficient tailored to coding, math, and science

More info

Per million tokens

Input:

$3

Output:

$12

Input Batch:

$0

Output Batch:

$0

Cached input:

$1.5

Cached output:

$0

OpenAI

$

75 /mo


OpenAI o1-mini model

Variant: 2024-09-12

Context windows size:

128 k
Description:

fast, cost-efficient tailored to coding, math, and science

More info

Per million tokens

Input:

$3

Output:

$12

Input Batch:

$0

Output Batch:

$0

Cached input:

$1.5

Cached output:

$0

Anthropic

$

92 /mo


Claude 3.5 model

Variant: Sonnet

Context windows size:

200 k
Description:

most intelligent model

More info

Per million tokens

Input:

$3

Output:

$15

Input Batch:

$1.5

Output Batch:

$7.5

Cached input:

$3.75

Cached output:

$0.3

OpenAI

$

103 /mo


GPT-4o model

Variant: 2024-05-13

Context windows size:

128 k
Description:

GPT-4o with improved reasoning capabilities

More info

Per million tokens

Input:

$5

Output:

$15

Input Batch:

$2.5

Output Batch:

$7.5

Cached input:

$0

Cached output:

$0

OpenAI

$

375 /mo


OpenAI o1-preview model

Variant:

Context windows size:

128 k
Description:

reasoning model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$60

Input Batch:

$0

Output Batch:

$0

Cached input:

$7.5

Cached output:

$0

OpenAI

$

375 /mo


OpenAI o1-preview model

Variant: 2024-09-12

Context windows size:

128 k
Description:

reasoning model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$60

Input Batch:

$0

Output Batch:

$0

Cached input:

$7.5

Cached output:

$0

Anthropic

$

461 /mo


Claude 3 model

Variant: Opus

Context windows size:

200 k
Description:

Powerful model for complex tasks

More info

Per million tokens

Input:

$15

Output:

$75

Input Batch:

$7.5

Output Batch:

$37.5

Cached input:

$18.75

Cached output:

$1.5

OpenAI

$

1500 /mo


GPT-4o model

Variant: audio-preview

Context windows size:

128 k
Description:

GPT-4o with audio capabilities

More info

Per million tokens

Input:

$100

Output:

$200

Input Batch:

$0

Output Batch:

$0

Cached input:

$0

Cached output:

$0

Disclaimer

This is an estimation tool only.The actual bill could be very different. SaaS Pricing is not responsible for any miscalculations. The prices and terms may be wrong, and may be changed. If no region is specified, you can assume it's Frankfurt, Germany.