LLM Token Pricing, LLM Tokenomics

llm token pricing

LLM Hostings

In this post, we've compiled a summary of both proprietary and open-source models that are provided as a service. However, this list does not encompass the full range of available hosting providers. You'll find numerous GPU and serverless LLM hostings listed in our directory. For more details, please refer to the "LLM Hostings" section.

 

Proprietary model (LLM) pricing

The most recent LLM token pricing for the proprietary LLMs as of April 20, 2024:

Platform Model Input: $/1M Tokens Output: $/1M Tokens Avg: $/1M Tokens
OpenAI GPT 4 Turbo 10.00 30.00 20.00
OpenAI GPT 3.5 Turbo 0.50 1.50 1.00
Cohere Command-R 0.50 1.50 1.00
Cohere Command-Light 0.30 0.60 0.45
Anthropic Claude 3 Opus 15.00 75.00 45.00
Anthropic Claude 3 Sonnet 3.00 15.00 9.00
Anthropic Claude 3 Haiku 0.25 1.25 0.75
Anthropic Claude Instant 0.80 2.40 1.60

It's somewhat challenging to determine which model is superior from the pricing perspective, as each boasts advantages in different use cases. Therefore, the choice must be made within the context of the specific business case in question. However, the Claude 3 family, including Sonnet and Haiku, are definitely worth using.

 

Open-source serverless model (LLM) pricing 

The most recent LLM token pricing for the open-source LLMs hosted as serverless models as of April 20, 2024:

Model Platform Input: $/1M Tokens Output: $/1M Tokens Avg: $/1M Tokens
Mistral 7B DeepInfra 0.10 0.10 0.10
Mistral 7B Anyscale 0.15 0.15 0.15
Mistral 7B OctoAI 0.10 0.25 0.18
Mistral 7B Fireworks.ai 0.20 0.20 0.20
Mistral 7B Together.ai 0.20 0.20 0.20
Mistral 7B Mistral 0.25 0.25 0.25
Mixtral 8x7B DeepInfra 0.27 0.27 0.27
Mixtral 8x7B OctoAI 0.30 0.50 0.40
Mixtral 8x7B Anyscale 0.50 0.50 0.50
Mixtral 8x7B Fireworks.ai 0.50 0.50 0.50
Mixtral 8x7B Together.ai 0.60 0.60 0.60
Mixtral 8x7B Mistral 0.70 0.70 0.70
Mixtral 8x22B DeepInfra 0.65 0.65 0.65
Mixtral 8x22B Together.ai 1.20 1.20 1.20
Mixtral 8x22B Mistral 2.00 6.00 4.00

 

Long-context window embeddings

The most recent embedding pricing for the long-context window models as of April 20, 2024:

Platform Model Context Window $/1M Tokens
Together.ai m2-bert-80M-32k-retrieval 32768 0.01
Together.ai m2-bert-80M-8k-retrieval 8192 0.01
Fireworks.ai nomic-embed-text-v1.5 8192 0.01
OpenAI text-embedding-3-small 8192 0.02
Jinaai jina-embeddings-v2-base-en 8192 0.02
Mistral mistral-medium 8192 0.10
Nomic nomic-embed-text-v1.5 8192 0.10
Voyage AI voyage-2 4000 0.10
Voyage AI voyage-large-2 16000 0.12
OpenAI text-embedding-3-large 8192 0.13

 

Short-context window embeddings

Platform Model Context Window $/1M Tokens
Together.ai google/bert-base-uncased 512 0.01
DeepInfra baai/bge-large-en-v1.5 512 0.01
DeepInfra thenlper/gte-large 512 0.01
Together.ai baai/bge-large-en-v1.5 512 0.02
Fireworks.ai WhereIsAI/uae-large-v1 512 0.02
Fireworks.ai thenlper/gte-large 512 0.02
Anyscale baai/bge-large-en-v1.5 512 0.05
Anyscale thenlper/gte-large 512 0.05
OctoAI thenlper-gte-large 512 0.05

 

Free credits for API use

Platform Free Credit $
Cohere 75
Nomic 50
Together.ai 25
Anyscale 100
OctoAI 10
OpenAI 10
VoyageAI 5
Anthropic 5
DeepInfra 1.8
Fireworks.ai Free for 2 weeks
Was this helpful?
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v2024072803