LLM Token Pricing, LLM Tokenomics
20/04/2024 09:08:16LLM Hostings
In this post, we've compiled a summary of both proprietary and open-source models that are provided as a service. However, this list does not encompass the full range of available hosting providers. You'll find numerous GPU and serverless LLM hostings listed in our directory. For more details, please refer to the "LLM Hostings" section.
Proprietary model (LLM) pricing
The most recent LLM token pricing for the proprietary LLMs as of April 20, 2024:
Platform | Model | Input: $/1M Tokens | Output: $/1M Tokens | Avg: $/1M Tokens |
---|---|---|---|---|
OpenAI | GPT 4 Turbo | 10.00 | 30.00 | 20.00 |
OpenAI | GPT 3.5 Turbo | 0.50 | 1.50 | 1.00 |
Cohere | Command-R | 0.50 | 1.50 | 1.00 |
Cohere | Command-Light | 0.30 | 0.60 | 0.45 |
Anthropic | Claude 3 Opus | 15.00 | 75.00 | 45.00 |
Anthropic | Claude 3 Sonnet | 3.00 | 15.00 | 9.00 |
Anthropic | Claude 3 Haiku | 0.25 | 1.25 | 0.75 |
Anthropic | Claude Instant | 0.80 | 2.40 | 1.60 |
It's somewhat challenging to determine which model is superior from the pricing perspective, as each boasts advantages in different use cases. Therefore, the choice must be made within the context of the specific business case in question. However, the Claude 3 family, including Sonnet and Haiku, are definitely worth using.
Open-source serverless model (LLM) pricing
The most recent LLM token pricing for the open-source LLMs hosted as serverless models as of April 20, 2024:
Model | Platform | Input: $/1M Tokens | Output: $/1M Tokens | Avg: $/1M Tokens |
---|---|---|---|---|
Mistral 7B | DeepInfra | 0.10 | 0.10 | 0.10 |
Mistral 7B | Anyscale | 0.15 | 0.15 | 0.15 |
Mistral 7B | OctoAI | 0.10 | 0.25 | 0.18 |
Mistral 7B | Fireworks.ai | 0.20 | 0.20 | 0.20 |
Mistral 7B | Together.ai | 0.20 | 0.20 | 0.20 |
Mistral 7B | Mistral | 0.25 | 0.25 | 0.25 |
Mixtral 8x7B | DeepInfra | 0.27 | 0.27 | 0.27 |
Mixtral 8x7B | OctoAI | 0.30 | 0.50 | 0.40 |
Mixtral 8x7B | Anyscale | 0.50 | 0.50 | 0.50 |
Mixtral 8x7B | Fireworks.ai | 0.50 | 0.50 | 0.50 |
Mixtral 8x7B | Together.ai | 0.60 | 0.60 | 0.60 |
Mixtral 8x7B | Mistral | 0.70 | 0.70 | 0.70 |
Mixtral 8x22B | DeepInfra | 0.65 | 0.65 | 0.65 |
Mixtral 8x22B | Together.ai | 1.20 | 1.20 | 1.20 |
Mixtral 8x22B | Mistral | 2.00 | 6.00 | 4.00 |
Long-context window embeddings
The most recent embedding pricing for the long-context window models as of April 20, 2024:
Platform | Model | Context Window | $/1M Tokens |
---|---|---|---|
Together.ai | m2-bert-80M-32k-retrieval | 32768 | 0.01 |
Together.ai | m2-bert-80M-8k-retrieval | 8192 | 0.01 |
Fireworks.ai | nomic-embed-text-v1.5 | 8192 | 0.01 |
OpenAI | text-embedding-3-small | 8192 | 0.02 |
Jinaai | jina-embeddings-v2-base-en | 8192 | 0.02 |
Mistral | mistral-medium | 8192 | 0.10 |
Nomic | nomic-embed-text-v1.5 | 8192 | 0.10 |
Voyage AI | voyage-2 | 4000 | 0.10 |
Voyage AI | voyage-large-2 | 16000 | 0.12 |
OpenAI | text-embedding-3-large | 8192 | 0.13 |
Short-context window embeddings
Platform | Model | Context Window | $/1M Tokens |
---|---|---|---|
Together.ai | google/bert-base-uncased | 512 | 0.01 |
DeepInfra | baai/bge-large-en-v1.5 | 512 | 0.01 |
DeepInfra | thenlper/gte-large | 512 | 0.01 |
Together.ai | baai/bge-large-en-v1.5 | 512 | 0.02 |
Fireworks.ai | WhereIsAI/uae-large-v1 | 512 | 0.02 |
Fireworks.ai | thenlper/gte-large | 512 | 0.02 |
Anyscale | baai/bge-large-en-v1.5 | 512 | 0.05 |
Anyscale | thenlper/gte-large | 512 | 0.05 |
OctoAI | thenlper-gte-large | 512 | 0.05 |
Free credits for API use
Platform | Free Credit $ |
---|---|
Cohere | 75 |
Nomic | 50 |
Together.ai | 25 |
Anyscale | 100 |
OctoAI | 10 |
OpenAI | 10 |
VoyageAI | 5 |
Anthropic | 5 |
DeepInfra | 1.8 |
Fireworks.ai | Free for 2 weeks |
Recent Blog Posts
-
2024-08-03