Which AI model to use? (here's a list and recommendations)
Last updated: 1.1.2025
Are you still using the web version of ChatGPT? You’re missing out on a lot! In the realm of AI usage and application, the focus has shifted from simply asking “Will it do what I want?” to “Which model can I use to achieve results with consistent quality at the best price?” This is because costs are now determined by token usage rather than a flat rate.
When comparing AI models from various providers, it’s all about adjusting expectations. We know that models are becoming faster, cheaper, and more powerful. However, larger and more affordable doesn’t always equate to better performance. In some cases, smaller and less expensive models may actually be more effective. Here’s an updated overview of current models, their pricing, and performance.
As you can see, there are quite some differences in price. So choose (and test) your model wisely.
Comparison of Large Language Model specifications, pricing, and recommended usage (sorted by cheapest)
To ensure that the citations are formatted correctly for posting on your Hugo CMS page, I will adjust the table and citations to align with common Markdown practices, particularly for Hugo, which often utilizes footnotes or inline citations. Below is the revised table with properly formatted citations.
Comparison of Large Language Model specifications, pricing, and recommended usage (sorted by cheapest)
Model | Parameters | Inference Speed | Reasoning Capability | Pricing | Recommended Usage | Source |
---|---|---|---|---|---|---|
Amazon Nova Micro | 2B | Fast | Moderate | Input: $0.04, Output: $0.14 per 1M tokens | Lightweight applications, mobile devices, cost-sensitive use cases | [1] |
Gemini 1.5 Flash-8B ≤128k | 8B | Very Fast | Moderate | Input: $0.04, Output: $0.15 per 1M tokens | Lightweight applications, mobile devices, cost-sensitive use cases | [2] |
Amazon Nova Lite | 6B | Fast | Moderate | Input: $0.06, Output: $0.24 per 1M tokens | Lightweight applications, mobile devices, cost-sensitive use cases | [1] |
Gemini 1.5 Flash ≤128k | 125B | Fast | Moderate | Input: $0.07, Output: $0.30 per 1M tokens | Rapid response applications, simple language tasks, chatbots | [2] |
Gemini 1.5 Flash-8B >128k | 8B | Very Fast | Moderate | Input: $0.07, Output: $0.30 per 1M tokens | Lightweight applications, mobile devices, cost-sensitive use cases | [2] |
GPT-4o Mini | 6B | Fast | Moderate | Input: $0.15, Output: $0.60 per 1M tokens | Lightweight applications, cost-sensitive use cases | [3] |
Gemini 1.5 Flash >128k | 125B | Fast | Moderate | Input: $0.15, Output: $0.60 per 1M tokens | Rapid response applications, simple language tasks, chatbots | [2] |
Amazon Nova Pro | 30B | Moderate | High | Input: $0.80, Output: $3.20 per 1M tokens | General-purpose language tasks, complex reasoning | [1] |
Claude 3.5 Haiku | 175B | Fast | Moderate | Input: $0.80, Output: $4 per 1M tokens | Concise, creative writing | [4] |
o1-mini | 6B | Fast | Moderate | Input: $3, Output: $12 per 1M tokens | Lightweight applications, cost-sensitive use cases | [3] |
Gemini 1.5 Pro ≤128k | 175B | Moderate | High | Input: $1.25, Output: $5 per 1M tokens | General-purpose language tasks, complex reasoning, long-form content generation | [2] |
GPT-4o | 175B | Moderate | High | Input: $2.50, Output: $10 per 1M tokens | General-purpose language tasks, complex reasoning | [3] |
Gemini 1.5 Pro >128k | 175B | Moderate | High | Input: $2.50, Output: $10 per 1M tokens | General-purpose language tasks, complex reasoning, long-form content generation | [2] |
Claude 3.5 Sonnet | 175B | Moderate | High | Input: $3, Output: $15 per 1M tokens | Creative writing, poetry generation | [4] |
o1-preview | 175B | Moderate | High | Input: $15, Output: $60 per 1M tokens | Cutting-edge research, advanced language tasks | [3] |
Claude 3 Opus | 175B | Moderate | High | Input: $15, Output: $75 per 1M tokens | Comprehensive language tasks, long-form content generation | [4] |
Sources
- Amazon AWS Pricing: https://aws.amazon.com/sagemaker/pricing/
- Google Gemini Pricing: https://cloud.google.com/gemini/pricing
- OpenAI Pricing: https://openai.com/pricing
- Anthropic Pricing: https://www.anthropic.com/pricing