Which AI model to use? (here's a list and recommendations)

Last updated: 1.1.2025

Are you still using the web version of ChatGPT? You’re missing out on a lot! In the realm of AI usage and application, the focus has shifted from simply asking “Will it do what I want?” to “Which model can I use to achieve results with consistent quality at the best price?” This is because costs are now determined by token usage rather than a flat rate.

When comparing AI models from various providers, it’s all about adjusting expectations. We know that models are becoming faster, cheaper, and more powerful. However, larger and more affordable doesn’t always equate to better performance. In some cases, smaller and less expensive models may actually be more effective. Here’s an updated overview of current models, their pricing, and performance.

As you can see, there are quite some differences in price. So choose (and test) your model wisely.

To ensure that the citations are formatted correctly for posting on your Hugo CMS page, I will adjust the table and citations to align with common Markdown practices, particularly for Hugo, which often utilizes footnotes or inline citations. Below is the revised table with properly formatted citations.

Model Parameters Inference Speed Reasoning Capability Pricing Recommended Usage Source
Amazon Nova Micro 2B Fast Moderate Input: $0.04, Output: $0.14 per 1M tokens Lightweight applications, mobile devices, cost-sensitive use cases [1]
Gemini 1.5 Flash-8B ≤128k 8B Very Fast Moderate Input: $0.04, Output: $0.15 per 1M tokens Lightweight applications, mobile devices, cost-sensitive use cases [2]
Amazon Nova Lite 6B Fast Moderate Input: $0.06, Output: $0.24 per 1M tokens Lightweight applications, mobile devices, cost-sensitive use cases [1]
Gemini 1.5 Flash ≤128k 125B Fast Moderate Input: $0.07, Output: $0.30 per 1M tokens Rapid response applications, simple language tasks, chatbots [2]
Gemini 1.5 Flash-8B >128k 8B Very Fast Moderate Input: $0.07, Output: $0.30 per 1M tokens Lightweight applications, mobile devices, cost-sensitive use cases [2]
GPT-4o Mini 6B Fast Moderate Input: $0.15, Output: $0.60 per 1M tokens Lightweight applications, cost-sensitive use cases [3]
Gemini 1.5 Flash >128k 125B Fast Moderate Input: $0.15, Output: $0.60 per 1M tokens Rapid response applications, simple language tasks, chatbots [2]
Amazon Nova Pro 30B Moderate High Input: $0.80, Output: $3.20 per 1M tokens General-purpose language tasks, complex reasoning [1]
Claude 3.5 Haiku 175B Fast Moderate Input: $0.80, Output: $4 per 1M tokens Concise, creative writing [4]
o1-mini 6B Fast Moderate Input: $3, Output: $12 per 1M tokens Lightweight applications, cost-sensitive use cases [3]
Gemini 1.5 Pro ≤128k 175B Moderate High Input: $1.25, Output: $5 per 1M tokens General-purpose language tasks, complex reasoning, long-form content generation [2]
GPT-4o 175B Moderate High Input: $2.50, Output: $10 per 1M tokens General-purpose language tasks, complex reasoning [3]
Gemini 1.5 Pro >128k 175B Moderate High Input: $2.50, Output: $10 per 1M tokens General-purpose language tasks, complex reasoning, long-form content generation [2]
Claude 3.5 Sonnet 175B Moderate High Input: $3, Output: $15 per 1M tokens Creative writing, poetry generation [4]
o1-preview 175B Moderate High Input: $15, Output: $60 per 1M tokens Cutting-edge research, advanced language tasks [3]
Claude 3 Opus 175B Moderate High Input: $15, Output: $75 per 1M tokens Comprehensive language tasks, long-form content generation [4]
  1. Amazon AWS Pricing: https://aws.amazon.com/sagemaker/pricing/
  2. Google Gemini Pricing: https://cloud.google.com/gemini/pricing
  3. OpenAI Pricing: https://openai.com/pricing
  4. Anthropic Pricing: https://www.anthropic.com/pricing