Category
Technologies
LLM Articles
Keep up to date with the latest techniques, tools, and research in Large Language Models. Our blog talks about data science, uses, & responsible AI practices.
Other technologies:
Training 2 or more people?Try DataCamp for Business
Claude Fable 5 vs GPT-5.5: Benchmarks, Pricing, and Which to Choose
Claude Fable 5 leads on raw capability benchmarks, but GPT-5.5 wins on access, pricing, and fewer classifier interruptions. Here's how to choose.
Tom Farnschläder
June 10, 2026
Claude Mythos 5: Features, Benchmarks, and What It Can Do
Anthropic's most capable model yet, Claude Mythos 5 brings Mythos-class AI to cybersecurity, drug design, and scientific research with the safeguards lifted for trusted partners.
Tom Farnschläder
June 9, 2026
Claude Opus 4.8 vs Gemini 3.5 Flash: Benchmarks and Use Cases Compared
Compare Claude Opus 4.8 and Gemini 3.5 Flash on MCP Atlas, SWE-bench Pro, and GDPval benchmarks, plus pricing and speed, to find the right model for your work.
Derrick Mwiti
June 9, 2026
Codex vs Cursor: Delegate or Collaborate?
Codex runs fire-and-forget agents in cloud sandboxes; Cursor gives you real-time control in a VS Code-based IDE. Compare agents, models, pricing, and workflows.
Srujana Maddula
June 1, 2026
Claude Opus 4.8 vs GPT-5.5: Benchmarks, Tests, and Which to Choose
A head-to-head comparison of Anthropic's Claude Opus 4.8 and OpenAI's GPT-5.5 across coding, reasoning, agentic tasks, and pricing.
Tom Farnschläder
June 1, 2026
Gemini 3.5 Flash vs GPT-5.5: The Multitool and the Sledgehammer
One model is built for versatile tool-calling at scale; the other brute-forces the hardest reasoning problems. Compare Google's Gemini 3.5 Flash and OpenAI's GPT-5.5 across coding, agentic workflows, multimodal tasks, and pricing.
Tom Farnschläder
May 26, 2026
Gemini 3.5 Flash vs Claude Opus 4.7: The Sprinter and the Surgeon
Google's speed-optimized Flash model takes on Anthropic's deep-coding flagship across agentic workflows, reasoning, multimodal tasks, and pricing.
Tom Farnschläder
May 25, 2026
Composer 2.5: Benchmarks, Pricing, and How It Compares
Cursor's latest proprietary model, Composer 2.5, adds targeted RL feedback, more synthetic training tasks, and lower token pricing than frontier models.
Khalid Abdelaty
May 22, 2026
AI Learning Roadmap 2026: The Best Resources for Beginners
A structured AI learning roadmap covering the best courses and resources for learning AI from scratch, covering everything from Python basics to LLMs and agentic AI.
Matt Crabtree
May 13, 2026
Interaction Models: What TML-Interaction-Small Gets Right
Mira Murati's Thinking Machines Lab built a model that listens and talks at the same time. We break down the features and benchmark it against GPT-Realtime-2.
Tom Farnschläder
May 13, 2026
SubQ AI Explained: How Good Is the 12M Context Window LLM?
Subquadratic's SubQ model claims a 12M-token context window, 52x efficiency, and frontier performance. Here's how its SSA architecture works and what the benchmarks actually say.
Srujana Maddula
May 12, 2026
GPT-5.5 vs Gemini 3.1 Pro: Which Frontier Model Should You Use?
Compare OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro on coding, reasoning, agentic benchmarks, pricing, and context limits to help choose the right model.
Derrick Mwiti
May 11, 2026