What's My AIRuntime fit page built from tracked setup paths and model-fit coverage.

best model

Best local models for Ollama

Best for the quickest path from benchmark result to a real local run. This page ranks cleaner starting points first, then links you into the model pages when you need exact memory and hardware guidance.

starter pickPhi-4-reasoning
tracked models11
tier span3B to Frontier MoE
runtime tradeoffYou get less low-level tuning freedom than llama.cpp and fewer curation cues than LM Studio.

benchmark first

Benchmark before you commit to a Ollama download.

You get less low-level tuning freedom than llama.cpp and fewer curation cues than LM Studio. The benchmark is still the fastest way to confirm whether this machine belongs in the size band that makes Ollama feel worth using.

start here

Best first models for Ollama

gpt-oss-20b

34B class • 15.5 GB minimum

Official Ollama library entry with a native tag and published downloads.

why this runtime

What you are choosing with Ollama

  • Best for: People who want the fastest first success with a tracked local model tag.
  • Tradeoff: You get less low-level tuning freedom than llama.cpp and fewer curation cues than LM Studio.
  • Benchmark flow: Use the benchmark first when the question is about your machine, then use this page to choose the cleanest first pull inside Ollama.

broader catalog

More tracked models for Ollama

Granite 4.0 Micro

3B class • 2.5 GB minimum

IBM's smallest Granite 4.0 instruct release is a pragmatic US-origin starter for local chat, extraction, and agent scaffolding.

Open model page

OLMo 3 Instruct 7B

7B class • 5.0 GB minimum

Ai2's 7B instruct release is the clearest Apache-licensed American alternative to Llama when you want a smaller fully open local model.

Open model page

Llama 3.1 8B

7B class • 6.5 GB minimum

Meta's 8B instruct release remains the safest broad-compatibility US local model when you want maximum runtime coverage.

Open model page

Gemma 3 12B

13B class • 11.0 GB minimum

Gemma 3 12B stays interesting when you want a smaller multimodal American model, but it is less turnkey than Phi-4-reasoning for plain text work.

Open model page

OLMo 3.1 Instruct 32B

34B class • 19.5 GB minimum

OLMo 3.1 32B is the strongest Apache-licensed American 32B-class option, but it asks for more memory than gpt-oss-20b to reach a clean first run.

Open model page

Granite 4.0 H-Small

34B class • 19.5 GB minimum

Granite 4.0 H-Small is a credible American midrange choice for RAG-heavy work, but it is more specialized than the general-purpose winners above it.

Open model page

runtime smoke

Monthly runtime smoke matrix

Each row installs or updates the tracked runtime, downloads the starter model, and proves one local inference with the pinned prompt bundle.

These rows use hosted CPU runners so stale guidance is visible before the public install copy drifts too far from reality.

Ollama

Runtime guidance currently needs review

Last verified: Not yet verified

Tested runtime version: Not yet verified

Monthly smoke cadence (31-day review window)

Prompt bundle: 2026.03-reference-lm-prompts-v1

Linux

GitHub-hosted Ubuntu x64 CPU runner

Install recipe: Install the latest standalone Ollama CLI asset for Linux before each run.

Last verified: Not yet verified

Tested version: Not yet verified

Model pull: Granite 4.0 Micro

Stale: No successful monthly smoke run recorded yet.

macOS

GitHub-hosted macOS CPU runner

Install recipe: Install the latest standalone Ollama CLI asset for macOS before each run.

Last verified: Not yet verified

Tested version: Not yet verified

Model pull: Granite 4.0 Micro

Stale: No successful monthly smoke run recorded yet.

Windows

GitHub-hosted Windows x64 CPU runner

Install recipe: Install the latest standalone Ollama CLI asset for Windows before each run.

Last verified: Not yet verified

Tested version: Not yet verified

Model pull: Granite 4.0 Micro

Stale: No successful monthly smoke run recorded yet.