start here
Best first models for this runtime decision
Granite 4.0 Micro
3B class • 2.5 GB minimum
IBM's smallest Granite 4.0 instruct release is a pragmatic US-origin starter for local chat, extraction, and agent scaffolding.
OLMo 3 Instruct 7B
7B class • 5.0 GB minimum
Ai2's 7B instruct release is the clearest Apache-licensed American alternative to Llama when you want a smaller fully open local model.
Llama 3.1 8B
7B class • 6.5 GB minimum
Meta's 8B instruct release remains the safest broad-compatibility US local model when you want maximum runtime coverage.