Which model should 12 gb desktop vram class start with?

Run the benchmark first, then start with the lightest model that still matches the resulting hardware band.

Why benchmark if this page already gives a tier?

Because reference device, GPU, and laptop pages still compress a wide range of real machines. The benchmark keeps the answer specific before you download the wrong model.

GPU cohort

What local AI can I run on an RTX 4070 Super desktop GPU?

Q: What local AI can I run on an RTX 4070 Super desktop GPU?

GPU cohort currently maps to a 70B local-model target. The benchmark is still the fastest way to verify whether the exact machine clears that band.

Desktop builders who want a practical large-model answer before choosing the rest of the machine. This page gives the short local-AI answer for the search query, then routes uncertain cases into the benchmark before the recommendation turns into guesswork.

gpu cohort12 GB desktop VRAM class

starter modelBenchmark first

tier guide70B

publish waveWeek 1

benchmark first

Verify this class before you trust the reference answer.

It is an attractive local-AI card, but the benchmark still turns the parts-list assumption into a browser-tested answer. The benchmark is what turns this cohort answer into a machine-specific decision before you spend time downloading models that do not fit.

Benchmark this device Compare known machines

starter models

Best first models for this cohort

why this page ranks

Query evidence and benchmark path

Organic traffic signal: High demand for the query cluster around “rtx 4070 super local ai”.
Search intent review: GPU cohort searches are high intent when the page explains model size, memory headroom, and when to benchmark anyway.
Benchmark completion potential: High. These searches are close enough to a hardware decision that a benchmark CTA consistently belongs above the fold.

runtime paths

Pick the runtime after you confirm the hardware band

The benchmark decides the size band first. The runtime pages then tell you which download path is the cleanest first move inside Ollama, LM Studio, or llama.cpp.

P0Static

Runtime page

Best local models for Ollama

Search intent: ollama best model

Best for the quickest path from benchmark result to a real local run.

Runtime guide + catalog coverage

Open page

P1Static

Runtime page

Best local models for llama.cpp

Search intent: llama.cpp best model

Best for people who care about low-level control, serving flags, and GGUF tuning.

Runtime guide + catalog coverage

Open page

P1Static

Runtime page

Best local models for LM Studio

Search intent: lm studio best model

Best for people who want a graphical model browser and easy GGUF pulls.

Runtime guide + catalog coverage

Open page

nearby pages

Adjacent queries to open next

P0Static

GPU cohort

What local AI can I run on an Apple M2 GPU class machine?

Search intent: apple m2 gpu local ai

13B class Apple GPU cohort page for the searchers who start with the chip family, not the laptop model.

GPU cohort • launch week 1

Open page

P0Static

GPU cohort

What local AI can I run on an Apple M4 Max GPU class machine?

Search intent: apple m4 max gpu local ai

120B class Apple GPU cohort page for people searching the chip family before they choose a studio desktop or pro laptop.

GPU cohort • launch week 1

Open page

P0Static

GPU cohort

What local AI can I run on an RTX 4060 laptop GPU?

Search intent: rtx 4060 laptop local ai

34B class NVIDIA laptop-GPU page for the most common midrange Windows local AI search cluster.

GPU cohort • launch week 1

Open page