Which model should apple integrated gpu + 16 gb class memory start with?

Run the benchmark first, then start with the lightest model that still matches the resulting hardware band.

Why benchmark if this page already gives a tier?

Because reference device, GPU, and laptop pages still compress a wide range of real machines. The benchmark keeps the answer specific before you download the wrong model.

GPU cohort

What local AI can I run on an Apple M2 GPU class machine?

Q: What local AI can I run on an Apple M2 GPU class machine?

GPU cohort currently maps to a 13B local-model target. The benchmark is still the fastest way to verify whether the exact machine clears that band.

People who know the Apple silicon generation they have but still need a practical local-model ceiling. This page gives the short local-AI answer for the search query, then routes uncertain cases into the benchmark before the recommendation turns into guesswork.

gpu cohortApple integrated GPU + 16 GB class memory

starter modelBenchmark first

tier guide13B

publish waveWeek 1

benchmark first

Verify this class before you trust the reference answer.

Apple GPU names simplify the search, but the benchmark still verifies whether the exact browser-visible machine clears the target band. The benchmark is what turns this cohort answer into a machine-specific decision before you spend time downloading models that do not fit.

Benchmark this device Compare known machines

starter models

Best first models for this cohort

why this page ranks

Query evidence and benchmark path

Organic traffic signal: High demand for the query cluster around “apple m2 gpu local ai”.
Search intent review: GPU cohort searches are high intent when the page explains model size, memory headroom, and when to benchmark anyway.
Benchmark completion potential: High. These searches are close enough to a hardware decision that a benchmark CTA consistently belongs above the fold.

runtime paths

Pick the runtime after you confirm the hardware band

The benchmark decides the size band first. The runtime pages then tell you which download path is the cleanest first move inside Ollama, LM Studio, or llama.cpp.

P0Static

Runtime page

Best local models for Ollama

Search intent: ollama best model

Best for the quickest path from benchmark result to a real local run.

Runtime guide + catalog coverage

Open page

P1Static

Runtime page

Best local models for llama.cpp

Search intent: llama.cpp best model

Best for people who care about low-level control, serving flags, and GGUF tuning.

Runtime guide + catalog coverage

Open page

P1Static

Runtime page

Best local models for LM Studio

Search intent: lm studio best model

Best for people who want a graphical model browser and easy GGUF pulls.

Runtime guide + catalog coverage

Open page

nearby pages

Adjacent queries to open next

P0Static

GPU cohort

What local AI can I run on an Apple M4 Max GPU class machine?

Search intent: apple m4 max gpu local ai

120B class Apple GPU cohort page for people searching the chip family before they choose a studio desktop or pro laptop.

GPU cohort • launch week 1

Open page

P0Static

GPU cohort

What local AI can I run on an RTX 4060 laptop GPU?

Search intent: rtx 4060 laptop local ai

34B class NVIDIA laptop-GPU page for the most common midrange Windows local AI search cluster.

GPU cohort • launch week 1

Open page

P0Static

GPU cohort

What local AI can I run on an RTX 4070 Super desktop GPU?

Search intent: rtx 4070 super local ai

70B class desktop-GPU page for builders searching the exact card they expect to buy for local models.

GPU cohort • launch week 1

Open page