Skip to content

Curated Ollama Models

This note records local model passes used for the common-sense guardrails walkthrough. The first pass on 2026-05-07 combined ollama-cache list SSD residency with ollama list installed sizes and filtered to models under 8 GB. Follow-up passes on 2026-05-11 and 2026-05-12 added the car-wash fail/recover smoke shape and structured-output checks.

These are dated smoke-test starting points, not model rankings or product guarantees.

  • Live structured fact extraction prefers pure JSON. If the model wraps a valid JSON object in prose, the Python and Bash runners extract it and mark the structured-facts stage as warn. If no valid JSON object is recoverable after retry, the stage is marked fail and the run falls back to checked-in fact fixtures so later guardrail stages can still demonstrate their behavior.
  • Ollama structured outputs are most reliable when the API format field carries JSON or a JSON schema, and Ollama recommends grounding the prompt with the schema and using low temperature. Provider-native structured-output controls are still the robust path for future hardening, but this walkthrough no longer treats structured-facts failure as the expected result.
  • Local car-wash smokes found qwen3.5:4b to be the best primary walkthrough candidate: small enough for local proof, stronger than the 2B option, and tested against the desired naive walk failure plus enhanced drive recovery.
  • Local car-wash smokes found qwen3.5:2b useful as the low-resource option with the same fail/recover shape.
  • Small-model smokes also found gemma3:1b and erukude/omni-json:1b useful for very small demos. nemotron-3-nano:4b is useful as a comparison point because separate tool-intent smokes showed native strict behavior.
  • llama3.2 remains a historical target candidate from earlier sweeps, but it is no longer the default recommendation for this walkthrough.

References:

Use these first because they are small local smoke-test candidates that match the guardrail demo shape:

RoleModelInstalled sizeWhy it is on the listWalkthrough note
Primary live walkthroughqwen3.5:4b3.4 GBStronger small Qwen 3.5 option from the 2026-05-11/12 local smokes.Desired car-wash shape: naive answer fails as walk, constrained output is parseable, and enhanced object-presence prompting recovers to drive. Use this first when available.
Low-resource walkthroughqwen3.5:2b2.7 GBSmaller Qwen 3.5 option from the 2026-05-12 local smoke.Same fail/recover car-wash shape at a smaller footprint. Use when local resource constraints matter more than maximum tool-intent strength.
Very small demo candidatesgemma3:1b or erukude/omni-json:1b815 MB / 1.4 GBVery small models that reproduced the demo failure and recovery shape in local smokes.Useful for constrained machines, but keep the primary docs and walkthrough centered on qwen3.5:4b.
Comparison candidatenemotron-3-nano:4b2.8 GBCar-wash fail/recover target plus separate native strict tool-call smoke evidence.Interesting for comparison, but adding it to the main walkthrough can dilute the guardrails story.

Other observed models:

ModelInstalled sizeUseWalkthrough note
llama3.22.0 GBHistorical target candidate.Earlier local sweeps showed a valid fail/recover shape, but newer release-surface guidance prefers qwen3.5:4b primary and qwen3.5:2b low-resource.
phi4-mini-reasoning:3.8b3.2 GBAvoid as default for this scenario.Answered drive on the naive prompt, which removes the intended baseline failure.
granite4:350m-h366 MBAvoid as default for this scenario.Failed to recover under the enhanced prompt in local smokes.
qwen3:4b2.5 GBHistorical extraction experiment.Direct JSON probes were useful, but newer Qwen 3.5 smokes are the better starting point for the full walkthrough.

Prefer qwen3.5:4b for the interactive walkthrough when it is available. It is the primary v1.0.x local proof candidate, remains small enough for a developer laptop, and has current local smoke evidence for the car-wash fail/recover shape.

Terminal window
export NXUSKIT_PROVIDER=ollama
export NXUSKIT_MODEL=qwen3.5:4b
export OLLAMA_HOST=http://127.0.0.1:11434

For a lower-resource run, use the 2B Qwen 3.5 variant:

Terminal window
export NXUSKIT_PROVIDER=ollama
export NXUSKIT_MODEL=qwen3.5:2b
export OLLAMA_HOST=http://127.0.0.1:11434

For phase-specific experiments, keep the stronger model on extraction and repair while trying a smaller baseline:

Terminal window
export NXUSKIT_PROVIDER=ollama
export NXUSKIT_MODEL=qwen3.5:2b
export NXUSKIT_COMMON_SENSE_FACTS_MODEL=qwen3.5:4b
export NXUSKIT_COMMON_SENSE_REPAIR_MODEL=qwen3.5:4b
export OLLAMA_HOST=http://127.0.0.1:11434

Provider-native structured-output controls remain the preferred hardening path, especially Ollama JSON/schema formatting and thinking-mode controls when exposed through the installed SDK/CLI surface. In the v1.0.x examples line, the Bash/CLI runner disables thinking for short guardrail calls and requests JSON schema output for fact extraction. The example itself should be described more narrowly: live structured facts can pass with pure JSON, warn when valid JSON is recovered from prose, or fail after retry and fall back to fixtures. Failure is a fallback state, not the expected walkthrough result.