Arbiter - Auto-Retry LLM with CLIPS Validation
A command-line tool demonstrating the Solver pattern: using CLIPS rules to evaluate LLM output quality and automatically retry with adjusted parameters when validation fails.
Stop accepting unreliable LLM output — validate every response against CLIPS rules and retry automatically until your standards are met.
Scenarios: classification · extraction · reasoning
Edition
Section titled “Edition”Pro — requires a Pro (or trial) entitlement.
What this demonstrates
Section titled “What this demonstrates”Difficulty: Advanced ♦🏁 · LLM · CLIPS
- Summary: CLIPS-validated LLM retry app with rule-based answer verification
- Scenario: Submit questions to an LLM and validate answers against CLIPS rules, retrying on validation failure
tech_tagsin manifest:CLIPS, LLM— example idarbiterinconformance/examples_manifest.json.
Prerequisites
Section titled “Prerequisites”- SDK: Use an installed SDK tree (
NXUSKIT_SDK_DIR,NXUSKIT_LIB_PATHas needed);test-examples.shresolves Go/Rust/Python deps from that tree only — see README.md,scripts/setup-sdk.sh, andscripts/test-examples.sh. - Languages in this example: go, rust (paths under this directory; Python may live under a sibling
python/or shared reference per Language Implementations).
Real-World Application
Section titled “Real-World Application”Operations research tool, constraint modeling workbench.
Requirements
Section titled “Requirements”Edition: nxusKit Pro
This example requires the Pro edition of nxusKit. Purchase Pro or start a free 30-day trial (automatic on first Pro feature call).
Technologies
Section titled “Technologies”Solver
CLIPS integration path (validation)
Section titled “CLIPS integration path (validation)”Go CLIPS validation uses provider chat with local go/clips_wire.go types mirroring ClipsInput / ClipsOutput. For Session API access, use nxuskit.ClipsSession. Reference: conformance/clips-json-contract.json; nxusKit SDK sdk-packaging/docs/rule-authoring.md — ClipsInput JSON Reference (#clipsinput-json-reference; bundle: docs/rule-authoring.md).
Language Implementations
Section titled “Language Implementations”| Language | Path | Status |
|---|---|---|
| Rust | rust/ | Available |
| Go | go/ | Available |
Attach an installed SDK (NXUSKIT_SDK_DIR). See the repository README.md and scripts/test-examples.sh.
# From `/examples/apps/arbiter`:cd rust && cargo buildcd go && make buildFeatures
Section titled “Features”- LLM Output Validation - Use CLIPS rules to evaluate response quality
- Automatic Retry - Retry with adjusted parameters on validation failure
- Configurable Strategies - Define retry strategies for different failure modes
- Multiple Conclusion Types - Classification, extraction, and reasoning tasks
- Scoring System - Score attempts based on validation results
Installation
Section titled “Installation”cd examples/apps/arbiter/rustcargo build --releasecd examples/apps/arbiter/gomake buildcd examples/apps/arbiter/rust
# Show helpcargo run -- --help
# Run classification taskcargo run -- -t classification -i "This product is amazing!" --categories "positive,negative,neutral"
# Run with custom configcargo run -- -c config.yaml -i "Analyze this text"
# Run extraction taskcargo run -- -t extraction -i "John works at Acme Corp in New York"
# Verbose modecargo run -- -v -t reasoning -i "If A implies B and B implies C, what can we conclude about A and C?"cd examples/apps/arbiter/go
# Show help./bin/arbiter --help
# Run classification./bin/arbiter -t classification -i "Great service!" --categories "positive,negative,neutral"Command Line Options
Section titled “Command Line Options”USAGE: arbiter [OPTIONS] -i <INPUT>
OPTIONS: -c, --config <FILE> Configuration file path -i, --input <TEXT> Input text to process -t, --type <TYPE> Conclusion type: classification, extraction, reasoning --categories <LIST> Comma-separated categories for classification --max-retries <N> Maximum retry attempts (default: 3) -v, --verbose Show detailed progress -h, --help Show help messageInteractive Modes
Section titled “Interactive Modes”All examples support debugging flags:
# Verbose mode - show raw HTTP request/response datacargo run -- --verbose # Rust./bin/arbiter --verbose # Go
# Step mode - pause at each step with explanationscargo run -- --step # Rust./bin/arbiter --step # Go
# Combined modecargo run -- --verbose --stepOr use environment variables:
export NXUSKIT_VERBOSE=1export NXUSKIT_STEP=1Conclusion Types
Section titled “Conclusion Types”Classification
Section titled “Classification”Categorize input into predefined categories.
arbiter -t classification -i "I love this!" --categories "positive,negative,neutral"Extraction
Section titled “Extraction”Extract structured information from text.
arbiter -t extraction -i "Contact John at john@example.com or 555-1234"Reasoning
Section titled “Reasoning”Perform logical reasoning and inference.
arbiter -t reasoning -i "All cats are mammals. Whiskers is a cat. What is Whiskers?"Configuration
Section titled “Configuration”Create a YAML configuration file for custom settings:
max_retries: 5strategies: - failure_type: invalid_category adjustments: temperature: -0.2 prompt_suffix: "Choose only from the given categories." - failure_type: incomplete_extraction adjustments: temperature: 0 prompt_suffix: "Extract all mentioned entities." - failure_type: invalid_reasoning adjustments: temperature: -0.1 prompt_suffix: "Show your reasoning step by step."Example Output
Section titled “Example Output”Classification Result
Section titled “Classification Result”=== Solver: Classification Task ===
Input: "This product exceeded my expectations!"Categories: positive, negative, neutral
Attempt 1: LLM Response: "positive" Validation: PASSED Score: 100
Result: positive (1 attempt, 45ms)With Retry
Section titled “With Retry”=== Solver: Classification Task ===
Input: "It's okay I guess"Categories: positive, negative, neutral
Attempt 1: LLM Response: "somewhat positive" Validation: FAILED (invalid category) Strategy: Reduce temperature, clarify categories
Attempt 2: LLM Response: "neutral" Validation: PASSED Score: 95
Result: neutral (2 attempts, 1.2s)Retry Strategies
Section titled “Retry Strategies”The arbiter includes default strategies for common failure modes:
| Failure Type | Adjustment |
|---|---|
| Invalid category | Lower temperature, clarify options |
| Incomplete extraction | Reset temperature, request all entities |
| Invalid reasoning | Lower temperature, request step-by-step |
| Confidence too low | Increase temperature slightly |
| Format error | Add format examples to prompt |
Architecture
Section titled “Architecture”arbiter/├── rust/│ ├── src/main.rs # CLI entry point│ └── Cargo.toml├── go/│ ├── cmd/main.go # CLI entry point│ ├── solver.go # Core solver logic│ ├── strategies.go # Retry strategies│ ├── validator.go # CLIPS validation│ └── go.mod└── shared/ ├── rules/ # CLIPS validation rules └── configs/ # Example configurationsHow It Works
Section titled “How It Works”- Initial Request: Send input to LLM with configured prompt
- Validation: Pass LLM response through CLIPS validation rules
- Evaluation: Score the response based on validation results
- Retry Decision: If validation fails, find matching retry strategy
- Adjustment: Apply strategy adjustments (temperature, prompt, etc.)
- Repeat: Retry up to max_retries times
- Result: Return best successful attempt or highest-scoring failure
- CLIPS validation rules are in
shared/rules/classification-eval.clp - Build Go with
-tags=clipsfor real ClipsProvider integration - LLM approach requires API key (ANTHROPIC_API_KEY or OPENAI_API_KEY)
Testing
Section titled “Testing”# Rustcd examples/apps/arbiter/rustcargo test
# Gocd examples/apps/arbiter/gogo test -v ./...License
Section titled “License”Part of the nxusKit project.