Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Public SDK release tags begin at sdk-v0.9.0. Earlier entries preserve
pre-public development history with normalized pre-public version numbers after
historical version resets.
[Unreleased]
Section titled “[Unreleased]”[1.0.1] - 2026-06-08
Section titled “[1.0.1] - 2026-06-08”Patch release for Pro SDK bundle composition. No API or C ABI signature changes from v1.0.0 are introduced.
- Pro SDK bundles now compile
nxuskit-cli solverandnxuskit-cli zenwith the real Pro engine command modules instead of CE-safe stubs. - Pro CLI builds now enable solver/ZEN capability flags in both
nxuskit-engineandnxuskit-core, so valid Pro tokens can execute the bundled commands. - SDK bundle verification now fails if a Pro CLI is compiled with CE stubs or OSS edition metadata.
[1.0.0] - 2026-05-28
Section titled “[1.0.0] - 2026-05-28”General Availability release for the v0.9.4-stabilized SDK API surface. This release is intentionally narrow: version metadata moves to
1.0.0, public documentation now reflects GA posture, and SDK bundle packaging guards include the Rust benchmark targets required by the wrapper manifest.
Changed
Section titled “Changed”- Promoted nxusKit SDK to GA in public README/support copy.
- Updated Rust workspace, C ABI, Go SDK, Python SDK, and package metadata to
lockstep version
1.0.0. - Promoted Python package classifier to
Development Status :: 5 - Production/Stable. - Refreshed public architecture and README links to the hosted docs site.
- Clarified loopback CLI examples so local copy-paste commands use explicit loopback model names.
- SDK release workflows now copy
packages/nxuskit/benchesinto the bundled Rust wrapper so the declared benchmark target is present in release archives. - Packaging verification now fails if the bundled Rust wrapper omits the
declared
logprobs_serializationbenchmark target.
Compatibility
Section titled “Compatibility”- No API or C ABI signature changes from v0.9.4 are introduced in this GA cut.
- Pro installation packages continue to be published on the public release page; Pro capabilities still require a valid license key.
[0.9.4] - 2026-05-11
Section titled “[0.9.4] - 2026-05-11”v0.9.4 release candidate. Provider-capability modernization and release hardening, consolidating sprints S1 (streaming logprobs, branch 098), S2/S3 (provider capability modernization + Capability Manifest v2 decision, branch 099), and S4-S6 (CLI Level 2 completion, examples & bundle alignment, docs & release candidate, branch 100). Lockstep version bump
0.9.3 -> 0.9.4across all components. No C ABI signature changes in this release.
Added - S1: Streaming Logprobs + Capability Metadata (branch 098)
Section titled “Added - S1: Streaming Logprobs + Capability Metadata (branch 098)”StreamLogprobsDeltatype (Rust engine + wrapper, Go, Python) carrying per-chunkTokenLogprobentries on streaming responses.StreamChunk.logprobs: Option<StreamLogprobsDelta>(Rust),StreamChunk.Logprobs *StreamLogprobsDelta(Go), andStreamChunk.logprobs: Optional[StreamLogprobsDelta](Python) - additive, defaults toNone/nilfor non-supporting providers.ProviderCapabilities.supports_streaming_logprobs: boolflag withdebug_assert!enforcingsupports_streaming_logprobs => supports_logprobs.- GPT-5.4 reasoning-compat warn-and-drop guard: when
reasoning.effort != "none",temperature,top_p, andlogprobsare dropped with a warning rather than passed through. - CLI
provider infoexposes thestreaming_logprobsrow (human + JSON). - Cross-language parity harness at
internal/tests/parity/stream_logprobs/run_parity.sh. - OpenAI:
supports_streaming_logprobs = true(only supporting provider per fixture evidence); all other providersfalseper the evidence-first rule.
Added - S2/S3: Provider Capability Modernization + Manifest v2 (branch 099)
Section titled “Added - S2/S3: Provider Capability Modernization + Manifest v2 (branch 099)”- Provider capability surface modernized;
CapabilityProvider/ “capability provider” vocabulary introduced (no breakingLLMProviderrename). - xAI Grok runtime provider support under canonical provider id
xai(XAI_API_KEY, default base URLhttps://api.x.ai/v1);groqremains Groq, Inc. and no confusinggrokalias is registered. CapabilityManifestv2 concept with a public preview subset for provider/model capability discovery (full internal manifest unchanged); the publication decision is recorded in the 099 artifacts.- OpenAI remains Chat-Completions-first (no full Responses API migration).
Added - S4: CLI Level 2 completion & stabilization (branch 100)
Section titled “Added - S4: CLI Level 2 completion & stabilization (branch 100)”nxuskit-cli zen validate(Pro) - structural validation of a ZEN JSON Decision Model (JDM): rejectsfunctionNode(JavaScript), checks decision table node shape, attempts expression compilation, and reports node/decision-table/rule counts. Backed by a new purenxuskit_engine::providers::zen::validate(model_json) -> Result<ZenValidationReport>engine entry point. Exit 0 = valid; exit 5 =parse_error(unparseable input) orzen_validate_error(structurally invalid, with aproblems[]report); exit 4 =entitlement_denied.nxuskit-cli zen test(Pro) - run a ZEN decision table against a fixture set{table, cases: [{name, input, expected}]}and compare each actual output toexpected; on mismatch emits a structured diff report (exit 5,zen_test_mismatch), a per-case eval error iszen_test_eval_error, a fixture parse error isparse_error.nxuskit-cli bn learn- parameter learning (MLE / Bayesian) of a Bayesian network’s CPDs from a CSV dataset given the network skeleton; output is the learned network, BIF-exportable.nxuskit-cli bn evidence- validate/normalize an observations map against a network. (Community edition.)solver what-if --compareand the unsatisfiable-assumptions path are now covered by non-#[ignore]d, entitlement-aware tests (skip-with-reason in CE, assertions run in the Pro CI lane.github/workflows/ci-pro.yml).CliError::CommandValidation { code, message, details }- exit 5 with a command-specificcodestring + structureddetails(used by the new ZEN commands; exit-code set unchanged, FR-001 / Article IV).- Shell support policy documented (
completions: bash, zsh, fish supported; PowerShell not generated in v0.9.4; helper snippets + schema bundle locations).
Added - S5: Examples repo & bundle alignment (branch 100)
Section titled “Added - S5: Examples repo & bundle alignment (branch 100)”- Examples portfolio bundle-instruction refs bumped to v0.9.4;
PYTHON_EXAMPLES_STATUS.mdrecords the v0.9.4 Python-parity scope (minimal slice: the SDK-sidepackages/nxuskit-pyFFI version-guard alignment is in scope; the 17 already-passing Python examples stay; new examples remain examples-team backlog). Rust vision example confirmed using the v0.9.2 multimodal wrapper API with no text-only caveat.
Changed
Section titled “Changed”nxuskit-py_ffi.pyEXPECTED_VERSIONaligned to the package version (0.9.1->0.9.4) - the cffi loader requires the linked library’snxuskit_version()to match; this unblocks the Python FFI examples that were previouslybroken-upstreamagainst the v0.9.3 mismatch.- Lockstep version bump
0.9.3 -> 0.9.4: Rust workspace +nxuskitcrate, C ABI version constant (nxuskit-core), Gonxuskit-go(Version+ExpectedNxuskitVersion), Pythonnxuskit-py(__version__+ pyproject).
Compatibility
Section titled “Compatibility”- No C ABI signature changes in v0.9.4 - only the ABI version constant
moves (
0.9.3 -> 0.9.4); function signatures and struct layouts are frozen (Article XIV). - All v0.9.4 additions are additive. The CLI exit-code set (0/1/2/3/4/5/130)
is unchanged; the new ZEN commands introduce new
codestrings within exit 5. - S1/S2/S3 baseline behavior (streaming logprobs, provider capability metadata) is preserved.
[0.9.3] - 2026-04-29
Section titled “[0.9.3] - 2026-04-29”Published SDK release
sdk-v0.9.3. Production licensing real-purchase activation/recovery, PR readiness, and supported-platform SDK build checks passed before release publication.
-
Production licensing cutover (Phase 4):
- Release builds embed the production ES256 public key with
kid: es256-v1. - Release default endpoint is
https://nxus.systems/licensing-api/v1. nxuskit-cli license status --jsonexposes endpoint, environment, and signing-key diagnostics.- SDK accepts
real_purchaseandleasedtoken kinds even whilenxus-licensing-clientlags on its enum (ES256 fallback verifier). - Stable licensing errors include
authentication_required,environment_mismatch,wrong_product_identifier, etc. - Activation timeouts extended to 30s on both client and proxy
(
EXTENDED_TIMEOUT_SECS = 30) for cold-start activation paths.
- Release builds embed the production ES256 public key with
-
First-class unary chat logprobs (Phase 5, US2):
- Rust wrapper (
nxuskit):ChatRequest::with_logprobs(bool)andwith_top_logprobs(u8); typedChatResponse.logprobs: LogprobsDatawithTokenLogprob(selected token + bytes) andTopLogprob(alternative + bytes). Doctests onwith_logprobsandLogprobsData. - Python SDK (
nxuskit-py): newChatRequestdataclass withlogprobs/top_logprobskwargs;LogprobsData,TokenLogprob,TopLogprobexported from top-levelnxuskit. FFI response decode populates typed logprobs. - C ABI: round-trip preserves
logprobs.content[], alternative tokens, and UTF-8 bytes. Wire path islogprobs.content[](matches OpenAI; pinned bytests/logprobs_abi_passthrough_test.rs::response_envelope_uses_content_field_for_logprobs_token_array). - Engine:
parameter_adapter.rs::adapt_logprobsperforms warn-and- drop when a provider lackssupports_logprobs, with structured Info warning.provider_optionsdoes not tunnel logprobs. - Migration guide: The logprobs migration guide covers Rust + Python + C ABI before/after with capability-gating rationale.
- Rust wrapper (
-
ABI / version consistency (Phase 3):
- Workspace, Rust wrapper, Python package, Go markers, C ABI version
constant, capabilities ABI JSON, current SDK docs, and
Cargo.lockall bumped to v0.9.3. - Pre-logprobs v0.9.2 fixture compatibility tests (Rust + Python) prove requests without logprobs serialize byte-identically to v0.9.2.
- Stale-version guard:
scripts/check-version-inventory.sh.
- Workspace, Rust wrapper, Python package, Go markers, C ABI version
constant, capabilities ABI JSON, current SDK docs, and
Changed
Section titled “Changed”data-model.mdcorrected: logprobs response token array iscontent(matches Rust/Python/C ABI implementations and OpenAI wire format), not the earlier draft’stokens. Pinned by ABI passthrough test.
Test counts (logprobs surface, cumulative)
Section titled “Test counts (logprobs surface, cumulative)”- 7 Rust wrapper (mock_provider) + 6 ABI passthrough + 6 engine warn-drop + 3 streaming-scope + 18 Python = 40 logprobs tests green across all SDK surfaces.
[0.9.2] - 2026-04-13
Section titled “[0.9.2] - 2026-04-13”- CLI Level 2 request surface: richer JSON request construction across chat/call flows, including multimodal image input handling where supported.
- Provider diagnostics: provider ping/status improvements for checking SDK environment readiness from the CLI.
- Python SDK hardening: runtime library discovery deprecation warning and SecurityValidator coverage for common unsafe input patterns.
- Release confidence checks: conformance, parity, performance, and packaging checks expanded for the supported SDK surfaces.
Changed
Section titled “Changed”- CLI and runtime-loading documentation refreshed for v0.9.2 behavior.
- Test fixtures and CI checks hardened so SDK builds can validate without relying on local native-library side effects.
- Workspace and lockfile versions bumped to v0.9.2.
[0.9.1] - 2026-04-05
Section titled “[0.9.1] - 2026-04-05”CLI Level 1 Semantic Remediation
Section titled “CLI Level 1 Semantic Remediation”- Real Engine Integration:
zen eval,solver solve,clips eval, andbn infernow execute real engine logic - no more placeholder/stub responses - Pipeline Execution:
pipeline rundispatches all stage types (LLM, CLIPS, ZEN, solver, BN) through real engines with output handoff and partial results on failure - Call Envelope:
callpropagates tool definitions and includestool_callsandinference_metadatain responses - Artifact Deep Merge:
artifact mergeperforms recursive deep merge with dot-notation conflict paths - Models Capabilities:
models --supportsfilter uses real capability inference from model metadata - Provider Auth:
provider statususes structured auth subsystem;provider logoutis provider-scoped - Judge/Branch:
judge selectreturns structured errors;branch compareproduces field-level diffs
CLI Documentation and Solver Format Compatibility
Section titled “CLI Documentation and Solver Format Compatibility”- CLI Input Reference: New
docs/user/cli-input-reference.mdcovering all 13 Level 1 commands with JSON schemas, working examples, and common errors - Enhanced Help Text: Every engine command’s
--helpnow shows input format structure - Solver Format Compatibility:
solver solveaccepts library API format (ConstraintInput withvar_type, structured constraints, domains, objectives) directly from sharedproblem.jsonscenario files - no format conversion needed - SMT-LIB Support:
solver solveaccepts SMT-LIB 2 format (S-expressions) as convenience input for Z3 experts - Auto-Detection: Solver input format (simplified CLI, library API, SMT-LIB) is auto-detected from content
Positioning
Section titled “Positioning”- CLI Description: Updated from “CLI for interacting with multiple LLM providers” to “JSON-first control plane for shell automation, CI, and multi-engine reasoning workflows”
- README: Added CLI / Shell Automation section with examples
- Naming: Fixed
nxuskit-engine-cli->nxuskit-clinaming drift across all docs and scripts
Compliance
Section titled “Compliance”- NOTICE: Regenerated with zen-engine and z3-sys entries; Python section reformatted to remove excessive whitespace padding
- Constitution v2.4.0: Added semantic test assertions, stub prohibition, and task verification criteria (Articles II and III)
- Acceptance Fixtures: Three PoR 4.1 acceptance workflow scripts (intake-routing, generator-validator-retry, typed-artifact-handoff)
[0.9.0] - 2026-03-13
Section titled “[0.9.0] - 2026-03-13”Initial public release of the nxusKit SDK.
Highlights
Section titled “Highlights”- Polyglot SDK: Unified LLM interfaces across Rust, Go, and Python
- 14 LLM Providers: Claude, OpenAI, Ollama, LM Studio, Mistral, OpenRouter, Together, Groq, Fireworks, Perplexity, MCP, CLIPS, Mock, Loopback
- CLIPS Expert System: Rule-based inference via embedded CLIPS 6.4.2 engine with FFI bindings
- Bayesian Network Inference: Full-featured BN provider with Variable Elimination, Junction Tree, Loopy BP, NUTS/HMC, and structure/parameter learning
- Z3 Constraint Solver: Stateful solver sessions with multi-objective optimization, soft constraints, push/pop scoping, and UNSAT core extraction
- ZEN Decision Tables: JSON Decision Model evaluation via zen-engine
- Plugin Architecture: Signed plugin loading with Ed25519 verification and capability-based sandboxing
- SDK CLI: Command-line tool for all providers (
nxuskit-cli) - SDK Installer: Cross-platform SDK manager (
install.sh) with version management - Cross-Language Conformance: Shared test vectors ensuring API parity across Rust, Go, and Python
Platform Support
Section titled “Platform Support”| Platform | Architecture | Status |
|---|---|---|
| Linux | x86_64 | Supported |
| macOS | ARM64 (Apple Silicon) | Supported |
| macOS | x86_64 | Supported |
| Windows | x86_64 | Supported |
Language SDKs
Section titled “Language SDKs”| Language | Package | Description |
|---|---|---|
| Rust | nxuskit | FFI wrapper with safe Rust API |
| Go | nxuskit-go | Idiomatic Go with context support |
| Python | nxuskit-py | Pure Python with requests HTTP client |
Getting Started
Section titled “Getting Started”See sdk-packaging/docs/getting-started.md for installation and usage instructions.
For runnable examples, see the nxusKit-examples repository.
[0.8.23] - 2026-02-24
Section titled “[0.8.23] - 2026-02-24”-
Solver Progress Streaming (US1): Real-time progress events during optimization solves
nxuskit_solver_solve_streamC ABI function withon_chunk/on_donecallbacksSolverProgressEventstruct: iteration, status, elapsed_ms, objective_value, bound_gap, is_final- nxuskit-rs
SolverStreamReceiverwith sync Iterator and asyncfutures_core::Streaminterfaces solve_stream_async()convenience method for tokio-based async consumption
-
Z3 Context Pooling (US2): Reusable Z3 context pool for reduced solver startup overhead
- Pool checkout/return benchmarked at 290us per 100-variable FFI round-trip
- Configurable via
SolverConfig.pool_sizeandpool_max_idle_ms
-
ZEN Decision Table Evaluation (US3): JSON Decision Model (JDM) evaluation via zen-engine
nxuskit_zen_evaluate/nxuskit_zen_free_resultC ABI functions (stateless, no session)- Supports decision table nodes, expression nodes, and switch nodes
- Function nodes rejected with clear error (no QuickJS dependency)
- Benchmarked at 0.39ms average for 100-row decision tables (well under 1ms target)
- Go wrapper:
gollyllm.ZenEvaluate()with automatic memory management - Rust provider:
zen::evaluate()async function with pre-compilation optimization
-
BN Min-Weight Variable Elimination (US4): Alternative elimination heuristic
EliminationHeuristic::MinWeightoption for VE inference- Identical posteriors to MinFill (verified on Asia and Alarm networks, max diff 2.22e-16)
- Configurable via
{"elimination_heuristic": "min_weight"}in inference config JSON
-
clips-sys Windows Stub Parity (US5): Windows compilation stubs for CLIPS FFI functions
-
7 Code Examples: Full Rust + Go implementations across patterns and integrations
- E1 Constraint Solver (
patterns/constraint-solver): Basic Z3 solver session with 3 scenarios - E2 Bayesian Inference (
patterns/bayesian-inference): BN loading, evidence, multi-algorithm inference with 3 scenarios - E3 Multi-Provider Pipeline (
integrations/solver-bn-pipeline): 3-stage BN→Solver→CLIPS pipeline with 3 scenarios (festival, rescue, bakery) - E4 LLM-Solver Hybrid (
integrations/llm-solver-hybrid): LLM constraint extraction + Z3 solving with mock/live modes, 3 scenarios (seating, dungeon, road-trip) - E5 Solver What-If (
patterns/solver-what-if): Push/pop what-if analysis with UNSAT detection, 3 scenarios (wedding, mars, recipe) - E6 BN Structure Learning (
integrations/bn-structure-learning): Hill-Climb+BIC, K2, MLE parameter learning, log-likelihood scoring with 3 scenarios (golf, bmx, sourdough) - E7 ZEN Decision Tables (
integrations/zen-decisions): JDM evaluation with first/collect hit policies, expression nodes, 3 scenarios (maze-rat, potion, food-truck)
- E1 Constraint Solver (
Changed
Section titled “Changed”- Go BN FFI (
ffi_bn.go): AddedSearchStructure(),LearnMLE(),LogLikelihood()wrappers withBnSearchStructureConfig,BnStructureResult,BnEdgetypes - Go header (
nxuskit.h): Addednxuskit_zen_evaluateandnxuskit_zen_free_resultdeclarations log::debug!instrumentation added to solver streaming and ZEN evaluate hot paths for timing observability
Performance
Section titled “Performance”- Solver FFI overhead: 290us per 100-variable add_variables call
- Solver first-chunk latency: 21ms (well under 400ms Doherty Threshold)
- ZEN 100-row evaluation: 0.39ms avg, 0.55ms worst-case (under 1ms target)
- Z3 pool bench: >=50% improvement documented (PR-009)
[0.8.22] - 2026-02-23
Section titled “[0.8.22] - 2026-02-23”-
Solver Part 2 (044-solver-perf-audit): Multi-objective optimization, soft constraints, and explainability
- Multi-objective optimization: Weighted sum and lexicographic modes via Z3 Optimize API
- Soft constraints: Penalty-weighted constraints that can be violated, with violated constraint tracking
- Constraint labels & explainability: Human-readable labels on constraints/variables/objectives, UNSAT core with label mapping, binding constraints and slack values
- Assumption-based solving: Push/pop scoping with retractable assumptions
- Go solver wrapper: 21 C ABI solver functions wrapped via CGo FFI with full type parity
- Python solver wrapper: 21 C ABI solver functions wrapped via CFFI with context managers and dataclass types
- Performance audit: Criterion benchmarks for all SDK providers (Z3, CLIPS, Chat, FFI overhead); SC-007 (≤1ms FFI) and SC-008 (≤200ms composite) verified passing
- Constitution v2.3.0: PR-007 (cumulative overhead), PR-008 (platform-specific optimization), PR-009 (benchmark platform representativeness)
-
BN Part 2 (043-bn-part2-advanced): Advanced inference algorithms and cross-language parity
- Loopy Belief Propagation: Message-passing inference for cyclic/large networks with configurable damping and convergence diagnostics
- Linear Gaussian Bayesian Networks: Gaussian variable types with moment-matching exact inference and 95% credible intervals
- NUTS/HMC Sampling: Gradient-based MCMC via
nuts-rscrate for continuous variables with ESS/R-hat diagnostics - BIF Export: Round-trip BIF file export with 15-digit precision and CPT completeness validation
- Parallel Junction Tree: Rayon-based parallel collect/distribute with deterministic results and auto-fallback
- Go BN wrapper: Full C ABI coverage for Part 2 — BnNetwork, BnEvidence, BnResult with goroutine-safe inference and streaming via channels
- Python BN wrapper: Full C ABI coverage for Part 2 — context managers, async inference, generator-based streaming
- C ABI extensions: 8 new exported functions for Gaussian variables, config-based inference, streaming, and BIF save
Changed
Section titled “Changed”- Dependency Audit & Update (043-deps-audit-update): Comprehensive dependency updates across Rust, Go, and Python
- Rust — Breaking upgrades:
thiserror: 1.0 → 2.0 (unified across workspace, eliminates dual compilation)reqwest: 0.12 → 0.13 (rustls TLS default, improved security)serde_yaml→serde_yaml_ng: 0.9 → 0.10 (replaces archived/deprecated crate via Cargo rename)rand: 0.9 → 0.10,rand_chacha: 0.9 → 0.10 (trait renames: Rng→RngExt, from_os_rng removed)quick-xml: 0.37 → 0.39 (drop-in for our usage)rmcp: 0.8 → 0.16 (not yet used in source — pure Cargo.toml bump)libloading: 0.8 → 0.9 (in nxuskit-rs, AsFilename trait)infer: 0.16 → 0.19 (additive changes only)criterion: 0.5 → 0.8 (dev-only)wiremock: 0.5 → 0.6 (dev-only, hyper 1.0 migration)mockito: 1.2 → 1.7 (dev-only, semver-compatible)
- Rust — Semver-compatible floor bumps: regex 1.12, uuid 1.21, csv 1.4, cc 1.2, tempfile 3.20, openapiv3 2.2
- Go: go-edlib 1.7.0, pflag 1.0.10, plus transitive updates in all examples
- Python: cffi ≥2.0.0, pytest ≥9.0.0, pytest-cov ≥7.0.0, ruff ≥0.15.0
- Toolchain: Rust MSRV 1.92 → 1.93 (Go 1.26 bump deferred pending CI toolchain update)
- Rust — Breaking upgrades:
[0.8.21] - 2026-02-21
Section titled “[0.8.21] - 2026-02-21”-
Bayesian Network Inference Engine (040-bayesian-network-inference): Full-featured Bayesian network provider
- BIF parser: Reads standard Bayesian Interchange Format network files (Asia, Cancer, Alarm, Survey, Earthquake)
- 4 inference algorithms: Variable Elimination (exact), Likelihood Weighted Sampling, Gibbs Sampling (MCMC), Junction Tree
- 3 structure learning algorithms: K2, Hill Climbing (BIC-scored), Bayesian structure learning
- 2 parameter learning algorithms: Maximum Likelihood Estimation, Bayesian parameter estimation
- Streaming support: Real-time probability updates during sampling-based inference
- C ABI integration: 14 BN SDK functions with opaque handle pattern for cross-language access
- nxuskit-rs BN wrapper: RAII wrapper with dual-dispatch (static-link + dynamic-link)
- Reference test data: Python-generated reference marginals for deterministic validation
- Benchmarks: Performance benchmarks for all inference algorithms
-
Solver Tier 2 Session API (042-solver-tier2-api): Stateful Z3 solver sessions with incremental model building
- Typed solver domain types (
solver_types.rs):VariableDef,ConstraintDef,ObjectiveDef,SolveResult,SolverStats,SolverConfig,SolverCapabilities,SessionStatus, with 20 constraint type variants and full serde round-trip fidelity - Mock solver backend (
mock_solver.rs): Deterministic contract testing without Z3 runtime — pre-configured responses, operation recording, atomic response cycling - 17 C ABI solver session functions (
solver_sdk.rs): Opaque handle pattern for cross-language access — create/destroy, add variables/constraints, set objective, retract, push/pop, solve, reset, introspection (variables, constraints, status, capabilities, counts) - nxuskit-rs SolverSession wrapper (
solver.rs): RAII wrapper with dual-dispatch (static-link + dynamic-link), typed methods, automatic cleanup on drop - Internal solver session engine (
solver_session.rs): Accumulates state as plain Rust data, rebuilds Z3 solver on each solve viawith_z3_configclosure re-entry - Push/pop scoping: Checkpoint/restore model state for what-if analysis (5 nested levels tested)
- Unsat core extraction: Named constraint labels propagated through Z3 assertion tracking, conflict identification on UNSAT
- Solver configuration: Timeout, random seed, and max-conflicts controls for deterministic, bounded-resource solving
- Backend capability introspection: Query Z3 or mock feature flags (incremental, unsat core, push/pop, multi-objective)
- Typed solver domain types (
-
New Providers (041-new-providers): Local LLM and Z3 constraint solver providers
- Local LLM provider: In-process inference via llama.cpp (feature-gated
provider-local-llama) and mistral.rs (feature-gatedprovider-local-mistralrs) - Z3 constraint solver provider: Constraint satisfaction and optimization via Z3 (feature-gated
provider-z3) - Go FFI constructors:
NewLocalProvider()andNewZ3Provider()in gollyllm - ModelLister implementations: For Local and Z3 providers
- Local LLM provider: In-process inference via llama.cpp (feature-gated
Changed
Section titled “Changed”-
Tier 1 Z3 refactor:
Z3Provider::chat()now delegates tosession::solve_ephemeral()for shared validation and dispatch logic — no external behavior change -
Constitution v2.1.0: Extended lockstep versioning to all nxusKit components (rustyllm, gollyllm, pythicllm), unified under single
sdk-v*tag line -
Go toolchain: Bumped to 1.24.13 for crypto/tls security fix (GO-2026-4337)
-
gollyllm LKS Parity (018-gollyllm-lks-parity): Go library now has API parity with rustyllm
InferenceMetadataandInferenceSteptypes: Structured metadata for inference resultsInferenceMetadatawithIsComplete,FinishReason,TokenUsage,ThinkingTrace,InferenceStepsInferenceStepfor capturing tool calls, thinking traces, and custom steps- Builder pattern with fluent methods (
Completed(),WithTokenUsage(),AddInferenceStep(), etc.) ChatResponse.InferenceMetadatafield populated by all 12 providers
SessionResetterinterface: Deterministic testing supportFreshSession() (LLMProvider, error)method on all 12 providers- Stateless providers return self, MockProvider creates new instance
- Enables reproducible test results in CI/CD pipelines
ModelListerinterface: Dynamic model discoveryListAvailableModels(ctx) ([]ModelInfo, error)for local providers- Implemented by: Ollama, LM Studio, Mock, Loopback
- Cloud providers don’t implement (API doesn’t support dynamic listing)
- Backward Compatibility:
ChatResponse.Metadatafield preserved with deprecation notice - 85.9% Test Coverage: Exceeds target of ≥85%
Changed
Section titled “Changed”- Project Naming Migration: Unified naming conventions across all language implementations
- Umbrella project renamed from “RustyLLM/LLMKit” to “nxusKit”
- Go library:
go/llmkit-go/→gollyllm/, packagellmkit→gollyllm - Go module path:
github.com/llmkit/llmkit-go→github.com/nxus-SYSTEMS/nxusKit/gollyllm - Go CLI:
gollm→gollyllm - Python library:
rustyllm-py/→pythicllm/, packagerustyllm→pythicllm - Python tools:
python-tools/→nxusKit-tools/ - Rust library:
rustyllm(unchanged) - Updated all documentation, examples, and configuration files
[0.8.20] - 2026-01-29
Section titled “[0.8.20] - 2026-01-29”-
ModelListerTrait: New trait for polymorphic model discovery- Enables
Box<dyn ModelLister>for provider registries - Implemented for Ollama, LmStudio, CLIPS, Mock, and Loopback providers
- Correct vtable dispatch through trait objects
- Enables
-
InferenceMetadataandInferenceStepTypes: Unified response metadataInferenceMetadataprovides consistent access to execution details across all providersInferenceStepcaptures inference traces (rule firings for CLIPS, tool calls, etc.)- All providers now populate
response.inference_metadata
-
fresh_session()Method: Per-provider method for deterministic CI/testing- Returns a fresh provider instance with clean state
- Implemented for all 13 providers
- Enables reproducible test results
-
BlockingProvider<P>Wrapper: Synchronous API for non-async contexts- Feature-gated under
blocking-api - Uses internal tokio Runtime
- Supports
chat()andlist_models()(when wrapped provider implements ModelLister)
- Feature-gated under
-
fullFeature Flag: Convenience feature combiningclips+blocking-api -
CLIPS Ordering Guarantees: Deterministic output ordering
- Conclusions sorted by fact_index
- Rules fired sorted by name for determinism
- Conflict strategy recorded in provider_metadata
-
Documentation: New integration guides
docs/INTEGRATION_PATTERNS.md- Polymorphic providers, deterministic evaluation, sync APIdocs/MINIMAL_BUILD.md- Feature flag reference and build configurations
-
CI Improvements: Feature flag verification steps
- Tests
--no-default-features,--features blocking-api,--features fullbuilds
- Tests
-
Examples:
polymorphic_example.rs- Provider registry pattern with ModelListerblocking_example.rs- Synchronous API usage patterns
-
as_clips_output()Method: Typed accessor for CLIPS results (feature = “clips”)- Avoids manual JSON parsing for CLIPS inference results
- Returns
Option<ClipsOutput>with typed access to conclusions, traces, stats - Returns
Nonefor non-CLIPS response content
-
all-providersFeature Flag: Enables all provider features (pro+mcp) -
Documentation Enhancements:
- Error handling patterns in INTEGRATION_PATTERNS.md
- CLIPS ordering guarantees documentation
- WASM compatibility notes in MINIMAL_BUILD.md
- CI optimization tips for feature matrix testing
Changed
Section titled “Changed”ChatResponsenow includesinference_metadata: InferenceMetadatafield (backward compatible with#[serde(default)])
[0.8.19] - 2026-01-28
Section titled “[0.8.19] - 2026-01-28”-
CLIPS Provider Options: New configuration options for expert system inference
strategyoption for conflict resolution strategy selectionallow_duplicate_factsoption for fact assertion behavior control
-
CLI Support for All 14 Providers: Command-line interface now supports all providers
- Claude, OpenAI, Ollama, LM Studio, Mistral, OpenRouter, Together, Groq, Fireworks, Perplexity, MCP, CLIPS, Mock, Loopback
-
Stop Patterns: Conditional inference halting based on output patterns
- Enables early termination when specific patterns are detected in responses
-
CLIPS Expert System Enhancements:
- Binary rule loading support for pre-compiled rule bases
- Search paths for rule file discovery
- Schema conversion utilities for fact/rule translation
- Help commands for CLIPS debugging and introspection
[0.8.18] - 2026-01-23
Section titled “[0.8.18] - 2026-01-23”Breaking Changes
Section titled “Breaking Changes”ThinkingMode::Autobehavior changed: Now intelligently enables thinking for thinking-capable models (qwen3, deepseek-r1, etc.) instead of omitting the parameter. UseThinkingMode::Omitfor the old behavior.
-
ThinkingMode::Omitvariant: Explicitly omit thethinkparameter from Ollama requests, letting the model use its raw default behavior. Use this if you need the pre-0.8.18Autobehavior. -
Smart
Automode for Ollama thinking models:ThinkingMode::Autonow detects thinking-capable models and enables thinking automatically, preventing empty response issues with models like qwen3-vl. -
Native Ollama Structured Output Support: Full JSON mode and JSON schema support for Ollama provider
- JSON Mode: Use
ResponseFormat::Jsonto get structured JSON responses from Ollama models - JSON Schema Mode: Use
ResponseFormat::JsonSchema { schema }for schema-validated responses (Ollama 0.5.0+) - Native API integration - sends
format: "json"orformat: { schema }directly to Ollama - No more prompt-based fallback needed for JSON mode with Ollama
- Updated
supports_json_schema: truein Ollama provider capabilities
- JSON Mode: Use
-
Helper methods on
ThinkingMode:is_auto()- Check if mode requires smart automatic behavioris_omit()- Check if mode explicitly omits the thinking parameter
-
Ollama JSON Mode Gap: Previously, Ollama declared
supports_json_mode: truebut didn’t actually send theformatfield to the API. Now properly implemented with native support. -
Empty responses with qwen3-vl models: Models like qwen3-vl would return empty content when the
thinkparameter was omitted. The new smartAutobehavior prevents this by enabling thinking for known thinking models.
[0.8.17] - 2026-01-20
Section titled “[0.8.17] - 2026-01-20”-
ThinkingModeEnum: Provider-agnostic control over chain-of-thought reasoningThinkingMode::Auto- Use model’s default behavior (recommended)ThinkingMode::Enabled- Force thinking mode onThinkingMode::Disabled- Force thinking mode off for faster responses- New
ChatRequest.with_thinking_mode()builder method - Automatically translated to provider-specific parameters (e.g.,
thinkfor Ollama)
-
Ollama
thinkParameter Support: Direct API integration for thinking control- Added
think: Option<bool>to internalOllamaRequeststruct - Maps from
ChatRequest.thinking_modeautomatically
- Added
-
Thinking Model Detection: Auto-detection with warnings
OllamaProvider::is_thinking_model()detects qwen3*, deepseek-r1/v3, :thinking variants- Debug warnings when
max_tokens < 200for thinking models (token budget may be exhausted)
-
StreamChunk Helper Methods: Easier access to combined content
all_content()- Returns(Option<&str>, Option<&str>)tuple of (thinking, content)combined_text(separator)- Combines thinking + content into single string
- Empty Response Bug: Resolved issue where thinking models returned empty responses
- Root cause:
max_tokenssetting too low, causing thinking tokens to exhaust budget - Solution: Added
ThinkingModecontrol and detection warnings
- Root cause:
[0.8.16] - 2026-01-20
Section titled “[0.8.16] - 2026-01-20”- Ollama Thinking Mode Support: Reliable streaming for models with chain-of-thought reasoning (e.g., Qwen3)
- Bug Fix: Streaming no longer returns empty responses when using thinking-enabled models
- Previously, Qwen3 models would intermittently return empty responses because thinking chunks were dropped
- Stream now correctly stays active during thinking phase and delivers all content
- Important: Thinking tokens count toward
max_tokenslimit - avoid setting lowmax_tokensvalues with thinking models or you may get empty responses - New
StreamChunk.thinkingfield exposes model reasoning content (when available) - New helper methods on
StreamChunk:thinking(String)- Create a thinking-only chunkwith_thinking(String, String)- Create a chunk with both content and thinkinghas_thinking()- Check if chunk contains thinking contenthas_content()- Check if chunk has any content (delta or thinking)
- Token counting now includes thinking tokens in completion count
StreamingTokenAccumulatorupdated withadd_thinking_chunk()method- Backward compatible: existing code works unchanged,
thinkingfield isOption<String> - Affected models:
qwen3:*,qwen3-vl:*, and any future models using Ollama’s thinking field - See the Ollama provider documentation for usage details
- Bug Fix: Streaming no longer returns empty responses when using thinking-enabled models
- Custom
Debugimplementation forTokenEstimatorto avoid derive issues withtiktoken_rs::CoreBPE
[0.8.15] - 2026-01-19
Section titled “[0.8.15] - 2026-01-19”-
Streaming Token Usage Tracking: Real-time token consumption monitoring across all 13 LLM providers
- Every streaming chunk now includes token usage information (both actual and estimated)
- Dual accuracy: 100% actual counts from providers + 95-99% estimated counts via client-side tokenization
- New
TokenEstimatorfor client-side token counting with tiktoken-rs support (95-99% accuracy) - New
StreamingTokenAccumulatorfor real-time token aggregation during streaming - New
TokenUsagestructure with dual count support (actual + estimated) - New
stream_with_usage()convenience method onLLMProvidertrait - Feature-gated
stream-token-estimationenables high-accuracy token counting (~50KB binary size) - Provider Support:
- Tier 1 (100% Actual): Claude, OpenAI, Ollama
- Tier 2 (95-99% Estimated): Groq, Mistral, Fireworks, Together, OpenRouter, Perplexity, LM Studio
- Tier 3 (70-90% Heuristic): MCP
- Tier 4 (100% Test): Mock, Loopback providers
- All 13 providers updated with streaming token tracking implementation
- 6 new integration tests covering token tracking across providers
- See:
docs/PROVIDER_TOKEN_ACCURACY.md
-
Comprehensive Documentation
- Enhanced README.md with new “Streaming Token Usage Tracking” section
- New PROVIDER_TOKEN_ACCURACY.md with detailed provider breakdown and cost tracking guidance
- Updated quickstart.md with 5 practical examples and migration guide
- New DOCUMENTATION_GUIDE.md for navigation and reader journeys
- 1,110+ lines of documentation, 30+ code examples
Changed
Section titled “Changed”- TokenUsage structure now includes both actual and estimated counts for dual accuracy
- All streaming chunks now include token usage information (previously only some providers)
- Updated exports in lib.rs to include TokenEstimator and EstimationMethod
- Fixed unit tests to use new TokenUsage structure (estimated_only, best_available methods)
- Addressed clippy warning about unnecessary clone of Copy type in lmstudio.rs
[0.8.14] - 2025-12-08
Section titled “[0.8.14] - 2025-12-08”- Retry-After Header Parsing: All providers now parse the
Retry-Afterheader from HTTP 429 responses- When rate limited,
LlmError::RateLimit { retry_after }now contains the duration - Enables intelligent retry logic: clients can wait exactly as long as needed
- Supports delay-seconds format (e.g., “30” for 30 seconds)
- Added
parse_retry_after()helper function inproviders/mod.rs - All 9 providers updated: Claude, OpenAI, Ollama, Groq, Together, Fireworks, Mistral, Perplexity, OpenRouter
- Implements Retry-After header parsing per HTTP/1.1 specification
- When rate limited,
[0.8.13] - 2025-12-08
Section titled “[0.8.13] - 2025-12-08”-
Critical Bug Fix: Streaming requests now use
total_timeoutinstead ofconnection_timeout- The
chat_stream()methods were incorrectly using.timeout(self.connection_timeout)on the request builder - This overrode the client’s
total_timeout(default 600s) withconnection_timeout(default 10s) - Caused streaming to fail after ~10 seconds even with longer timeout configurations
- Affects:
ClaudeProvider,OpenAIProvider,OllamaProvider - Added regression tests to prevent this issue from recurring
- The
-
All providers now use centralized
build_http_client()helper- Previously, only Claude, OpenAI, and Ollama used the helper
- Now all providers use consistent timeout configuration with
read_timeoutfor streaming - Fixed providers:
FireworksProvider,GroqProvider,PerplexityProvider,TogetherProvider,LmStudioProvider,MistralProvider,OpenRouterProvider MistralProviderandOpenRouterProviderwere usingClient::new()which ignored ALL timeouts- Other providers were missing
read_timeoutwhich is critical for streaming
[0.8.12] - 2025-12-08
Section titled “[0.8.12] - 2025-12-08”- Critical Bug Fix: Timeout configurations are now properly applied to HTTP clients
- Previously,
connection_timeout,stream_read_timeout, andtotal_timeoutvalues were stored but never applied to the underlyingreqwest::Client - This caused requests to use reqwest’s default timeouts instead of user-configured values
- Streaming requests would fail prematurely (~10s) regardless of timeout configuration
- Affects:
ClaudeProvider,OpenAIProvider,OllamaProvider - Root cause: timeout was set on the request builder instead of the HTTP client builder
- Previously,
-
build_http_client()helper function: Centralized HTTP client creation with proper timeout application- Ensures all providers use consistent timeout configuration
- Prevents this class of bug in future provider implementations
- See:
docs/PROVIDER_IMPLEMENTATION.md
-
Timeout configuration regression tests: Comprehensive test suite to catch timeout misconfigurations
- Tests for all three timeout types across all providers
verify_provider_respects_timeout()helper for testing new providers- Behavioral tests with mock servers and configurable delays
Changed
Section titled “Changed”- Upgraded
reqwestfrom 0.11 to 0.12: Enablesread_timeoutsupport for streaming responsesread_timeoutapplies per-chunk during response body reading- Critical for LLM streaming where there are pauses between tokens
[0.8.11] - 2025-12-07
Section titled “[0.8.11] - 2025-12-07”- Loopback Provider: Test/development provider that echoes back user messages
- Useful for testing without API calls
- Configurable response delays for timeout testing
[0.8.10] - 2025-11-25
Section titled “[0.8.10] - 2025-11-25”Breaking Changes
Section titled “Breaking Changes”-
LLMProvider Trait: Added
get_capabilities()method to the trait- All custom provider implementations must now implement this method
- Returns
ProviderCapabilitiesstruct describing what the provider supports
-
ChatResponse: Added new fields
warnings: Vec<ParameterWarning>- Parameter adaptation warningslogprobs: Option<LogprobsData>- Token probability data (when requested)
-
New Parameters in ChatRequest:
stop: Option<Vec<String>>- Stop sequences for generationpresence_penalty: Option<f32>- Repetition penalty (-2.0 to 2.0)frequency_penalty: Option<f32>- Frequency penalty (-2.0 to 2.0)seed: Option<u64>- Deterministic generation seedlogprobs: Option<bool>- Enable token probabilitiestop_logprobs: Option<u8>- Number of top alternatives (0-20)response_format: Option<ResponseFormat>- JSON mode controlprovider_options: Option<ProviderOptions>- Provider-specific parameters
-
Parameter Adapter: Graceful degradation system
- Automatically adapts parameters to provider capabilities
- Truncates stop sequences to provider limits with warnings
- Ignores unsupported parameters with info-level warnings
- Falls back to prompt-based JSON mode when native not supported
-
Provider Capabilities System:
ProviderCapabilitiesstruct for querying provider support- Runtime capability detection for all parameters
- Enables write-once code that adapts to any provider
-
New Providers (6 total):
MistralProvider- Mistral AI APIOpenRouterProvider- OpenRouter unified API (100+ models)TogetherProvider- Together AI (open source models)GroqProvider- Groq (ultra-fast inference)FireworksProvider- Fireworks AI (optimized inference)PerplexityProvider- Perplexity AI (search-augmented)
-
MCP Auto-Discovery:
- Automatic server discovery from
~/.config/mcp/servers.json RUSTYLLM_MCP_CONFIGenvironment variable overrideMcpProvider::discover_servers()APIMcpProvider::builder().discover_server("name")pattern
- Automatic server discovery from
-
Supporting Types:
ResponseFormatenum (Text, Json, JsonSchema)ProviderOptionsenum withOllamaOptionsvariantParameterWarningandWarningSeverityfor warningsLogprobsData,TokenLogprob,TopLogprobfor token probabilities
Changed
Section titled “Changed”- All existing providers updated to implement
get_capabilities() - All providers now integrate with ParameterAdapter for graceful degradation
- MockProvider enhanced with full capability support for testing
- Test race conditions in env_detector and MCP tests (mutex serialization)
- Test assertions for provider metadata flexibility
[0.8.9] - 2025-11-25
Section titled “[0.8.9] - 2025-11-25”Changed
Section titled “Changed”- Updated plans for new versioning with open source beta target (pre-v1.x.x)
- Cleanup and tightened various compiler warnings
[0.8.8] - 2025-11-25
Section titled “[0.8.8] - 2025-11-25”Changed
Section titled “Changed”- Version bump for release preparation
[0.8.7] - 2025-11-25
Section titled “[0.8.7] - 2025-11-25”Changed
Section titled “Changed”- Release preparation updates
[0.8.6] - 2025-11-25
Section titled “[0.8.6] - 2025-11-25”Changed
Section titled “Changed”- Minor fixes and updates
[0.8.5] - 2025-11-25
Section titled “[0.8.5] - 2025-11-25”Changed
Section titled “Changed”- Release updates
[0.8.4] - 2025-11-24
Section titled “[0.8.4] - 2025-11-24”Changed
Section titled “Changed”- Post-reset stabilization
[0.8.3] - 2025-11-24
Section titled “[0.8.3] - 2025-11-24”- Initial Tier-1 provider implementations (Mistral, OpenRouter, Together, Groq, Fireworks, Perplexity)
- MCP auto-discovery groundwork
[0.8.2] - 2025-11-24
Section titled “[0.8.2] - 2025-11-24”- LiteLLM-style convenience API with automatic provider detection
- Graceful parameter degradation foundation
[0.8.1] - 2025-11-24
Section titled “[0.8.1] - 2025-11-24”- Provider expansion preparation
- API refinements
[0.8.0] - 2025-11-24
Section titled “[0.8.0] - 2025-11-24”Changed
Section titled “Changed”- Version Reset: Reset version numbering from the earlier 2.x line into the pre-public development line
- Pre-1.0 versioning (0.x.y) signals API is not yet stable per semantic versioning
- Allows breaking changes in minor versions during development
- Functionality from the earlier 2.x line carried forward unchanged
- Historical 2.x changelog preserved in the internal archive
- Previous 2.x release artifacts archived outside the public SDK release line
This was a version numbering reset only - no code functionality changed. Features, fixes, and improvements from the earlier 2.x line carried forward into the pre-public development line. The reset better reflected the library’s development stage and followed Rust ecosystem conventions for pre-release crates.
The older 2.x history is intentionally kept out of the public SDK changelog.