Competence Gate: gating tool-use on a small model's internal confidence signal instead of its verbalised one — Qwen3.5-4B, open weights [P]

Summary

I made a 10MB LoRA adapter for Qwen3.5-4B plus a small orchestration layer. It decides, per query, whether to answer directly, search the web, or retrieve from your own local documents and it refuses to make things up when it can't verify an answer. It runs locally (Apple Silicon / MLX, with a GGUF build for llama.cpp/Ollama). Basically small instruct models are poor at telling users how confident they really are. They can't verbalise it and tend to say they are confident for everyhting.

SpinGraph analysis pending — check back after processing.

Ask AI about this story

Opens with the SpinGraph .md URL and structured context — one click, prompt included.

ChatGPT Claude Perplexity Gemini Grok

More from Reddit r/MachineLearning

View all →

Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO