Three steps to
trustworthy AI
Wrap any LLM
One line of code attaches the Verace metacognitive layer. The base model stays completely frozen — invisible until trained.
Works with Llama, Mistral, Qwen, and any open-weight model.
Train the adapter
5,000 steps of lightweight training teaches the nervous system to monitor the model’s internal states.
< 7% parameter overhead. 15–25% inference overhead.
Deploy with confidence
Every response now carries calibrated uncertainty. Route high-confidence directly, hedge medium, escalate low.
Zero-latency detection — caught during generation, not after.
1from verace import enhance23model = enhance('meta-llama/Llama-3-70B')4response = model.generate(5 "Delete user account #4523"6)78if response.confidence < 0.5:9 ask_human() # Agent pauses10else:11 execute(response) # ConfidentKnow when to ask for help vs proceed. That's the entire product.
Numbers that speak
Tested on TinyLlama 1.1B • 5,000 training steps • <7% parameter overhead
Head-to-Head Comparison
Base LLM vs Verace-Enhanced
| Metric | Base LLM | Verace | Change |
|---|---|---|---|
| Detection AUROC | 0.500 | 0.917 | +83% |
| F1 Score | 0.000 | 0.823 | from zero |
| Uncertainty Gap | 0.000 | 1.878 | from zero |
| Selective Acc @30% | 0.430 | 0.588 | +37% |
| NTP Accuracy | 47.7% | 51.2% | +3.5% |
| Perplexity | 10.40 | 9.80 | -6% |
The model doesn't just detect hallucinations — it actually gets better. When allowed to abstain on 30% of queries, selective accuracy reaches 58.8% vs 43.0% baseline.