TempleOfTwo
7 hours ago
Hi everyone, I’m an independent researcher working on a reproducible, open-source framework that measures how large language models express certainty — not just what they say.
After a year of cross-model experiments (GPT-5, Claude 4.5, Gemini, Grok), I found that AI systems naturally organize their answers into four distinct “epistemic types,” defined by measurable confidence ratios and convergence width. We call this framework IRIS Gate — it’s basically a map of knowledge reliability for AI output.
Core Idea
Every multi-model run produces a numerical confidence pattern that separates cleanly into:
Type Confidence Ratio What it Means Human Action 0 – Crisis / Conditional ≈ 1.26 Known emergency logic, activates only with triggers Trust if trigger 1 – Facts ≈ 1.27 Established knowledge Trust 2 – Exploration ≈ 0.49 Emerging hypotheses Verify 3 – Speculation ≈ 0.11 Unverifiable/future claims Override
It’s like a real-time reliability gauge for AI reasoning. The system self-labels outputs as “Trust / Verify / Override,” which makes scientific and technical research auditable instead of opaque.
What’s Been Done • 7 full multi-model runs (49 convergence chambers, ~1,200 data points) • Reproducibility bundle with SHA-256 checksums and citation metadata • Validated against a real biomedical use-case (CBD mechanism discovery, including the VDAC1 paradox)
What I Need Help With
I’d love collaborators who can help with: 1. Independent replication of the confidence-ratio experiment on other models 2. Code review & architecture of the iris_orchestrator.py classifier (Python) 3. Statistical validation (is the 4-type separation significant under bootstrapping?) 4. Open-science publication / peer review advice
Everything is MIT-licensed and reproducible; repo + data bundles are ready.
Links • GitHub: (add your repo link here) • Full documentation: EPISTEMIC_MAP_COMPLETE.md • Paper draft (preprint): link if available
Why It Matters
LLMs are powerful but opaque. This project aims to give them an epistemic dashboard — a way to quantify what kind of knowing each answer represents. If it works, it could improve safety, reproducibility, and trust in scientific AI systems.
⸻
If you’re into interpretability, epistemology, or reproducible AI science — come help kick the tires.