Understanding the AI Output
How to read and interpret the results of an AI prediction.
When the AI pipeline completes, you get a detailed breakdown of the prediction. Here's what each component means.
Probability
The headline result: a calibrated probability from 0–100% representing how likely the AI thinks the outcome is.
This is not a binary YES/NO — it's a nuanced estimate. A probability of 72% means the AI believes there's roughly a 72% chance the event will happen.
| Range | Interpretation |
|---|---|
| 80–100% | Strong conviction — clear evidence points one way |
| 60–79% | Likely but uncertain — evidence favours one side |
| 40–59% | Toss-up — evidence is mixed or insufficient |
| 20–39% | Unlikely — evidence points the other way |
| 0–19% | Strong conviction against |
Confidence Interval
A range around the probability that captures uncertainty. The default interval is ±15 points around the stated probability.
For example, a probability of 70% with a confidence interval of [55%, 85%] means the AI is fairly confident the true probability falls in that range.
A wider interval means more uncertainty. A narrower interval means the agents are in closer agreement.
Disagreement Score (Eroteme only)
A value from 0 to 1 measuring how much the 8 agents diverge in their probability estimates.
- Calculated as the standard deviation of all agent probabilities, divided by 50
- 0.0–0.2 — Strong consensus, agents broadly agree
- 0.2–0.4 — Moderate disagreement, some agents see it differently
- 0.4+ — High disagreement, agents are significantly split
When the disagreement score is high, pay attention to the contrarian and bear case agents — they may have identified risks the majority missed.
Per-Agent Breakdown (Eroteme only)
Each of the 8 agents' individual results are shown:
- Role — The agent's function (e.g. Bull Case, Bear Case, Contrarian, News Researcher)
- Provider — Which AI model (Claude, ChatGPT, Gemini, Grok, Perplexity)
- Probability — That agent's individual probability estimate (0–100%)
- Confidence — How confident the agent is in its own estimate
- Reasoning — The key arguments and evidence that agent considered
This transparency lets you:
- See which agents are bullish vs bearish
- Read the contrarian perspective on why consensus might be wrong
- Judge the quality of evidence each agent surfaced
- Understand what the research agents found vs how the analysts interpreted it
Key Factors
A list of the primary drivers behind the prediction, synthesised by the meta-judge from all 8 agents' outputs. These are the most important facts and trends that inform the probability estimate.
What Could Be Wrong
The meta-judge explicitly identifies the biggest risk to its estimate — the scenario or factor most likely to make the prediction incorrect. This is not a disclaimer; it's a genuine assessment of where the analysis might fail.
Standard Tier Output
For Standard tier predictions (single model), the output is simpler:
- Probability (0–100%)
- Reasoning — The model's analysis
- Key factors — Main drivers
There is no disagreement score, confidence interval, or per-agent breakdown, since only one model was used.
Eroteme