ErotemeEroteme

Understanding the AI Output

How to read and interpret the results of an AI prediction.

When the AI pipeline completes, you get a detailed breakdown of the prediction. Here's what each component means.

Probability

The headline result: a calibrated probability from 0–100% representing how likely the AI thinks the outcome is.

This is not a binary YES/NO — it's a nuanced estimate. A probability of 72% means the AI believes there's roughly a 72% chance the event will happen.

RangeInterpretation
80–100%Strong conviction — clear evidence points one way
60–79%Likely but uncertain — evidence favours one side
40–59%Toss-up — evidence is mixed or insufficient
20–39%Unlikely — evidence points the other way
0–19%Strong conviction against

Confidence Interval

A range around the probability that captures uncertainty. The default interval is ±15 points around the stated probability.

For example, a probability of 70% with a confidence interval of [55%, 85%] means the AI is fairly confident the true probability falls in that range.

A wider interval means more uncertainty. A narrower interval means the agents are in closer agreement.

Disagreement Score (Eroteme only)

A value from 0 to 1 measuring how much the 8 agents diverge in their probability estimates.

  • Calculated as the standard deviation of all agent probabilities, divided by 50
  • 0.0–0.2 — Strong consensus, agents broadly agree
  • 0.2–0.4 — Moderate disagreement, some agents see it differently
  • 0.4+ — High disagreement, agents are significantly split

When the disagreement score is high, pay attention to the contrarian and bear case agents — they may have identified risks the majority missed.

Per-Agent Breakdown (Eroteme only)

Each of the 8 agents' individual results are shown:

  • Role — The agent's function (e.g. Bull Case, Bear Case, Contrarian, News Researcher)
  • Provider — Which AI model (Claude, ChatGPT, Gemini, Grok, Perplexity)
  • Probability — That agent's individual probability estimate (0–100%)
  • Confidence — How confident the agent is in its own estimate
  • Reasoning — The key arguments and evidence that agent considered

This transparency lets you:

  • See which agents are bullish vs bearish
  • Read the contrarian perspective on why consensus might be wrong
  • Judge the quality of evidence each agent surfaced
  • Understand what the research agents found vs how the analysts interpreted it

Key Factors

A list of the primary drivers behind the prediction, synthesised by the meta-judge from all 8 agents' outputs. These are the most important facts and trends that inform the probability estimate.

What Could Be Wrong

The meta-judge explicitly identifies the biggest risk to its estimate — the scenario or factor most likely to make the prediction incorrect. This is not a disclaimer; it's a genuine assessment of where the analysis might fail.

Standard Tier Output

For Standard tier predictions (single model), the output is simpler:

  • Probability (0–100%)
  • Reasoning — The model's analysis
  • Key factors — Main drivers

There is no disagreement score, confidence interval, or per-agent breakdown, since only one model was used.

On this page