Methodology

How the validation score is computed

The number on every sighting page isn't a probability of anything exotic. It's a measure of how well-evidenced the report is — here's exactly how we get there.

June 2, 2026By Vera Halliday5 min read
A glowing translucent data panel floating in a dark navy void, cyan validation gauge at the edge.
Share

Every sighting on Sighted carries a validation score between 0 and 100. It sits in the top-right of the page, next to the status badge, and it's the single most-asked-about number in our product. People want to know what it means, where it comes from, and whether they can argue with it.

Short answer to the last one: yes, please. The whole point of publishing the formula is to invite the argument. Here's how it works.

What the score is actually measuring

The validation score does not tell you a sighting was extraterrestrial. It doesn't tell you it was a drone, or a Starlink train, or weather, either. It tells you one thing: how confident we are that the event happened as the witness described it.

That's a narrower question than most readers expect, and it's narrower on purpose. We don't have a way to score "exotic origin" — nobody honestly does. We do have a way to score evidence quality, and that's where we keep the conversation.

A perfect 95 on a clearly-identifiable Starlink train is still 95. The witnesses, the timestamps, and the trajectory all line up. The score is telling you the report is well-evidenced. Whether the explanation is interesting is a separate column.

The four inputs

The score is a weighted blend of four signals. None of them are magic, and none of them are weighted to favour any particular conclusion.

1. Witness corroboration

This is the heaviest of the four by a wide margin. We're asking: how many independent observers filed reports of the same event inside a tight spatial and temporal window?

"Independent" is doing work in that sentence. A friend group posting the same video to a Discord and then individually filing reports doesn't count as independent corroboration — they saw the same source, not the same event. We look at submission timing, geographic spread, and source diversity to score this honestly.

2. Media weight

Does the report come with photo or video? What did the AI pre-screen say? Was the file's metadata consistent with the timestamp on the report, or did someone upload a five-year-old clip claiming it happened last Tuesday?

Media is a strong positive signal when it's clean. It's a strong negative signal when it isn't. A report whose attached video has been visibly cropped, resized, or stripped of EXIF lands below where it would have without the media at all.

3. Investigator review

Verified investigators — usually with backgrounds in aviation, atmospheric science, or law enforcement — can sign off on a report after their own checks. Investigator review is the only input that can push a score above the high 80s on its own, and even then it caps before 100. We reserve 100 for cases that have all four signals locked: corroboration, clean media, investigator review, and source diversity. Those are rare.

4. Source diversity

Were the witnesses pulled from one online community, or do they span unrelated ones? A single subreddit producing twelve reports of the same orb is weighted very differently from a pilot, a hiker, and an off-duty police officer producing three. Diversity is a multiplier on corroboration, not a separate score, but it can swing the corroboration weight by a factor of two.

What the score deliberately doesn't do

A few things readers expect that we left out on purpose:

  • It doesn't decay with age. A well-evidenced sighting from 2014 keeps its score. Memories fade, but corroborating media doesn't.
  • It doesn't track community sentiment. Validations and downvotes are visible on the page, but they don't feed into the score. We've watched what voting-driven scores do to other platforms and we'd rather not.
  • It isn't normalised against the category. A 70 on a triangle case is the same evidentiary bar as a 70 on a lights case. We don't grade on a curve.

When the score is wrong

It will be. The formula is a model, and like any model it sometimes catches up to reality late. When that happens, the right move is usually to add more evidence to the report — an investigator note, an additional witness, a corroborating media file — and let the score recompute. The recompute is incremental: once a report is published, the score is recalculated on every new witness or material addition, and the change is logged so you can see how it moved.

If you think a specific score is wrong, the report's flag menu has a "score disputed" option. That goes to the investigations queue, not to a public comment thread. We'd rather quietly fix it than argue in front of an audience.

Why we show the number at all

Because the alternative is people fighting about credibility in the comments without a shared anchor. Surfacing the score puts a number in the room. You can disagree with the weighting — we publish it precisely so the community can pull on it — but you can't claim we hid the methodology behind a black box. The whole formula fits in this post.

Filed underMethodology
Share
Vera Halliday avatar
About the author
Vera Halliday· Editor-in-chief, Field Investigations

Vera runs editorial and field investigations for Sighted. Her background is investigative journalism — pre-Sighted she covered defence, aviation incidents, and the long, unglamorous work of cross-checking witness statements. She's a skeptic by trade and a writer by choice. Most of what she publishes is about how to be wrong less often, not about what's out there.