blog & research · how image detection works

how AI image detection works: probability, not proof

by Tuan Hoang · detection lead · last reviewed 2026-07-03
a probability, never a proof.

an AI image detector doesn’t ‘see’ a fake — it measures how strongly an image’s statistical patterns resemble the output of known generators, and reports a probability. anyone promising a clean yes-or-no is selling the part the math can’t deliver.

the certainty trap

binary certainty is the first thing to unlearn. a detector that returns a flat “real” or “fake” hasn't solved the problem, it has hidden the interesting part: how sure, based on what, and wrong how often. the Reuters Institute's guidance for newsrooms (April 2024) is blunt about it: detection tools return a probability, their signal is degraded by cropping and re-compression, and a clean result does not prove a file is genuine. and when a vendor does promise certainty, regulators have started checking. in 2025 the U.S. FTC ordered Workado to stop advertising its AI Content Detector as roughly 98% accurate after its complaint alleged independent testing put it at 53% on general-purpose content.

the asymmetry matters too. a missed AI image is bad; a real photographer's work flagged as machine-made is worse, because that's an accusation. honest detection tunes against the false positive, which means accepting more “uncertain” verdicts, not fewer.

likely humanuncertainlikely AInever 0%never 100%← more evidence of a human handmore evidence of a generator →“uncertain” is a real answer, not a failure.THE VERDICT SPECTRUM
every verdict is a point on a spectrum — and the ends are off-limits

what a classifier actually reads

the era of counting fingers is mostly over. as generators improved, the visible glitches (warped hands, melted text, impossible fences) got rarer, and the reliable signal moved somewhere your eye can't follow: statistical regularities in how a generator renders noise, texture, lighting and detail. classifiers are trained on enormous piles of known-real and known-generated images until they learn those regularities well enough to score a new image against them.

two honest caveats come with that. first, the traces are fragile, not indestructible: screenshots, crops, filters and every re-compression pass erode them, which is why a heavily reshared meme is genuinely harder to read than a fresh upload. second, the patterns are generator-shaped. a Midjourney image and a Stable Diffusion image tend to leave different statistical residue, which is what makes model attribution possible at all. attribution is always a resemblance claim (“this looks most like Midjourney”), never an identification, because a detector can only name models it was trained to recognize.

why a panel beats a single model

a single classifier is a single point of failure: one training set, so one blind spot. the sturdier shape is an ensemble — several independent detectors that fail on different inputs, so one model's blind spot gets caught by another's read.

that's the shape amige. runs. a panel of independent detectors scores each image; their reads are weighed by each detector's track record, fused into one estimate, and capped so a verdict never reads 0% or 100%. before any of that, a trained routing layer points the scan toward the detectors most likely to read it well; attribution coverage spans 90+ generative models. the routing hint never enters the verdict itself. when the panel splits, amige. says uncertain rather than guess. disagreement is reported, not averaged away: you can see every detector's read on the result page.

ONE IMAGE, FOUR INDEPENDENT READSdetector A0.81 · leans AIdetector B0.74 · leans AIdetector C0.38 · disagreesdetector D0.69 · leans AIweighed by track record → fused → cappedone estimate: likely AI · 74% — with all four reads still visible↑ the dissenter stays on the record.
how a panel verdict is made · illustrative example, not a real scan

the scan modes, honestly

  • auto. the fast consensus: a couple of detectors, and a verdict only when they agree. built to keep a single scan quick. it is not a bulk-moderation pipeline, and we won’t pretend otherwise.
  • deep. the full panel. more detectors, stronger consensus required before the verdict lands. use it when the answer matters more than the wait.
  • custom. you pick which detectors run. useful when you already know which ones you trust for a given kind of image.
  • prioritize deepfake. a separate toggle, not a mode: it switches the panel to deepfake-capable detectors and asks a different question, ‘is this a doctored real person?’ rather than ‘is this generated?’

video gets the same panel logic plus one extra problem: time. generators still find it hard to keep a scene coherent frame-to-frame, so video detection leans on temporal consistency as well as per-frame signal. and for any file, the strongest evidence isn't statistical at all: if an image carries signed Content Credentials, that provenance outranks any classifier's guess. most files in the wild don't, which is why the probabilistic layer exists.

what to do with a probability

treat it like a weather forecast, not a court ruling. an 80% likely-AI read on a stranger's profile photo is a reason to check further (reverse-search it, ask for another shot, look for provenance), not a conviction. an “uncertain” is the tool telling you the truth about its own limits, which is more useful than a confident number that folds under one crop. that's the standard we hold our own machine to, and the standard worth demanding from anyone else's.

questions

because the detectors disagreed, or the image didn’t carry enough signal for a confident read. small files, heavy compression, and repeated re-saves all erode the statistical traces classifiers rely on. amige. treats ‘uncertain’ as a real verdict rather than rounding it to a guess. the academic name for this is selective abstention: a classifier that can decline to answer makes fewer costly mistakes than one forced to pick a side every time.

both layers exist, honestly divided: a panel of independent, specialist detectors does the classification, and amige.’s own trained layer decides which of them a scan should go to, weighs their reads by track record, fuses the result, and caps the confidence. the routing hint never enters the verdict itself. that division is on purpose. no single model, ours included, deserves the final word.

sources.

  1. 01
    Reuters Institute — Spotting the deepfakes in this year of elections: how AI detection tools work and where they fail
    Anlen & Vázquez Llorente, Apr 15, 2024. Detection returns a probability, is degraded by cropping and re-compression, and a negative result does not prove a file is genuine.
  2. 02
    Dugan et al. — RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors (ACL 2024)
    the largest public benchmark for AI-text detection; cited here for the accuracy-claims context: detectors advertising 99%+ drop sharply out-of-domain.
  3. 03
    FTC — Order Requires Workado to Back Up AI Detection Claims (April 2025)
    ~98% advertised vs an alleged 53% on general-purpose content; the regulatory precedent against certainty marketing.
  4. 04
    Content Authenticity Initiative — How it works (Content Credentials / C2PA)
    the signed provenance ‘nutrition label’ — the strongest signal when it survives, absent from most files in the wild.
scan it. see for yourself →is this AI? →