blog & research · glossary · model drift

what is model drift?

by Tuan Hoang · detection lead · last reviewed 2026-06-26

⤹ why no detector is set-and-forget.

model drift is when the data a model meets in the real world moves away from the data it was trained on, so its accuracy quietly decays, which is what happens to an AI-content detector each time a new generator ships.

model drift, the umbrella term for concept drift and dataset shift, is what the machine-learning literature has studied for years: a model learns a relationship from training data, the world moves, and the relationship it learned stops matching what it now sees. Gama et al. (ACM Computing Surveys, 2014) split this into two cases. real drift is when the input-to-label relationship itself changes, so the old model is simply wrong. virtual drift is when only the mix of inputs shifts. for a detector, both arrive the moment a generator it never trained on starts producing content.

why a detector drifts

new generators keep shipping, and that is the whole problem. a detector trained before a model existed has never seen that model's fingerprints, so from its point of view every output is out-of-distribution. RAID (Dugan et al., ACL 2024), the largest public benchmark for machine-generated-text detection (over 6 million generations across 11 models, 8 domains, and 11 adversarial attacks), found that detectors regularly struggle to generalize to unseen generators and are easily fooled by simple attacks. the same paper noted that many commercial detectors advertise very high accuracy yet are rarely tested on hard, shared benchmarks.

the same pattern shows up on the image side. research preprints in 2025-26 report that image detectors strong on older GAN-era pictures weaken on newer diffusion models, with accuracy falling and uncertainty rising as the gap between training data and live data widens. that rising uncertainty is the upside: it is one of the signals that drift is underway. the textbook symptoms are a detector that was strong at training time decaying on the newest generators, a widening gap between performance on seen versus unseen models, and verdicts that cluster at high confidence while turning out wrong more often.

monitor, recalibrate, retrain

the canonical response has three parts. Lu et al. (IEEE TKDE, 2019) frame learning under drift as detection, understanding, and adaptation: watch for the shift, diagnose what moved, then update. updating itself splits into two very different moves. retraining teaches the model new generators on fresh data. recalibration is the lighter one: it corrects how confident the stated probabilities are without changing which way the model leans.

recalibration matters because modern neural networks tend to be overconfident. Guo et al. (ICML 2017) showed exactly that, and that temperature scaling, a single-parameter variant of Platt scaling, fixes it by rescaling the model's outputs on held-out data, tightening the reported confidence rather than flipping the decision. this is why every verdict here reads as a probabilistic estimate and never as proof. the regulatory backdrop is the FTC's 2025 order against Workado: the FTC challenged a 98-percent-accurate claim because independent testing put the tool at roughly 53 percent on general content, and the order bars effectiveness claims unless they are backed by competent and reliable evidence held at the time the claim is made. an accuracy number is only as good as the data and the date behind it.

why amige. is built to expect drift

the responsible design is self-correcting, not set-and-forget. amige. routes each scan to the detectors strongest for that kind of content, runs a panel of independent detectors built by different teams, and names the likely generator only as a best guess, since the resemblance learned for one version of a model can drift as that model is updated. when the reads conflict it returns “uncertain” instead of forcing a call, and it recalibrates as the generator landscape moves. this does not make drift disappear. no detector stays permanently current. it means the system is built to notice when it is falling behind and to correct for it. see how the whole thing fits together in the machine.

questions

what is model drift?

model drift (the umbrella for concept drift and dataset shift) is when the data a model sees in the real world moves away from the data it was trained on, so its accuracy quietly degrades. the machine-learning literature formalizes it as a change over time in the distribution relating inputs to the correct label (Gama et al., ACM Computing Surveys, 2014; Lu et al., IEEE TKDE, 2019).

why does an AI-content detector get worse over time?

new generators keep appearing. a detector trained before a model existed has never seen that model's fingerprints, so it is operating under distribution shift. the RAID benchmark (ACL 2024) found detectors regularly struggle to generalize to unseen generators, and parallel image-detection research reports the same degradation on newer generators, which is why detection has to be maintained, not set-and-forget.

what is recalibration, and how is it different from retraining?

retraining updates the model on fresh data so it learns new generators. recalibration is lighter: it adjusts how confident the stated probabilities are without changing which way the model leans. Guo et al. (ICML 2017) showed modern neural nets tend to be overconfident and that temperature scaling, a single-parameter variant of Platt scaling, corrects this on held-out data.

what are Platt scaling and temperature scaling?

both are post-hoc calibration methods: they map raw model scores to better-behaved probabilities. Platt scaling fits a logistic transform to the scores; temperature scaling is its single-parameter special case that divides the logits by one learned temperature to soften overconfident outputs (Guo et al., ICML 2017).

does this mean detection accuracy numbers cannot be trusted?

it means any accuracy number is only as good as the data and the date behind it. the FTC's 2025 Workado order made this concrete: a 98-percent-accurate claim was challenged because independent testing showed roughly 53 percent on general content, and the order requires competent and reliable evidence held at the time a claim is made. treat every detection verdict as a probabilistic estimate, not proof.

sources.

01
A Survey on Concept Drift Adaptation — Gama, Žliobaitė, Bifet, Pechenizkiy, Bouchachia (ACM Computing Surveys, 2014)
The canonical concept-drift reference. Real vs. virtual drift; sudden / gradual / incremental / recurring patterns.
02
Learning under Concept Drift: A Review — Lu, Liu, Dong, Gu, Gama, Zhang (IEEE TKDE, 2019)
The three-part framework: detection, understanding, adaptation. Monitor, diagnose, then update.
03
On Calibration of Modern Neural Networks — Guo, Pleiss, Sun, Weinberger (ICML 2017)
Modern neural nets are typically overconfident; temperature scaling is a single-parameter variant of Platt scaling.
04
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors — Dugan et al. (ACL 2024)
Detectors regularly struggle to generalize to unseen generators. Direct evidence of drift in detection.
05
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes — arXiv preprint (2025)
Research preprint (not yet peer-reviewed). Image detectors degrade on unseen/newer generators under distribution shift.
06
FTC Order Requires Workado to Back Up Artificial Intelligence Detection Claims — Federal Trade Commission (2025)
98% claim vs. ~53% on independent testing; the competent-and-reliable-evidence standard. Final order approved Aug 2025.

related terms

put one through amige →is this AI? →