OpenAI

ChatGPT 5.2 Pro

Daily drift snapshot against a 21-day baseline with auto + human signals.

Last run Jan 13, 2026 (2h ago)

AUTO48+3
HUMAN35-2

7-day drift

AUTO DUMB INDEX

050100
OK

48

Normal

vs baseline +3

AUTO DUMB INDEX 48 (Normal), +3

Why it moved

Today's drivers

Latency up

med

TTFT slower

Delta +4

Instruction slips

low

Format misses

Delta +2

Accuracy steady

low

No major drift

Delta -1

Baseline window: 21 days

Auto score breakdown

Accuracy

Objective tasks solved correctly.

38%

+1 vs baseline

Click to expand for recent values (mocked)

Reasoning robustness

Consistency across prompt variations.

34%

0 vs baseline

Click to expand for recent values (mocked)

Instruction following

Format and constraint compliance.

31%

+2 vs baseline

Click to expand for recent values (mocked)

Hallucination risk

Confident wrong answers on known items.

40%

+1 vs baseline

Click to expand for recent values (mocked)

Refusal anomaly

Unexpected refusals on safe prompts.

29%

-1 vs baseline

Click to expand for recent values (mocked)

Latency

p50/p95 response time drift.

44%

+4 vs baseline

Click to expand for recent values (mocked)

Variance

Run-to-run stability.

28%

-2 vs baseline

Click to expand for recent values (mocked)

Eval suite

Task tier performance

Tier 0

Sanity checks

78

+1 today

12 tasks

Tier 1

Factual QA

73

-1 today

20 tasks

Tier 2

Reasoning + math

69

-2 today

18 tasks

Tier 3

Coding

71

0 today

12 tasks

Tier 4

Instruction stress

66

-1 today

10 tasks

Community

Human reports

Top categories today

Latency7
Instruction5
Reasoning3
Hallucination2
Refusal2

ChatGPT 5.2 Pro

1h ago

LatencySeverity 3

p95 latency jumped to ~14s for short prompts.

"Short prompts felt sluggish in the last hour."