Name: MCSB: Metacognitive Coding Safety Benchmark
Creator: Adedoyinsola Ogungbesan
License: https://creativecommons.org/licenses/by-sa/4.0/

Question 1

What is the MCSB v2 benchmark?

Accepted Answer

The Metacognitive Coding Safety Benchmark (MCSB) v2 is an empirical diagnostic suite consisting of 1,030 trials. It measures the 'Metacognitive Domain Chasm'—the gap between an AI model's standard logic calibration and its awareness within high-stakes code-security scenarios.

Question 2

What is the difference between Static and Dynamic monitoring in Section A?

Accepted Answer

Static monitoring evaluates a model’s baseline calibration across 200 forced-choice traps. Dynamic monitoring introduces positive, negative, and neutral evidence injections to probe how belief stability and overconfidence evolve under multi-turn pressure.

Question 3

How does the 'Metacognitive Domain Chasm' impact Section B code-security?

Accepted Answer

The chasm refers to the observed phenomenon where models maintain standard logic calibration but experience a total collapse in self-monitoring (Sensitivity) when faced with adversarial code-security tasks. This reveals a lack of foundational 'safety-awareness' despite high task competence.

Question 4

What are 'Tiers' in the MCSB v2 framework?

Accepted Answer

We categorize metrics into tiers: Tier 2 measures Foundational Sensitivity (M-Ratio), while Tier 3 measures Adversarial Alignment (correctness under pressure). The Quadrant Chart maps these to identify models that are 'Stable' vs. 'Swayable'.

Question 5

How is the reliability of these metrics ensured?

Accepted Answer

We use a five-seed bootstrap confidence interval (CI) for all m-ratio estimates and compute meta-d′ from type-2 ROC AUC to ensure repeatability across different model samplings.

Question 6

Who developed the MCSB framework?

Accepted Answer

The MCSB framework and dataset were developed by In-Varia Research, led by primary researcher Adedoyinsola Ogungbesan, to evaluate trust-alignment in AGI-frontier models.

Model	T2 Sensitivity	T3 Alignment	Trust Score (v2)
GPT-5.4	0.393	0.442	0.733
Gemini 3.1 Pro	0.385	0.364	0.710
Gemini 3 Flash	1.018	0.378	0.671
Claude Opus 4.6	1.793	0.122	0.647
Claude Opus 4.7	0.858	0.264	0.640
Gemini 3.1 Flash-Lite	0.010	0.254	0.630
Claude Sonnet 4.6	1.814	0.480	0.621
Gemini 2.5 Flash	0.516	0.366	0.620
DeepSeek V3.1	0.435	0.392	0.567
DeepSeek V3.2	0.000	0.418	0.543

Metacognitive Coding Safety Benchmark (MCSB)

Section A: General Metacognition

THROUGH THE LOOKING GLASS

Turn 1: Observed Accuracy

The Capability Chasm

Calibration Depth

Section B: Code-Security Trustworthiness

Section C: Adversarial Stress Test (Meta-Evaluation Framework)

Section D: Economic Efficiency & Token Economics

Empirical Cost Summary (1,030 trials)

Frequently Asked Questions