An autonomous agent for auditing and improving the reliability of clinical AI models — now published.
We introduce ModelAuditor, a self-reflective agent that simulates clinically relevant distribution shifts and produces interpretable reports on likely failure modes. Across multiple medical imaging domains, it recovers up to 25% of performance lost under shift while providing actionable deployment insights.