The case enterprises make against expanding AI budgets points to a consistent gap: benchmark scores do not predict reliable task completion. OpenAI released GPT-5.5 on April 23 as an agentic runtime, not a conversational model; it reclaimed the top position on the Artificial Analysis Intelligence Index with a score of 60.
Three artefacts need revision: vendor evaluation criteria should add task-completion rates alongside benchmark scores; the Master Services Agreement with OpenAI should be reviewed for conversational-scope clauses, including the data-processing addendum and liability cap on automated decisions; and the internal eval harness needs an agentic task-completion suite. Ask your AI platform lead: which contracted use cases are now re-priceable as agentic subscriptions, and does the current contract expose the organisation to unexpected cost on autonomous loops?
State-level AI safety mandates are creating indirect compliance obligations for enterprise buyers, not just vendors. New York Governor Kathy Hochul signed a revised RAISE Act in April 2026, establishing safety obligations for frontier model developers effective January 1, 2027.
Three artefacts need updating before year-end: the vendor risk review checklist must add a RAISE Act compliance tier for New York-exposed frontier model providers; the Master Services Agreement with those vendors should require a regulatory compliance warranty and a right-to-audit on safety documentation; and the AI governance policy must identify which deployed models fall within the new reporting regime. Ask your procurement lead and General Counsel: do existing contracts require AI vendors to notify the organisation when they become subject to new state safety mandates?
Most enterprise AI compliance programs focus on model-developer obligations. Colorado's AI Act puts equivalent accountability on the deployer, with enforcement starting June 30, 2026. The Colorado AI Act requires companies deploying high-risk AI systems to take reasonable care against algorithmic discrimination and to conduct documented impact assessments.
Three artefacts need attention before June 30: the compliance review must screen every production AI-driven decision in hiring, lending, insurance, or benefits against the Colorado high-risk definition; the AI governance policy needs a Colorado-specific algorithmic-discrimination section; and each Master Services Agreement with a downstream deploying entity should carry a compliance warranty. Ask your Chief Data Officer and General Counsel: which production AI systems qualify as high-risk under Colorado's definition, and do current impact assessments meet the evidentiary standard a state regulator would accept?