This RFP supports work that addresses major challenges to the field of AI capability evaluation by building GCR-relevant benchmarks, advancing the science of evaluations, or improving third-party access infrastructure.
Capability evaluations are not on track for the role they are expected to playGlobal Catastrophic Risk (GCR)-relevant capability benchmarks for AI agents
Advancing the science of evaluations and capabilities developmentImproving third-party access and evaluations infrastructure