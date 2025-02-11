Scale AI , a company offering labeled data for training artificial intelligence applications, and the United States AI Safety Institute , or AISI, have partnered to develop testing methods for frontier AI models .

The San Francisco-based company said Monday the partnership aims to advance AI science by providing model builders access to efficient and reliable ways to test their models before deployment. By working with an independent third-party company like Scale AI, the government can create benchmarks for AI models and remove barriers that keep model builders from doing pre-deployment tests.

Developing Third-Party AI Model Testing Methods

Scale AI’s research arm, the Safety, Evaluation and Alignment Lab, or SEAL, will collaborate with AISI to develop novel evaluation methods that enable companies of all sizes to access reliable testing forms with Scale. They can assess their models and have the option to share the results with AI safety institutes worldwide. The test involves model performance in different domains such as math, reasoning and AI coding.

Model builders are encouraged to do voluntary testing so they can evaluate the capabilities of their AI models and make the necessary changes before deployment. These third-party pre-deployment tests keep the government from building an in-house testing infrastructure, which may not be enough to meet future demands.

“SEAL’s rigorous evaluations set the standard for how cutting-edge AI systems meet the highest standards,” said Summer Yue , director of research at Scale AI. “This agreement with the U.S. AISI is a landmark step, providing model builders an efficient way to vet the technology before reaching the real world.”

