API Testing Report - Search News

DeepSeek's new V3.2-Exp model cuts API pricing in half to less than 3 cents per 1M input tokens

MMLU-Pro holds steady at 85.0, AIME 2025 slightly improves to 89.3, while GPQA-Diamond dips from 80.7 to 79.9. Coding and agent benchmarks tell a similar story, with Codeforces ratings rising from ...

Decrypt

New Study Shows AI Outpaces Humans in Game Testing

NetEase-backed study shows language model agents may detect bugs faster and with greater coverage than existing tools.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

DeepSeek's new V3.2-Exp model cuts API pricing in half to less than 3 cents per 1M input tokens

New Study Shows AI Outpaces Humans in Game Testing

Trending now