AI Lab Notes | ALTOS LAB Journal

AI Evaluation— How we build AI systems, product by product.

Model Quality Usually Fades Before Teams Notice

OpenAI Evals, Anthropic research, Hugging Face leaderboards and arXiv evaluation work all point to the same risk: model quality drifts as data, tasks and user behavior change.

Column市場專欄8 min read