Open Model Leaderboards and Ecosystem

This page tracks open-weight model rankings, state-of-AI usage data, and model-provider ecosystem changes.

Sources in this batch

Onyx’s self-hosted LLM leaderboard and LM Council benchmarks rank open and proprietary systems.
OpenRouter’s State of AI provides usage-scale perspective.
GPT-OSS evaluation, Mistral 3, Arcee, OLMo, GLM-4.6-GGUF, ModernVBERT, and Qwen updates represent ecosystem movement.

Research interest

The interesting research problem is measurement: open-model progress is multi-dimensional, and leaderboards often collapse it into a single score. For local/self-hosted work, deployment constraints, context length, tool use, multimodality, license, and quantization support may matter more than benchmark rank.

Open questions:

What benchmark mix best predicts usefulness for agentic research workflows?
How do open-weight models compare after controlling for tool scaffolding and inference budget?
Are usage studies a better signal of practical value than leaderboards?

Quartz 5

Explorer

Open Model Leaderboards and Ecosystem

Open Model Leaderboards and Ecosystem

Sources in this batch

Research interest

Graph View

Table of Contents

Backlinks

Quartz 5

Explorer

Open Model Leaderboards and Ecosystem

Open Model Leaderboards and Ecosystem

Sources in this batch

Research interest

Related

Graph View

Table of Contents

Backlinks