Launching today

SchemaFit
CI linter for LLM structured-output schemas
11 followers
CI linter for LLM structured-output schemas
11 followers
SchemaFit is an MIT-licensed CI linter for LLM structured-output schemas. It checks JSON Schema, tool definitions, and response_format schemas against provider-specific constraints before runtime. It catches unsupported keywords, nesting issues, required/optional mismatches, and portability problems across OpenAI, Anthropic, Gemini, Mistral, and Cohere, so teams can fail PRs instead of production calls.

To see how bad it actually is, I ran SchemaFit over 50 real, public schemas, from OpenAI/Anthropic cookbooks, agent frameworks, and official MCP servers, each provenance-linked to its source. 44 of 50 (88%) would be rejected by at least one major provider. Only 3 of 50 were clean across all five. The single biggest culprit was additionalProperties 134 flags, more than every other keyword combined.
One real example: an MCP get_channel_history tool with just two properties breaks OpenAI strict mode three ways at once (no additionalProperties:false, limit not in required, and limit has a default), yet passes Anthropic, Gemini, and Cohere. That's the portability problem in one schema.
Honest caveat: the rule packs vary in firmness. OpenAI's constraints are documented and firm; Mistral's are a conservative reading of its docs; Gemini's are warnings, not hard errors. The corpus is provider-mixed by design, so 88% is a floor, not a cherry-pick.
Corpus, repro script, and provenance: https://github.com/OrionArchitekton/schemafit/tree/main/benchmarks
Repo: https://github.com/OrionArchitekton/schemafit
It's MIT, pure-Python, pip install schemafit, static, offline, no API key, zero runtime deps. I'd genuinely like your eyes on the rule packs, tell me where they're too strict, too loose, or just wrong for a provider you use.