All activity
Oskar Kocolleft a comment
If going with the tracing, Narev integrates with @Helicone AI @Langfuse @LangSmith and @W&B Weave by Weights & Biases More on tracking in our docs

NarevFind a faster and cheaper LLM in minutes
Oskar Kocolleft a comment
If going for the API endpoint, here are some guides: How to choose an LLM with a Kaggle and Google Colab notebook How to choose an LLM (if your data is labelled) with a Kaggle and Google Colab notebook import requests NAREV_API_KEY = "" NAREV_BASE_URL = "https://www.narev.ai/api/applica... ID>/v1" STATE_OF_ART_MODEL="openrouter:google/gemini-3-pro-preview" messages = [] response =...

NarevFind a faster and cheaper LLM in minutes
Oskar Kocolleft a comment
To get started you integrate through: import of your traces (we've pretty much every major provider) calling our API (OpenAI API endpoint) file import (json, jsonl, csv) manual entry (yes, just type) Here is our blog post on GPT 3.5 beating GPT 5 in structured extraction to get you inspired

NarevFind a faster and cheaper LLM in minutes
Oskar Kocolleft a comment
What is Narev? BYO benchmark for LLMs (replacement of evals). What does Narev do? Gives an answer to "What's the best model for my XYZ use case?" (spoiler alert - no one knows, not even the LLM leaderboards). How does Narev do it? We help you set up a quick benchmark on YOUR OWN data. We already do A/B testing for product (LaunchDarkly) and marketing (Mailchimp, Hubspot), so let's do it for LLMs.

NarevFind a faster and cheaper LLM in minutes
Oskar Kocolleft a comment
For quick experiments - just enter the prompts manually, through the UI:

NarevFind a faster and cheaper LLM in minutes
Our objective was to set an A/B test and see the results in 5 minutes.
We hit it.
We worked hard to make setting up a benchmark easier by 1. improving model search 2. adding smart model recommendation engine 3. adding support for publicly sharing the results

NarevFind a faster and cheaper LLM in minutes
Run a series of A/B tests for your LLM setup in 15 minutes. Define model, parameters and system prompt and see what's the impact on latency cost and quality. Call it a bespoke benchmark.

NarevRapid A/B testing for LLMs
Oskar Kocolleft a comment
Hey, we've built this to allow a quick iteration and finding the best quality for the dollar spent. Here is a short Loom video showing how this looks: https://www.loom.com/share/39b7cd24166c4e1fafd5e7fbd12a9d4d?sid=55659f54-ba68-4be2-be53-15057001a1c7 Open access at https://narev.ai. Would love to get your feedback!

NarevRapid A/B testing for LLMs
Oskar Kocolleft a comment
Hey everyone! I built Narev to solve a problem I was facing myself: scattered SaaS billing data across AWS, Azure, GCP, OpenAI, and other services. Getting a unified view was a nightmare. Narev ingests all your billing data and normalizes it using FOCUS 1.2 (the FinOps Foundation standard), giving you one clean dashboard for everything. Since it's self-hosted, your financial data never leaves...

NarevOpen source FinOps for AI
Narev ingests SaaS billing data and lets you export in FOCUS 1.2 format. It comes with one dashboard for everything. Self-hosted and open source so your data stays private.

NarevOpen source FinOps for AI
