Papr Graph - Upgrade to graph-native vector embeddings

Papr Graph transforms semantic embeddings into graph-native embeddings with one API call. It encodes temporal, topical, and other dimensions within any embedding, helping agents retrieve answers based on correctness, not just semantic closeness.

Add a comment

Replies

Best
Maker
šŸ“Œ

Hello everyone. I’m Amir, founder of Papr.

We built Papr Graph after seeing AI agents fail in production. The model wasn't the problem — retrieval was. Multi-hop questions, versioned policies, relational data — flat vector search breaks on all of it.

Vector search ranks by semantic closeness. But closeness ≠ correctness. A doc saying "aspirin reduces heart attack risk" and one saying "aspirin causes stomach bleeding" rank nearly identical — they're both about aspirin. For an agent making a recommendation, that's the difference between helpful and harmful.

Papr Graph is a graph-native embedding that sits between your existing embeddings and your agent. It encodes structured signals — topic, time, intent, entities, anything you define — directly into your embedding, so ranking reflects meaning in context, not just surface similarity. It's model-agnostic, works with whatever embeddings you're already using.

We saw Papr Graph improve existing embeddings on MTEB (coding, scifact, finance tasks) by 5-20%. On Stanford STaRK (MAG synthesized 10% dataset), Papr Graph leads retrieval models with 92% hit@5 accuracy.

Getting started is free. Keep your existing stack. Add our plugin. Drop graph-native ranking into your current retrieval flow with one API call.

Ā Hi Amir, congrats on the launch. Isn't this more an issue of poor semantic structuring at the embed stage? Also, how do you automate the process of understanding the context of each vertical?

Ā The issue happens because of semantic embeddings. If you trained a custom model on your exact domain with rich contrastive signals, you can fool the 'semantic' nature and close some of this gap for queries requiring contrastive signals.

But two problems remain:

1. Embeddings collapse everything into one similarity score. Topic, recency, intent, entity overlap — they all get squashed into a single number. You can't say "prefer recent docs" or "prefer same-entity matches" at query time without re-embedding the whole corpus.

2. Re-training isn't realistic for most teams. Production stacks run on off-the-shelf embeddings (OpenAI, Cohere, Voyage) because custom training is expensive and brittle. Every time your schema changes — new policy version, new entity type — you'd re-train. Papr Graph adds those structured signals as a layer on top of whatever embeddings you already use, so you get the benefit without owning the model.

On the vertical question, we ship cross-domain defaults and have an agent to help edit or generate new schemas.

I like the idea of retrieving based on correctness not just similarity. How do you evaluate that, like do you have benchmarks showing fewer hallucinated citations or better grounded answers?

The aspirin example nails the core problem, semantic similarity has always confused proximity with correctness. Curious about the structured signals definition process. "Topic, time, intent, entities, anything you define", is that schema defined manually per use case or does Papr Graph infer the right signals from the corpus automatically? That decision feels like the difference between a tool that takes a week to configure and one that works in an afternoon.