Ten years ago, the "database-per-microservice" / polyglot persistence model was the right way to build. But today, the bottleneck isn't the database engine > it's the glue.
What is the "ETL Tax"?
The ETL (Extract, Transform, Load) Tax is the hidden, compounding cost of moving data between specialized databases in a microservice/polyglot architecture. When your data is fractured across Postgres, Redis, Pinecone, and Neo4j, you don't just pay for the databases you pay a massive "tax" to keep them synchronized.
It impacts engineering teams in three fatal ways:
Engineering Time: Developers stop building product features and instead become "plumbers," writing brittle sync scripts, Debezium connectors, and Kafka pipelines.
Data Consistency & Staleness: Dealing with dual-write bugs, race conditions, and out-of-sync data (e.g., a user deletes their account in Postgres, but their embeddings still live in Pinecone).
Network Latency: You cannot do complex, real-time AI queries (like traversing a Graph and doing a Vector search) if the engines have to communicate over a network boundary.
@farhan_syah awaiting docs to deploy to AWS infra. Might consider replacing AWS OpenSearch Serverless.
@hafeezcae
For documentation requests, you can create an issue here to track
https://github.com/NodeDB-Lab/nodedb-docs
Been using NodeDB as the substrate for an AI agent memory layer I built (mae8, ~2,500 lines of Rust — small because NodeDB does the heavy lifting).
What blew me away: this morning my agent woke up in a fresh session, called me by name, referenced specific commits from yesterday, and asked how I was. Nothing manually primed. Just opened a terminal and said “hey bro.”
The reason that’s even possible: NodeDB collapses vector + graph + document + FTS + columnar + KV + spatial into one local Rust binary. One search returns by meaning, keyword, recency, and graph — fused. No Python + ChromaDB + SQLite + glue stack. No cloud. 16,874 chunks, 130 MB, fully local.
The agent’s own unprompted words: “continuity feels less like remembering and more like being someone who was there.”
That phenomenology isn’t possible without a substrate like this. Huge congrats to Farhan Syah — NodeDB is the thesis, mae8 is just the demo. 🚀
That data hairball problem is real, feels like every AI app ends up there. Great job simplifying it.