Ankil S

Harikrupa - A CLI that answers dev burnout with the Bhagavad Gita

by
Harikrupa is an offline-first CLI that answers your life, career, and burnout questions with verses from Srimad Bhagavad Gita. Hybrid RAG: a local ONNX vector index finds the most relevant shloka on your machine in milliseconds.. your query never leaves your laptop for retrieval. Groq-powered inference generates a contextual, bilingual response. Lose wifi? It falls back gracefully to offline mode and prints the raw Sanskrit and English verse. No telemetry. 4 dependencies. Just npm i harikrupa

Add a comment

Replies

Best
Ankil S
Maker
📌
Hi everyone.. I am Ankil. Background in TPGM and DevOps, so most of what I do is keeping things unbroken for other people. Pipelines, releases, on-call rotations, the unglamorous infra that makes a team's week quiet. The job teaches you two things pretty quickly. Systems degrade. Humans degrade faster. Harikrupa came out of that second observation. I have sat in enough post-incident reviews and 1:1s to know the hard questions are not technical. They are the ones engineers will not type into Slack. "Should I quit?" "Am I the problem here?" "Why can I not sleep the night before a prod push?" Stack Overflow has no answer for those. The Bhagavad Gita, strangely, does. So I built a CLI. You ask it something messy, it finds the most relevant shloka, and gives you a grounded interpretation. It lives where I already live.... in the terminal :) A few engineering choices: → Hybrid RAG. Local ONNX vector index handles retrieval on-device, Groq API handles inference only. Two reasons: latency (sub-100ms retrieval) and privacy (your query text never goes to a search engine). If you are coming from a DevOps background, you will appreciate that retrieval is 100% reproducible and deterministic..... no API variance on the most sensitive step. → The language story. You ask in English, and that keeps things predictable for retrieval. The interesting part is the response. Offline mode answers in one of 8 pre-compiled languages: English, Hindi, Sanskrit, Gujarati, Punjabi, Tamil, Telugu, Kannada. Online mode, with AI inference, has no such ceiling..... get the response in Japanese, Swahili, Portuguese, anything. Sky is the limit on the output side. → Graceful degradation. Lose wifi mid-query? It does not crash. Falls back to the local vector DB and prints the raw Sanskrit plus a pre-compiled translation in your preferred language. No "check your connection" nonsense. → 4 dependencies. No telemetry. Nothing phones home. What evolved during the build: Version 1 sent the raw query straight to an LLM. Worked, but it was slow, expensive, and I could not stomach shipping a spiritual tool that exfiltrated every sensitive question to someone's logs. Splitting retrieval (local) from generation (remote, interpretation-only) solved all three..... 10x faster, nearly free at scale, and the most private part of the exchange stays on your laptop. That architectural split is the thing I am actually proud of. Asks: - Throw your messiest prompts at it..... the 2 AM, staring-at-the-ceiling, "I do not know if I can do this anymore" kind. That is the intended use case and the test set I am building against. - Ask something in English and request the response in an unusual language..... I am curious what the limits of the inference path actually are. Break it for me. - If you are a fellow TPGM / SRE / DevOps person.. does this resonate or am I solving a problem only I have? `npm i -g harikrupa` Thanks for reading. जय श्री कृष्ण 🙏