I wanted to share an open-source project called ProxyFace. If you're interacting with LLMs and want a more engaging experience, this adds a real-time, pixel-art avatar that reacts to the AI's output with actual emotions and it runs entirely on your own machine.
Most of us have hit that wall where a 6GB or 8GB GPU just gives up the ghost as soon as you feed it a long PDF. Quansloth (Apache 2.0 License at GitHub) an implementation of TurboQuant (ICLR 2026) this early is a game-changer for the local LLM scene. Why this is a big deal:
Based on the implementation of Google's TurboQuant (ICLR 2026) — Quansloth brings elite KV cache compression to local LLM inference. Quansloth is a fully private, air-gapped AI server that runs massive context models natively on consumer hardware with ease - PacifAIst/Quansloth