Nidhi Bharani

Nidhi Bharani

How We Think About Interruption Handling in Mira

Why reliable interruption handling in voice AI requires more than VAD and audio buffer clearing When we designed Mira, a telephonic AI health assistant that places and receives real phone calls, interruption handling was treated as a first-class architectural concern, not an afterthought. A health conversation is not a chatbot

Teaching AI to Remember: Building Persistent Memory with Mem0, pgvector, and Neo4j

A practical walkthrough of a durable, hybrid memory layer for LLM-backed products — and why the architecture matters beyond one domain. The problem: LLMs have no memory Large language models are brilliant in the moment and amnesiac between moments. Every request to a model is stateless: the weights don't

Keeping Long Chats Cheap and Coherent: Engineering Chat Compaction

Anyone who has shipped an LLM-powered product runs into the same wall: conversations grow, context windows do not. The first ten turns are fast and pleasant. By turn eighty the prompt is bloated, latency creeps up, costs scale linearly with every new message, and eventually the model starts dropping the

The Architecture of a Chrome Extension: A Backend Engineer's Perspective

I'm equal parts ML Engineer and backend engineer. I think in models, services, queues, and databases. The browser has always been someone else's problem. So when I decided to build PromptMask — a privacy-first Chrome extension powered by on-device AI — I expected the frontend world to feel

AI Safety starts before the Model

Governance, risk allocation, and system design are doing more work than your benchmark score. There is a particular kind of confidence that comes from a good safety benchmark score. Your model refused harmful prompts. It passed the red-team suite. The evaluation report is clean. You ship. [1] Then something goes

Federated Learning with NVFlare: A Mental Model for Jobs, Clients, and Servers

A lot of the best ML data is the least portable. Hospitals can’t centralize patient records. Banks can’t ship transaction logs. Phones generate rich signals that never leave the device. Even when data movement is technically feasible, it’s often blocked by privacy commitments, regulation, contracts, or internal

Retrieval Augmented Generation- An Introduction

Everybody knows about ChatGPT. A few know about Llama. Fewer know about Mamba. But there is new word doing the rounds- RAG and Enterprise LLM. Well those are two-technically three words. But you get what I am saying-there is a new baby in town. Well the thing is this baby

John Hick

Close your eyes. Imagine. No...scratch that. Read this first. Imagine a Lone Ranger- an assassin on a hunt. Sun going down the desert, light shimmering on the sand- playing tricks on the eyes. Imagine music playing in the background- something that fills you with anticipation, something that inspires, something

Human Trickery: Consensus sequence from forest maps

In a world far far away from the forest, humans started sequencing DNA with something called DNA sequencers [1]. If you don't know what DNA is, this is not the blog for you. I can't explain every damn thing! This is just too much pressure. Sorry

Mapping the forest: Animal guide to cartography

Once upon a time in a forest far far away, the animals fought a lot. This was no ordinary fighting. Not your usual lion-eats-deer kinda deal. It was a battle of wits, a tussle of opinions- for this was a forest of intellectual animals. All animals, smart as they were,