The Architecture of a Chrome Extension: A Backend Engineer's Perspective
I'm equal parts ML Engineer and backend engineer. I think in models, services, queues, and databases. The browser has always been someone else's problem.
So when I decided to build PromptMask — a privacy-first Chrome extension powered by on-device AI — I expected the frontend world to feel foreign. And some of it did. What surprised me was how clearly the underlying architecture mapped to patterns I already think in.
It's Not a Web App. It's a Distributed System.
The first thing to understand about a Chrome extension is that it is not one thing running in one place. It is multiple isolated components, each with its own execution context, its own memory, and its own lifetime. They don't share variables. They talk to each other by passing messages.
If that sounds like microservices, that's because it basically is.
Once I stopped thinking "web app" and started thinking "small distributed system with strict boundaries," everything clicked.
manifest.json — Your Service Registration
Every extension starts with a manifest.json. This is the file Chrome reads to understand what your extension is, what components it has, and what permissions it needs.
{
"manifest_version": 3,
"name": "PromptMask",
"version": "1.0.0",
"permissions": ["storage", "offscreen"],
"host_permissions": ["https://chat.openai.com/*", "https://claude.ai/*"],
"background": {
"service_worker": "service_worker.js",
"type": "module"
},
"content_scripts": [...]
}The principle is minimal permissions — only ask for what you actually need. Same discipline you'd apply to IAM roles or database access controls.
Contexts — Each Component Has Its Own Process
In a Chrome extension, different components run in completely isolated JavaScript environments called contexts. They do not share memory. A variable you set in one context is invisible to another.
For a backend engineer, the easiest mental model is separate processes. You wouldn't expect one service to read another service's heap directly. You'd call its API. Same idea here.
The main contexts you'll work with are the service worker, the content script, and (if you need it) the offscreen document. Let's go through each.
Service Worker — The Daemon Process
The service worker is the background brain of your extension. It has no UI. It doesn't touch any webpage. It runs in the background, has privileged access to Chrome's internal APIs, and is the component everything else calls into when it needs something done.
Think of it as a daemon process. It's not attached to any user-facing session — other components call into it when they need something from that privileged layer. In PromptMask, the service worker manages the lifecycle of the offscreen document (where the AI model runs) and routes inference requests between components.
One important gotcha: the service worker is not always alive. The catch is it's not persistent like a real daemon — Chrome can spin it down when idle and restart it on demand. It wakes up, handles an event, and may go back to sleep. Any state you need to survive that cycle has to live somewhere durable, not in a JavaScript variable.
Content Script — Request Middleware
A content script is code that runs inside a web page. It can read the page's DOM, modify it, and intercept user interactions. This is the component that lets your extension actually do something on sites like ChatGPT or Gemini.
The backend analogy here is request middleware or an interceptor. It sits in the path of something that's about to happen — say, a user hitting "send" on a form — and it can inspect or modify that data before it continues.
One thing that confused me at first: the content script runs inside the page's context, but it's still isolated from the page's own JavaScript. It can see and manipulate the DOM, but it doesn't share variables with the site's code. It's embedded, but sandboxed. Think of it as a sidecar — it runs alongside the page, shares its context, but is its own isolated runtime handling a cross-cutting concern the host page knows nothing about.
Offscreen Document — The Sandbox Process
This one is less well known, and it was the most interesting architectural decision in PromptMask.
An offscreen document is a hidden page your extension can create and run in the background. You can't see it — there's no tab, no UI — but it's running, and critically, it has a full DOM context.
Think of it as a sandbox process. When your main process lacks a capability it needs, you spin up an isolated child process to handle it. That's exactly what's happening here — the service worker has no window, no document, no page context at all. But some browser APIs were designed assuming they run inside a page. WebGPU is one of them. So rather than running the AI model in the service worker (which can't), you spin up a hidden page that can.
In PromptMask, the offscreen document is where the entire inference pipeline lives. The service worker delegates to it, it runs the model, and it passes the result back.
Message Passing — Internal API Calls
Since all these contexts are isolated, they communicate by sending messages. Chrome gives you chrome.runtime.sendMessageand chrome.tabs.sendMessage for this.
It maps directly to IPC — inter-process communication. Or think of it as internal API calls between services. One component sends a typed message with a type and a payload, another component listens and responds. The service worker usually sits in the middle — it's the component that can talk to everyone else, so it becomes the natural coordinator.
chrome.storage — Your Config Store
Since the service worker can go to sleep and lose its in-memory state, you need a place to durably store things like user settings and preferences. That's chrome.storage.
Think of it as a SQLite config table — a flat key-value store with get(key) and set(key, value), no schema, no relations, no query language. It persists across restarts and is used purely for config and user preferences.
The rule is simple: if a value needs to exist after the popup closes or the service worker restarts, put it in chrome.storage, not in a JavaScript variable.
It's a Genuinely Good Architecture
What I came to appreciate is that Manifest V3 — the current Chrome extension platform — has enforced a clean separation of concerns. Each component has a clearly defined role, a defined lifetime, and a defined way to communicate. There's no shared mutable state bleeding across boundaries.
For a backend engineer, that constraint is actually comfortable. It's the same discipline you apply when designing services: clear interfaces, no side channels, explicit message passing.
The browser just wraps it in a slightly different vocabulary.
I built this architecture for PromptMask, a Chrome extension that automatically redacts PII from your AI prompts before they reach the cloud — using a local AI model that runs entirely in your browser. If you use ChatGPT, Claude, or Gemini at work and ever paste anything you probably shouldn't, it's worth a look.