Building a journal the server can't read

June 19, 2026

I set out to build a journaling app where the server genuinely cannot read your entries, got the encryption wrong on the first attempt, and rebuilt it the right way.

I wanted to build a journaling platform where the server genuinely couldn't read the user's entries. Not "we promise not to look," not "encrypted at rest with a key we also hold." I meant it literally. I run the server, I own the database, and even I cannot read someone's journal. That property has a name, zero-knowledge, and the reason I wanted it is simple. No one else had built it, and it's the only way you get real privacy out of a piece of software you don't host yourself.

Fair warning: this is not necessarily a guide on how to build a zero-knowledge system. It's an overview of how I built one, including how I got it wrong at first and had to throw the whole design out.

What zero-knowledge promises

Zero-knowledge sets the bar higher than simply "encrypted." Plenty of apps encrypt your data with a key that lives on their server, which means the lock and the key are in the same building. Zero-knowledge promises that the server stores your data and fundamentally cannot decrypt it, because the key never existed anywhere the server could see. Basically, the server has the lock and only you have the key.

Why my first attempt was wrong

The first version felt responsible, which turns out to be the most dangerous kind of wrong. An account was an email and a password. The journal had a second secret, a passphrase, that never went to the server. On the user's device, I ran the passphrase through PBKDF2 to derive a key, encrypted every entry with it, and stored a hash so the client could tell whether the passphrase came back correct. The server only ever saw a salt, that verifier hash, and the ciphertext. By the strict definition at the top of this post, it worked. The key never existed anywhere the server could see, so the server genuinely could not read the entries.

And yet being zero-knowledge barely bought anything. The entries were locked, but everything keeping them locked rested on the passphrase, and almost nothing in the design made that passphrase hard to guess. The weaknesses were small, and each looked reasonable when I made it. Together they meant a stolen copy of the database presented far less of an obstacle than it should have.

The salt

I used the user's email as the PBKDF2 salt, because it was unique and already sitting in the database as their account identifier. Unique is the one property a salt must have, so it looked fine. What a salt does not have to be is secret. It sits in the clear next to the ciphertext, so the email being readable was never the problem. The problem is when it can be read. A random salt is unknowable until the database is already stolen, so not one guess can begin before the breach. An email is knowable today, which lets an attacker grind a dictionary of likely passphrases offline, on their own schedule, long before they touch my server. By the time they have the database, opening the journal is a lookup, not an attack.

And none of it was hidden. The key-derivation code runs in the browser, so "the salt is the email" was never a secret an attacker had to uncover. That is Kerckhoffs's principle, and a zero-knowledge web app is its purest case, because I ship my whole cryptographic design to the attacker's machine. Algorithm, parameters, salt, all of it is theirs to read, and the passphrase is the only secret left. A salt they can know in advance throws away the one thing it was there to buy, which is keeping the clock from starting until after the break-in.

The passphrasethe only real secretThe user's emailused as the saltPBKDF2not memory-hardOne keyno master keyEncrypts the entriesStored as a verifier hasha target for guesses
The first design, and the shaded box is the mistake. The user's passphrase and email both feed the key, with the email serving as the salt. Half of what locks the journal is public, so an attacker can do the slow part offline before ever reaching the server.

The smaller mistakes

PBKDF2 at 600k iterations isn't cheap, but it isn't memory-hard, which is what matters once an attacker brings a GPU and runs thousands of guesses in parallel for almost nothing each. The verifier hash I was so proud of only made it easier. The authenticated ciphertext was already an oracle, since the wrong key fails to decrypt, so the verifier just handed over a second, tidier target, and maybe a cheaper one.

What I needed instead

So it was zero-knowledge on paper and openable in practice. Underneath the salt and the KDF, the deeper mistake was that the passphrase was the key, with a verifier bolted on to check it like a password. So I threw the design out, read how this is actually supposed to be done, and rebuilt it. The passphrase shouldn't be the key, it should unlock one, and then there's nothing left to verify, because a key that unlocks is its own proof. Hand back a locked key and let the passphrase either open it or not.

The design that works: a wrapped master key

When you first set a passphrase, the client generates a random 32-byte master key. Every journal entry is encrypted with that master key using AES-256-GCM. The master key is the only thing that can read your data, and it never leaves your device in a form the server can use.

Of course, the master key has to survive you closing the tab, so it has to be stored somewhere. It gets stored wrapped. Your passphrase is run through Argon2id (256 MiB of memory, 4 iterations) to derive a key-encryption key, the KEK. The KEK encrypts the master key using AES-GCM-SIV, and that wrapped blob is what goes to the server. I reached for SIV here rather than the plain AES-256-GCM the entries use because it's nonce-misuse resistant. If a nonce ever gets reused under the same key, plain GCM fails catastrophically, while SIV only leaks whether two plaintexts were equal. For the one secret I least want to get wrong, that safety margin is worth the slightly heavier mode. Entries, written constantly with a fresh random nonce each time, are fine on ordinary GCM. The passphrase derives the KEK, the KEK unwraps the master key, the master key reads your entries. Three steps, all happening on the user's device, and only the last two ever touch the user's plaintext.

So what does the server actually hold? Three things: the KDF salt, the KDF parameters, and the wrapped master key. No passphrase. No password hash. Nothing it can turn into your data. It can hand you back the wrapped blob, but it cannot unwrap it, because the only key that would is one it has never seen.

On your devicePassphrasetyped, never sentDerived key (KEK)Argon2idMaster keydecrypts entriesYour entriesplaintextOn the serverKDF saltKDF paramsWrapped master keyEncrypted entriestrust boundary
The server stores everything on the right and can decrypt none of it. The only key that would unwrap the master key never leaves your device.

That reframes what "unlock" even means. There's no password check anymore. You type your passphrase, the client derives the KEK, and it tries to unwrap the master key. If the passphrase was right, the unwrap succeeds and you're in. If it was wrong, the unwrap fails, and there's nothing else to ask. No comparison runs on the server, so there's no oracle to attack, because verification is the decryption. The unwrap is also bound to your identity through the cipher's associated data, so a blob wrapped for you can't be unwrapped under someone else's account even if the bytes get swapped.

The brute-force math

None of this stops someone from guessing the passphrase. Zero-knowledge just moves the whole fight onto that one secret, so it's worth being honest about how that fight goes and how much the rebuild actually changed it.

Three things set the cost of a guessing attack. How expensive each guess is, whether the attacker can start before they break in, and how strong the passphrase was to begin with.

The first is the key-derivation function, and it's the part people expect to matter most. The first design ran PBKDF2 at 600k iterations. The rebuild uses Argon2id at 256 MiB and four passes. The difference that matters isn't the iteration count, it's that Argon2id is memory-hard. A GPU will run thousands of PBKDF2 guesses at once because each one is just arithmetic, but it chokes on Argon2id, because every guess demands 256 MiB of its own fast memory and the card runs out of memory long before it runs out of cores. On the same hardware that's worth roughly a hundredfold fewer guesses per second. Real, but not the headline.

The headline is the salt, and this is the part I got wrong the first time. Using the email as the salt didn't make any single guess cheaper. It let the attacker do the guessing in advance. With a salt they already know, the expensive grind can run on their own schedule, months before they ever touch the database. A random salt takes that away completely. Not one cycle of guessing can start until after they've broken in and read the salt out of the database, which is exactly when a defender has the best chance of noticing. The KDF sets what each guess costs. The salt decides whether the clock is even allowed to start early.

Time to crack a 3-word passphraseone high-end GPU, half the keyspace0.67 yrcrackable in monthsFirst design67 yrout of reachRebuilt100×
The memory-hard rebuild turns roughly eight months of grinding into about seventy years. A better GPU slides both bars down together and leaves the ratio untouched.

And then there's the part no design controls, which is the passphrase itself. The hundredfold and the lost head start are a fixed advantage that holds at every passphrase length. What changes across the range is whether it tips the verdict. Drop to two words and the hundredfold still applies, it just drags the cracking time from hours to days. Climb to five and both are out of reach for centuries, hundredfold or not. Only in between, at a merely decent passphrase, is that multiplier the whole distance between "crackable by someone determined" and "not worth anyone's time." That is the case I The rebuild did not make a bad passphrase safe. For a journal with no password reset, winning that middle case is the whole game.

The honest cost

True zero-knowledge architecture has a price. There is no password reset. If you forget your passphrase, your journal is gone. I can't email you a link, because the link would have to lead to your master key, and I don't have your master key. The instant a server can recover your data, it can read your data, and then it was never zero-knowledge. Recovery and zero-knowledge are the same property pointed in opposite directions. You get to pick one. And I won't pretend that's costless in practice. A less technical user will eventually lose years of entries to a forgotten passphrase and be rightly upset, and "the math made me do it" is cold comfort when it's your journal. The honest answer is that the alternative is worse, not that the edge isn't there.

Summary

It comes down to a single chain. Every entry is locked with a random master key, that master key is locked with your passphrase, and only the wrapped master key is ever sent to the server. The passphrase that opens it never leaves your device, so the server can store the whole journal and read none of it.

If you'd like to check it out, visit https://commajournal.com. Just don't forget the passphrase, because I meant the part where there's no getting it back.