Threat model

We design sync against a specific set of threats. We are explicit about what is in scope and what is not.

What we defend against:

  • Full compromise of our relay server and its database.
  • Passive observation of network traffic (TLS assumed intact).
  • Account-level takeover (no accounts exist to take over).
  • A rogue internal actor at Refit attempting to read user health data. No such path exists; the data is encrypted with a key we do not hold.
  • Offline brute-force of stolen ciphertext against a reasonably strong passphrase.

What is out of scope today:

  • Device-level malware on the user's phone or laptop.
  • Loss of a paired device without the ability to remotely revoke it.
  • Traffic analysis based on upload size and frequency (the ciphertext is encrypted, the shape of the push is not).

Architecture, in one diagram

A sync round trip, from the perspective of a single device:

1. User picks a passphrase on the device (12 characters minimum).
2. Device derives two keys from the passphrase:
     syncId      (an opaque hash used to look up the blob)
     encryptKey  (AES-256 key, stays on the device only)
3. Device pulls the latest ciphertext for this syncId from the relay.
4. Device decrypts locally with encryptKey.
5. Device merges local state with remote state using a CRDT.
6. Device encrypts the merged state with a fresh nonce.
7. Device pushes (syncId, ciphertext, iv, version).
8. Relay stores the blob. It never sees the passphrase or the plaintext.

Key derivation

A passphrase becomes two unrelated keys through PBKDF2 with different salts:

passphrase + salt="refit-sync-id"   -> PBKDF2(600K, SHA-256) -> syncId      (hex, public)
passphrase + salt="refit-encrypt"   -> PBKDF2(600K, SHA-256) -> encryptKey  (AES-256, secret)

600,000 iterations matches OWASP 2023 guidance for PBKDF2-HMAC-SHA256. The syncId is used by the relay to look up a blob; given only the hash, recovering the passphrase is computationally infeasible. The encryptKey never leaves the device. It exists in memory and a local key wrap in your browser storage. The relay has never seen it and neither have we. If you forget your passphrase, your encrypted data is unrecoverable by design.

What the relay actually stores

One row per sync group, in Postgres:

sync_store (
  sync_id     TEXT PRIMARY KEY,   -- opaque 256-bit hash
  data        BYTEA NOT NULL,     -- AES-256-GCM ciphertext
  iv          BYTEA NOT NULL,     -- 12-byte GCM nonce
  version     INTEGER DEFAULT 1,  -- monotonic counter for conflict detection
  updated_at  TIMESTAMPTZ
)

No user table. No email column. No IP log tied to accounts. No device fingerprint. The relay is a typed key-value store that accepts (syncId, ciphertext, iv) and returns (ciphertext, iv, version). It has no concept of "user" and no way to form one.

Why AES-256-GCM

Galois Counter Mode gives us two properties at once:

  1. Confidentiality. Without the key, the ciphertext is indistinguishable from random bytes.
  2. Integrity. Tampering with a single bit of the stored ciphertext causes decryption to fail loudly on the device, rather than silently produce corrupted plaintext.

If an attacker modifies the blob on the relay, you see a decryption error on next pull, not a corrupted entry written back into localStorage. That is the part of "hack proof" that encryption alone does not handle.

The 12-byte nonce is regenerated for every push. Nonces are public by design in GCM; their job is to be unique per encryption, not to be secret.

Protections against brute force

Four layers working together.

Passphrase strength. We enforce a twelve-character minimum at setup. Short passphrases are rejected before we ever derive a key. We also offer a generator that produces a six-word plus four-digit passphrase (~59 bits of entropy).

Iteration count. 600,000 PBKDF2 iterations per guess. Every attempt costs real CPU, making offline attacks on stolen ciphertext slow and expensive.

Relay hardening. The relay rate-limits repeated requests from the same sync group, validates request origin against our known clients, and expires inactive sync groups. Online enumeration against a single syncId is not a productive attack.

User choice. A long, high-entropy passphrase pushes any realistic offline attack into centuries of GPU time. A weak passphrase does not. The strength meter at setup is not cosmetic.

CRDT convergence

Classic client-server sync lets the server arbitrate merges. That requires the server to read the data. We cannot read the data, so the merge happens on-device. The merge algorithm has to be one where any two devices always converge to the same result without talking to each other mid-merge.

We use a Last-Writer-Wins Element Set, a well-known conflict-free replicated data type:

  • Every entry carries a _modified timestamp and a _device id.
  • Every deletion becomes a tombstone with a _deletedAt timestamp.
  • When two replicas merge, the newer timestamp wins. Ties break on device id for determinism.
  • An entry is alive if its modification is newer than its deletion, dead otherwise.

This gives three mathematical guarantees:

  • Commutative: A merge B = B merge A. Order does not matter.
  • Associative: (A merge B) merge C = A merge (B merge C). Grouping does not matter.
  • Idempotent: A merge A = A. Running the merge twice does nothing bad.

Your phone edits offline on a plane. Your laptop edits in a cafe. When both reconnect and push, they converge to the same state. No "which device is the source of truth" dialog. The math resolves it.

Trade-offs we accept

  • No passphrase recovery. If you forget your passphrase, the ciphertext is math noise. That is intentional; a recovery path would undermine the zero-knowledge property.
  • Whole-blob pushes. Current pushes encrypt and upload a full snapshot rather than a delta. We plan to revisit this as sync groups grow.
  • Opt-in only. Refit works perfectly without sync. You only pair a sync group when you need multi-device.

How this differs from "encrypted cloud storage"

Many competitors advertise "encrypted cloud sync." Usually one of three things is meant:

  • TLS-in-transit only. Plaintext on the server. The only guarantee is "our employees haven't looked."
  • Server-managed keys. The server holds the key and encrypts your data before storing. A rogue or breached admin can read everything.
  • Account-scoped encryption. The key is derived from your password, but the server sees your password on login. Anyone who can read login traffic or server memory can derive the key.

Refit is the only option in that list where a complete server compromise yields nothing readable. That is the specific property we wanted, and the architecture is built around keeping it.

Known gaps and roadmap

We maintain an internal security roadmap covering the next round of hardening work. Planned items include a better key-rotation story (revoking a single paired device without rotating the whole group) and, in time, an independent external review. We will update this page as each item lands, and we deliberately do not publish a prioritized gap list in marketing copy.

Where the code is

Everything described on this page is implemented in the client-side JavaScript bundle that ships to your browser. The two files that matter most:

  • app/src/js/sync.js - key derivation, encryption, CRDT merge, pair and migrate flows.
  • app/api/_rate-limit.js and app/api/sync-*.js - the relay endpoints and rate limiting.

If you find something that does not match this page, please email us. The code is the ground truth; this page stays in sync with it.

Closing

The honest version of security writing is: here is the threat model, here is what we defend today, here is what we plan to improve. Anyone who tells you their sync is perfectly unbreakable is selling you something. We are telling you our sync is built so we cannot read your data. That is a narrow, precise, testable claim, and it is the one that should actually matter.