The Reversible Hash

P2easy10 minIncident Response

We share a user_hash column with an analytics partner so they can join on users without seeing identities. This morning the partner sent back a spreadsheet mapping our user_hash values straight to real email addresses. We used SHA-256 - it's a one-way function. So how did they reverse it?

Reversible identities
2,400,000 → 0 users
Unsalted-hash columns in pipeline
4 → 0 columns

The incident

It's 11:00 and our analytics partner's security lead just emailed the DPO something that made her go pale: a spreadsheet that maps our supposedly-anonymous user_hash column back to thousands of real email addresses. We ship that column to the partner precisely so they can join on a stable per-user key without ever seeing who the user is - it's meant to be pseudonymized. The hashing is real: every email is run through SHA-256, which is a one-way function, and the live SELECT shows neat 64-character digests with no plaintext anywhere near the extract. Transport is locked down too. And yet the partner reproduced the mapping in an afternoon on a laptop, with no access to our systems and no secret from us. The data is leaving us as a hash and arriving as an identity. Legal has a disclosure-assessment call in two hours and needs to know how a one-way hash became a lookup table.

Symptoms on the table

  • the partner mapped ~2.4M user_hash values to real email addresses with no access to our systems
  • the extract contains only 64-char SHA-256 digests - no plaintext email anywhere in it
  • the same email always produces the same user_hash, in every extract we've ever sent
  • SFTP transport, keys and TLS are all clean - nothing was intercepted
  • no alert fired - to every check, we are shipping hashes, not PII

Systems on the board

The real components in play for this incident — the surface you investigate when the clock starts.

Iceberg `customers`
raw email source
Pseudonymizer
SHA-256 of email
Partner Extract
user_hash shipped to partner
SFTP Delivery
extract transport

What you'll practice

This is a timed, hands-on incident in the Incident Response. You diagnose the symptom, trace it to a root cause across real components, and ship a fix before the clock runs out — the same loop you run on call, without the production blast radius.

Interactive challenge

Solve it in the Simulation Arcade.

The interactive workspace — live metrics, the component map, and the fix you ship — runs inside Petascale Labs. Sign in to start the clock.

Related topics

Browse the full Arcade

Every challenge maps to a stratum in the curriculum.