All posts 4 min read

Our Cold Email System Was Losing Replies. A Dashboard Caught It.

We built a cold email outbound system, thought it was working, then built a dashboard — and discovered we were silently dropping replies from leads who didn't exist in our CRM. Here's what we found and how we fixed it.

We thought our cold email system was working.

Emails going out, replies coming in, scorecard links getting sent. Nothing looked broken. Nobody was complaining. So we assumed it was fine.

Then we built a dashboard, and the dashboard disagreed.


The Setup

Here's how the outbound sequence works for one of our clients:

  1. We pull leads (RIAs — Registered Investment Advisors) from a data source and load them into Instantly, our cold email tool.
  2. Instantly runs the sequence, tracks opens, and catches replies.
  3. When someone replies, Instantly fires a webhook to our reply router.
  4. The router finds the lead in our CRM, bumps their stage, and sends a scorecard link.
  5. Scorecard comes back, sales team gets notified, call gets booked.

Clean enough. The problem was hiding in step 4.


The Silent Failure

Our reply router does a simple lookup: find the lead by email address. If it finds a match, it moves them through the pipeline. If it doesn't find a match…

Nothing happens. No error. No log entry worth noticing. The reply just falls into a hole.

Here's what we hadn't thought through: the RIA leads we were emailing only existed in Instantly. We'd imported them into the email tool to run the sequence, but we'd never created matching records in our CRM database. So when Instantly fired the webhook and our router went looking — nothing there. It exits quietly, and the lead is none the wiser.

The system "worked." It just worked by doing nothing.

The worst kind of bug is the one that doesn't tell you it happened.


How We Found It

We built a simple outbound analytics dashboard — really just a handful of queries against our database. We wanted to see:

  • Active campaigns
  • Emails sent
  • Replies received (pulled from Instantly's API)
  • Replies actually processed in our system

That last one is what caught it. Processed X of Y replies.

First time we ran it, X was meaningfully smaller than Y. We were getting replies we weren't doing anything with. The gap was the leak.

Took about twenty minutes to trace. Leads in Instantly, no matching records in the CRM, router bails silently. That was it.


The Fix

Straightforward once we knew what was broken.

When the reply router gets a webhook and can't find a matching lead, instead of bailing out, it now creates the lead on the spot from the reply payload:

  • Client ID — resolved from a campaign mapping table (every Instantly campaign maps back to a client in our system)
  • Name — from the payload if it's there, otherwise parsed from the email address, otherwise "Unknown"
  • Email — the replying address
  • Source — tagged as instantly-reply so we always know how it got in

The upsert is idempotent — partial unique index on (client_id, email) means a second reply from the same person doesn't create a duplicate. Lead gets created or found, pipeline kicks in, scorecard goes out.

Dashboard now shows X ≈ Y. Leak closed.


What This Is Actually About

It's not really a cold email story. It's a visibility story.

We built the automation first. The observability came later. That meant we ran blind for a while — and the blind spot happened to be exactly where a reply goes when the system can't handle it.

The metric processed / received forced us to define what "working" actually meant. Before we had that number, "working" meant "no one is complaining." Turns out that's not the same thing.

A few things we've taken from this:

Build the "received vs. processed" check before you trust the pipeline. Any webhook-driven flow should have a counter on both ends. If those numbers drift apart, something is silently failing.

A silent no-op is worse than a loud error. An error logs, fires an alert, makes someone grumpy. A no-op just evaporates. You find it three weeks later when you're wondering why a lead went cold.

Your CRM is only as useful as what's actually in it. If you're running outreach through tools that don't sync back to your main database, you've got gaps you're not seeing. The tools don't coordinate on their own — that's your job.


The Bigger Pattern

We run into this pretty regularly with clients who've stitched together a sales stack — Instantly for email, some CRM for tracking, Apollo or Clay for enrichment, a webhook here, a Zap there. Each tool does its thing. Nobody's tested the seams.

The question isn't whether each tool works. It's what happens at every handoff.

The Instantly → CRM handoff was our gap. We found it because we measured it. Most teams find it later, usually after a deal slips through and someone goes hunting for what went wrong.


What We'd Do Differently

Honestly, not much. Finding and fixing this in an afternoon is the best-case scenario for this kind of bug. The lesson is just: build the visibility metric at the same time as the automation, not after you realize you need it.

If you're running cold email and you don't have a "replies received vs. replies actioned" number somewhere — you probably have some version of this. The replies came in. The pipeline didn't fire. You just haven't looked at the gap yet.

Add the counter. Then sleep better.


Raz Mihalyi runs azlabs.io, where we build AI-powered systems for SMBs. If you're running outbound and want to talk through the plumbing, reach out.