What you will take away from this

  • How a hybrid MTA stack (Mautic + KumoMTA + SparkPost + Tableau) works in production at scale
  • Why the 5M euro Hotmail reputation collapse happened and what the real cause was
  • The two-layer spam filter problem that most senders never address separately
  • Which metrics replaced open rates after Apple MPP changed the measurement landscape
  • What an AI analysis of 750,000 pages of deliverability data actually found

Sophie, the host at Open Rate Club, put together a conversation that covers the operational side of email marketing at a level of detail you rarely find published anywhere. Most interviews stay at the strategy layer. This one goes into the actual stack, the specific failures, the real data, and the decisions that practitioners make but rarely document.

It runs 68 minutes. If you manage email at volume, it is worth your full attention. Below are the five topics that carry the most practical weight.

The sending stack: why three layers hold up better than one

Most senders choose a single commercial ESP and accept whatever deliverability behavior comes with the contract. The hybrid approach described in this interview uses three distinct layers:

  • Mautic handles CRM, segmentation, and contact lifecycle logic. It runs on dedicated servers and keeps all contact data under direct ownership rather than inside a vendor’s database.
  • KumoMTA is the open-source mail transfer agent that handles SMTP delivery. This layer gives full visibility into bounce processing, feedback loops, and connection-level delivery decisions that commercial ESPs typically abstract away.
  • SparkPost operates as the commercial relay for sends where reputation guarantees and shared infrastructure matter more than granular control.

Running your own MTA adds operational complexity. For senders below a few million emails per month, the added cost rarely pays off. Above that threshold the calculation changes: commercial relay fees become significant, and the opacity around deliverability decisions becomes genuinely costly when something breaks.

The analytics layer across this stack runs through Tableau, tracking inbox placement rates, engagement signals, and list quality metrics at the domain and segment level. Without that visibility the stack would be making decisions blind.

The 5M euro Hotmail failure: what the root cause actually was

One story in the interview deserves particular attention. A large campaign to a Hotmail-heavy list triggered a sender reputation collapse. Recovery took months. The revenue loss from the email channel during that period reached 5 million euros.

The campaign content was not the problem. The list was. Contacts had aged without proper cleaning. Engagement signals had deteriorated. But the list was still large, the aggregate open rate still looked adequate on the surface, and the send was approved.

Microsoft’s filtering at Hotmail has always been more opaque than Gmail’s. What makes a reputation incident at Hotmail especially damaging is that it propagates: shared blocklists carry the reputation signal across ISPs beyond just Hotmail, which means the damage does not stay contained.

The lesson that matters here is not “clean your lists more often.” It is that the metrics that actually predict a deliverability problem are engagement rates tracked per mailbox provider domain, not aggregate averages. A 25% overall open rate can mask a 4% rate at Hotmail. That 4% is the number that matters, and it will trigger a reputation event before your dashboard shows you anything is wrong.

Two layers of spam filters that require separate optimization

The interview covers a distinction that most email content skips entirely: ISP filters and corporate email filters are not the same system and do not respond to the same signals.

ISP filters at consumer providers like Gmail, Outlook.com, and Yahoo weight engagement signals heavily. Reply rates, “this is not spam” clicks, move-to-folder behavior, and unsubscribe patterns all feed into reputation scoring. Content and authentication matter, but engagement history is the dominant signal.

Corporate filters (Proofpoint, Mimecast, Microsoft 365 Defender) work differently. They weight authentication alignment (SPF, DKIM, DMARC), link reputation, domain age, and IP history more heavily than engagement data. They also run URL reputation checks that consumer ISP filters do not run at the same depth or frequency.

A campaign that delivers cleanly to Gmail can fail at Proofpoint. If your list includes a significant share of corporate addresses, which is almost universal in B2B sending, optimizing only for consumer ISP behavior will leave measurable deliverability on the table.

What replaced open rates after Apple MPP

Apple’s Mail Privacy Protection changed what open rate data means. The interview is direct: raw open rates are no longer a reliable behavioral signal. Automated prefetching inflates the numbers and tells you less about what recipients actually did with your email.

The metrics the conversation highlights as more reliable replacements:

  • Reply rate. A reply requires an explicit human decision. It is the cleanest engagement signal available, and Gmail uses it as a positive sender reputation input.
  • Click-to-open rate. If you still track opens for some segments, normalizing clicks against opens gives a ratio that is more stable than raw click rate and corrects partially for MPP inflation.
  • Unsubscribe rate per campaign. Track this as a leading indicator. A rising trend on an otherwise healthy list signals a content or frequency problem before engagement collapses.
  • Inbox placement rate. The most direct deliverability signal. Gartner covers several tools for measuring this at scale with seed account networks across major ISPs.

The AI deliverability experiment: 750,000 pages of data

The interview describes an experiment that is not common: feeding 750,000 pages of deliverability data into an AI system to find patterns that human analysts would not surface through manual review.

The most actionable finding was a correlation between domain age and inbox placement rate that held across multiple sender categories. Younger sending domains consistently showed lower inbox placement even when technical authentication was correctly configured. This suggests that ISP reputation scoring includes a time-based component that technical setup cannot accelerate. You have to earn the sending history, and that takes calendar time.

This kind of analysis is now accessible because tools like Google Gemini and ChatGPT can process and cross-reference large datasets at a cost that was not feasible even three years ago. The limiting factor today is not computational. It is having the right data collected in a structured form to begin with.

Geopatriation: does routing email through local servers actually work?

One of the more technically specific topics in the interview is geopatriation: routing email sends through servers physically located in the recipient’s country. The hypothesis is that some ISPs apply lighter filtering to email arriving from in-country infrastructure, partly because bulk spam campaigns are disproportionately international in origin.

The results from testing this are not uniform. European markets showed a measurable inbox placement improvement in the data described. North American markets showed less response to the variable. Gartner and other analyst firms treat geopatriation as an emerging optimization rather than a proven standard practice.

Whether it is worth the operational complexity of routing by geography depends on your volume of sends to specific markets and the baseline deliverability performance you are already achieving.

Key points, compressed

The Open Rate Club interview goes into operational detail that does not appear in typical email marketing content. The useful points to extract: the hybrid MTA stack is real infrastructure that can be replicated at scale; list hygiene at the mailbox provider level matters more than aggregate list quality metrics; corporate filters require different treatment than ISP filters; reply rate is the most reliable post-MPP engagement signal; and AI analysis of large deliverability datasets is now within reach for teams that have the data.

The full conversation is available on YouTube.

Live session — Deliverability Summit, Barcelona

Tuesday April 21, 2026 | 17:35 CET | Casa Milà – La Pedrera, Barcelona (also online)

If the hybrid MTA stack described here is something you want to implement, there is a live technical session on exactly this at Deliverability Summit this Tuesday. The session is called The Publisher’s Blueprint: Hybrid Multi-MTA Orchestration Layer and it covers how to build an orchestration layer that decouples routing logic from sending, so your infrastructure can handle failover and cost optimization without rewriting campaigns.

+ Add to calendar