How We Built ChurnZilla: Automated SaaS Payment Recovery in 30 Days
The full story of building dual-Stripe architecture, a 4-step retention workflow, and multi-tenant payment security — in 30 days.
Hugo was losing SaaS revenue to failed payments and cancellations. We built ChurnZilla in 30 days with dual-Stripe architecture and a 4-step retention workflow. Here is the full build story.
Hugo Millington-Drake was losing customers to failed payments and had no way to intercept cancellations. Expired credit cards, insufficient funds, processing errors — the kind of silent revenue leaks that compound monthly. On the other side, customers who wanted to leave had no alternative to clicking "cancel."
The problem isn't unique to Hugo. SaaS businesses lose 3–5% of monthly recurring revenue to failed payments. And voluntary churn — customers actively choosing to leave — accounts for 20–40% of total churn. Most businesses have no automated system for either.
We built ChurnZilla in 30 days. Here's how.
The two types of churn nobody talks about separately
Most churn discussions treat all lost customers the same. They're not.
Involuntary churn happens without the customer choosing to leave. Their credit card expired. The bank declined a charge. A processing error occurred. The customer still wants the product — they just can't pay. These are the easiest customers to save, because they haven't decided to leave. They just need a frictionless way to fix the problem.
Voluntary churn happens when customers actively choose to cancel. But "cancel" isn't always "I hate your product." Sometimes it's "I need to pause for a month." Sometimes it's "this is too expensive at this tier." Sometimes it's "I didn't know about this feature." These customers can be saved — but only if you intervene at the right moment with the right offer.
ChurnZilla addresses both with completely separate systems, because they're fundamentally different problems requiring different solutions.
What we built: the architecture of saving revenue
3-tier automated payment recovery. When a payment fails, ChurnZilla detects it instantly via Stripe webhooks and triggers a smart email sequence: immediate notification, +3 day reminder, +7 day reminder, and final notice. Each email includes a magic link that lets customers update their payment method with one click — no login required.
The magic link is critical. Asking a customer to remember their password, navigate to billing settings, and update their card is a conversion killer. A single link that goes directly to payment update recovers more failed payments than any amount of email copy.
4-step cancellation prevention workflow. When a customer clicks "cancel," they don't go straight to a confirmation screen. Instead, they enter a structured retention flow:
Step 1: Initial offer — pause your subscription instead of cancelling. Step 2: Customer survey — tell us why you're considering leaving. Step 3: Feedback form — detailed input that shapes the final offer. Step 4: Final offer — a discount, downgrade option, or extended pause based on their specific feedback.
This isn't a dark pattern. Every step gives the customer a clear "cancel anyway" option. But by presenting alternatives, it catches the people who would stay if they knew they could pause, downgrade, or get a discount. The workflow generates real Stripe coupons on the customer's connected account — not mock offers, actual billing system changes.
Proactive card expiration alerts. Rather than waiting for payments to fail, ChurnZilla sends notifications 30 days and 7 days before a card expires. Preventing failures before they happen is cheaper and more effective than recovering them after.
Recovery analytics dashboard. Real-time metrics showing recovered revenue, recovery rate, failure reason breakdown, recovery method performance, and time-based filtering across 7-day, 30-day, and all-time windows. You can't improve what you can't measure.
Slack integration. OAuth-based workspace connection with channel selection and message type filtering. Failed payments, successful recoveries, and weekly summaries pushed to the channels Hugo monitors — so he stays informed without checking a dashboard.
The hardest technical challenge: dual-Stripe architecture
ChurnZilla has two completely separate Stripe integrations, and getting this right was the most architecturally complex part of the build.
Platform Stripe handles ChurnZilla's own billing — subscriptions, checkout, invoices, and the customer portal for ChurnZilla users. This is standard SaaS billing.
Connect Stripe handles payment recovery on the customer's Stripe account via OAuth. When Hugo connects his Stripe account, ChurnZilla gets limited permissions to manage coupons, modify subscriptions, and process recovery actions on Hugo's behalf.
Many platforms accidentally use their own Stripe credentials for customer operations. This creates security issues, liability problems, and audit nightmares. Separating them at the architectural level — different route files, different environment variables, different code paths — makes credential mixing impossible.
The file structure tells the story: billing.ts handles platform operations, stripe-connect.ts handles connected account operations, and pause-stripe-service.ts handles the cancellation prevention workflow. No shared code between them. Zero credential confusion incidents.
This pattern — separating platform billing from marketplace operations — is the same architecture used by platforms like Shopify. It's notoriously difficult to implement correctly, and it's why agencies often quote separately for "marketplace payments."
Multi-tenant security that can't be bypassed
The second architectural challenge was multi-tenant data isolation. Every ChurnZilla user manages their own customers' data. That data must be completely separated — not just by convention, but by design.
We built a centralised storage layer (storage.ts) that wraps every database query and automatically scopes it to the authenticated user's ID. There are 100+ storage methods in the codebase. Every single one filters by userId automatically.
This is the difference between security by discipline ("remember to add the WHERE clause") and security by architecture ("the WHERE clause is baked into the system"). Direct database queries outside the storage layer don't exist. Cross-tenant data access is structurally impossible without modifying the storage layer itself.
For the cancellation prevention workflow, we implemented a formal state machine with validated transitions: idle → survey → offer → complete/aborted. The workflow must proceed through all steps in order — skipping steps defeats the purpose. And because the state machine is explicit, analytics show exactly where customers drop off at each step, giving Hugo data to optimise the flow.
The production details that separate tools from toys
Encrypted OAuth tokens. All Stripe Connect and Slack OAuth tokens are encrypted at rest using AES encryption. Tokens are decrypted only when needed for API calls. A database breach doesn't compromise connected accounts.
Email rate limiting. An in-memory queue enforces four rate limits: 10/second, 100/minute, 1,000/hour, 10,000/day. Without this, a surge of failed payments could trigger enough emails to get the Resend account suspended.
Webhook idempotency. A processed_events table ensures every Stripe webhook is processed exactly once. Stripe retries failed deliveries, so without deduplication, the same failed payment could trigger multiple email sequences.
HMAC signature validation. Both Stripe and Resend webhooks are validated using HMAC signatures before processing. This prevents anyone from sending fake webhook events to trigger email sequences or modify subscription states.
The total technical footprint: 11,800+ lines of production TypeScript across 55+ database tables, 20+ frontend pages, and 25+ API endpoints. Shipped in 1,097 commits over 4.5 months of total development.
What Hugo got: revenue recovery while he sleeps
ChurnZilla now runs 24/7 on Hugo's behalf. Failed payments trigger recovery sequences automatically. Card expiration alerts go out on schedule. Cancellation attempts enter the retention workflow. Slack notifications keep Hugo informed. Weekly summary emails report on recovery metrics.
The platform Hugo was quoted months of development for — complete with dual-Stripe architecture, state machine workflows, encrypted credential storage, and rate-limited email delivery — shipped in 30 days.
As Hugo put it: "We were losing customers to failed payments and had no way to intercept cancellations. Tom built Churnzilla in 30 days and now we're recovering revenue automatically while I sleep."
Lessons for SaaS founders building retention tools
Separate involuntary from voluntary churn. They're different problems with different solutions. Failed payment recovery is a technical problem (frictionless payment update). Cancellation prevention is a product problem (presenting the right alternative at the right moment).
State machines prevent bad UX. Multi-step workflows need validated transitions. Without them, customers can end up in impossible states, and your analytics become meaningless.
Encrypt everything you're trusted with. OAuth tokens are equivalent to passwords. If a customer trusts you with Stripe access, that access must be encrypted at rest, regardless of how secure your database is.
Architecture beats discipline. Multi-tenant security through a storage layer is more reliable than remembering to add userId filters to every query. Build systems that are correct by design, not by diligence.
One person can build what teams quoted months for. The dual-Stripe architecture, state machine workflow, encrypted credential storage, rate-limited email delivery — all of it shipped in 30 days with AI-accelerated development. AI handled the boilerplate. Human judgment handled the architecture.
---
Frequently asked questions
How long did ChurnZilla take to build?
30 days for the core platform including dual-Stripe architecture, payment recovery sequences, cancellation prevention workflow, analytics dashboard, and Slack integration. Ongoing iteration continued over 4.5 months.
What does dual-Stripe architecture mean?
ChurnZilla uses two completely separate Stripe integrations: one for its own billing (platform Stripe) and one for operating on customers' Stripe accounts (Connect Stripe). This prevents credential mixing — a common security problem in marketplace platforms.
Can ChurnZilla work with any SaaS business?
Any SaaS business using Stripe for subscription billing. The OAuth connection grants ChurnZilla limited permissions to manage coupons, modify subscriptions, and process recovery actions on the connected account.
How does the 4-step cancellation workflow differ from dark patterns?
Every step presents a clear "cancel anyway" option. The workflow doesn't prevent cancellation — it presents alternatives (pause, discount, downgrade) that some customers prefer. The survey data also helps the business understand why customers leave.
---
Related reading
---
Tom Crossman builds production-ready software at Hello Crossman. 18 years in product development. 100+ products shipped. See the full ChurnZilla case study →