Error Handling and Edge Cases: The Invisible Work That Makes Software Professional
AI tools build the happy path. Production is everything else — network failures, invalid input, timeout errors, and the hundred ways real users break things that work perfectly in demos.
The happy path is easy. AI tools build it fast. But production software needs to handle every other path — failures, timeouts, invalid input, and the hundred ways real users break things.
The user fills in the form. Clicks submit. Sees a success message. Feature complete.
That's the happy path, and AI tools build it beautifully. In demos, the happy path is all you need. In production, it's roughly 40% of the user experience. The other 60% is everything that goes wrong — and in AI-generated code, almost none of it is handled.
Here's what happens when things go wrong in a typical vibe-coded app: the user sees a blank white screen, a raw JavaScript error, a database query in plain text, or nothing at all. The app silently fails and the user has no idea what happened. They try again, get the same result, and leave.
Error handling is invisible when it works. When it doesn't work, it's the first thing users notice — and the primary reason they decide your product isn't serious.
The Two Anti-Patterns
AI-generated code handles errors in one of two ways, both wrong.
Anti-pattern 1: Exposing raw errors. The app catches the error and displays it directly to the user. "TypeError: Cannot read properties of undefined (reading 'email')." "PostgreSQL error: relation 'users' does not exist." "Stripe API error: No such customer: cus_LqJX3m7nNb."
These messages are meaningless to users and actively dangerous — they reveal internal implementation details (database engine, table names, API providers, customer IDs) that help attackers understand and exploit your system.
Anti-pattern 2: Silent failure. The app catches the error and does nothing. The user clicks submit and nothing happens. The page loads but data is missing. A background process fails and nobody — not the user, not the developer — knows about it.
Silent failures are worse than exposed errors because they're invisible. At least raw errors tell you something went wrong. Silent failures create ghost problems that accumulate until the application is in a broken state nobody can diagnose.
The Four Categories of Failure
Every production failure falls into one of four categories. AI tools handle none of them by default.
External Service Failures
Your app depends on services you don't control: Stripe for payments, SendGrid for email, Supabase for the database, external APIs for data. Each of these can fail, timeout, return unexpected data, or go completely offline.
AI-generated code assumes these services always work. There's no retry logic for transient failures, no timeout configuration for slow responses, no fallback for service outages, and no circuit breakers to prevent cascading failures when one service goes down.
What production code does: wraps every external call in try-catch with specific error handling for each failure mode. Implements retry logic with exponential backoff for transient errors (network blips, rate limits). Sets reasonable timeouts so a slow service doesn't hang the entire request. Falls back gracefully when non-critical services are down — if the analytics service is offline, the app should still work.
User Input Errors
Users submit empty forms. They enter email addresses without @ signs. They paste phone numbers with country codes your validation doesn't expect. They type negative numbers for quantities. They submit forms in languages with characters your database doesn't support.
AI-generated code validates the happy path input. It might check that an email field isn't empty, but it won't validate the email format. It might check that a number is present, but it won't check that it's positive. It almost never handles character encoding issues, extremely long input, or deliberately malicious input.
What production code does: validates every input at two levels. Client-side validation for immediate user feedback (before the form submits). Server-side validation for security (the client-side validation can be bypassed). Error messages explain what's wrong and how to fix it: "Please enter a valid email address" not "Validation error."
System Resource Limits
The database connection pool is exhausted. The server runs out of memory. The disk fills up with logs. The file upload exceeds the server's limit. The API hits the rate limit for a third-party service.
These failures don't happen during development because development runs on a single user with minimal data. They appear in production under real load, real data volumes, and real usage patterns.
What production code does: configures resource limits explicitly rather than relying on defaults. Implements connection pooling for database access. Sets file size limits on uploads. Monitors resource usage and alerts before exhaustion. Handles resource exhaustion gracefully — returning a "server is busy" message rather than crashing.
Business Logic Edge Cases
What happens when a user cancels a subscription mid-billing cycle? When two users book the same appointment at the same time? When a timezone change means the same hour happens twice? When a user's session expires while they're filling a long form? When a payment webhook arrives before the checkout redirect?
These aren't technical failures — they're business scenarios that the happy path doesn't cover. AI tools can't anticipate them because they require understanding how the business operates, how users behave, and what the correct resolution is for each scenario.
What production code does: identifies edge cases during specification (before building), handles each one explicitly with business-appropriate logic, and defaults to safe behaviour when an unhandled case appears (log the issue and show a helpful error rather than crashing or corrupting data).
The Three Audiences for Error Handling
Production error handling serves three audiences simultaneously.
Users need friendly, actionable messages. "Something went wrong. Please try again in a few minutes." "Your payment couldn't be processed. Please check your card details." "This appointment slot was just booked by someone else. Please choose another time." No technical jargon. No blame. Clear next steps.
Developers need detailed diagnostic information — but only in the right place. Full error messages, stack traces, request data, and user context should go to Sentry or your logging system, never to the user's screen. When you get a Sentry alert, you should be able to reproduce and fix the issue from the information in the alert alone.
Monitoring systems need structured, machine-readable data. Error types, status codes, response times, and failure rates that can be aggregated into dashboards and trigger alerts. Monitoring tells you "the payment endpoint failed 15 times in the last hour" before users tell you "I can't pay."
The Error Handling Implementation
Here's the pattern I implement across every build.
Global error boundary (frontend). A React error boundary that catches any unhandled error in the component tree and displays a friendly fallback UI instead of a white screen. This is the safety net — it catches everything that more specific handlers miss.
API error middleware (backend). A middleware function that wraps every API endpoint. If the endpoint throws an error, the middleware catches it, logs the full error to Sentry, and returns a standardised error response to the client with a user-friendly message and an appropriate HTTP status code.
Service-specific error handling. Each external service integration (Stripe, Supabase, email provider) has its own error handling that translates service-specific errors into application-level responses. A Stripe card decline becomes a user-friendly "payment declined" message, not a Stripe API error.
Input validation layer. A validation library (Zod, Yup, or similar) that validates every API input against a schema before the business logic runs. Invalid input returns specific, helpful error messages: "Email must be a valid email address" not "Validation failed."
Structured logging. Every error, warning, and significant event is logged with context: timestamp, user ID, endpoint, input data (sanitised), and error details. Logs are persistent (not console.log) and searchable.
This entire pattern takes 3-5 days to implement in an existing codebase. It transforms the user experience from "this app feels broken" to "this app feels professional" — because professional software handles failure gracefully.
Frequently Asked Questions
Can I prompt AI tools to add error handling?
For specific, well-defined cases: "add error handling to this Stripe webhook endpoint" — yes, and the result is usually decent. For comprehensive error handling across an entire application: no. The AI doesn't have the context to know which failures are possible at each point, which errors are critical vs informational, or what the correct user-facing message should be.
What's the minimum error handling for launch?
At minimum: a global error boundary in the frontend (prevents white screens), API error middleware that returns user-friendly messages (prevents exposed internals), Sentry or equivalent for error tracking (prevents invisible failures), and input validation on every form. This takes 1-2 days and catches 80% of error handling issues.
How do I prioritise which errors to handle first?
Start with the money path: authentication, payment processing, and core feature flows. Then handle external service failures (database, API, email). Then user input validation. Then edge cases. This ordering matches the business impact — payment errors cost revenue, auth errors cost security, and input errors cost user trust.
Is Sentry worth it for a small app?
Yes. Sentry's free tier tracks 5,000 errors per month, which is more than enough for most apps pre-scale. The alternative — discovering errors through user complaints — is slower, misses most issues, and makes you look unprofessional. Set up Sentry before launch. It takes 15 minutes.
---