Performance Budgets: Latency, Throughput, and Cost

    You can’t optimize what you don’t measure. And most teams don’t measure the right things. They watch CPU graphs, average response times, and the green dots in their APM dashboard — and then act surprised when checkout falls over on Black Friday or the AWS bill triples after a “small refactor”.

    Performance is not a vibe. It’s a budget. And like any budget, if you don’t set it explicitly, someone else will set it for you — usually a payment provider timing out at 10 seconds, or a finance director asking why your infra spend grew 40% year over year.

    Why Most Commerce Teams Get This Wrong

    The default measurement culture in Magento shops is built around two questions: is the site up? and is the page fast? Both are necessary. Neither is sufficient.

    The reasons are structural, not personal:

    • APM tools default to averages. P50 looks great when P99 is on fire. The people losing checkouts are in the tail.
    • Magento performance work historically meant “faster TTFB on category pages”. That mental model survives into Go microservice projects where it doesn’t apply.
    • Cost lives in a different dashboard than latency. Engineers optimize one, finance complains about the other, and nobody connects them.
    • Throughput rarely gets a number. “It scales” is a feeling, not a budget. Until a campaign hits and you find out it doesn’t.

    A performance budget reframes all of this. You declare upfront: this service has 200ms at P99, sustains 500 requests per second, and costs no more than €400/month at steady state. Now you have something to defend, to test against, and to refuse.

    The Three Numbers That Matter

    Every meaningful commerce service has three budgets. Not one. Three.

    1. Latency Budget (with a Percentile)

    Forget averages. Pick percentiles that match what your users actually feel:

    • P50 — typical experience, useful for trend monitoring
    • P95 — the bar you’d accept in front of a CTO
    • P99 — where conversion starts to suffer
    • P99.9 — where payment retries and timeouts hide

    Budget example for a pricing service backing the cart:

    Endpoint P50 P95 P99
    GET /price/{sku} 15ms 40ms 80ms
    POST /price/bulk 60ms 150ms 300ms

    These are not hopes. They are commitments. If a deploy breaches P99, you roll back. If a new feature needs more headroom, you negotiate the budget before you ship.

    2. Throughput Budget

    How many requests per second can the service sustain while staying inside the latency budget? This is the question that breaks “it scales” hand-waving.

    Two thresholds worth measuring:

    • Steady-state throughput — what a single instance handles 24/7 without degradation
    • Burst throughput — what the fleet can absorb for a 5-minute Black Friday spike

    A Go service handling shipment tracking events at JTI-scale might budget 2,000 events/sec steady-state per pod, with the fleet sized for 8,000/sec burst. If a code change drops that to 1,200/sec steady-state, the cost model breaks even if latency is unchanged. You’d need more pods, more nodes, more money.

    3. Cost Budget

    Everyone has a latency dashboard. Almost nobody has a cost-per-request number. They should.

    Useful cost metrics:

    • Cost per 1,000 requests — direct unit economics
    • Cost per processed order — translates infrastructure into commerce language
    • Steady-state monthly cost at expected load — the number your CFO actually reads

    The discipline is to compute cost honestly: compute + memory + egress + managed services (Redis, RDS, MSK) + observability overhead. Not just EC2. The observability bill alone can quietly grow to 20% of compute if you don’t watch it.

    SLOs Are the Contract, Error Budgets Are the Currency

    A latency target without a service-level objective is a wish. A wish is not a budget.

    A useful SLO for a Go pricing service might be:

    99.9% of /price/{sku} requests return within 80ms over a rolling 28-day window.

    That gives you an error budget of 0.1% — about 43 minutes of “out of budget” time per month. You can spend that budget on:

    • Deploys that briefly degrade performance
    • Backend dependency hiccups (the ERP that nobody tells you went into batch processing)
    • Cache misses during warmup after a restart

    When the error budget burns too fast, you stop shipping features and fix reliability. When you finish a month with budget unused, you can spend it: ship riskier changes, run chaos tests, push the cost lower at the expense of some headroom.

    This is the part most commerce teams skip. They have alerts. They don’t have a budget. So every incident feels equally urgent, every deploy feels equally safe, and the team burns out on noise.

    Measuring the Right Things in Go

    Go gives you decent primitives for this, but only if you wire them in deliberately.

    What I actually instrument in commerce Go services:

    • Request latency histograms with percentile-friendly buckets (Prometheus Histogram with explicit buckets matching your SLO thresholds — not the defaults). The default buckets will lie to you about P99.
    • Queue lag and consumer offset for any worker pool consuming Magento events, ERP outbox messages, or PIM updates. Lag is the early warning that throughput is breaking down before latency goes red.
    • Retry and DLQ counters — every retry is latency the user doesn’t see, but cost you absolutely pay. A spike in retries is often the first signal of a downstream regression.
    • In-flight goroutine count for backpressure visibility. A worker pool quietly growing past its semaphore limit is throughput collapse in slow motion.
    • Cost-allocation labels on every metric and log line: service, environment, tenant, region. Without these, your cost dashboard is unreadable when you need it most.

    The pattern in our open-source tracking-updater service is a useful reference: bounded worker pool, exponential backoff, file lifecycle states, and metrics that map directly to a throughput budget. You can tell at a glance whether the service is healthy, slow, or quietly losing work.

    The Magento Side of This

    Magento should not be where you measure your microservices. But it does play a role in the budget.

    In a typical headless setup:

    • Magento emits the event (order placed, product saved, stock updated) and gets out of the way.
    • The Go service consumes from a queue, does the heavy work, and reports its own SLOs.
    • The Magento application itself has a separate budget — page TTFB, GraphQL response time, admin responsiveness — measured by APM and tied to the storefront experience.

    The mistake is conflating them. “Order sync is slow” can mean Magento took 2 seconds to write the order, or the Go consumer is 30 minutes behind on its queue, or the ERP rejected the payload three times before accepting. Three different budgets, three different owners, three different fixes.

    Boundary clarity here is not academic. It determines whether your on-call rotation knows where to look at 2am.

    Decision Framework: Setting a Performance Budget

    Use this checklist when you’re standing up a new service or auditing an existing one:

    Apply this when:

    • The service sits on the critical path of checkout, pricing, or fulfillment
    • Downstream systems have hard timeouts (payment providers, ERPs, tax engines)
    • You have more than one team contributing changes to the service
    • Infrastructure cost has grown faster than transaction volume for two quarters running
    • “It scales” has been said in a planning meeting in the last 90 days

    Don’t bother (yet) when:

    • The service is genuinely internal, low-traffic, and not on a critical path
    • You have a team of two and no production load
    • You’re prototyping and the numbers would be lies anyway

    For each budget, define:

    • The metric (P99 latency, requests per second sustained, monthly cost at steady state)
    • The threshold (80ms, 2,000 RPS, €400/month)
    • The window (rolling 28 days, peak hour, monthly bill)
    • The action when breached (rollback, capacity review, architecture review)

    A budget without an action is decoration.

    The Leadership Angle

    Performance budgets are an executive tool, not an engineering toy. They change the conversations you can have with the rest of the business.

    • With product: “This feature is a 15ms increase to P99. We have 20ms of budget left this quarter. Yes, or do we deprioritize the recommendations widget?”
    • With finance: “Cost per order dropped from €0.04 to €0.025 after the Go migration. Here’s the data, here’s the trend.”
    • With your CTO: “We have three months of error budget data. The pricing service is healthy. The order export service is burning budget — that’s where I want to invest engineering time.”

    Without budgets, all of these conversations become opinion against opinion. With them, they become trade-off conversations grounded in numbers. That is the difference between an engineering team that gets resources and one that gets squeezed.

    The other quiet benefit: budgets let you say no. “We can’t ship that without raising the latency budget — here’s what raising it would cost.” Engineers who can frame trade-offs in business language stop being treated as cost centers.

    Conclusion

    The best commerce teams I’ve worked with don’t have faster code than everyone else. They have clearer numbers. They know what their pricing service should cost. They know when their order export queue is breathing wrong. They know the exact moment to stop shipping features and fix reliability — because the budget tells them, not the loudest voice in the room.

    Set the three budgets. Latency at a real percentile. Throughput at a real load. Cost at a real bill. Defend them like you defend uptime. Everything else — the autoscaling rules, the cache layers, the Go rewrites — becomes a means to a number, instead of a religion.

    Performance is a budget. Spend it on purpose.

    Like What You Read?

    Let's discuss how we can help your e-commerce business

    Get in Touch →

    Stay Updated

    Get expert e-commerce insights delivered to your inbox

    No spam. Unsubscribe anytime. Privacy Policy

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Let's Talk!