{"id":318,"date":"2026-05-15T15:40:39","date_gmt":"2026-05-15T15:40:39","guid":{"rendered":"https:\/\/magendoo.ro\/insights\/performance-budgets-latency-throughput-and-cost\/"},"modified":"2026-05-15T15:40:39","modified_gmt":"2026-05-15T15:40:39","slug":"performance-budgets-latency-throughput-and-cost","status":"publish","type":"post","link":"https:\/\/magendoo.ro\/insights\/performance-budgets-latency-throughput-and-cost\/","title":{"rendered":"Performance Budgets: Latency, Throughput, and Cost"},"content":{"rendered":"<p>You can\u2019t optimize what you don\u2019t measure. And most teams don\u2019t measure the right things. They watch CPU graphs, average response times, and the green dots in their APM dashboard \u2014 and then act surprised when checkout falls over on Black Friday or the AWS bill triples after a \u201csmall refactor\u201d.<\/p>\n<p>Performance is not a vibe. It\u2019s a budget. And like any budget, if you don\u2019t set it explicitly, someone else will set it for you \u2014 usually a payment provider timing out at 10 seconds, or a finance director asking why your infra spend grew 40% year over year.<\/p>\n<h2 class=\"wp-block-heading\">Why Most Commerce Teams Get This Wrong<\/h2>\n<p>The default measurement culture in Magento shops is built around two questions: <em>is the site up?<\/em> and <em>is the page fast?<\/em> Both are necessary. Neither is sufficient.<\/p>\n<p>The reasons are structural, not personal:<\/p>\n<ul>\n<li><strong>APM tools default to averages.<\/strong> P50 looks great when P99 is on fire. The people losing checkouts are in the tail.<\/li>\n<li><strong>Magento performance work historically meant \u201cfaster TTFB on category pages\u201d.<\/strong> That mental model survives into Go microservice projects where it doesn\u2019t apply.<\/li>\n<li><strong>Cost lives in a different dashboard than latency.<\/strong> Engineers optimize one, finance complains about the other, and nobody connects them.<\/li>\n<li><strong>Throughput rarely gets a number.<\/strong> \u201cIt scales\u201d is a feeling, not a budget. Until a campaign hits and you find out it doesn\u2019t.<\/li>\n<\/ul>\n<p>A performance budget reframes all of this. You declare upfront: this service has 200ms at P99, sustains 500 requests per second, and costs no more than \u20ac400\/month at steady state. Now you have something to defend, to test against, and to refuse.<\/p>\n<h2 class=\"wp-block-heading\">The Three Numbers That Matter<\/h2>\n<p>Every meaningful commerce service has three budgets. Not one. Three.<\/p>\n<h3 class=\"wp-block-heading\">1. Latency Budget (with a Percentile)<\/h3>\n<p>Forget averages. Pick percentiles that match what your users actually feel:<\/p>\n<ul>\n<li><strong>P50<\/strong> \u2014 typical experience, useful for trend monitoring<\/li>\n<li><strong>P95<\/strong> \u2014 the bar you\u2019d accept in front of a CTO<\/li>\n<li><strong>P99<\/strong> \u2014 where conversion starts to suffer<\/li>\n<li><strong>P99.9<\/strong> \u2014 where payment retries and timeouts hide<\/li>\n<\/ul>\n<p>Budget example for a pricing service backing the cart:<\/p>\n<table>\n<thead>\n<tr>\n<th>Endpoint<\/th>\n<th>P50<\/th>\n<th>P95<\/th>\n<th>P99<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><code>GET \/price\/{sku}<\/code><\/td>\n<td>15ms<\/td>\n<td>40ms<\/td>\n<td>80ms<\/td>\n<\/tr>\n<tr>\n<td><code>POST \/price\/bulk<\/code><\/td>\n<td>60ms<\/td>\n<td>150ms<\/td>\n<td>300ms<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>These are not hopes. They are commitments. If a deploy breaches P99, you roll back. If a new feature needs more headroom, you negotiate the budget before you ship.<\/p>\n<h3 class=\"wp-block-heading\">2. Throughput Budget<\/h3>\n<p>How many requests per second can the service sustain <em>while staying inside the latency budget<\/em>? This is the question that breaks \u201cit scales\u201d hand-waving.<\/p>\n<p>Two thresholds worth measuring:<\/p>\n<ul>\n<li><strong>Steady-state throughput<\/strong> \u2014 what a single instance handles 24\/7 without degradation<\/li>\n<li><strong>Burst throughput<\/strong> \u2014 what the fleet can absorb for a 5-minute Black Friday spike<\/li>\n<\/ul>\n<p>A Go service handling shipment tracking events at JTI-scale might budget 2,000 events\/sec steady-state per pod, with the fleet sized for 8,000\/sec burst. If a code change drops that to 1,200\/sec steady-state, the cost model breaks even if latency is unchanged. You\u2019d need more pods, more nodes, more money.<\/p>\n<h3 class=\"wp-block-heading\">3. Cost Budget<\/h3>\n<p>Everyone has a latency dashboard. Almost nobody has a cost-per-request number. They should.<\/p>\n<p>Useful cost metrics:<\/p>\n<ul>\n<li><strong>Cost per 1,000 requests<\/strong> \u2014 direct unit economics<\/li>\n<li><strong>Cost per processed order<\/strong> \u2014 translates infrastructure into commerce language<\/li>\n<li><strong>Steady-state monthly cost at expected load<\/strong> \u2014 the number your CFO actually reads<\/li>\n<\/ul>\n<p>The discipline is to compute cost honestly: compute + memory + egress + managed services (Redis, RDS, MSK) + observability overhead. Not just EC2. The observability bill alone can quietly grow to 20% of compute if you don\u2019t watch it.<\/p>\n<h2 class=\"wp-block-heading\">SLOs Are the Contract, Error Budgets Are the Currency<\/h2>\n<p>A latency target without a service-level objective is a wish. A wish is not a budget.<\/p>\n<p>A useful SLO for a Go pricing service might be:<\/p>\n<blockquote>\n<p>99.9% of <code>\/price\/{sku}<\/code> requests return within 80ms over a rolling 28-day window.<\/p>\n<\/blockquote>\n<p>That gives you an error budget of 0.1% \u2014 about 43 minutes of \u201cout of budget\u201d time per month. You can spend that budget on:<\/p>\n<ul>\n<li>Deploys that briefly degrade performance<\/li>\n<li>Backend dependency hiccups (the ERP that nobody tells you went into batch processing)<\/li>\n<li>Cache misses during warmup after a restart<\/li>\n<\/ul>\n<p>When the error budget burns too fast, you stop shipping features and fix reliability. When you finish a month with budget unused, you can spend it: ship riskier changes, run chaos tests, push the cost lower at the expense of some headroom.<\/p>\n<p>This is the part most commerce teams skip. They have alerts. They don\u2019t have a budget. So every incident feels equally urgent, every deploy feels equally safe, and the team burns out on noise.<\/p>\n<h2 class=\"wp-block-heading\">Measuring the Right Things in Go<\/h2>\n<p>Go gives you decent primitives for this, but only if you wire them in deliberately.<\/p>\n<p>What I actually instrument in commerce Go services:<\/p>\n<ul>\n<li><strong>Request latency histograms<\/strong> with percentile-friendly buckets (Prometheus <code>Histogram<\/code> with explicit buckets matching your SLO thresholds \u2014 not the defaults). The default buckets will lie to you about P99.<\/li>\n<li><strong>Queue lag and consumer offset<\/strong> for any worker pool consuming Magento events, ERP outbox messages, or PIM updates. Lag is the early warning that throughput is breaking down before latency goes red.<\/li>\n<li><strong>Retry and DLQ counters<\/strong> \u2014 every retry is latency the user doesn\u2019t see, but cost you absolutely pay. A spike in retries is often the first signal of a downstream regression.<\/li>\n<li><strong>In-flight goroutine count<\/strong> for backpressure visibility. A worker pool quietly growing past its semaphore limit is throughput collapse in slow motion.<\/li>\n<li><strong>Cost-allocation labels<\/strong> on every metric and log line: service, environment, tenant, region. Without these, your cost dashboard is unreadable when you need it most.<\/li>\n<\/ul>\n<p>The pattern in our open-source <a href=\"https:\/\/github.com\/florinel-chis\/tracking-updater\">tracking-updater<\/a> service is a useful reference: bounded worker pool, exponential backoff, file lifecycle states, and metrics that map directly to a throughput budget. You can tell at a glance whether the service is healthy, slow, or quietly losing work.<\/p>\n<h2 class=\"wp-block-heading\">The Magento Side of This<\/h2>\n<p>Magento should not be where you measure your microservices. But it does play a role in the budget.<\/p>\n<p>In a typical headless setup:<\/p>\n<ul>\n<li>Magento emits the event (order placed, product saved, stock updated) and gets out of the way.<\/li>\n<li>The Go service consumes from a queue, does the heavy work, and reports its own SLOs.<\/li>\n<li>The Magento application itself has a separate budget \u2014 page TTFB, GraphQL response time, admin responsiveness \u2014 measured by APM and tied to the storefront experience.<\/li>\n<\/ul>\n<p>The mistake is conflating them. \u201cOrder sync is slow\u201d can mean Magento took 2 seconds to write the order, or the Go consumer is 30 minutes behind on its queue, or the ERP rejected the payload three times before accepting. Three different budgets, three different owners, three different fixes.<\/p>\n<p>Boundary clarity here is not academic. It determines whether your on-call rotation knows where to look at 2am.<\/p>\n<h2 class=\"wp-block-heading\">Decision Framework: Setting a Performance Budget<\/h2>\n<p>Use this checklist when you\u2019re standing up a new service or auditing an existing one:<\/p>\n<p><strong>Apply this when:<\/strong><\/p>\n<ul>\n<li>The service sits on the critical path of checkout, pricing, or fulfillment<\/li>\n<li>Downstream systems have hard timeouts (payment providers, ERPs, tax engines)<\/li>\n<li>You have more than one team contributing changes to the service<\/li>\n<li>Infrastructure cost has grown faster than transaction volume for two quarters running<\/li>\n<li>\u201cIt scales\u201d has been said in a planning meeting in the last 90 days<\/li>\n<\/ul>\n<p><strong>Don\u2019t bother (yet) when:<\/strong><\/p>\n<ul>\n<li>The service is genuinely internal, low-traffic, and not on a critical path<\/li>\n<li>You have a team of two and no production load<\/li>\n<li>You\u2019re prototyping and the numbers would be lies anyway<\/li>\n<\/ul>\n<p><strong>For each budget, define:<\/strong><\/p>\n<ul>\n<li>The metric (P99 latency, requests per second sustained, monthly cost at steady state)<\/li>\n<li>The threshold (80ms, 2,000 RPS, \u20ac400\/month)<\/li>\n<li>The window (rolling 28 days, peak hour, monthly bill)<\/li>\n<li>The action when breached (rollback, capacity review, architecture review)<\/li>\n<\/ul>\n<p>A budget without an action is decoration.<\/p>\n<h2 class=\"wp-block-heading\">The Leadership Angle<\/h2>\n<p>Performance budgets are an executive tool, not an engineering toy. They change the conversations you can have with the rest of the business.<\/p>\n<ul>\n<li><strong>With product:<\/strong> \u201cThis feature is a 15ms increase to P99. We have 20ms of budget left this quarter. Yes, or do we deprioritize the recommendations widget?\u201d<\/li>\n<li><strong>With finance:<\/strong> \u201cCost per order dropped from \u20ac0.04 to \u20ac0.025 after the Go migration. Here\u2019s the data, here\u2019s the trend.\u201d<\/li>\n<li><strong>With your CTO:<\/strong> \u201cWe have three months of error budget data. The pricing service is healthy. The order export service is burning budget \u2014 that\u2019s where I want to invest engineering time.\u201d<\/li>\n<\/ul>\n<p>Without budgets, all of these conversations become opinion against opinion. With them, they become trade-off conversations grounded in numbers. That is the difference between an engineering team that gets resources and one that gets squeezed.<\/p>\n<p>The other quiet benefit: budgets let you say no. \u201cWe can\u2019t ship that without raising the latency budget \u2014 here\u2019s what raising it would cost.\u201d Engineers who can frame trade-offs in business language stop being treated as cost centers.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p>The best commerce teams I\u2019ve worked with don\u2019t have faster code than everyone else. They have clearer numbers. They know what their pricing service should cost. They know when their order export queue is breathing wrong. They know the exact moment to stop shipping features and fix reliability \u2014 because the budget tells them, not the loudest voice in the room.<\/p>\n<p>Set the three budgets. Latency at a real percentile. Throughput at a real load. Cost at a real bill. Defend them like you defend uptime. Everything else \u2014 the autoscaling rules, the cache layers, the Go rewrites \u2014 becomes a means to a number, instead of a religion.<\/p>\n<p>Performance is a budget. Spend it on purpose.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You can\u2019t optimize what you don\u2019t measure. And most teams don\u2019t measure the right things. They watch CPU graphs, average response times, and the green dots in their APM dashboard \u2014 and then act surprised when checkout falls over on Black Friday or the AWS bill triples after a \u201csmall refactor\u201d. Performance is not a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":317,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-container-style":"default","site-container-layout":"default","site-sidebar-layout":"default","disable-article-header":"default","disable-site-header":"default","disable-site-footer":"default","disable-content-area-spacing":"default","footnotes":""},"categories":[1],"tags":[],"class_list":["post-318","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-general"],"_links":{"self":[{"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/posts\/318","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/comments?post=318"}],"version-history":[{"count":0,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/posts\/318\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/media\/317"}],"wp:attachment":[{"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/media?parent=318"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/categories?post=318"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/magendoo.ro\/insights\/wp-json\/wp\/v2\/tags?post=318"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}