Saturday, June 28, 2014

#LatencyTipOfTheDay: MOST page loads will experience the 99%'lie server response

Yes. MOST of the page view attempts will experience the 99%'lie server response time in modern web applications. You didn't read that wrong.
This simple fact seems to surprise many people. Especially people who spend much of their time looking at pretty lines depicting averages, 50%'lie, 90%'lie or 95%'lies of server response time in feel-good monitoring charts. I am constantly amazed by how little attention is paid to the "higher end" of the percentile spectrum in most application monitoring, benchmarking, and tuning environments. Given the fact that most user interactions will experience those numbers, the adjacent picture comes to mind.

Oh, and in case the message isn't clear, I am also saying that:

- MOST of your users will experience the 99.9%'lie once in ten page view attempts

- MOST of your users will experience the 99.99%'lie once in 100 page view attempts

- More than 5% of your shoppers/visitors/customers will experience the 99.99%'lie once in 10 page view attempts.

So, how does this work: Simple. It's math.

For most (>50%) web pages to possibly avoid experiencing the 99%'ile of server response time, the number of resource requests per page would need to be smaller than 69.

Why 69?

Here is the math:

- The chances of a single resource request avoiding the 99%'lie is 99%. [Duh.]

- The chances on all N resource requests in a page avoiding the 99%'lie is (0.99 ^ N) * 100%.

- (0.99 ^ 69) * 100%  = 49.9%

So with 69 resource requests or more per page, MOST (> 50% of) page loads are going to fail to avoid the 99%'lie. And the users waiting for those pages to fill will experience the 99%'ile for at least some portion of the web page. This is true even if you assume perfect parallelism for all resource requests within a page (non of the requests issued depend on previous requests being answered). Reality is obviously much worse than that, since requests in pages do depend on previous response, but I'll stick with what we can claim for sure.

The percentage of page view attempts that will experience your 99%'lie server response time (even assuming perfect parallelism in all requests) will be bound from below by:

% of page view attempts experiencing 99%'ile >= (1 - (0.99 ^ N)) * 100%

Where N is the number of [resource requests / objects / HTTP GETs] per page.

So, How many server requests are involved in loading a web page? 

The total number of server requests issued by a single page load obviously varies by application, but it appears to be a continually growing number in modern web applications. So to back my claims I went off to the trusty web and looked for data.

According to some older stats collected for a sample of several billions of pages processed as part of Google's crawl and indexing pipeline, the number of HTTP GET requests per page on "Top sites" hit the obvious right answer (42, Duh!) in mid 2010 (see [1]). For "All sites" it was 44. But those tiny numbers are so 2010...

According to other sources, the number of objects per page has been growing steadily, with ~49 around the 2009-2010 timeframe (similar to but larger than Google's estimates), and crossed 100 GETs per page in late 2012 (see [2]). But that was almost two years ago.

And according to a very simple and subjective measurement done with my browser just now, loading this blog's web page (before this posting) involved 119 individual resource requests. So nearly 70% of the page views of this blog are experiencing blogger's 99%'lie.

To further make sure that I'm not smoking something, I hand checked a few common web sites I happened to think of, and none of the request counts came in at 420:

Site # of requests page loads that would
experience the 99%'lie
[(1 - (.99 ^ N)) * 100%] 190 85.2% 204 87.1% 112 67.6% 109 66.5%
-- -- -- 173 82.4% 279 93.9%
-- -- -- 87 58.3% 84 57.0% 178 83.3%
-- -- --
(yes, that simple noise-free page)
31 26.7%
search for "http requests per page"
76 53.4%

So yes. There is one web page on this list for which most page loads will not experience the 99%'lie. "Only" 1/4 of visits to's clean and plain home page will see that percentile. But if you actually use google search for something, you are back on my side of the boat....

What the ^%&*! are people spending their time looking at?

Given these simple facts, I am constantly amazed by the number of people I meet who never look at numbers above the 95%'ile, and spend most of their attention of medians or averages. Even if we temporarily ignore the fact that the 95%'lie is irrelevant (as in too optimistic) for more than half of your page views, there is less than a 3% chance of modern web app page view avoiding the 95%'ile of server response time. This means that the 90%'lie, 75%'lie, median, and [usually] the average are completely irrelevant to 97% of your page views.

So please, Wake Up! Start looking at the right end of the percentile spectrum...


[1] Sreeram Ramachandran: "Web metrics: Size and number of resources", May 2010.
[2] "Average Number of Web Page Objects Breaks 100", Nov 2012

Discussion Note: 

It's been noted by a few people that these calculations assume that there is no strong time-correlation of bad or good result. This is absolutely true. The calculations I use are valid if every request has the same chance of experiencing a larger-than-percetile-X result regardless of what previous results have seen. A strong time correlation would decrease the number of pages that would see worse-than-percentile-X results (down to a theoretical X% in theoretically perfect "all responses in a given page are either above or below the X%'lie" situations). Similarly, a strong time anti-correlation (e.g. a repeated pattern going through the full range of response times values every 100 responses) will increase the number of pages that would see a worse-than-percentile-X result, up to a theoretical 100%.

So in reality, my statement of "most" (and the 99.3% computed for above) may be slightly exaggerated. Maybe instead of >50% of your page views seeing the 99%'lie, it's "only" 20% of page views that really are that bad... ;-)

Without time-correlation (or anti-correlation) information, the best you can do is act on the basic information at hand. And the only thing we know about a given X %'ile in most systems (on its own, with no other measured information about correlative behavior) is that the chances of seeing a number above it is (100%-X).


  1. I strongly agree that when testing you should look at the 99%tile even to the point of ignoring the 50%. The 50%tile can give you an idea of how much you could improve e.g. is it likely to be quick wins (if 99%tile >> 4x 50%tile) or do you have to speed up all the code.

    99%tile is just the start, you have to have a realistic peak load and look at 99.9% and even worst case timings after this step 1.

  2. This compounds when you look at user activity over an entire session -- with for example 5/10/20 page impressions in a session -- it becomes even worse - the odds are very high that while a user is browsing your site they will hit the 99.9.

    I think to give a fairer assessment you need to factor in that not all of your resources come from the same location, assuming you are talking about a website. For example, static content will often come from a CDN which hopefully has a much lower 100% than your app server. This pushes out the likelihood of a user hitting a 99.9% load time quite a bit.

    Nonetheless the problem still exists and I find it a very important point when talking to customers about the *real experience* versus *meeting the requirements*.

    1. Yup. the session or daily experience is compounded. And the fact that most people don't look at numbers above the 95% or 99% literally means that they are looking at the data of their best 3%-5% user experience, and ignoring the rest.

      Yes. A page is served by many servers, and the 99%'ile (cross all of them is what should/would be used in this math). The "CDNs are good" arguments come up a lot, but they basically cancel out. CDNs are not special, and their 99.99%'lie stinks just as bad as everyone else's. Try to get a CDN to report on something other than median, 95%'lie, and maybe 99%'lie, and you'll find out very quickly that they live in the same world as the rest of us... Mostly because they'll either push back or admit to not even looking at the numbers.

      As you may have noticed, I'm trying to discredit[one by one] the commonly used metrics people watch on feel-good "monitoring" dashboards. Average is meaningless. Medians are practically never experienced by anyone. 95%'lie cover the best 3%-5%, and the 99%'lie describes only the better half.

      Why am I doing this? Because I truly believe that unless you are actually measure and closely watch numbers with several 9s in them. 99.9% is an ok entry-level starting point, but 99.99% and higher are absolutely needed for anyone that care about what users that actually use their application more than a couple of minutes per day.

      The math is simple: Each actual user will be regularly experiencing the worst of 10s of thousands of "server response times" each day. Ignoring (as in not even bothering to measure) the 99.99%'lie in that reality, and not monitoring the max times (for which there is no technical excuse, since they are trivial to measure) is what the picture at the top of this blog is all about.

  3. Hi.

    Are you trying to say "ninety nine percent lie" or "ninety nine percentile" ? because most of your spellings say "lie" and I have never heard of a "ninety nine percent lie" when it comes to web stuff...

  4. Hello Gil,

    On the 100-200 resources of these sites, you will typically find 95-99% of static resources (images, css, js, fonts, ...) and a handful of real "server response" (main document, few ajax calls). Static resources should be cached in the CDN and therefore the server time will only be for origin requests.

    Assuming 1 main document, a couple of ajax requests, using your math, it will give "only" 5% (1 - 0.99 ^ 5) of affected users by the 99th percentile on the server.

    You sill need to actively monitor your CDN 99th percentile though ;)

    I short, I think the 99th percentile on the server is more than a "feel good" indicator (unless you are monitoring requests that should never have reached the origin server). Make sense ?

  5. It’s a very informative and helpful article, thank you for sharing!

    Web Designer

  6. I recommend a small insertion making it clear that this is true because the requests are being made in parallel so your total time is a function of the max of service times.

    Because I just had a discussion with someone who tried to apply your results to a system that hands off processing from service to service in series.

    And when you add a collection of random variables, eg prices of the various stocks in your portfolio, you get a "Portfolio Effect" -- total values are closer to mean, not more extreme.

    The probability the other n-1 services are NOT hitting their 99%-ile time is 1 - (.01 ^ [n-1]), for 69 services, that's a near certainty. And each of those services are saving time compared to their own 99%-ile. Offsetting the total to make it So there is a portfolio effect.

    The central idea that makes this post basically true even when in series is that the top 1%-ile performance is typically so much higher than norm that it is unlikely to be measurably offset by savings elsewhere. But I am unsure.

    In any case, a couple of words about the requests being simultaneous would have saved my co worker some confusion.

  7. Ten years ago I found what is remote working. It never even crossed my mind that one could work from outside of the office. And then I joined Skype family. Compared to my previous work – oh my what a different culture. A culture, a family and the product that I instantly fell in love with.

  8. replica rolex watches, combining elegant style and cutting-edge technology, a variety of styles of replica rolex pearlmaster watches, the pointer walks between your exclusive taste style.

  9. Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agianMarriage certificate in delhi
    Marriage certificate in ghaziabad
    Marriage registration in gurgaon
    Marriage registration in noida
    special marriage act
    Marriage certificate online
    Marriage certificate in mumbai
    Marriage certificate in faridabad
    Marriage certificate in bangalore
    Marriage certificate in hyderabad thanks once again to all.

  10. Thanks for sharing this article for Speedinsight having superb explanation & it's too clear to understand the concept as well. I am hoping the same best work from you in the future.

  11. Cheap Nike Air Max UK, combining elegant style and cutting-edge technology, a variety of styles of Cheap Nike Air Max vapormax, the pointer walks between your exclusive taste style.

  12. Cari Situs Judi Online Terpercaya? Teraman? Banyak Bonusnya?
    Di Sini Tempatnya.

    ZumaQQ, Agen BandarQ, Domino99, Poker Online Terpercaya

    Hanya 4 Tahapan Kemenangan yang MUDAH di ZumaQQ
    Daftar > Depo > Main > Menang!
    Deposit minimal Rp. 20.000,-
    Anda Bisa Menjadi JUTAWAN? Kenapa Tidak?
    Dengan 1 ID sudah bisa bermain 8 Jenis Permainan
    * Agen BandarQ
    * Bandar 66 (New Hot)
    * Bandar Poker
    * Sakong Online
    * Domino99
    * Adu Q
    * Poker
    * Capsa Susun
    Ada Juga Bonus Terhot Se-Indonesia
    Bonus Turnover 0,5%
    Bonus Refferal 10% + 10%

    Yuk gabung bersama kami dan Jadilah Jutawan Berikutnya!
    Website : ZumaQQ
    WA : +85517812404
    Line : ZumaQQ

    blog ZumaQQ

  13. I read a article under the same title some time ago, but this articles quality is much, much better. How you do this..
    situs judi online terpercaya

  14. Nice Post thanks for the information, good information & very helpful for others,Thanks for Fantasctic blog and its to much informatic which i never think ..Keep writing and grwoing your self

    rc transfer in delhi
    rc transfer in ghaziabad
    rc transfer in noida
    rc transfer in gurgaon
    rc transfer in faridabad
    rc transfer in bangalore
    rc transfer in greater noida
    rc transfer in mumbai
    rc transfer in ranchi
    rc transfer in chennai

  15. Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agian

    court marriage in delhi ncr
    court marriage in delhi
    court marriage in noida
    court marriage in ghaziabad
    court marriage in gurgaon
    court marriage in faridabad
    court marriage in greater noida
    name change online
    court marriage in chandigarh
    court marriage in bangalore