>> "It is OK if a query is slow as long as it is always slow"
I'm having trouble understanding the motivation. If a slow query is always slow, then I'm always going to be kept waiting for that page/data. It seems logical to worry about the queries that 100% of the time keeps users waiting rather than the queries that keep users waiting <100% of the time.
Does anyone care to explain why this is a good idea (for Facebook at least)?
I'm sure the rationale here is that if a given query takes 100ms, they don't focus on getting it down to 50ms even if that's several times what the average is because they know they can. Certainly someone should focus on making that query faster, but it's more straightforward.
The harder problem is figuring out why that 20ms query suddenly balloons to 200ms. You can say, "no big deal, it only happens 1% of the time," but if you don't know why, you could make changes to the system that cause it to happen much more frequently and eventually bring the whole system down.
Also, there's a bit of UX here. People are much more frustrated by things they don't understand and/or aren't used. There are parts of GMail that are always slow (archiving a lot of messages). I know this so I know I have to wait 5-10 seconds. What if sometimes it took 1 second and sometimes it took 20 seconds? What if it took 20 seconds 5% of the time. I'd probably always click again and think something was broken. If it's always slow, I want it to be faster, but at least I know what to expect.
It's all about user experience. If a feature loads for you in 300ms, then it will feel unbearably slow if it later take 700ms. If the feature always loaded at 700ms, your expectation of 700ms would never be disappointed. The theory is that a consistent 700ms is more appealing than an inconsistent 300-700ms, even though the median is better in terms of raw performance.
The parent hints at this by talking about minimizing unpredictability and describing Facebook as "people oriented" - it ultimately boils down to user experience.
Think of it this way: if you know a certain function will reliably take a little while to complete, you can justify the effort of adding progress indicators and other feedback to let the user know.
But if query performance is unpredictable, even planning the UI design becomes difficult - not to mention the end user's experience.
I'm having trouble understanding the motivation. If a slow query is always slow, then I'm always going to be kept waiting for that page/data. It seems logical to worry about the queries that 100% of the time keeps users waiting rather than the queries that keep users waiting <100% of the time.
Does anyone care to explain why this is a good idea (for Facebook at least)?