Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The other response you'll see all the time? "Measure it!"

Low-level optimization is not a task to be undertaken casually. If you don't know where to look, you can end up just wasting a lot of programmer time for zero real-world benefit. If you don't measure before and after, you may even make things worse...

The SQLite folks didn't make these changes with a hope and a prayer; they carefully profiled, changed, and profiled again... And they invested years in building a test suite to ensure that such micro-optimizations don't inadvertently break the logic. The results are impressive, but so was the time and effort invested in achieving them. If you're not willing to be so diligent, "premature optimization" is exactly what you're doing.



> "Measure it!"

And for the love of all that you care about, measure it with realistic data and use cases.

I've seen people optimise code while working with small test databases and making changes that are more efficient over the smaller data-sets but are hideously bad on larger ones (usually due to memory use growth, for instance hitting the limits where SQL Server starts spooling temporary data to disk instead of just keeping it in RAM for the duration of its existence).

Or just as bad: working with data that has the wrong balance and (to give another SQL example) giving index use "hints" that shave a little off the time to run over their test data but hamper the query planner on real data by making it ignore what might be better options.


Can you expand on what you mean by index use "hints"? Or is this a feature I just haven't stumbled on yet?


As lmz mentioned, they inform the query planner that you'd prefer it to take a particular route to address your instructions where it might have a number of choices given your current table+index structure and data patterns (and has chosen a bad one of the options in the past).

I use the quotes around the word hint because it implies that it is a fairly informal pointer that the planner might choose to ignore where in reality they usually take it as an order and go that way without question (assuming that you know better than it might otherwise chose to do, or you would not have given the hint).


Telling / forcing the query planner to use a particular index in a query e.g.:

SQL Server: http://msdn.microsoft.com/en-us/library/ms187373.aspx MySQL: http://dev.mysql.com/doc/refman/5.6/en/index-hints.html


Oh yeah, definitely. In some situations performance implications are obvious. E.g. Adding to a list of unique values in a loop? Use a hash, not a list. That's the kind of thing you do up-front, it sounds obvious but you would be amazed at the bad responses you get on the internet for pointing this kind of stuff out.

What I guess I'm trying to say is that you can go in a write obviously slow code "which is fine because you shouldn't prematurely optimize," or you can be informed and, with sometimes no extra development time, write something that's at least competent from the get-go.

For everything else? Profile, profile, profile, profile! And make sure you are using a competent profiler. I've wasted days because of inferior tools, switched to better ones (cough VTune cough) and had major improvements in mere hours.


This is where knowing the full Knuth quote helps...

> Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Another common mistake relates to the phrase "penny wise and pound foolish", which applied to programming helps us identify those scenarios where we might waste time micro-optimizing array searches rather than choosing a hashtable (or even pre-sorting the array). Time spent optimizing your architecture will nearly always offer a bigger payoff than micro-optimizations - but once that's done, Knuth's critical 3% is worth remembering.


It goes beyond the full quote --- the whole paper is great, showing Knuth's stature as both a thinker and a writer. If you've ever quoted the "premature optimization" line to someone, or nodded when hearing it, you'd do yourself justice by understanding the context of the quote.

http://cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammingGoT...

Here's some more from it to entice you:

  My own programming style has of course changed during the 
  last decade, according to the trends of the times (e.g., 
  I'm not quite so tricky anymore, and I use fewer go to's), 
  but the major change in my style has been due to this inner 
  loop phenomenon. I now look with an extremely jaundiced eye 
  at every operation in a critical inner loop, seeking to 
  modify my program and data structure (as in the change from 
  Example 1 to Example 2) so that some of the operations can 
  be eliminated. The reasons for this approach are that: a) 
  it doesn't take long, since the inner loop is short; b) the 
  payoff is real; and c) I can then afford to be less 
  efficient in the other parts of my programs, which 
  therefore are more readable and more easily written and 
  debugged.


For real, the SQLite codebase is some of the most well-tested open-source software out there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: