Thanks, this is a pet peeve of mine. Theoretical bounds are all nice and such, a...

dbaupp · on July 6, 2014

Why should everything have an immediate practical impact? "Theoretical" research often becomes useful in future as new developments happen. E.g. as computers get faster it becomes more and more feasible to tackle huge problems, i.e. so large that the asymptotically faster algorithms are faster in practice (that is, the problem size overcomes the large constant factor).

anaphor · on July 6, 2014

Please don't misunderstand my post as an attack on research that has a more theoretical impact. What I'm saying is that if you claim to have found a faster version of some algorithm, then it should actually be true (preferably with working code). I have no problem with people showing a theoretically better version as long as they explain whether or not it's useful in practice.

Also I would point out that as computers get faster things that were suboptimal before become feasible and begin to outperform the tricks people used before. A good example would be spellchecking, which had to be done using a wide array of cleverness but now we can just throw tons of computing power at it and get better results.

CHY872 · on July 7, 2014

I believe the mistake to be yours. Algorithms is a largely theoretical field - when an algorithms researcher comes up with an algorithm with a lower asymptotic bound, that's all he or she is claiming - they're making no statements about its practical utility or the performance of an implementation. An implementation will almost certainly be provided, but that's besides the point.

An explanation of why it's useful in practice is again totally besides the point - and really would be taken to be rather insulting by many researchers - the motivations are often largely theoretical.

As a case in point, the fibonacci queue noted by OP was motivated by reducing the theoretical worst-case asymptotic complexity of Dijkstra's algorithm, and does that very - no benefit intended at all for normal programmers.

If people do not know what big-O means, that's their own problem. The algorithms people do their research, and their papers are published in the context of an informed reader - if someone does not know the definition of big-O and make mistakes based on it, the time wasted by them is probably their own fault.

In the case of this article, it seems as though the code originally ran in 200ms. This seems like a very acceptable amount - and one that would be improved more by using an algorithm like A* with decent heuristics than by simply trying to do the same work in less time.

alco · on July 6, 2014

It is worth pointing out here that Quicksort has complexity of Ɵ(n²) in the worst case. Yet it has worked out as the best one in practice in many applications.

shanusmagnus · on July 6, 2014

That's true, but it's also worth pointing out that the worst case can be trivially prevented by shuffling the array before you Quicksort it. Which is maybe a confirmation of your larger point.

greatzebu · on July 6, 2014

The principled way to avoid bad worst-case performance for quicksort is to select the pivot using a median of medians approach. That gives you good asymptotic performance even for adversarial inputs, but in practice it tends to be slower. Which fits well with the theme of this article.

btilly · on July 6, 2014

You can trivially achieve bounded worst case performance with minimal overhead by doing median of medians only, say, one time in 5.

In the average case you do the expensive median of medians on a small fraction of the data and so only pay a few percent average penalty. And in the worst case you still get n log(n) performance.

judk · on July 6, 2014

How is the worst case not n^2 on a case you don't use medians of medians?

adrianN · on July 7, 2014

Every fifth pivot is chosen to be an actual median.

ballpoint · on July 6, 2014

The median-of-medians approach is also somewhat tricky to implement (avoiding array copies and so on), although it admittedly looks neat on the slides of the algorithms lectures.

ruggeri · on July 6, 2014

Strictly speaking, it is not true that shuffling first avoids the worst case. Given a deterministic shuffling function, some permutation of input needs to shuffle to sorted order, which will again trigger the worst case.

Of course, shuffling is still potentially helpful because ascending/descending order is often common in many applications.

brudgers · on July 6, 2014

Randomization of the pivots produces O(n log n) average running time with a very low probability of any other running time for any value of n where the worst case running time matters.

ruggeri · on July 6, 2014

Even without randomization of pivots expected run time is O(n log n). If the order of the input data is random, I don't believe randomizing pivots changes the distribution of running times.

What changes is that one very common permutation of data (ascending data) does not have O(n * 2) performance.

brudgers · on July 6, 2014

There's little justification for expecting data to behave randomly unless our algorithm introduces it.

mtdewcmu · on July 7, 2014

Correct. In general, real data will seldom, if ever, look like random data.

shanusmagnus · on July 6, 2014

Shuffling with a too-small seed size will render the shuffle susceptible to brute force attacks; but in no way invalidates its use as a dodge for pathological (pre-sorted) arrays submitted to Quicksort in the course of normal usage. There's a better chance of getting eaten by a dinosaur than finding O(N^2) QS performance on shuffled data that comes from anything other than an attacking adversary.

mtdewcmu · on July 7, 2014

You're right that shuffling doesn't avoid the worst case, but your explanation isn't quite correct. See my other comment for my explanation.

I'm not sure what you mean by deterministic shuffling. Shuffling is supposed to be random, and therefore non-deterministic. In practice, you'd probably use a pseudorandom function to implement shuffling, but the goal of pseudorandom functions is to be indistinguishable from random functions, so that detail shouldn't matter.

mtdewcmu · on July 7, 2014

Shuffling can never prevent the worst case of any algorithm. "Worst case" means the worst out of all possible cases. Shuffling doesn't change the set of cases, and it doesn't change anything about any individual case.

Probably what you meant was that shuffling makes the worst case very unlikely.

On a practical level, if you have to shuffle the data before sorting to make your sort algorithm work better, you've most likely already blown any performance advantage. So you might as well choose a different sort algorithm.

AYBABTME · on July 6, 2014

I believe Big Theta means bounded above and below by `n^2`, where the bounds differ by a constant factor. Saying that Quicksort is Ɵ(n^2) means that is has worst case `n^2` AND best case `n^2`. But the best case of Quicksort is `n`.

So Quicksort is O(n^2); the worst case is <= cn^2, for a constant c.

Sorry to bring it up. I was wondering if I wanted to be the annoying guy who nitpicks on that; but then I thought you might be interested in the difference.

   [edit1: was wrong, don't want to mislead]
   [edit2: not sure I was wrong anymore, this is confusing]

devinj · on July 6, 2014

No, it means the worst case is bounded above and below by n^2. That is, the worst-case scales no worse or better than n^2, but approximately proportional, asymptotically. (Whereas, for example, every O(n^2) algorithm also happens to be O(2^n))

The best case is something entirely separate, and can itself be bounded from above or below.

OMBUG · on July 6, 2014

It doesn't make sense to say there's an upper and lower bound on the "worst" case. If that were true, the upper bound would be the new worst case, and the lower bound would be some middle case. A best or worst case is inherently always going to be big-theta.

ballpoint · on July 6, 2014

This talks about our ability to reach or describe the worst case, so if we say it's Ɵ(n^2), we're saying that we can tell that the worst case is at least quadratic, and it's also no worse than quadratic. We might not be so good at the math and have only discovered that the worst case was at least linear and no worse than exponential (which is true of quicksort).

emillon · on July 6, 2014

To help you wrt your edit 2, you seem to be confusing two things:

- the Ɵ() and O() notations can be used on mathematical function. For example 5n^2 + 3n = Ɵ(n^2) [^1] or 3n^2 = O(n^3).

- algorithms have different complexities: for best case, for worst case, or average case. It corresponds to as many mathematical functions.

Each of these functions can be given appropriate Ɵ() or O() bounds.

[^1]: really meaning λn.5n^2 + 3n ∈ Ɵ(λn.n^2), it's a relation between functions.

dlp211 · on July 6, 2014

Except you are wrong. O(x) is an upper bound, Omega(x) is a lower bound, and something is Theta(x) iff it is bounded by Omega(x) and O(x) where (x) is the same[1].

And if a stackoverflow link isn't good enough, than go read CLRS, Dasgupta, or Skiena.

Quicksort has no Theta(x), it is O(n^2) and Omega(n log n) with an expected runtime of n log n. The reason that Quicksort performs so well is because it is only in very specialized cases where it devolves to its worst case.

[1] http://stackoverflow.com/questions/10376740/big-theta-notati...

Edit: Replied to the wrong commenter, so the replied to post isn't wrong, the grandparent post is.

AYBABTME · on July 6, 2014

After reviewing, I've understood two meanings people give to Theta(n):

    1. Theta(n) means that it's O(n), O(n^2), O(...), but 
       Theta(n) is a proper, tighter definition of the upper
       bound.  It doesn't say anything about the lower bound.

    2. Theta(n) means that it's tightly bounded up and down 
       by n. Both Omega(n) and some O(n) are equal, thus its 
       Theta(n).

I think 2 is the proper definition (thus my 2nd edit), but I'm not totally sure. I've let my original comment there with the edit chain in hope discussions would bring up the proper definition. Unfortunately I'm getting downvoted for that, which I find odd.

emillon · on July 6, 2014

I was confused at first :)

alco · on July 6, 2014

Saying it is bounded by O(n²) is too vague, because it's the same as being bounded by O(n³) and so on. A linear algorithm is also O(n²) and O(n³).

That is why I emphasized that the worst case in particular has its bound within a constant factor of n², hence Ɵ.

tshaddox · on July 6, 2014

I think the general point is that problems just aren't that interesting for small n.

ballpoint · on July 6, 2014

Well the even-more-general point is that these problems usually aren't that interesting in real life, then.