Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Wavelets (1994) [pdf] (cybertester.com)
98 points by rubenbe on Aug 14, 2019 | hide | past | favorite | 46 comments


Something that blew my mind the first time I learned it:

You can think of a function f(x) as the limit of an infinitely big vector where the entries index the infinitesimal. Eg f(x) = [...f(-2dx), f(-dx), f(0), f(dx), f(2dx)...]. The dot product of two functions (f,g) is still the usual sum[f(x)*g(x)] but the sum is replaced with an integral. Sin and Cos of integer frequencies happen to have a dot product of 0 (check it) which means doing a change of basis into those functions (aka the Fourier transform) happens to work out really nicely. Other than that Sin and Cos are not really privileged. For example you can do a transform into the basis of polynomial functions if you wanted to (aka the Taylor series). Any basis you can cook up would do just as well so long as its a complete basis. Just like in linear algebra, you have a complete basis if you can use linear combinations to construct the vectors ... [...1,0,0...], [...0,1,0...], [...0,0,1...] ... (aka the dirac delta functions). Differentiation is a matrix with dx on the off diagonal and -dx on the diagonal.


It's a bit weirder than this suggests. You are right as long as you have a complete basis you can transform code your function by a weighted sum of bases vectors, where the weights are inner products of the function against he dual vectors of that basis, and the basis vector form a complete orthonormal basis (we can relax that last bit).

So that's great, but what do the basis vectors look like? You need a family of functions so that scalings and translations of them have inner product 0 against each other. So this is easy if the shared support is 0, or if the +ve and -ve parts cancel out. It doesn't take too much playing about to see how this works with Haar basis, or sin/cos (i.e. FFT), so it looks pretty intuitive.

It also seem kind of intuitive that these might be the only way to do this... but that's wrong.

It turns out you can construct other families that work, unlike the nicely behaved sin/cos, or step functions, they are not smooth, not symmetric (at least for orthonormal basis) - quite odd. They don't have closed forms, so if you want to see what one looks like you'll have to realized it as a fixed point or by some other approximation method: e.g. https://en.wikipedia.org/wiki/Daubechies_wavelet#/media/File...


> For example you can do a transform into the basis of polynomial functions if you wanted to (aka the Taylor series). Any basis you can cook up would do just as well so long as its a complete basis.

Just be warned that the polynomial you get depends on where you center your Taylor series. There are multiple different Taylor-series approximations for the same function, and each one only converges within a disk in the complex plane which does not include any poles.

If you try to fit the Runge function 1 / (1 + 25x^2) with a Taylor series centered at 0 you’re going to have a bad time.

https://www.desmos.com/calculator/uyjyjug07m


What should I google to read more about this kind of thing?


These kind of heuristics are supposedly a prerequisite to the formal study, but sadly most textbooks don't spend the time talking about it (since logically it's not necessary). Instead of reading up on functional analysis (Banach spaces and Hilbert spaces), read bits of its history to get a taste. Here's one that's not too long http://courses.mai.liu.se/GU/TATM85/FA-history.pdf



Wavelets were supposed to be a big thing like 15-20 years ago ... what happened? Do any mainstream modern codecs use wavelets? I’m not aware of any

Edit: looks like the Dirac video codec is based on wavelets. Good to know all of that research wasn’t for nothing!


I attended a talk by Stephane Mallat a few months back about the work his lab is doing on wavelet scattering transforms. It seems they're making great progress in developing deep neural networks based on them that don't need to be learned.

It's not as hyped as mainstream deep learning methods but I think it holds a lot of promise since it cuts down on learning time, is mostly unsupervised, gives you control over the network's features and is a more theoretically principled way to build networks than just defining the loss and crossing your fingers while it's optimizing as we're doing right now (bit of an hyperbole).

I could go on, it's absolutely fascinating. More info (in english & french): https://edouardoyallon.github.io/thesis.pdf


I'm not familiar with this particular research direction.

On the more general area Mallat has a good and approachable (not very technical) book on the area: "A Wavelet Tour of Signal Processing". Worth a look for anyone intrigued by wavelets.


My old research group was working on stuff relating to scattering network theory and application. The tl;dr is it's got nice math but isn't practically useful for much. The network construction also isn't as similar to, e.g., convnets as some of the scattering literature suggested.


I'm partly speaking out of my ass from second-hand experience 10 years ago but...

Wavelet-based codecs turned out to be much harder to implement in hardware that fourier-transform-based ones (for which we have FFT!). That disincentivized development, which is why you only have Dirac (an open-source, non-HW-industry-backed effort) that uses it.


Not a codec, but still image-related: one of the most powerful modules in Darktable is the equalizer, which is based on wavelets[0][1].

[0] http://www.darktable.org/2011/11/darktable-and-research/

[1] https://www.roberthutton.net/archives/203


The problem is that the economical model of Wavelet, or TCO of Wavelet codec doesn't do well. And like everything in out world it is only the cost and values on offers that matters. The total energy ( CPU cycles, or hardware implementation ) it requires to encode and decode far outweighs the benefits it offers. Not to mention the current block based still has many tricks. VVC ( H.266 ) seems to be doing well. ( Patents issues aside )


Do you happen to know what's the encoding time for a DWT based encoder vs H.264? Assuming the same hardware, of course...

Are we talking about roughly the same encoding time, or the DWT is slower by an order of magnitude?


According to the x264 developers (can't find the reference anymore, sorry), wavelet based lossy compression is good for optimizing for PSNR, not for perceptual image/audio quality, being prone to loss of detail. x264 beats Dirac and modern image compression formats like HEIF (based on HEVC) easily beat JPEG2000. Modern codecs usually use transforms derived from the classic DCT and DST.


There was even a whole pop-sci book -- Hubbard's "The World According to Wavelets" (1996) that was a bit like Mandelbrot's "The Fractal Geometry of Nature" in that it predicted an amazing future for the technique and claimed that science would be revolutionized by it. But it is nice to see that wavelets, like fractals, do have their uses even if the revolution never came.


DjVu ebook format uses wavelet compression. From my experience its painfully slow.


Wavelets are the basis of modern video codecs. People also use them to watermark videos. I never found much advanced application in image processing which was my field of research.


There's no wavelet-based video codec in widespread use (unless one counts DCP, a file format used for physical deliveries of movies to theaters, where the picture track is encoded as JPEG2000 frames).


H.264 is pretty widespread.


H.264 is not a wavelet-based video codec.


JPEG2000, I believe.


Blocked by Patent-bullshit


I believe either the patents have expired or the basic functionality was always royalty-free; but what really killed JPEG2000 was the immense complexity and increased processing power required. It still has applications in some niche areas, and probably most people's contact with it will be in the form of images embedded in PDFs (many of archive.org's ebooks use JPEG2000), where the noticeably slower rendering of pages is a big turn-off.


Also digital cinema is pretty much all JPEG2000 (every frame is individually compressed).


ImMix Cube products maybe? My memory fades.


Gilbert Strang is a delightful writer.

"What do you say to a thesis student you don't remember? In that position I suggest something very short: 'Tell me more.' The most amazing part was his thesis topic. 'I am designing the filter bank for MIT's entry in the HDTV competition.' Some days you can't lose, even if you deserve to."


i wrote my master thesis on wavelets! a few years back they morphed into shearlets and contourlets.

https://en.wikipedia.org/wiki/Contourlet

https://en.wikipedia.org/wiki/Shearlet


They didn't really morph, these were approaches to deal with a fundamental issue in signal representation in higher dimensions that comes up when you use tensor product representations. Simply put, wavelets are not good at capturing edges in signals of dimension > 1 (although they are great at it for dimension 1)

The fundamental (functional analysis) research in this area split in a few directions. One as you mention, mostly driven by imaging research, but also frame theory and non tensor-product basis, etc.


My thesis defense was about wavelets and it was in 1995! Had to implement my own FTT and rendered 3D charts in Excel...



I love how you can decompose one image in wavelets and edit just some layers of it (in GIMP), and those retouches end up as subtle improvements of the image.


The 2017 Abel Prize was awarded to Yves Meyer (no relation) for his contribution to the theory of wavelets. This is worth a read: https://www.quantamagazine.org/yves-meyer-wavelet-expert-win...


We explored using fast wavelet transforms to compress EKG samples for an ambulatory cardiac monitoring device, but the risks of regulators not understanding the math behind the compression meant we walked away from a ~3:1 compression of the sampled waveforms and attendant bandwidth savings.


I'm surprised bandwidth was a consideration at all given the slow signals involved. What's a typical Nyquist rate used when sampling an EKG?


Probably 500 Hz. Haha


Worth reading, but boy is that font painful. Looks like one of the older bitmapped latex CM font?


Implemented FDWT as part of a coursework project many many years ago when it was hip :)


Who gets taught wavelets now?

Are they supposed to displace Fourier decompositions, or did that fizzle?


They accomplish different things. Fourier decompositions break signals down into a frequency based representation, whereas wavelet decomposition yields a sort of hybrid frequency/temporal representation. Wavelets have their uses, but I don't know that anyone studying them thought they would replace Fourier decompositions.


It's a pretty standard signal processing topic. It doesn't replace a FFT, but is used similarly to windowed FFT.

Definitely has its uses. Definitely isn't a panacea.


how does this compare to Fast Fourier transform ?

https://en.wikipedia.org/wiki/Fast_Fourier_transform

Or the Discrete Cosine Transform ?

https://en.wikipedia.org/wiki/Discrete_cosine_transform

Or the Gabor Transform ?

https://en.wikipedia.org/wiki/Gabor_transform


A wavelet naturally gives a spectrogram. The Fourier transform taken on a signal will give you only frequency content, but no idea in time where this frequency occurred. A wavelet naturally gives both time and frequency information. Of course, if you take the Fourier transform at discrete time intervals in blocks, you can also construct a spectrogram.


In what sense?


Anyone know good sources on shift invariant wavelets?


The dual tree complex wavelet transform is approximately shift invariant https://en.wikipedia.org/wiki/Complex_wavelet_transform#Dual...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: