Something that blew my mind the first time I learned it:
You can think of a function f(x) as the limit of an infinitely big vector where the entries index the infinitesimal. Eg f(x) = [...f(-2dx), f(-dx), f(0), f(dx), f(2dx)...]. The dot product of two functions (f,g) is still the usual sum[f(x)*g(x)] but the sum is replaced with an integral. Sin and Cos of integer frequencies happen to have a dot product of 0 (check it) which means doing a change of basis into those functions (aka the Fourier transform) happens to work out really nicely. Other than that Sin and Cos are not really privileged. For example you can do a transform into the basis of polynomial functions if you wanted to (aka the Taylor series). Any basis you can cook up would do just as well so long as its a complete basis. Just like in linear algebra, you have a complete basis if you can use linear combinations to construct the vectors ... [...1,0,0...], [...0,1,0...], [...0,0,1...] ... (aka the dirac delta functions). Differentiation is a matrix with dx on the off diagonal and -dx on the diagonal.
It's a bit weirder than this suggests. You are right as long as you have a complete basis you can transform code your function by a weighted sum of bases vectors, where the weights are inner products of the function against he dual vectors of that basis, and the basis vector form a complete orthonormal basis (we can relax that last bit).
So that's great, but what do the basis vectors look like? You need a family of functions so that scalings and translations of them have inner product 0 against each other. So this is easy if the shared support is 0, or if the +ve and -ve parts cancel out. It doesn't take too much playing about to see how this works with Haar basis, or sin/cos (i.e. FFT), so it looks pretty intuitive.
It also seem kind of intuitive that these might be the only way to do this... but that's wrong.
It turns out you can construct other families that work, unlike the nicely behaved sin/cos, or step functions, they are not smooth, not symmetric (at least for orthonormal basis) - quite odd. They don't have closed forms, so if you want to see what one looks like you'll have to realized it as a fixed point or by some other approximation method: e.g. https://en.wikipedia.org/wiki/Daubechies_wavelet#/media/File...
> For example you can do a transform into the basis of polynomial functions if you wanted to (aka the Taylor series). Any basis you can cook up would do just as well so long as its a complete basis.
Just be warned that the polynomial you get depends on where you center your Taylor series. There are multiple different Taylor-series approximations for the same function, and each one only converges within a disk in the complex plane which does not include any poles.
If you try to fit the Runge function 1 / (1 + 25x^2) with a Taylor series centered at 0 you’re going to have a bad time.
These kind of heuristics are supposedly a prerequisite to the formal study, but sadly most textbooks don't spend the time talking about it (since logically it's not necessary). Instead of reading up on functional analysis (Banach spaces and Hilbert spaces), read bits of its history to get a taste. Here's one that's not too long http://courses.mai.liu.se/GU/TATM85/FA-history.pdf
I attended a talk by Stephane Mallat a few months back about the work his lab is doing on wavelet scattering transforms.
It seems they're making great progress in developing deep neural networks based on them that don't need to be learned.
It's not as hyped as mainstream deep learning methods but I think it holds a lot of promise since it cuts down on learning time, is mostly unsupervised, gives you control over the network's features and is a more theoretically principled way to build networks than just defining the loss and crossing your fingers while it's optimizing as we're doing right now (bit of an hyperbole).
I'm not familiar with this particular research direction.
On the more general area Mallat has a good and approachable (not very technical) book on the area: "A Wavelet Tour of Signal Processing". Worth a look for anyone intrigued by wavelets.
My old research group was working on stuff relating to scattering network theory and application. The tl;dr is it's got nice math but isn't practically useful for much. The network construction also isn't as similar to, e.g., convnets as some of the scattering literature suggested.
I'm partly speaking out of my ass from second-hand experience 10 years ago but...
Wavelet-based codecs turned out to be much harder to implement in hardware that fourier-transform-based ones (for which we have FFT!). That disincentivized development, which is why you only have Dirac (an open-source, non-HW-industry-backed effort) that uses it.
The problem is that the economical model of Wavelet, or TCO of Wavelet codec doesn't do well. And like everything in out world it is only the cost and values on offers that matters.
The total energy ( CPU cycles, or hardware implementation ) it requires to encode and decode far outweighs the benefits it offers. Not to mention the current block based still has many tricks. VVC ( H.266 ) seems to be doing well. ( Patents issues aside )
According to the x264 developers (can't find the reference anymore, sorry), wavelet based lossy compression is good for optimizing for PSNR, not for perceptual image/audio quality, being prone to loss of detail.
x264 beats Dirac and modern image compression formats like HEIF (based on HEVC) easily beat JPEG2000.
Modern codecs usually use transforms derived from the classic DCT and DST.
There was even a whole pop-sci book -- Hubbard's "The World According to Wavelets" (1996) that was a bit like Mandelbrot's "The Fractal Geometry of Nature" in that it predicted an amazing future for the technique and claimed that science would be revolutionized by it. But it is nice to see that wavelets, like fractals, do have their uses even if the revolution never came.
Wavelets are the basis of modern video codecs. People also use them to watermark videos. I never found much advanced application in image processing which was my field of research.
There's no wavelet-based video codec in widespread use (unless one counts DCP, a file format used for physical deliveries of movies to theaters, where the picture track is encoded as JPEG2000 frames).
I believe either the patents have expired or the basic functionality was always royalty-free; but what really killed JPEG2000 was the immense complexity and increased processing power required. It still has applications in some niche areas, and probably most people's contact with it will be in the form of images embedded in PDFs (many of archive.org's ebooks use JPEG2000), where the noticeably slower rendering of pages is a big turn-off.
"What do you say to a thesis student you don't remember? In that position I suggest something
very short: 'Tell me more.' The most amazing part was his thesis topic. 'I am designing the filter
bank for MIT's entry in the HDTV competition.' Some days you can't lose, even if you deserve to."
They didn't really morph, these were approaches to deal with a fundamental issue in signal representation in higher dimensions that comes up when you use tensor product representations. Simply put, wavelets are not good at capturing edges in signals of dimension > 1 (although they are great at it for dimension 1)
The fundamental (functional analysis) research in this area split in a few directions. One as you mention, mostly driven by imaging research, but also frame theory and non tensor-product basis, etc.
I love how you can decompose one image in wavelets and edit just some layers of it (in GIMP), and those retouches end up as subtle improvements of the image.
We explored using fast wavelet transforms to compress EKG samples for an ambulatory cardiac monitoring device, but the risks of regulators not understanding the math behind the compression meant we walked away from a ~3:1 compression of the sampled waveforms and attendant bandwidth savings.
They accomplish different things. Fourier decompositions break signals down into a frequency based representation, whereas wavelet decomposition yields a sort of hybrid frequency/temporal representation. Wavelets have their uses, but I don't know that anyone studying them thought they would replace Fourier decompositions.
A wavelet naturally gives a spectrogram. The Fourier transform taken on a signal will give you only frequency content, but no idea in time where this frequency occurred. A wavelet naturally gives both time and frequency information. Of course, if you take the Fourier transform at discrete time intervals in blocks, you can also construct a spectrogram.
You can think of a function f(x) as the limit of an infinitely big vector where the entries index the infinitesimal. Eg f(x) = [...f(-2dx), f(-dx), f(0), f(dx), f(2dx)...]. The dot product of two functions (f,g) is still the usual sum[f(x)*g(x)] but the sum is replaced with an integral. Sin and Cos of integer frequencies happen to have a dot product of 0 (check it) which means doing a change of basis into those functions (aka the Fourier transform) happens to work out really nicely. Other than that Sin and Cos are not really privileged. For example you can do a transform into the basis of polynomial functions if you wanted to (aka the Taylor series). Any basis you can cook up would do just as well so long as its a complete basis. Just like in linear algebra, you have a complete basis if you can use linear combinations to construct the vectors ... [...1,0,0...], [...0,1,0...], [...0,0,1...] ... (aka the dirac delta functions). Differentiation is a matrix with dx on the off diagonal and -dx on the diagonal.