Very cool! if anyone who comes across this particular thread knows about papers/research being written about this topic, I'd be very interested to learn more.
There is an award-winning compressor that uses many statistical models (and a 3-layered dense neural network) to compress the data losslessly: http://www.byronknoll.com/cmix.html
Overall I think we are yet to see the full potential of deep learning unleashed on data compression. For example the neural network in cmix compressor is quite primitive compared to modern architectures. Someone will certainly find a way to do better than that!