Winograd O(n^2.37) is a win for 3x3 convolutions in cuDNN, so it can be implemen...

chrchang523 · on Oct 5, 2022

My understanding is that the Winograd minimal filtering algorithms used in cuDNN are different from the O(n^2.37) Coppersmith-Winograd-descended matrix multiplication algorithms. But I acknowledge that these can be considered cousins, produced by the same line of research.

HarHarVeryFunny · on Oct 5, 2022

I'm pretty sure there's no difference. It does seem to be pretty hard to turn the theoretical win into a practical one - the GPU kernel needs to be coded extremely efficiently to match the underlying hardware. AFAIK it's only a win for 3x3 - maybe for one other size too. Originally Winograd wasn't supported by cuDNN on NVidia's Tensor Cores (matmul-specific hardware on more recentish GPUs), vs CUDA cores, but a Google search seems to indicate it can be done - not sure if that's in cuDNN though.