On The Relationship Between Self-Attention And Convolutional Layers - Web this work provides evidence that attention layers can perform convolution and, indeed, they often learn to do so in.
Web this work provides evidence that attention layers can perform convolution and, indeed, they often learn to do so in.