swin_transformer
时间: 2023-10-28 12:47:57 浏览: 85
Swin Transformer is a type of transformer-based neural network architecture for image recognition tasks. It was introduced in the paper "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" by Microsoft Research Asia in 2021.
Swin Transformer improves upon the original Vision Transformer (ViT) architecture by introducing a hierarchical structure that allows for better handling of large images. Instead of processing the entire image at once, Swin Transformer divides the image into smaller patches and processes them in a hierarchical manner. This approach reduces the computational cost and memory requirements of the network while improving its accuracy.
Another key feature of Swin Transformer is the use of shifted windows. Instead of using fixed windows for each layer of the network, Swin Transformer shifts the windows to capture more local features and improve the overall performance of the network.
Swin Transformer has achieved state-of-the-art results on several benchmark image recognition datasets, including ImageNet and COCO.
阅读全文