Axial Attention in Multidimensional Transformers
时间: 2023-12-20 18:28:19 浏览: 177
Axial Attention in Multidimensional Transformers是一种用于多维Transformer模型的注意力机制。在传统的Transformer模型中,注意力机Axial Attention in Multidimensional Transformers是一种基于轴向注意力的变种transformer。它允许在解码期间并行计算绝大多数上下文,而不引入任何独立假设。这种层结构自然地与编码和解码设置中张量的多维度对齐。Axial Attention in Multidimensional Transformers的用法如下所示:
```
import torch
from axial_attention import AxialAttention
img = torch.randn(1, 3, 256, 256)
attn = AxialAttention(dim=3, # embedding dimension
dim_index=1, # index of the dimension to split and permute
heads=8, # number of heads
dim_head=None, # dimension of each head, defaults to dim/heads
sum_axial_out=True, # whether to sum the axial output
sum_axial_out_dims=None, # dimensions to sum the axial output, defaults to all axial dimensions
axial_pos_emb=None, # axial position embedding, defaults to None
axial_pos_shape=None, # axial position shape, defaults to None
axial_pos_emb_dim=None, # axial position embedding dimension, defaults to dim_head
attn_drop=0., # attention dropout
proj_drop=0.) # projection dropout
```
相关问题:
1. 什么是transformer?
2. 轴向注意力在哪些领域有应用?
阅读全文