FLatten Transformer
时间: 2023-10-14 11:01:32 浏览: 152
Flatten Transformer是一种基于Transformer结构的语义分割模型,其核心思想是通过将输入图像进行下采样并划分成小的patches,然后将这些patches进行flatten操作,以便作为Transformer的编码器输入。这样做的目的是为了减少计算量并将图像的全局信息编码到每个小的patch中。
在Flatten Transformer中,还会插入一个类别标记(Class token),用于表示整个图像的语义信息。此外,每个patch的位置信息也会被加入到输入中,这是通过对每个位置信息进行简单的加法运算实现的。这些位置信息也可以被视为可训练的参数。<span class="em">1</span><span class="em">2</span><span class="em">3</span>
#### 引用[.reference_title]
- *1* [Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation论文...](https://blog.csdn.net/qq_42476803/article/details/128570281)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"]
- *2* *3* [Transformer计算量和过程统计](https://blog.csdn.net/MicroE_Lv/article/details/122855331)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT3_1"}}] [.reference_item style="max-width: 50%"]
[ .reference_list ]
阅读全文