Public
- Public
- Groups
- Media
- Popular

Conversation

Notices

Remote profile options...

akionux

[2209.07484] Hydra Attention: Efficient Attention with Many Heads - https://arxiv.org/abs/2209.07484

Friday, 16-Sep-22 08:52:10 UTC from status.akionux.net
1. Remote profile options...
  
  akionux
  
  @akionux 純粋なAttentionだとO(T^2D)でトークン数Tのスケールが悪いが、カーネルトリックで線形AttentionにするとO(TD^2/H)でヘッド数Hに反比例するので、ヘッド数を特徴次元Dに増やした（Hydraトリック）らO(TD)にできるとのこと。計算方法もエレメントワイズな計算が出てすごくすっきりするみたい。
  混沌としたTransformer界隈に光を差すようなとてもエレガントな考え方で、研究するならこういう仕事がしたいやつ。
  
  Tuesday, 20-Sep-22 23:27:33 UTC from status.akionux.net

Feeds

Rainbow Dash Network is a microblogging service brought to you by Cerulean Spark. It runs the StatusNet microblogging software, version 1.0.1, available under the GNU Affero General Public License.

Content and data of Rainbow Dash Network are private and confidential.

Content and data copyright by Rainbow Dash Network. All rights reserved.

Switch to mobile site layout.