New paper: "Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize"

## Detailed Description
Very interesting new paper from Google:

* [Blog: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize](https://ai.googleblog.com/2021/12/improving-vision-transformer-efficiency.html)
* [Paper: TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?](https://arxiv.org/abs/2106.11297)

Basically: A way to dynamically pick which "patches" of images / videos to attend to.

This could be a really nice way of including much larger input images (which is important for longer-time-horizon nowcasting).




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New paper: "Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize" #101

Detailed Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New paper: "Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize" #101

Description

Detailed Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions