Viewing a single comment thread. View all comments

visarga t1_itgqoj0 wrote

There are workarounds for long input, one is the linear transformer family (Linformer, Longformer, Big Bird, Performer, etc), the other is the Perceiver, who can reference a long input sequence using a fixed size transformer.

2