Comments

You must log in or register to comment.

_d0s_ t1_ivxk6y9 wrote

It's also used for spatial embedding of patches in an image

Besides the positional embedding transforms also use the attention mechanism which can be beneficial for some problems on its own

1

Hrant_Davtyan t1_ivxmph7 wrote

In table Q&A tasks people are using stuff like BERT to represent a table (where data can be non-sequential, aka you can rearrange table columns and still have the same info).

They do that to represent a structured table as an unstructured text and perform question answering over it using the table as the context. But to be frank, it is not fully non-sequential data representation as the user's question in the end is a text, which is a sequential data source.

2

IglooAustralia88 t1_ivxpafz wrote

Transformers have no spatial/position knowledge unless you add it explicitly (eg positional embeddings). Otherwise they’re a feed forward network with input specific weights.

2

eigenham t1_ivxwhgg wrote

The attention head is a set to set mapping. It takes the input set, then compares each input to a context set (which can be the input set itself, or another set), and based on those comparisons outputs a new set of the same size as the input set

Out of curiosity, how were you thinking of using that for gene expression?

2

No_Captain_856 OP t1_ivxwq1s wrote

I wasn’t, it didn’t seem the best model to me, at least conceptually. Anyway, my thesis supervisor asked me to, and I wasn’t so sure about its applicability to that kind of data, and also on the meaning of using it in that context 🤷🏻‍♀️

2

eigenham t1_ivy0qhf wrote

I mean it definitely captures a relationship between parts of input data in ways that many other models cannot. It also cannot do everything.

Like most real world problems, there's a question of how you will represent the relevant information in data structures that best suit the ML methods you intend to use. Similarly there's a question of whether the ML methods will do what you want, given the data is in that form.

Despite the fact that transformers are killing it for ordered data, I'd say their flexibility to deal with unordered data is definitely of interest for real world problems where representations are tricky.

2