hellrail t1_iqsvjt7 wrote
Transformers are graph networks applied on graph data, CNNs do not operate on graph data
029187 OP t1_iqsz9t7 wrote
Yeah I get why the non-locality is useful, as CNNs group data locally, which doesn't make sense in graph data (the relevant word could be very far away in the sentence)
But a densely connected deep neural network already should have what it needs to map out any arbitrary function relating nodes on a graph.
RobKnight_ t1_iqu7c54 wrote
Deeper layers in CNNs are not constrained to locality
029187 OP t1_iqud0ju wrote
true but the attention layers immediately overcome locality.
Viewing a single comment thread. View all comments