Viewing a single comment thread. View all comments

CKtalon t1_j9r2k9j wrote

Probably FasterTransformers with Triton Inference Server

3

whata_wonderful_day t1_ja3kh4d wrote

Yeah this is what the big bois use. It'll give you max performance, but isn't exactly user friendly

1