Viewing a single comment thread. View all comments

CKtalon t1_j9r2k9j wrote on February 23, 2023 at 11:19 PM

Probably FasterTransformers with Triton Inference Server

whata_wonderful_day t1_ja3kh4d wrote on February 26, 2023 at 4:17 PM

Yeah this is what the big bois use. It'll give you max performance, but isn't exactly user friendly