Submitted by harishprab t3_yga0s1 in MachineLearning

We have recently open sourced our inference acceleration library, voltaML.

⚡VoltaML is a lightweight library to convert and run your ML/DL deep learning models in high performance inference runtimes like TensorRT, TorchScript, ONNX and TVM.

We would love for the reddit and the open-source community to use it, give feedback and help us improve the library.

https://github.com/VoltaML/voltaML

16

Comments

You must log in or register to comment.

PlayOffQuinnCook t1_iu7ualp wrote

Congrats on open sourcing! Quick question on fusion - how do you guys fuse layers like conv, bn, relu, etc, if they are not named conv1, bn1, relu in the nn.Module.

1

_Arsenie_Boca_ t1_iu7yz56 wrote

Looks promising. A comparison with other competitors (hf accelerate, neuralmagic, nebullvm, ...) would be great

1

limpbizkit4prez t1_iu965oz wrote

Do you have any benchmarks against other frameworks and have you benchmarked other types of models or are you doing something specific for NLP?

1

LetterRip t1_iu9be41 wrote

Have you tried it with diffusers/stable diffusion?

3

pommedeterresautee t1_iu9vc6f wrote

Hi, I am one of the authors of transformer deploy. I have seen you have copied most of the files for the transformer part. That’s really cool, I really appreciate you kept the licenses, may I ask you to cite our work in the Readme ?

Moreover, if I may, why did you copied instead of just importing a dependency? You would get the maintenance for free :-)

10

harishprab OP t1_iucy9k9 wrote

Thanks. HF accelerate is basically doing it for intel chips. I haven’t seen them support TensorRT. I could be wrong. Neural Magic is mostly about quantisation aware training and pruning techniques. We focus on post training techniques. We should try Nebullvm. They’re a great library too.

1

PlayOffQuinnCook t1_iueq6l4 wrote

I understand that. But let’s say I have these operators named as c1, b1, r1 instead of what it expects, the fusion logic won’t work. So my question was if this library works only a fixed set of models defined in the library itself or it can work against any models users write.

1

harishprab OP t1_iugkqy9 wrote

Right now it supports only for the models that are supported by these libraries. We have tried fusion manually earlier but ran into many issues given the diversity of models. So we stuck to torchfx and trt. Maybe in the future we can make it modular so that it can work on any model.

1