Submitted by Realistic-Bed2658 t3_zzhg84 in MachineLearning
miss_minutes t1_j2bph94 wrote
https://stackoverflow.com/a/41568439
https://scikit-learn.org/stable/faq.html#will-you-add-gpu-support
Scikit learn isn't meant to be used with the GPU. Machine learning that's not deep learning don't benefit as much from GPU compute. Sklearn also uses optimised native libraries when possible (e.g. libsvm and liblinear), so if you were to implement the sklearn API on top of pytorch, it's very unlikely you'll beat the performance of sklearn.
jeanfeydy t1_j2cnd36 wrote
From direct discussions with the sklearn team, note that this may change relatively soon: a GPU engineer funded by Intel was recently added to the core development team. Last time I met with the team in person (6 months ago), the project was to factor some of the most GPU friendly computations out of the sklearn code base, such as K-Nearest Neighbor search or kernel-related computations, and to document an internal API to let external developers easily develop accelerated backends. As shown by e.g. our KeOps library, GPUs are extremely well suited to classical ML and sklearn is the perfect platform to let users fully take advantage of their hardware. Let’s hope that OP’s question will become redundant at some point in 2023-24 :-)
jpbklyn t1_j2ecs6g wrote
Correct. And since every GPU is attached to a CPU there is no reason to not run Scikitlearn on the local CPU, which also benefits from the direct attach to system memory
Realistic-Bed2658 OP t1_j2by65z wrote
thanks for the links, but I disagree for the most part.
DBSCAN and LOF most likely would benefit. Even their own MLP model would inherently benefit from it ( I do believe somebody willing to train a neural network most likely would use PyTorch of TF).
Also, the fact that today non-DL ML is mainly CPU-based doesn’t mean that in 5 years from now this won’t change. Personal opinion here, though.
AerysSk t1_j2cgmo4 wrote
If you are looking for a GPU version of scikit-learn, I think Nvidia is making one, and they call it cuml. Note that, not all algorithms are implemented, and there will also be some missing functions as well.
However, a note about Apple and AMD GPU thing: they are on the rise, but not until a few years later that they will become usable. My lab has only Nvidia GPUs but we already have a lot of headache dealing with Nvidia drivers and libraries. At least for a few years, we do not see any plan switching to AMD or Apple.
JocialSusticeWarrior t1_j2ed75n wrote
unfortunate name "cuml"
Realistic-Bed2658 OP t1_j2cld4m wrote
Totally understandable. I only use nvidia at work too.
Thanks for the info about the nvidia package!
AmbitiousTour t1_j2c637s wrote
Gradient boosting hugely benefits from a gpu.
Viewing a single comment thread. View all comments