miss_minutes t1_j2bph94 wrote on December 31, 2022 at 1:47 AM

https://scikit-learn.org/stable/faq.html#will-you-add-gpu-support

Scikit learn isn't meant to be used with the GPU. Machine learning that's not deep learning don't benefit as much from GPU compute. Sklearn also uses optimised native libraries when possible (e.g. libsvm and liblinear), so if you were to implement the sklearn API on top of pytorch, it's very unlikely you'll beat the performance of sklearn.

jeanfeydy t1_j2cnd36 wrote on December 31, 2022 at 6:40 AM

From direct discussions with the sklearn team, note that this may change relatively soon: a GPU engineer funded by Intel was recently added to the core development team. Last time I met with the team in person (6 months ago), the project was to factor some of the most GPU friendly computations out of the sklearn code base, such as K-Nearest Neighbor search or kernel-related computations, and to document an internal API to let external developers easily develop accelerated backends. As shown by e.g. our KeOps library, GPUs are extremely well suited to classical ML and sklearn is the perfect platform to let users fully take advantage of their hardware. Let’s hope that OP’s question will become redundant at some point in 2023-24 :-)

jpbklyn t1_j2ecs6g wrote on December 31, 2022 at 5:13 PM

Correct. And since every GPU is attached to a CPU there is no reason to not run Scikitlearn on the local CPU, which also benefits from the direct attach to system memory

Realistic-Bed2658 OP t1_j2by65z wrote on December 31, 2022 at 2:53 AM

thanks for the links, but I disagree for the most part.

DBSCAN and LOF most likely would benefit. Even their own MLP model would inherently benefit from it ( I do believe somebody willing to train a neural network most likely would use PyTorch of TF).

Also, the fact that today non-DL ML is mainly CPU-based doesn’t mean that in 5 years from now this won’t change. Personal opinion here, though.

AerysSk t1_j2cgmo4 wrote on December 31, 2022 at 5:29 AM

If you are looking for a GPU version of scikit-learn, I think Nvidia is making one, and they call it cuml. Note that, not all algorithms are implemented, and there will also be some missing functions as well.

However, a note about Apple and AMD GPU thing: they are on the rise, but not until a few years later that they will become usable. My lab has only Nvidia GPUs but we already have a lot of headache dealing with Nvidia drivers and libraries. At least for a few years, we do not see any plan switching to AMD or Apple.

JocialSusticeWarrior t1_j2ed75n wrote on December 31, 2022 at 5:16 PM

unfortunate name "cuml"

Realistic-Bed2658 OP t1_j2cld4m wrote on December 31, 2022 at 6:18 AM

Totally understandable. I only use nvidia at work too.
Thanks for the info about the nvidia package!

AmbitiousTour t1_j2c637s wrote on December 31, 2022 at 3:56 AM

Gradient boosting hugely benefits from a gpu.