Submitted by jesusfbes t3_yexifs in MachineLearning
Is there any computationally efficient implementation of clustering methods, beyond vanilla kmeans, for lot of data? Let's say 1M data points of 100s dimensions. I know about sci-kit learn and a couple others but I wonder if the will suit that data sizes.
sapnupuasop t1_iu0pqof wrote
Does clustering on 100s of features make sense? Maybe spark could solve your problems but you must look if there are spark implementations of other algorithms