Gere1 t1_iv0505o wrote on November 4, 2022 at 8:41 AM

Reply to [D] What are the major general advances in techniques? by windoze

Does someone know a good ablation study of the mentioned techniques. I've seen results where neither dropout nor layer normalization did much. So I wonder if these 2 techniques are a believe or still crucial.

Gere1 t1_ithw25k wrote on October 23, 2022 at 7:39 PM

Reply to comment by likeamanyfacedgod in [D] Accurate blogs on machine learning? by likeamanyfacedgod

I agree. If anything, then ROC curves have some "academic" reasons to be rather good than bad for imbalanced data.

I think there are a lot of low quality data science blog posts out there. In the end only something with measurable success (like an ML competition winner) indicates something worth looking into.

Gere1 t1_itbcbfq wrote on October 22, 2022 at 9:57 AM

Reply to [D] Accurate blogs on machine learning? by likeamanyfacedgod

Machinelearningmastery is rather shallow, but he tries to spread and monetize it aggressively. And searches will pick up on this.

But if you believe P-R-curves are bad for imbalanced data, then you are just as mistaken. For example, precision and recall is exactly what you need for fraud detection. ML isn't about opinions and hand-wavy reasoning about math, but about getting results which work in the real world.

Now, you are asking for blogs specifically. What type information would you like to see? Learning the basics like ROC curves so probably better done from books or practice instead of waiting for blog posts. For more research level information there are many blogs, but it depends on the field (CV, NLP, ...). For a regular overview over what's happening you could look into https://jack-clark.net/ or https://datamachina.substack.com/

Gere1 t1_it1qipu wrote on October 20, 2022 at 9:03 AM

Reply to [D] Python's library to multivariate time series forecasting: Sktime, modeltime, darts. by popcornn1

Don't miss tuned ARIMA, ETS (e.g. statsmodel). Include a library which has NBEATS, N-Hits (darts, gluonts). Tbh, Darts seems to cover all of them. Maybe DeepAR (gluonts). Most models don't do real multivariate forecasts, though.

Set up an honest evaluation and test all models. You can do some light pre-processing of the data, but don't spend too much time on it.

There aren't any magic tricks. Most methods won't beat a trivial baseline. Predicting the future usually does not work due to a missing predictive signal in the data. How is the model going to know what Musk will twitter tomorrow? The only thing that works is fitting boring seasonality and fitting the effect of holidays etc. You see that neither of these is actually about the future.

Let us know what worked in a critical evaluation in the end!

Here is a write-up https://www.sciencedirect.com/journal/international-journal-of-forecasting/vol/38/issue/4 of https://www.kaggle.com/competitions/m5-forecasting-accuracy . But note that in that competition it was more about fitting holidays and other tricks. There were a lot of zeros in the target. To predict the trend many people used a guessed fudge factor. Also look at the difference between public and private leaderboard to convince yourself that prediction of the top Kagglers in the world seemed to be a noisy mess for that data set. I'm afraid predicting the future isn't solved yet.

Gere1 t1_iswqj2w wrote on October 19, 2022 at 7:08 AM

Reply to [D] How frustrating are the ML interviews these days!!! TOP 3% interview joke by Mogady

I could chime in on the "these big, wealthy companies are all stupid, they don't appreciate brilliant work and you should be glad they did not hire you", but then you might experience a similar situation next time again. Your choice is to acknowledge another weakness or repeat the situation. Here is an alternative view.
I do find that 30min is far too short to get a noiseless evaluation of candidates. Little things can trip even the most experienced coder and the pressure makes things worse.
However, Leetcode exercises coding, but not ML coding in particular. If you had been interested practicing on Kaggle, you would have undoubtedly known the Pandas shortcuts. They correctly, concluded that you had experience coding, but not as an all-rounder for ML and rather as a specialist for NLP. They may even have thought that you did not show interest in ML beyond your assigned work previously.
The attitude of doing "projects from scratch" and finding Pandas "shitty" hints that you may be over-engineering opinionated code and no company wants to pay coders who waste time and are opinionated about their work. Maybe the interviewer sensed that. Companies don't necessarily respond with all reasons why they rejected someone.
You may find all what I wrote BS, but if you don't consider it even the slightest bit, it confirms things I wrote and the next interviewer may sense that again.

Gere1 t1_irj4sg3 wrote on October 8, 2022 at 4:45 PM

Reply to [R] Discovering Faster Matrix Multiplication Algorithms With Reinforcement Learning by EducationalCicada

Is it useful though? I had the impression that the Strassen algorithm is already an optimization and yet I'm not aware that it is used in practice on GPUs. Am I wrong and it is used in NVidia GPUs or is it a gimmick not worth building for? Maybe it's easier to do the conventional matrix multiplication on hardware and parallelize that?