Submitted by fujidaiti t3_10pu9eh in MachineLearning
qalis t1_j6mczg1 wrote
Absolutely not! There is still still a lot of research going into traditional ML methods. For tabular data, it is typically vastly superior to deep learning. Especially boosting models receive a lot of attention due to very good implementations available. See for example:
- SketchBoost, CuPy-based boosting from NeurIPS 2022, aimed at incredibly fast multioutput classification
- A Short Chronology Of Deep Learning For Tabular Data by Sebastian Raschka, a great literature overview of deep learning on tabular data; spoiler: it does not work, and XGBoost or similar models are just better
- in time series forecasting, LightGBM-based ensembles typically beat all deep learning methods, while being much faster to train; see e.g. this paper, you can also see it at Kaggle competitions or other papers; my friend works in this area at NVidia and their internal benchmarks (soon to be published) show that top 8 models in a large scale comparison are in fact various LightGBM ensemble variants, not deep learning models (which, in fact, kinda disappointed them, since it's, you know, NVidia)
- all domains requiring high interpretability absolutely ignore deep learning at all, and put all their research into traditional ML; see e.g. counterfactual examples, important interpretability methods in finance, or rule-based learning, important in medical or law applications
silentsnake t1_j6mk21a wrote
To add-on, most real world business data are tabular data.
qalis t1_j6mmvwg wrote
Absolutely. OPs question was about research, so I did not include this, but it's absolutely true. It also makes sense - everyone has relational DBs, they are cheap and scalable, so chances are a business already has a quite reasonable data for ML just waiting in their tabular database. This, of course, means money, which means money for research, even in-company research, which may not be even published, but is research nonetheless.
MrAcurite t1_j6msvtn wrote
The customers I build models for insist on interpretability and robustness, which deep learning just doesn't give them right now. Actually just got a conference paper out of a modification to a classical method, which was kinda fun.
aschroeder91 t1_j6mys1h wrote
Good to hear! Do you know what the space of hybrid models looks like? Specifically using deep learning for input signal to data and classical machine learning algorithms (e.g. gradient boosted trees) for data processing.
My intuition says that hybrid models definitely have a role in general problem solving machines. I've tried searching this topic and the space is muddy at best.
beanhead0321 t1_j6nbq1j wrote
I remember sitting in on a talk from a large insurance company who did this a number of years back. They used DL for feature engineering, but used traditional ML for the predictive model itself. This had to do with satisfying some regulatory requirements around model interpretability.
aschroeder91 t1_j6nliim wrote
the real innovation will be comfortably backpropagating through the hybrid model as a whole
ktpr t1_j6nmsol wrote
Did they claim traditional ML explained the features engineered by the DL? If so, how did they explain the units of feature variables?
qalis t1_j6o6jjv wrote
The simplest way to do this is combining autoencoders (e.g. VAEs) and boosting, I have seen this multiple times on Kaggle.
coffeecoffeecoffeee t1_j6o260e wrote
For interpretable ML, I really like what Cynthia Rudin's lab at Duke has been putting out. They have a great paper on building ML models that generate rules with integer scores for classification, like what doctors typically use (Arxiv).
qalis t1_j6o5zha wrote
Yeah, I like her works. iModels library (linked in my comment under "rule-based learning" link) is also written by her coworkers IIRC, or at least implements a lot of models from her works. Although I disagree with her arguments in "Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead", paper which she is arguably the most well known for.
data_wizard_1867 t1_j6n6m7q wrote
Another addendum to this fantastic answer: lots of work in uplift modelling also uses traditional ML methods (related to your counterfactual point) and will likely continue to do so.
ApprehensiveNature69 t1_j6nmxbc wrote
Tangential, but I am so happy cupy is still being developed even though Chainer died.
nucLeaRStarcraft t1_j6nunti wrote
There's also this survey of DL vs traditional methods for tabular data: https://arxiv.org/pdf/2110.01889.pdf
qalis t1_j6o6cou wrote
That's a nice paper. There is also an interesting, but very niche line of using gradient boosting as a classification head for neural networks. Gradient flows through it normally, after all, just tree addition is used instead of gradient descent steps. But sadly I could not find any trustworthy open sourced implementation of this approach. If this works, it could bridge a gap between deep learning and boosting models.
Viewing a single comment thread. View all comments