Submitted by BadKarma-18 t3_z8fdoh in MachineLearning
Also how are you solving the data availability problems in your project/or at work
Submitted by BadKarma-18 t3_z8fdoh in MachineLearning
Also how are you solving the data availability problems in your project/or at work
Data scarcity is a problem of methods not data.
Starting about a decade ago, cheap hardware made it possible to run vast datasets, allowing for models with more degrees of freedom. These models in turn led to the demand for massive amounts of human labeled data. It's questionable whether all this vast amount of crunching has led to an improved understanding of the world, although we now have machines that can mimic humans a lot better than they used to. The whole exercise of iterating over increasingly bigger models and bigger data, without any increase in fundamental scientific understanding, feels as pointless as bitcoin mining.
What is holding back AI/ML is to continue to define intelligence the way Turing did back in 1950 (making machines that can pass as human), and chasing big data, especially human labeled data and its attendant subjectivity and pointlessness. Essentially, we are getting hung up on local minima in the search for intelligence.
Powerful processors on the go - Serverless GPUs.Though lots of startups are linearly growing in this dimension.
> hung up on local minima
We are the local minima that we seek.
[removed]
[removed]
I do agree that current ML systems require much larger datasets than we would like. I doubt the typical human hears more than a million words of english in their childhood, but they know the language much better than GPT-3 does after reading billions of pages of it.
> What is holding back AI/ML is to continue to define intelligence the way Turing did back in 1950 (making machines that can pass as human)
But I don't agree with this. Nobody is seriously using the Turing test anymore, these days AI/ML is about concrete problems and specific tasks. The goal isn't to pass as human, it's to solve whatever problem is in front of you.
I like how you think.
Though we are very far off from understanding consciousness.
I feel like what roger penrose is doing is more what you are describing.
Data science cares about output more than the science behind the humam brain. Though i think neural networks are very interesting.
From a theoretical perspective, our continued lack of understanding for what consciousness or intelligence is. Right now our models are nothing more than fancy correlation machines. You never had to shuffle through a million images of dogs or cats to know what each one is, yet the first few times you saw a cat or dog you understood what it was and how to identity it. Maybe we are thinking about intelligence all wrong, that is if you’re interested in machines doing tasks like humans do, instead of just doing tasks that humans do.
We don’t even understand what our own consciousness is yet. There is no grand unified theory in the field of cognitive science to asses whether a machine can even become conscious. Until then we are stuck just doing more mathematical optimization sleigh-of-hands to increase the accuracy of the SotA model by .2%.
Actually having a use case
[removed]
In my opinion that is a very good answer not only from philosophical point of view but rather from a mathematical vantage point. Well i think cognitive scientist understand how far we are from a human like machines. However Tech communities sticked to the term intelligence because it is just fancy and because it sounds very nice when you want to promote your research or your product. Like what in the world makes a human driving a car Intelligent whereas a machine is indeed considered such as. There are a lot of fundamentals difference between human brain and machinery "Intelligence". At the end machine learning and AI is constrained and tethered whithin the realem of mathematical framework and you can't bypass it. Which it turns out that we don't even understand the mathematics of these models. For instance we don't know what class of function neural network can approximate, (well we know that we can approximate any continuous function in theory and this is a very broad statement theorem), what kind of regularities the function should have, how these models are exploting the symmetries of the problems, what are the principles or strategies the model is pursuing to solve the task, lets say we had a deep understanding of these topics can we than solve the problem without learning etc.. so i guess our lack for the mathematical understanding of these models is the real obstacle. Understanding intelligence and cognition is important if you want to do human like machine but nowdays i don't think we have such, we rather have a human task imitating machines.
We simply don't really understand the math behind it. (And please im not talking about matrix multiplication and taking derivatives.) https://arxiv.org/abs/2105.04026
[removed]
>The goal isn't to pass as human, it's to solve whatever problem is in front of you.
It's worth disambiguating between solving specific business problems, and creating intelligent (meaning broadly generalizing) programs that can solve problems. For the former, what Francois Chollet calls cognitive automation is often sufficient, if you can get enough data, and we're making great progress. For the latter, we haven't made much progress, and few people are even working on it. Lots of people are working on the former, and deluding themselves that one day it will magically become the latter.
One major problem is not enough diversity in ML/DL research. I don't mean this from a social sense only. Most of the major development is lead by FAANG or research labs doing FAANG'ish work but real ML/DL work doesn't have trillion token datasets or fuss free gpu budgets. Industrial AI for egs is a field which is underperforming compared to advances in certain sciences and general b2C areas like NLP or recsys.
Even computer vision long thought to be solved still struggles in many applications to provide great solutions for example in segmentation of artifacts in.used catalysts .
We need more folks from industrial/ real life areas working with ML on small data/ extreme sparse phenomena or complex natural science systems in an interdisciplinary sense.
On a completely different but related note . If you look at ML for climate change, it's so far from what's required to actually make a change in climate. Stuff like using NLP for catalysts or conv lstm for weather like metnet from Google makes for great PR, but it's useless in the greater scheme of things. None of those ideas get us to developing and shipping climate tech related solutions in the short term. Perhaps if we had more multidisciplinary teams both in research as well as in management, because decision is made by non tech folks in general, we might have much better outcomes.
Narrow AI still has tremendous potential to change our world for the better. We are in the early stages of Cambrian explosion era of narrow AI is what I feel
The hottest problems in NLP, computer vision, even self-driving cars, are almost solely defined in terms of how well a machine can mimic a human.
>I doubt the typical human hears more than a million words of english in their childhood, but they know the language much better than GPT-3 does after reading billions of pages of it.
But is this a fair comparison? I am far a way from being an expert in Evolution but I assume we have some evolutinoary in coded bias to learn language easier. Whereas ML systems have to begin from 0.
Well, fair or not, it's a real challenge for ML since large datasets are hard to collect and expensive to train on.
It would be really nice to be able to learn generalizable ideas from small datasets.
Thats correct. But to define what is the bare minimum, you need a baseline. I just wanted to say that humans are a bad baseline because we have "training data" encoded in our DNA. Further for tabular data ML systems often outperform humans with not as much training data.
But of course less data needed with good training results is always better. I would not argue about that.
Edit: Typos
I believe lack of privacy preserving models/ systems in machine learning is also a cause for concern. In many applications we collect user data (the biggest companies in the world do that) if we are able to train models in a more secure manner(not just federated learning) people might be more inclined to give more info if they can control and we can guarantee no party apart from the user will get a peek at the raw data.
This is my two cents on the topic.
There's too much awful AI. An endless number of AI applications foster discrimination, disinformation, surveillance, scams, weapons, and more. We need to turn the table and apply AI to problems that matter--health, poverty, climate change, biodiversity, etc.
More awful AI https://github.com/daviddao/awful-ai
What are you talking about? ML has been used in real-world use cases for ages; speech-to-text, machine translation, OCR/handwriting recognition, image generation, and more.
Chill out. Obviously ML is useful. The biggest companies in the world are only this big because of ML. I’m just saying that in my experience, many companies think they need it when they really just need data engineers and data analysts
Phoneaccount25732 t1_iybghm2 wrote
Interpretability, causal ML, cost of training, out-of-distribution detection.
Also inherits every other problem that can plague statistical modeling.