BellyDancerUrgot t1_j6zyiqm wrote on February 3, 2023 at 2:02 AM

Reply to Why are FPGAs better than GPUs for deep learning? by Open-Dragonfly6825

I’ll be honest, I don’t really know what FPGAs (I reckon they are an ASIC for matrix operations?) do and how they do it but tensor cores already provide optimization for matrix / tensor operations and fp16 and mixed precision has been available for quite a few years now. Ada and hopper even enable insane performance improvements for fp8 operations. Is there any real verifiable benchmark that compares training and inference time of the two?

On top of that there’s the obvious Cuda monopoly that nvidia has a tight leash on. Without software even the best hardware is useless and almost everything is optimized to run on Cuda backend.

BellyDancerUrgot t1_j6v9w0p wrote on February 2, 2023 at 3:08 AM

Reply to comment by FastestLearner in Using Jupyter via GPU by AbCi16

Last I checked for tensorflow-gpu conda install didn’t install the correct cuda version for some reason and it was annoying to roll back and then reinstall correct cuda and cudnn versions. PyTorch is fking clean tho.

BellyDancerUrgot t1_j6v6uv2 wrote on February 2, 2023 at 2:45 AM

Reply to comment by AbCi16 in Using Jupyter via GPU by AbCi16

Windows or Linux ?

Edit : or m1

BellyDancerUrgot t1_j6uz93b wrote on February 2, 2023 at 1:49 AM

Reply to Using Jupyter via GPU by AbCi16

Do u have a discrete gpu ?

BellyDancerUrgot t1_j6p8tor wrote on January 31, 2023 at 10:43 PM

Reply to comment by msltoe in How to visualize CNN feature maps? by DrummerSea4593

Wouldn’t a deconvolution operation (backprop basically but without updating gradients) on any layer after the network has been trained show u what features activate that layer?

BellyDancerUrgot t1_j6bzv82 wrote on January 29, 2023 at 6:20 AM

Reply to comment by gantork in Google not releasing MusicLM by Sieventer

Using scraped data for research does not violate copyright laws. Monetizing it as a product for the public does. Most of the work done by Meta , Google , nvidia and other big tech arent even available for public use let alone monetized for public use. But yeah sure whatever u say! I’ve realized people on this sub who have no real knowhow about ML/DL and about laws/legal consequences are the ones that are the loudest.

Have a good day.

BellyDancerUrgot t1_j67sn0r wrote on January 28, 2023 at 9:33 AM

Reply to comment by CypherLH in Google not releasing MusicLM by Sieventer

It isn’t at all. ‘Lol’

What I understand from our brief exchange :

-u have no idea about fair use, Creative Commons licensing, TDM rules apply to non commercial uses which is not the case here, scraping copyright protected content is a legal infringement if used for commercial purposes and or generate profit.

-u make dumb analogies because u don’t understand that representations in DL are equivalent to a photocopy of ur data. U can’t remove an artists watermark and use their IP to generate revenue.

oh but I can look at someone’s work and modify it a bit and thats fair use - yes except that’s not what’s happening here. Stop trying to throw random analogies trying to connect the two. Ur ai generated art will have the same distribution as whatever input data it sourced from during inference. Which is the entire foundation for digital watermarking against generative diffusion and GAN models which picked up in popularity.

BellyDancerUrgot t1_j67qd3w wrote on January 28, 2023 at 9:01 AM

Reply to comment by CypherLH in Google not releasing MusicLM by Sieventer

Totally wrong. A neural network learns a representation from the data. It literally scans ur work. The entire analogy of it ‘just looking’ at ur data is wrong. There’s a reason why artists have watermarks and signatures on artwork hosted on various websites. Circumventing measures put in place to prevent misuse doesn’t mean it’s legal , it just means existing laws were inadequate.

Edit: fyi there’s already work being done to trace back datasets on which ai art generation models were trained. Quite easy to do since most GAN and Diffusion models have distributions that get replicated in the output (cuz the outputs are derived from the representations learnt from the dataset they are trained on) making them easy to trace back.

BellyDancerUrgot t1_j67p0e9 wrote on January 28, 2023 at 8:42 AM

Reply to [D] Staying on top of the DL field? by Which-Distance1384

A degree , maybe a part time professional masters perhaps from a school where the faculty who teaches does active research. Or just read papers. 2 min papers is a good channel to start off with.

In ML/DL 1 year is already ancient. 4 years is prehistoric lol. For context if u choose a topic, say 2D-3D translation , from the advent of NERFs a couple of years back? We have a stupid amount of papers on the topic trying various novel approaches , everything ranging from using Voxels to store geometry in new ways, to geometry aware GANs , multi view compression using ViTs etc etc

So choose a topic and focus on that otherwise it’s a lot.

BellyDancerUrgot t1_j67oidu wrote on January 28, 2023 at 8:35 AM

Reply to comment by CypherLH in Google not releasing MusicLM by Sieventer

Yes but these models were trained on data publicly available without consent. That’s the big legal problem and imo entirely fair. Fair use falls flat in this argument lol.

Edit : for people replying to my last comment

First mistake is comparing neural networks to the brain under this context.

And no

Their output is not unique because it follows the same distribution that it learnt the representation on. Humans don’t do that. You can’t find a human analogy because humans do not learn things the same way as neural nets.

Neural networks can’t actually extrapolate data because they don’t have a physical intuition just a large associative memory. You only think they can because you are uneducated on the topic.

BellyDancerUrgot t1_j67o636 wrote on January 28, 2023 at 8:30 AM

Reply to Google not releasing MusicLM by Sieventer

Because legal grey zone

BellyDancerUrgot t1_j620nyw wrote on January 27, 2023 at 3:34 AM

Reply to comment by SometimesZero in best deep learning reference by Reasonable-Ball9018

Fair enough. But I think the book is chefs kiss and Andrew Ng courses with reference from the book is a perfect balance of easy and moderate to hard topics to study.

BellyDancerUrgot t1_j5wtg9z wrote on January 26, 2023 at 2:38 AM

Reply to best deep learning reference by Reasonable-Ball9018

Deep Learning book - Courville, Bengio, Goodfellow

Andrew Ng ML and DL specializations on Coursera

BellyDancerUrgot t1_j4nw2ku wrote on January 17, 2023 at 1:06 AM

Reply to comment by suflaj in Is 100 mega byte text corpus big enought to train? by elf7979

Gensim documentation itself has them highlighted along with the necessary arguments to use to download and use them.

BellyDancerUrgot t1_j3pw2m2 wrote on January 10, 2023 at 6:25 AM

Reply to comment by ivan_kudryavtsev in Cloud VM GPU is much slower than my local GPU by Infamous_Age_7731

Oh I thought maybe he is going for distributed learning since he has access to 2 GPUs. In that case MPI has some overhead simply because it has to replicate, scatter and gather all the gradients per batch every epoch.

BellyDancerUrgot t1_j3nq5pn wrote on January 9, 2023 at 8:58 PM

Reply to Cloud VM GPU is much slower than my local GPU by Infamous_Age_7731

There would be a 5-8% overhead for the same gpu in a bare vm vs physical comparison. A100 is significantly faster for ML workloads than a 3090 iirc. So it’s probably something related to how it’s setup in your case. Also try using a single gpu instead of distributed learning if you are. MPI might be leading to more overhead in your compute node.

BellyDancerUrgot t1_j3ez825 wrote on January 8, 2023 at 2:18 AM

Reply to comment by AsheyDS in We need more small groups and individuals trying to build AGI by Scarlet_pot2

JFK intensifies

BellyDancerUrgot t1_j3efn9s wrote on January 7, 2023 at 11:56 PM

Reply to comment by AsheyDS in We need more small groups and individuals trying to build AGI by Scarlet_pot2

I don’t have a PhD either lol. Your beliefs aren’t meaningless either. Nobody actually knows what breakthrough we might have next. I do consider chatgpt to be a breakthrough tbh (using RL to train an LLM). VQA was a breakthrough imo. GANs was also a breakthrough. All these came about in the same way as the post suggests but without hardware or funding u would never see all of it come together.

There’re people like Blake Richards working on the boundaries of neuroscience and AI but it’s hard to work on any of those fields without math as the underlying structure. Still, even if you approach it or want to approach it from an entirely new way it’s hard to do that without knowing the approaches that do exist which would require you to have a lot of math knowledge regardless. You can do that without a degree for sure tho , that wasn’t my point. It’s just super hard without guidance and the primary topic of this post is: working on smaller problems without any funding , I don’t see how that works and i don’t see any actual pragmatic answers here by op either.

BellyDancerUrgot t1_j3dxt4h wrote on January 7, 2023 at 9:54 PM

Reply to comment by turnip_burrito in We need more small groups and individuals trying to build AGI by Scarlet_pot2

Depends on what you define as basic. The post talks about novel approaches to AGI. Simple MLP is below basic in that regard. And even then I doubt most people learn about differentiation with respect to coordinate transformation in undergrad unless they do some highly specialized ML or math course.

BellyDancerUrgot t1_j3dtr99 wrote on January 7, 2023 at 9:27 PM

Reply to comment by AsheyDS in We need more small groups and individuals trying to build AGI by Scarlet_pot2

Do share then what your beliefs are. What exactly is AI without math? Just curious since you have the tag of a researcher. What field are u working on? I’m not suggesting that a PhD is necessary, a degree is a indicator of ur work. But phd level work is necessary to achieve anything meaningful in this field. The post is very hand wavy and aimless. Andrew Ng and Khan academy is not enough to invent the next big thing however small it is. Read up on Mish activation. The guy who did that did so before even getting a masters degree. But that’s only because he is a genius who was capable of understanding grad math when barely out of high school.

BellyDancerUrgot t1_j39l9js wrote on January 6, 2023 at 11:50 PM

Reply to comment by ChronoPsyche in We need more small groups and individuals trying to build AGI by Scarlet_pot2

Out of curiosity what was it that u tried?

And I agree most people on this subreddit don’t have any idea what they are talking about lol. It’s just Twitter and social media hype, accompanied by total lack of knowledge ….. or worse having superficial knowledge on the most basic things.

BellyDancerUrgot t1_j39klnu wrote on January 6, 2023 at 11:46 PM

Reply to comment by Scarlet_pot2 in We need more small groups and individuals trying to build AGI by Scarlet_pot2

You need to learn graduate level math to even begin to understand the math required for the most basic models.

BellyDancerUrgot t1_j39jx1x wrote on January 6, 2023 at 11:41 PM

Reply to We need more small groups and individuals trying to build AGI by Scarlet_pot2

How exactly does ur idea reduce expense?

BellyDancerUrgot t1_j34m2fe wrote on January 6, 2023 at 12:24 AM

Reply to comment by [deleted] in 2022 was the year AGI arrived (Just don't call it that) by sideways

Totally irrelevant to the conversation. Doesn’t address anything I said.

BellyDancerUrgot t1_j311o8o wrote on January 5, 2023 at 8:55 AM

Reply to comment by visarga in 2022 was the year AGI arrived (Just don't call it that) by sideways

No because humans do not hallucinate information and can derive conclusions based on cause and effect on subjects it hasn’t seen before. LLMs can’t even differentiate between cause and effect without memorizing patterns, something humans can naturally do.

And no, human beings in fact do not parrot information. I can reason about subjects I have never studied because human beings do not parrot words and actually understand them rather than memorizing spatial context. It’s like we are back at a stage when people thought we have finally developed AGI back when Goodfellows paper on GANs was published in 2014.

If you actually get off of the hype train u will realize most major industries use gradient boosting and achieve almost the same generalization performance for their needs as an LLM trained with giga fking tons of data. Because they can’t generalize well at all.