BoiElroy
BoiElroy t1_j9ipbtg wrote
This is not the answer to your question but one intuition I like about universal approximation theorem I thought I'd share is the comparison to a digital image. You use a finite set of pixels, each that can take on a certain set of discrete values. With a 10 x 10 grid of pixels you can draw a crude approximation of a stick figure. With 1000 x 1000 you can capture a blurry but recognizable selfie. Within the finite pixels and the discrete values they can take you can essentially capture anything you can dream of. Every image in every movie ever made. Obviously there are other issues later like does your models operational design domain match the distribution of the training domain or did you just waste a lot of GPU hours lol
BoiElroy t1_j9ioqcz wrote
Reply to comment by relevantmeemayhere in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Yeah always should first exhaust existing classical methods before reaching for deep learning.
BoiElroy t1_j991yix wrote
If you haven't already started I'd say begin with the standard engineering statistical quality control (SQC) stuff. AI/ML is great but honestly only when the existing classical techniques are no longer sufficient.
BoiElroy t1_j4014ia wrote
BoiElroy t1_j12qq57 wrote
Reply to [D] Techniques to optimize a model when the loss over the training dataset has a Power Law type curve. by Dartagnjan
!RemindMe 7 days
BoiElroy t1_ixzz4z3 wrote
Reply to Deep Learning for Computer Vision: Workstation or some service like AWS? by Character-Ad9862
Honestly lookup Paperspace Gradient and consider their monthly service. They have a tier where you can quite routinely get decent free GPUs, which honestly when you're just working up code and refactoring and making sure a training run is actually going to run then it's perfect for that. Then when you're ready to let something run overnight then you select an or whatever A6000 and it's reasonably priced.
BoiElroy t1_jecx0mw wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Is it "open source" though? ...
If anyone knows, I'd be curious also if you took a model that was not open source and then fine tuned it but unfreezing the weights of some intermediate layers, will it just always be not open source because of the initial state?