Real_Revenue_4741 t1_itddwyd wrote on October 22, 2022 at 8:01 PM

Reply to comment by hellrail in [D] What things did you learn in ML theory that are, in practice, different? by 4bedoe

I believe you are looking at the wrong slides. Reddit did something weird with the hyperlink

hellrail t1_itde28m wrote on October 22, 2022 at 8:02 PM

Then please point mento the right slide by gibing the slide number

Real_Revenue_4741 t1_itdehtz wrote on October 22, 2022 at 8:05 PM

It should be from MIT (try copying/pasting the address linked above)

hellrail t1_itewyaw wrote on October 23, 2022 at 3:06 AM

One thing i must add regarding the topic of presentation as "established knowledge".

The lecture you quoted, is lecture number 12. It is embedded in a course. There are of course lecture 11, 10, 9 etc. If you check these, which are also accessible with slightly midifying the given link, you see the context of this lecture. Specifically, a bunch of classifiers are explicitly introduced, and the v-dim theory on lecture 12 are still valid of these. The course does not adress deep networks yet.

So its a bit unfair to say these lecture does teach you a theory that deviates. Its does not deviate for the there introduced classifiers.

hellrail t1_itdlgvb wrote on October 22, 2022 at 8:54 PM

Ok found the right one.

Well, generally i must say good example. I accepted it at least as a very interesting example to talk about, worth mentioning in this context.

Nevertheless, its still valid for all NON cnn, resnet, transformer models.

Taking into account, that its based on an old theory (prior 1990), where these deep networks have not existed yet, one might take into account its limitedness (as it doesnt try to model effects taking place during learning of such complex deep models, which hasnt been a topic back then).

So if I would be really mean, i would say u cant expect a theory making predictions about entities (in this case modern deep networks) that had not been invented yet. One could say that the v-dim theory's assumptions include the assumption of a "perfect" learning procedure (therefore exclude any dynamic effects from the learning procedure), which is still valid for decision trees, random forrest, svms, etc, which have their relevance for many problems.

But since im not that mean, i admit that this observations in these modern networks do undermine the practicability of the V-dimension view for modern deep networks of the mentioned types, and that must have been a mediocre surprise before having tried out if v-dims work for cnn/resnet/transformers, therefore good example.