Submitted by CosmicTardigrades t3_10d3t41 in MachineLearning
[deleted] t1_j4jhufr wrote
[deleted]
CosmicTardigrades OP t1_j4jlzsm wrote
I know clearly about "linear algebra, calculus, and probability." And yes, I'm treating ChatGPT like a black box: not the training algorithm as a black box but the parameters it learned from corpus as a black box. There're billions of parameters and as far as I know most AI researchers treat them as a black box too. If you know some of AI research, DL models' interpretability is a long-standing difficult problem. Shortly, they are hard to understand. Moreover, we can have some tuition about DL models: CNN's filters represent image objects' features in different levels and transformer's Q-K-V matrices are about attention. What I'm asking is why such design can outperform traditional NLP methods so much.
BTW, I'm a bit infuriated when you say I "have to read some papers." My Zotero library contains a hundred read AI papers and more importantly, I've just posted two papers I have read in this post. They give a direct explaination about why ChatGPT fails in some regex and CFG tasks. My question is just one step further after reading these two papers.
The tone in the images is just for fun because I orininally posted this as a joke to my personal circle on the social media. I do have at least CS-grad-level knowledge about how DL models work.
Viewing a single comment thread. View all comments