Submitted by AutoModerator t3_122oxap in MachineLearning
Ricenaros t1_jeawpf3 wrote
Reply to comment by sparkpuppy in [D] Simple Questions Thread by AutoModerator
It refers to the number of scalars needed to specify the model. At the heart of machine learning is matrix multiplication. Consider input vector x of size (n x 1). Here is a Linear transformation: y = Wx + b. In this case, the (m x n) matrix W(weights) and the (m x 1) vector b(bias) are the model parameters. Learning consists of tweaking W,b in a way that lowers the loss function. For this simple linear layer there are m*n + m scalar parameters (The elements of W and the elements of b).
Hyperparameters on the other hand are things like learning rate, batch size, number of epochs, etc.
Hope this helps.
sparkpuppy t1_jee9qj3 wrote
Hello, thank you so much for the detailed explanation! Yes, it definitely helps me have a clearer vision of the meaning of that expression. Have a nice day!
Viewing a single comment thread. View all comments