Viewing a single comment thread. View all comments

Featureless_Bug t1_j69mojw wrote

I mean, it is kind of a very basic question and it takes like 15 minutes at most if you understand what you are doing. It is similar to leetcode-style questions for SE, it is not something that you will do on the job, but if you are smart, you will pass easily, and if you are not, you will struggle - so a great interview task

−6

OkAssociation8879 OP t1_j69n96y wrote

It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.

Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch. This question is more suited for someone fresh out of college, in my opinion.

3

Featureless_Bug t1_j69nsjq wrote

>It's definitely an easy question if it was a common question and hence featured on leetcode, where candidates would practice it before the interview.

I mean, if it was on leetcode, it wouldn't make sense to ask it in the interview, because then you will get prepared answers.

>Someone with 2 years of experience don't remember the knitty gritty maths to implement NN from scratch

If you cannot apply chain rule, your math is very weak. If your math is very weak, you probably won't be a great ML engineer. It's not that you need a lot of math, but you need a broad general understanding of what can work and what can't quite often, actually.

−4

xorbinant_ranchu t1_j69tasv wrote

Would be interested to know what kind of experience you have?

I think literally none of the ML engineers I work with (myself very much included) could pull a chain rule implementation out in 10 mins.

90% of this job is just finding an existing implementation of something to make work.

2

OkAssociation8879 OP t1_j69oa79 wrote

You are right. Interviewers generally ask about backpropagation. They should definitely test me on neural network concepts. But do you not think, asking to code entire neural network from scratch was an overdo for the interview?

1

pandasiloc t1_j69uj8v wrote

The human brain doesn’t work like this. It’s not a question about “being smart” or simply having learned something previously. In order to perform an implementation of this on the spot in a stressful situation, the relevant theory needs to be very fresh in your memory.

I highly doubt you would be able to reproduce a proof of the Fundamental Theorem of Algebra on the spot, even though it’s a simple concept that many people learn in middle school.

I would probably fail this question because I haven’t worked with deep learning much since I graduated 4 years ago. I majored in math at an Ivy League school and graduated with a pretty good GPA, so I don’t think my math is ‘weak’, either.

This kind of question does not make sense to ask on a live call unless someone claims to be working with deep learning architectures as part of their daily work.

3

Featureless_Bug t1_j69xpo0 wrote

Oh, a fellow mathematician. Look, I graduated from Cambridge 6 years ago, but I could still prove the fundamental theorem of algebra analytically or with Galois theory (I still remember the general ideas of both proofs I think), so I guess it depends on a person. But FTA is also a much more complicated thing to prove than the chain rule, and you don't even need to prove it to know how to use it. And sorry, if you don't remember how to differentiate multivariable functions, then you are an extraordinarily lousy mathematician. And if you know how to differentiate multivariable functions and if you are smart, you should be able to quickly come up with an implementation for backprop even if you don't remember anything else

0

pandasiloc t1_j6a4a2n wrote

I never said I didn’t remember how to differentiate multivariate functions - my point was that equating conceptual mathematical knowledge and the ability to implement a specific application of such concepts in a time-constrained and stressful situation is inappropriate.

A lot of things need to come together in answering a question like this - remembering that the chain rule is the key concept in backprop the first place, knowledge of how to implement matrix algebra in code, knowing the commonly-used loss functions, how to compute their derivatives, and how to represent the differentiation in code, etc. None of these things is complicated on its own; the difficulty arises in bringing everything together in a small amount of time. It’s fair to expect people in the field to intuitively remember what is going on but on the spot implementation in under 30 minutes requires a level of rigor that is unrealistic for even a competent person who does not have the theory fresh in their memory.

You keep using the term ‘smart’ and I don’t know what you mean by this. Your last statement is just an assertion without argument, one you’ve repeated throughout your comments but I see no reason to believe, given the above.

2