canbooo t1_ja9a5et wrote on February 27, 2023 at 8:03 PM

Reply to [D] CVPR Rebuttal scores are out! by ElPelana

I voted did not change because I wanted to see the results without biasing them too much. Do what you want with this info.

canbooo t1_j8onqb4 wrote on February 15, 2023 at 9:10 PM

Reply to comment by nibbajenkem in Physics-Informed Neural Networks by vadhavaniyafaijan

No, valid question, I just find it difficult to give examples that are easy to understand but let me try. Yes OPs example is not a good one to demonstrate the use case. Let us think about a swarm of drones and their physics, specifically the airflow around them. Hypothetically, you maybe able to describe the physics for a single drone accurately, although this would probably take quite some time in reality. Think days on a simple laptop for a specific configuration if you rally want high accuracy. Nevertheless, if you want to model say 50 drones, things get more complicated. Airflow of one effects the behavior/airflow of others, new turbulence sources and other effects emerge. Actually simulating such a complex system may be infeasible even with supercomputers. Moreover, you are probably interested in many configurations like flight patterns, drone design etc. so that you can choose the best one. In this case, doing a functional interpolation is not very helpful due to the interactions and new emerging effects as we only know the form of the function for a single drone. Sure, you know the underlying equations but you still can't really predict the behavior of the whole without solving them, which is as mentioned costly. The premise of PINNs in this case is to learn to predict the behaviour of this system and the inductive bias is expected to decrease the number of samples required for generalization.

canbooo t1_j8ohxf9 wrote on February 15, 2023 at 8:34 PM

Reply to comment by nibbajenkem in Physics-Informed Neural Networks by vadhavaniyafaijan

Esp. in engineering applications, i.e. with complex systems/philysics, fundamental physical equations are known but not how they influence each other and the observed data. Alternatively, these are too expensive to compute for all possible states. In those cases, we already build ML models using the data to, e.g. optimize design or do stuff like predictive maintenance. However, these models often do not generalize well to out of domain samples and producing samples is often very costly since we either need laboratory experiments or actually create some design that are bound to fail (stupid but for clarity: think planes with rectangular wings, cranes so thin they could not even pick up a feather. Real world use cases are more complicated to fit in these brackets). In some cases, the only available data may be coming from products in use and you may want to model failure modes without observing them. In all these cases PINNs could help. However, none of the models I have tested so far are actually robust to real world data and require much more tuning compared to MLPs, RNNS etc, which are already more difficult to tune compared to more conventional approaches. So I am yet to find an actual use case that is not academic.

TLDR; physics (and simulations) may be inefficient/inapplicable in some cases. PINNs allow us to embed our knowledge about the first principles in form of inductive bias to improve generalization to unseen/unobservable ststes.

canbooo t1_j8mfn6z wrote on February 15, 2023 at 11:37 AM

Reply to Reinforcement Learning based algorithms specifically for NLP[D][P] by Smooth-Stick-5751

Since you are waiting since 6h without any response, let me share my 5c. You are probably inspired by chatgpt and the success of HRL so why not start there: https://openreview.net/forum?id=20-xDadEYeU

But this idea is not novel, only its application to nlp. It has been applied to other stuff like games and autonomous driving. They use PPO, which is to me the most robust on-policy algorithm. However, any other on-policy algorithm could also have been used instead and stuff like SAC could improve sample efficiency but might run into convergence problems. Also, you can try to be more generalistic and try off-policy algorithms independent of a specific language model. This would allow using same experience/value model to fine tune other LMs. But it might require much much more data to achieve a similar performance. In any case, the application of RL to NLP (except for language based games) is quite new and many points remain yet to be answered.

canbooo t1_j7z0lku wrote on February 10, 2023 at 12:34 PM

Reply to comment by Ulfgardleo in [D] Critique of statistics research from machine learning perspectives (and vice versa)? by fromnighttilldawn

I agree with the size of the difference yet disagree with the examples as there is ml research considering all 3 (causal ml, conformal ml/predictions/forecasting, AI safety, reliability etc.) I think the difference is more like deduction and induction in a sense, meaning the process of finding the answers are different. Since finishing pooping on corporate time, I will keep this short.

ML: Data -> Method -> Hypothesis -> Answers

Statistics: Hypothesis -> Method -> Data -> Answers

This may be too simplistic and please propose a better distinction but do not postulate that ML does not care about things statistics do.

canbooo t1_j2paqt7 wrote on January 3, 2023 at 12:26 AM

Reply to comment by Comfortable_End5976 in [D] life advice to relatively late bloomer ML theory researcher. by notyourregularnerd

so much this. Was not sure if OP is trolling.

canbooo t1_iyeabow wrote on November 30, 2022 at 7:26 PM

Reply to comment by simplicialous in [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount

Unsure about your assumption about the other assumptions but loled at the end nonetheless. Just to completely confuse some redditors:

r/woosh

canbooo t1_iydgqzt wrote on November 30, 2022 at 4:16 PM

Reply to comment by ThisIsMyStonerAcount in [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount

Oh, fair enough, my bad, I misunderstood what you mean. You are absolutely right for that case. For me the question is rather P(X>=x) = .2 since having more intelligence implies you have (implicit at least) 20% but this is already too many arguments for a joke. Enjoy the conference!

canbooo t1_iyc5e42 wrote on November 30, 2022 at 8:16 AM

Reply to comment by ThisIsMyStonerAcount in [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount

Technically speaking, having 20% chance is not a point estimate, unless you assume that the distribution of the random variable itself is uncertain.

In that case, you accept being Bayesian so give us your f'in prior! /s

canbooo t1_iy3pylo wrote on November 28, 2022 at 2:59 PM

Reply to comment by Own-Archer7158 in Can someone pls help me with this question and explain to me why you chose your answers? Many thanks! by CustardSignificant24

Bad initialization can be a problem if you do it yourself (i.e. bad scaling of weights) and if you are not using batch or other kinds of normalizations, since it might make your neurons die. E.g. a tanh neuron with too large input scale will only predict -1 or 1 for all data, which leads it to being dead, i.e. not learning anything due to 0 grad for the entire data set.

canbooo t1_ivydtlt wrote on November 11, 2022 at 3:13 PM

Reply to comment by jimmiebtlr in [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru

You are right and what I ask may be practically irrelevant and I really should rtfp. However, think about the edge case of 1 Layer with 1 input and 1 output. Each node having 1 as input weight sees the same gradient, similar to the nodes having 0. Increasing the number of inputs make it combinatorially improbable to have the same configuration but increasing the number of nodes in a layer makes it likelier. So, it could be relevant for low dimensions or models with a narrow bottleneck. I am sure that the authors already thought about this problem and either discarded it as it is quite unlikely in their tested settings or they already have a solution/analysis somewhere in the paper, hence my question.

canbooo t1_ivx9yjn wrote on November 11, 2022 at 8:02 AM

Reply to [R] ZerO Initialization: Initializing Neural Networks with only Zeros and Ones by hardmaru

Very interesting stuff, just skimmed through and will definitely read more in depth but how does this break symmetry?

canbooo t1_ivc8obp wrote on November 6, 2022 at 10:10 PM

Reply to comment by cautioushedonist in [D] Simple Questions Thread by AutoModerator

subjectively sagemaker/mlflow

canbooo t1_iv9jjg9 wrote on November 6, 2022 at 9:54 AM

Reply to comment by ShadowKnightPro in [D] Physics-inspired Deep Learning Models by ShadowKnightPro

Ok you are right, I was assuming you are already doing your PhD. In this case, I would keep it simple and focus on methodology rather than novelty. Good luck with your search and thesis.

canbooo t1_iv6j73z wrote on November 5, 2022 at 6:09 PM

Reply to comment by ShadowKnightPro in [D] Physics-inspired Deep Learning Models by ShadowKnightPro

I think the comment above you is gold and you are approaching this kinda wrong if this is about research. The fact that they are not (yet) solving cv/nlp tasks is an advantage rather than a disadvantage. Although I must admit, I see a more direct relation to RL than anything, this makes it even more interesting since any idea you will come up with will probably be novel.