Exarctus t1_j0zqq3b wrote
Reply to comment by vprokopev in [D] Why are we stuck with Python for something that require so much speed and parallelism (neural networks)? by vprokopev
The vast majority of PyTorch functions calls are implemented in either Cuda C or OpenMP parallelized C++
python is only used as a front-end. Very little of the computational workload is done by the python interpreter.
Additionally The C++ API for PyTorch is very much in the same style as the python API. Obviously you have some additional flexibility in how you optimize your code but the tensor-based operations are the same.
PyTorch also makes it trivially easy to write optimized CUDA C code and provide python bindings to it so you can make use of it with faster development time in python, while retaining the computational benefits of C/C++/CUDA C for typical workloads.
vprokopev OP t1_j0zrven wrote
I understand this. This is not answering my question.
Using python front end I must implement any algorithm I have in mind in terms of vectorized pytorch operations. I can't use loops and indexing and other python libraries, or my code will be slow and only executed with 1 core.
How that supposed to make my job easier?
ok531441 t1_j0zsvqr wrote
It’s easier than the alternatives. If you don’t think it is, use whatever you think is better. You’ll either solve your problem and find that better language or you’ll learn why Python is used so much.
dumbmachines t1_j0zsn0r wrote
The alternative is writing your own cuda code or C++. Fortunately for you pytorch is pretty easily extendable. If you have something that needs to be done quickly, why not write a cpp extension?
Exarctus t1_j0ztwve wrote
I’ve not encountered many situations where I cannot use existing vectorized PyTorch indexing operations to do complicated masking or indexing etc, and I’ve written some pretty complex code bases during my PhD.
Alternatively you could write your code in C++/CUDA C however you like and provide PyTorch bindings to include it in your python workflow.
float16 t1_j10sam3 wrote
OK guys, you can chill with the downvotes. They're just asking questions.
As mentioned elsewhere, Python does not do much work; the important parts are in CUDA. So if you used some other language such as C++, you still can't write loops, and you still have to use the framework's data structures.
Dependent_Change_831 t1_j1hc9x4 wrote
You’re getting a lot of hate but IMO you’re totally right. Python may be convenient short term but it really does not scale.
I’ve been a working in ML for around 8 years now and I’ve just joined a new project where we’re training models trained on billions of pieces of dialog for semantic parsing, and it took us weeks to fix a bug in the way torch and our code would interact with multiprocessing…
There are memory leaks caused by using Python objects like lists in dataset objects, but only if you use DistributedDataParallel (or a library like DeepSpeed, i.e. multiple processes)…
Loading and saving our custom data format requires calling into our own C++ code to avoid waiting hours to deserialize data for every training run…
Wish I could say there’s a better alternative now (due to existing community resources) but we can hope for the future.
vprokopev OP t1_j1j0xs5 wrote
Thank you for sharing experience!.
My intuition is that C++ python extensions make it easier to to easy things (then in C++) but make it harder to do hard things.
People always go for convenience first and then move to something more fundamental and flexible.
Data Science was mostly in R and MATLAB about 12-15 years ago. Then people moved to more general python. Next step is a compiled language with static types imo.
30katz t1_j11v4wa wrote
Maybe you’d take all this free software and make it easier for others in the future?
Viewing a single comment thread. View all comments