Viewing a single comment thread. View all comments

corgis_are_awesome t1_j7puwt4 wrote

Yes, I can 100% design a machine that will iteratively develop an algorithm that can solve Rubik’s cubes, without ever knowing exactly how to solve them myself.

1

apfejes t1_j7pw9ls wrote

Feel free to join the crowd of people who are trying to do that.

I've spent the last year talking with people in this space, and all of the big pharmaceutical companies are now saying they won't work with AI-based companies because their algorithms don't work on complex biology data. Too many people have made the claim that they could use machine learning to mine patterns out of biology data sets and failed.

It's not a knock on ML or AI. How would your algorithm know that the data it's working on is unreliable and that biology data often has 50% false positive rates on yeast-2-hybrid screens, or a given SNP may be a miscall that has propagated through 10 generations of reference genomes? Or that the assay that generated the data you're looking at used a promiscuous antibody that's triggered on a related protein that happens to express in the lab culture you're working on? If the data you're working on isn't clean, how are you planning on getting a clean signal out?

Rubik's cubes are child's play compared to the networks that Recursion is working on.

4

corgis_are_awesome t1_j7pzjpd wrote

https://i.imgur.com/vX5hSEX.jpg

Draw a circle around the intersection of Data Scientist, Programmer, Superman, and Bioinformatician.

That’s basically my career target

1

apfejes t1_j7q02uh wrote

Thank you for citing my own figure to refute me!

You can't be in the "superman" or bioinformatician areas without having an understanding of biology - that's how Venn diagrams work.

2

corgis_are_awesome t1_j7q2bzp wrote

Haha yeah I figured you might like that. :-)

Do you have any recommendations on the most efficient way to become knowledgeable about biology, especially in the way that would be useful to longevity research?

Would I have to go through a full college degree on the topic, or is there a way to bypass a lot of the noise and focus on learning the key parts that matter? I have a long history of rapidly learning new things. I like to start with a problem and work my way backwards towards the solution, learning and leveraging different technologies as I iterate toward a solution.

For example, when I was 13, I was approached by a company that wanted a software system that would let them have a communal inbox for their support staff, and a way for individual team members to pick up an email and start responding to it without stepping on someone else’s toes. So I repurposed a Matt’s Script Archive forum perl script, taught myself the basics of the perl language, and then molded it into a support ticket system that met their needs. I did that in a matter of weeks, at the age of 13, with a language I didn’t even know.

That was a long time ago, sure, but I have since learned many other languages and built many other solutions for companies over the years. For example l, I learned Python and got a job working with ai in education, specifically because I knew that Python was big in the machine learning world, and I wanted to move my career in that general direction.

1

apfejes t1_j7q4thu wrote

Actually, I don't have a recommendation, unfortunately. There are many different fields in biology, and learning each one can be a few years of work, plus the common foundations - so the question isn't how do you learn but "How much do you need to know to do a specific job?"

Unfortunately, biology is the opposite of programming. Programming is a logical set of tools that build on each other. If you learn arrays, or dictionaries or data structures, you can go out and apply them logically. You can figure out which one will have the best performance in a given situation, and optimization is a logical extension of what you know. You can spend a life time learning, but the basics don't change.

In biology, EVERYTHING is an exception to something else. Learn the entire "biochemical pathway" chart, and then you'll discover than some animals do things differently, or short circuit pieces of it, or just get a specific chemical from their diet and don't need to do a certain part of it. It's all chaos. Biology is the mad hatter's perspective and there's no real guarantee that something is going to work the way you think it should, or the way you were taught. eg. Translation of RNA to protein always begins with a Methionine (AUG codon)... except that sometimes it doesn't. Sometime organisms have found a way to get things started with a missing base, or sometime just that things are wobbly.. or maybe sometimes it's just not at all what you think it's going to be.

That's the rambly way of saying that you'll never know what you need to know until it's too late and you discover something was wrong. For my Masters thesis, I worked on a really slow growing bacteria, and was trying to convince it to do something for months (take up a plasmid so I could knock out a gene). I worked on that system for about a year, and never got it to work. A couple years later, working on a different project, I discovered that the post-doc who set up the system had missed a critical detail: the half life of one of the antibiotics, to which the entire system had been build around, was shorter than the incubation time of the bacteria we were growing. The system could never have worked on that organism, and no amount of work would ever have changed it. I wasted months on that, and never once thought to validate the actual system that had been used by the guy for a year before I started. Who knows what to make of the data he'd recorded.... is it all garbage? I really don't know.

How deep would I have to have studied to know to look at the half life of Kanemycin? I haven't a clue. In biology, it's not what you know that gets you - it's what you don't know.

2

corgis_are_awesome t1_j7q87s5 wrote

I don’t know… to be honest, the way you are describing biological systems, the more I think of the way how real world software systems actually evolve in the wild, and the nightmare that is debugging large, complex, undocumented systems. But even if it seems chaotic, there are logical patterns that can be found, and understanding that can be developed.

Out in the real world, software programs rarely grow into the perfectly optimized and well organized logical constructs taught about in college. More often than not, they are full of extremely wonky solutions and poorly documented workarounds that have been duct taped together years ago by random people pasting code from stack overflow.

In my mind, biology isn’t even a biology problem as much as it is a particle physics problem.

For example - Particle Life: https://youtu.be/p4YirERTVF0

1

apfejes t1_j7qablb wrote

> In my mind, biology isn’t even a biology problem as much as it is a particle physics problem.

Emergence is a thing, but 3.7 Billion years of emergent property evolution has created levels of complexity that are far FAR beyond the level of the simple software tools that can mimic the surface level complexity you see in "computer life" simulations.

The computer complexity you're talking about with wonky solutions and poorly documented code are, on average, about 40 years old.

The biological equivalence would be to continue building the same way for about 100,000,000x longer.

I don't dispute the analogy, but it's a bit of Dunning-Kruger, again. The level of complexity isn't going to be obvious to you until you start trying to solve the problems. 3.7 Billion years of wonky solutions layered on top of each other is a lot different than 40 years.

2

t_rexinated t1_j89jsip wrote

the overhype-underdelivery cycle is real and that's led to very understandable vaporware vibes amongst bigger biotech and pharma.

honestly, if you think that you'll simply be able to just pop the data from your absolute trash of an experiment into a magical shiny black box and get anything meaningful out of it, then you're an idiot and you deserve to lose your money on something you think will solve all of your problems for you.

agreed: if you're shoveling hot garbage in, hot garbage is def gonna be coming out.

when done properly and when done well, AI/ML,/GNNs/CNNs/GANs/blah blah blah are absolutely amazing and powerful tools. it just takes a lot of hard work to get to that point, and few do it well. when done well though, peeps are doing some really awesome work tho...especially in image processing phenotypic profiling:

https://www.nature.com/articles/d41586-022-02964-6

1

apfejes t1_j89mgov wrote

Completely agree that AI has massive potential, but only when paired with people who understand the data they’re feeding in.

1