sometimesgoodadvice

sometimesgoodadvice t1_jclaqrc wrote

Isolating genes is usually done with simple PCR. These days you almost never isolate a gene without knowing it's sequence and then it's a matter of designing PCR primers that are unique (fairly straightforward). You can check that the appropriate gene is isolated by simple gel electrophoresis and once it's inserted into the vector (which makes amplification in an organism very easy) you can once again sequence.

Isolating the protein product after expression is a little bit more difficult but is well understood. Almost always you will use one or more liquid chromatography methods. You can append small "tag" sequences to the gene to create an amino-acid sequence that specifically binds to certain metals (HHH binds Nickel) or an antibody which can be immobilized on a solid "column". You can then flow a slurry of the bacteria that expressed the protein across the column and everything will flow past except your protein of interest. The protein can then be removed off the column with a different specific buffer. There are many variations depending on your protein of interest and application needs, but this is the general approach. In cases where tags can't be added due to activity needs or cost requirements, proteins can be isolated similarly based on their size, charge, and hydrophobicity. If you perform those separations in various orders and be clever about it, you can get very pure product.

Finally, confirming the final identity and activity of the enzyme might require mass spec and activity assays (if enzyme) where you can compare reaction rate of product vs. total amount of protein added to get an idea of the fraction of protein that is active.

3

sometimesgoodadvice t1_j4u9hi1 wrote

Really not sure about the premise and how much normal ranges actually change. Let's make an assumption that it is true thought and use some logic. What defines a normal range for a given biomarker? Let's say I was a really good scientist and wanted to do "better science" to find out what a healthy range for red blood cell count was. I would probably take a cross-section of people that doctors have called "healthy" and ones that they have diagnosed with a disease that affects RBC. Then I would do some nice statistics and say that 95% of people with with disease corresponding to low RBC had counts <3e12L and 95% of people with disease that correspond to elevated RBC had counts of >7e12/L. So I will define my "healthy" range as 3-7e12/L show that that corresponds to 98% of "healthy" people. Then I do some more statistics to determine false positive and negative rates, teach doctors how and when to properly utilize this knowledge (including performing the test exactly as I had) and be done.

There really is not a different way to do this. There is no equation that can tell you how many RB cells you need to have. There are some limits of upper and lower bounds, but those are not very useful, so we have to be empirical.

Now let's say I do this again, 50 years later and find that the value shifted. Is there a big problem? After all, the healthy range still corresponds to people that are "healthy" and the unhealthy range to those that have some underlying condition that doctors can diagnose. Maybe the range shifted because people in general have become "unhealthy". But then they would be diagnosed as such. It's just as likely that the range shifted because fewer people are eating lead paint, or because we decided to include people from diverse backgrounds with different genetics or environmental stimuli that were not available in the first study. Maybe the range shifted because people are much "healthier" now with more monitoring and resources available to stay healthy. As long as the range serves its purpose - identifying values that are indicative of an underlying disease - it does not matter what the absolute value is.

5

sometimesgoodadvice t1_j4u81tt wrote

Your hypotheses are pretty spot on. There is some observational bias, I think you are mostly looking at small molecules, as biologics typically have dosages on the order of grams. Small molecules are actually not too different in size, the molecular weight may range 10x between some of the smaller and some of the larger compounds (excluding outliers) which in biology is not that much of a difference considering that typical biological molecules range from the size of water or CO2 (18 and 44 Da) all the way to protein complexes that are >1MDa.

Next, the biology. Most drugs are given systemically which means they pretty much dilute themselves in blood which is pretty close to the same volume for most people. And of course, those drugs are designed to interact/interfere with typical biological processes which through evolution and for kinetic/thermodynamic reasons utilize a relatively narrow range of concentrations in the enzymes/receptors of the body. So a relatively close range of concentrations combined with almost constant volume and relatively close range of molecular weights yields close total dosage.

Lastly, there is pharmacokinetics. Every drug you take has three competing "things" it does. The effect you are looking for, the effect you are not looking for, and removal. The first two parts determine what's called a "therapeutic window". This is the range of concentrations where the intended effect is useful and the side effects are minimized. If you are above this range, the number and severity of side-effects will increase (again more or less back to basic thermodynamics) and if you are below, then your therapy will not be potent enough to have a considerable positive effect. This window can vary quite a bit, but at first approximation will center around the concentration of other similar molecules in the blood, which we already discussed above.

Then there is the clearance. If you are lucky enough to have a large therapeutic window, it may still not be advantageous to give lots of the drug. Most drugs at higher dosages are at first cleared by first-order kinetics whether through liver or kidney. This means that the more drug there is, the faster it is cleared. As concentration decreases, so does the clearance. So imagine you have a wonderful drug that is in the therapeutic window over a range of ~1000x concentration. That means if you give the max amount, you will be active for about 10 half-lives before the concentration decreases enough to no longer provide benefit (2^10~1000). Now you take that same drug and double its concentration. You have increased it's longevity by one half-life, meaning a 10% increase (11 half-lives compared to 10) from previous dosage in how long it stays effective. But to get that 10% you used 2x the drug. Not a great trade-off.

At the end of the day, each drug will have a dosage based on how effective it is at certain concentrations, which dosages minimize side-effects, what concentrations the formulation allows, and also what will lead to the highest rate of patients actually taking the drug (tons of people are working on ways to minimize insulin injections for example). There is some economics goin on as well, but not as much since the production cost for most drugs is small compared to the price.

6

sometimesgoodadvice t1_izkdgst wrote

Much like there are defined translation start and stop sites (ATG and TAG/TAA/TGA, respectively), there are translational and start and stop sites. These tend to be more complicated as there is variability in the UTRs of the RNA. The mechanisms are different in prokaryotes and eukaryotes and plenty of exceptions exist.

Without going into too much detail, there are different protein that interact with DNA and also with elements of the RNA polymerase complex that will start initiation. These are known as transcription factors and different ones will bind different DNA sequences. This is how you have control over which genes are expressed when. If there is a presence of a transcription factor, it will help initiate transcription of those genes whose upstream sequences it binds to. The whole region where this is occurring is known as a "promoter".

Similarly, there are sites where transcription halts. In translation, the stop codon works as a stop site because the tRNA that are complementary to the stop codons are not charged with amino acids. So they cannot add an amino acid to the chain and subsequently cant come off and create a new site for addition of more amino acids. There is no energy release from the peptide bond formation, so the ribosome does not move forward on the mRNA and simply sits there. Eventually (this is pretty fast) the ribosome dissociates and the mRNA and protein go their separate ways. Transcriptional stop sites work very analogously and are called "terminators". However, since there is no "empty" nucleotide to attach to the RNA, instead terminators use structure to simply stop the RNA pol from moving forward. This is usually done through the single stranded DNA sequence being such that it natively forms tight hairpins which the helicase cannot unwind. The polymerase falls off, and then eventually the double stranded DNA is zipped back up.

For the practical question of your answer. When designing a plasmid, you can have multiple promoters and multiple terminators. Bacteria can translate multiple proteins from the same rna, eukaryotes generally can't so they will require a promoter-terminator pair for every ORF. There are certain techniques around that, but that's beyond the scope here. In general, you want to avoid repeating the same promoter or terminator in a plasmid since that can cause recombination and the transcription factors will be divided up between the different promoters causing variability in expression levels between the cells. Whether you put multiple proteins on the same transcript or different one in bacteria depends on your goal. Proteins on the same transcript will be expressed roughly in the same ratio, since the amount of mRNA for one protein is exactly the same as the amount for the other. If you want differential control over the expression of one vs another protein, then they will be under action of different promoters.

6

sometimesgoodadvice t1_iy9wuje wrote

What we learned directly from the human genome is precisely that - the structure and sequence of the human genome. What that enabled (directly) is the ability to understand what genetic material is there that can govern all of the complicated biochemistry going on in the body.

More importantly, in sequencing a large genome like that of humans (nowhere near the largest, but pretty big compared to what was sequenced prior to that) is that we gained the technology (which has since become orders of magnitude better) to sequence more genomes. From this we can compare genomes of humans and other animals to help understand what makes our biology different (or similar) and also other humans to help understand what makes the biology of some humans different from others.

The genome was sequenced only about 20 years ago, but pretty much any medical advancement happening today uses that knowledge of an accurate genomic sequence somewhere in development.

The best analogy may be the invention of a transistor. At the time of the invention, 75 years ago, the basic understanding of electronics was there, and it performed a function that was not too dissimilar from vacuum tubes that existed already. However, the use of the transistor, combined with other inventions such as integrated circuits, photolithography, and many many more ended up revolutionizing the approach to electronics and the speed of their development. In this sense, having an accurate genetic sequence and being able to sequence human cells, combined with other developments has revolutionized our approach to molecular biology and medicine and is a very important building block. Hence why it's regarded as a big achievement.

3

sometimesgoodadvice t1_its067g wrote

It depends on the severity of the disease and type of medication proposed. As others have said, preclinical testing is required before human studies are allowed by the governing body. I am mostly familiar with how this works in the US under the control of the FDA so I will use them as an example, but similar approaches are taken by the various governing bodies in respective regions/countries (e.g. EMA).

A package is developed for your proposed study to the FDA which will include all information known about the medicine, such as molecular structure, mechanism of action, biological models used for study, how the drug will be manufactured and quality tested, relevant animal studies on safety (and ideally efficacy), etc. There is typically discussion with the FDA before submission and after on what would be satisfactory for review so that everyone gets to the goal quicker. The goal being - testing potentially useful therapies in the most safe and ethical way possible.

For the study design itself, very few studies are actually testing drug vs. placebo. Similarly, very few phase I studies are actually tested against healthy individuals. Everything depends of course on the disease. If it's a very rare disease, you may not be able to recruit enough diseased individuals for the safety portion so you may add some healthy volunteers. All trials will be against standard of care. So a placebo will be used only if there is no other care available.

Take cancer for example. If you have lung cancer, there are already multiple lines of treatment available. A study will thus be designed either to be administered to patients after all previous lines failed, or to directly compete with the last line of treatment. The reasons for one vs the other are complex and variable based on the disease, commercial landscape, mechanism of action of the drug, etc. So a person in the study is someone who has had chemo, has had a checkpoint inhibitor, maybe even radiotherapy and nothing has helped. At this point they can get on the study where they are informed that they have ~50% of getting a new untested drug + standard of care and 50% of getting placebo + standard of care (could be more chemo or maybe nothing else, depending on what's available).

If they consent, then they get on study where no one knows which one you are getting. If you are clever and study design is poor, sometimes you can figure it out based on side-effects but really, you should not be able to know.

As far as safety, there is even more going on. All that preclinical work provides a rough estimate of what dosage is safe/efficacious. Those estimates are based on years of data we have of how different classes of therapies translate between say non-human primates and humans (the most common but not only comparison used). If say some primates started getting adverse reactions at 15mg/kg dose of a biologic, we know that it will roughly translate to say 12mg/kg in humans (numbers made up). So we expect to see adverse effects at 12mg/kg. The study will start administering the drug at 0.1mg/kg however, significantly lower than expected threshold. A few people will get that dosage, and if there are no adverse effect, then the amounts will slowly move up until max dosage in study design or serious adverse effects are noticed. So the first 3 people may receive 0.1mg/kg, next 3 get 0.3mg/kg, next 3 get 1mg/kg and so on. Let's say the expected efficacy window is not until 3mg/kg. It really sucks for the first 18 people (3 dosages x 3 people x2 for drug/placebo), but safety comes first. In many studies, those people would be allowed to re-enroll later to receive therapeutic doses if it's safe. Unfortunately for things like cancer, it's unlikely many of them will have survived until that point based on disease progression (remember that these patients have already undergone all known therapies).

Once safety is figured out, then you look at efficacy. If the efficacy is better than standard of care, then the drug gets approval. If you tested on patients that have previously had 3 lines of therapies and you show that you provide benefit vs 4th line (or if there isn't one) you become the 4th line of therapy. If you think your drug can help people earlier in the progression, then you can design another study to go against 3rd or 2nd or 1st line therapy. Safety is likely less of a concern (that's been established in your earlier trials) so you just do phase II/III studies.

COVID vaccine trials were a bit different from what I described but nothing special. Since they are preventative studies, they were not tested against other therapies, although of course people that got COVID during the trial were treated with best practices. Similarly, everyone was told to continue using all of the precautions (social distancing, masks, hand washing, etc.) regardless of whether they got placebo or vaccine (again, no one knew what they got). Most vaccine trials take a hell of a lot longer because the disease prevalence is low and you need to wait until a statistically significant number of cases is seen in the placebo population, but with COVID that was not an issue as it was endemic.

9

sometimesgoodadvice t1_isp75d7 wrote

An interesting analogy but slightly flawed in terms of looking at genomes of already viable organisms. A person whose genome is sequenced to compare to the reference has already undergone the selection criteria for viability and development. Basically, there are plenty of sites where single mutations would lead to a complete breakdown of making a "human" but those would never be seen in a sequenced genome.

The other main difference is that of course code is written to be concise and concrete. As far as I know, no one pastes in some random code that doesn't perform a function just in case it may be needed in the future. Of course, biology works precisely in that way and the genome is a mess of evolutionary history with plenty of space for modification without really resulting in any functional change. So a better example of those 0.6% may be that you can have typos in the comments of the code. In fact, for any large piece of software, I would be surprised if the comment section did not contain at least 0.5% typos.

3