pharmaway123

pharmaway123 t1_j1ntpo7 wrote

Health care data scientist here. I'll copy and paste my thoughts from our company slack:

> The model has a shockingly high sensitivity and specificity. Looks very impressive off the bat. But once you dig in, it's a bit more of a "well duh" moment.

> The most predictive factors were opioid-related poisoning diags, long term use of opioids, and high daily MME (milligrams of morphine equivalence, a standardized opioid dosage)

> I think that leads to a couple high level take aways.

  1. Often times with claims data, you have highly informative signals. So high, in fact, that you actually want to train a model on that signal instead. Another classic example would be predicting Alzheimers disease in claims. The strongest predictive signal are diags like "age related cognitive decline" and a bunch of other R41.X codes. If you want to add value, you probably want to use those diags as your outcome variable. Otherwise, you can just come up with nearly perfect model by just ID'ing folks with (ex) opioid-related poisoning. But by that point, they've probably already a very expensive inpatient/ER episode (where the opioid poisoning was coded).

  2. In spite of the above, the initial predictive model like the one in this paper can be super useful. If you ask a clinician how to identify folks who use too many opioids, they are unlikely to deliver a list of diagnosis codes including the opioid poisoning codes. Similarly, it's unrealistic to expect that they would be able to identify a specific cut off for the number of MME's that are sensitive and specific for opioid use disorder. Those initial models can be used with a "human in the loop" mechanism where you review that output with clinicians and refine the inclusion criteria or outcome variable

  3. The last thing I'd highlight is it is rarely ever a good idea to take a predictive model developed in a different data set and apply it to our own data. The population differences, the difference in coding patterns, the difference in temporal utilization all mean you can expect a drastic drop in performance if you take an existing model and apply it to our data. Said differently: medical predictive models generally have poor generalizability. We have the luxury of having lots and lots of data, and we can likely get much better performance by developing models internally.

8