thiru_2718 t1_j4piklu wrote on January 17, 2023 at 10:25 AM

Inresting question. My intuition if you could maintain a continuously-updated cache of the metric you're using to split your branches (i.e. continuously compute mutual information for each fork), and we assume your new data roughly follows the same distribution as your old data, you maybe able to get away with only modifying the downstream branches of your trees which should be more efficient.

But if that assumption isn't true, then the new data changes your trees closer to the root, and there's little benefit.

monkeysingmonkeynew OP t1_j4pjoxj wrote on January 17, 2023 at 10:40 AM

Thanks! I'll muse this over