N3urAlgorithm OP t1_j8cwwcr wrote on February 13, 2023 at 11:15 AM

Reply to comment by Zeratas in GPU comparisons: RTX 6000 ADA vs Hopper h100 by N3urAlgorithm

Yes due to the fact I'm going to use it for work, it'll be ok to build a server option

artsybashev t1_j8e2dmj wrote on February 13, 2023 at 5:00 PM

I understand that you have given up hope for Cloud. Just so you understand the options, $50k gives you about 1000 days of 4x A100 from vast.ai with todays pricing. Since in 3 years there is going to be at least one new generation, you will probably get more like 6 years of 4x A100 or one year of 4x A100 + 1 year of 4x H100. Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

Appropriate_Ant_4629 t1_j8h5l44 wrote on February 14, 2023 at 7:09 AM

> Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

With his ask, he probably has jobs big enough they'll run through the holidays.

artsybashev t1_j8i33cp wrote on February 14, 2023 at 1:58 PM

Yeah might be. I've only seen companies do machine learning in two ways. On is to rent a cluster of gpus and train something big for a week or two to explore something interesting. The other use pattern is to retrain a model every week with fresh data. Maybe this is the case for OP. Retraining a model each week and serving that model with some cloud platform. It makes sense to build a dedicated instance for a reoccuring tasks if you know that there is a need for it for more than a year. I guess it is also cheaper than using the upfront payment option in aws.