Viewing a single comment thread. View all comments

Zeratas t1_j8cwg7l wrote

You're not going to be putting in an H100, and a workstation. That's a server card.

With the GPUs you were mentioning, are you prepared to spend 30 to 50 thousand dollars just on the GPUs?

IIRC, the A6000s are the top of the line desktop cards.

IMHO, take a look at the specs, performance in your own workload. You'd get better value doing something like one or two A6000s, and maybe investing in a longer term server-based solution.

10

N3urAlgorithm OP t1_j8cwwcr wrote

Yes due to the fact I'm going to use it for work, it'll be ok to build a server option

2

artsybashev t1_j8e2dmj wrote

I understand that you have given up hope for Cloud. Just so you understand the options, $50k gives you about 1000 days of 4x A100 from vast.ai with todays pricing. Since in 3 years there is going to be at least one new generation, you will probably get more like 6 years of 4x A100 or one year of 4x A100 + 1 year of 4x H100. Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

3

Appropriate_Ant_4629 t1_j8h5l44 wrote

> Keeping your rig at 100% utilization for 3 years might be hard if you plan to have holidays.

With his ask, he probably has jobs big enough they'll run through the holidays.

1

artsybashev t1_j8i33cp wrote

Yeah might be. I've only seen companies do machine learning in two ways. On is to rent a cluster of gpus and train something big for a week or two to explore something interesting. The other use pattern is to retrain a model every week with fresh data. Maybe this is the case for OP. Retraining a model each week and serving that model with some cloud platform. It makes sense to build a dedicated instance for a reoccuring tasks if you know that there is a need for it for more than a year. I guess it is also cheaper than using the upfront payment option in aws.

1