Help Me Build My Deep Learning Workstation

Hi oz-bargainers,

After a few weeks of research I've reached a level where I can put together parts to build my deep learning rig.
I thought I'd post my current configuration here to get some feed back from the experts.

The price is currently sitting at ~4.5k (and I need to buy a monitor too!). I'd appreciate some advice if I could use alternate configs or if you think I could make this better or cheaper.

The most important thing is the GPU. After doing a fair bit of research I decided to go for a dual GPU setup consisting of 2 x Asus GeForce RTX 2070 SUPER 8 GB Turbo EVO's instead of 1 x RTX 2080 Ti as I get more Vram for the same price (important for deep learning). I've chosen my other parts around these GPU's.

Here is the link to the parts. Please let me know what you think.

https://au.pcpartpicker.com/list/z8Zkn7

Comments

  • +1

    One issue I can see is that mobo has 1x x16 slot and 1x x8 slot, so your 2nd GPU is only getting PCIe 3.0 x8.

    It's a dual channel memory CPU, so you won't have the memory bandwidth of the Threadripper CPUs

    970 EVO is low endurance drive, I'd have gone 970 Pro for the extra endurance.

    Good choice on the blower GPUs, but make sure you've got enough airflow pushing toward them (especially if you're air cooling your CPU).

    https://au.pcpartpicker.com/list/fFxyBZ

    Not sure if the link works, but here's my setup (due to compatibility issues within pcpartpicker I was unable to setup correctly).

    Essentially, mostly as per link with differences being:
    4x Nvidia 1080Ti FE's
    1x U.2 960GB Optane 905P in lieu of PCIe

    I went with high memory bandwidth, low latency storage and with the X299 SAGE, there's 4x PCie 3.0 x16 ports worked by a PCIe switch for max bandwidth for each GPU.

    Starting my courses soon (delayed due to injury) so look forward to training on this machine. Autonomous vehicles for starters for me.

    • Which mother board can I go for what would have 2x PCIe x16? The x299's are not compatible with my CPU.

      Can you also explain what you meant by "It's a dual channel memory CPU, so you won't have the memory bandwidth of the Threadripper CPUs"?

      • +2

        Ryzen 3000 CPUs have total of 24 PCIe Gen4 lanes, meaning you don't actually have enough lanes to have 16x for 2 cards.

        Four out of the twenty-four are used for the interconnect to the X570 chipset, this means you've got 20 lanes left Gen 4.0 for other utilization.

        16 lanes (PCIe x16) are intended for graphics cards that are connected as x16 or two at x8.

        In order to get two 16x PCIE lanes for your graphics card, you need HEDT parts. Basically it's either TRX40 or X399 when you are looking at Ryzen (Threadripper)

  • Depends on you application. I will suggest 4 to 8 GPUs if you want to build and test your own models. It is a really slow process and if the results are not coming fast you can easily get frustrated. Waiting for days to train your network is a real pain..

  • +2

    Is it possible to rent time on some big hardware? (akin to AWS or whatnot)?

    Could work out cheaper and faster (not as awesome though).

    • In the long run its better to have your own machine.

    • +1

      Vast.ai is always handy if you want to rent cheap.

      • Thanks. I'll have a look

  • +2

    You might be better off using AWS etc until the COVID prices calm down a bit. GPU's are approaching 2x their pre-COVID pricing..

    • oO I didn't know that. I need to get this before EOFY as its for my business.

      • Talk to your accountant then about the asset write-off and you might be able to splash for a proper workstation setup (ie, workstation mobo/cpu and monitor etc) if it's business related.

        • +1

          That's the plan. I should be able to write off the entire purchase. At the same time business wasn't very good this year so I don't want to spend too much.

  • +1

    AI training is not my speciality, but I know a bit generally about HPC…

    Have you characterised your workload? There's talk here about trade offs w/ more GPU RAM across multiple cards, 8/16x PCIe, etc. Do you know what effect this will have on your computations? If so, how strong are these predictions against changes in algorithm or tool? For my work we have spreadsheets that estimate this kind of thing, but I work with bigger systems and pretty well defined workloads.

    I would go for a reputable brand of PSU (I've not heard of Cougar, but I've been out of the game of picking parts for a while). I would also suggest a better power efficiency rating - less heat, less power usage directly and indirectly (assuming you're running this box in an air-conditioned office).

    Are you only doing learning, not loads of inference? As I understand it, different accelerators are more suited to one or the other.

  • +2

    I'm not a DL expert and don't mean to be too negative but if you building a deep learning workstation for business purposes then I think you should already have a pretty clear idea of what you need and I am not sure ozbargain is going to be the best place for expert advice specific to your situation.

    Having recently built a personal workstation for machine learning and possible DL in the not too distant future I think you have to realise the limitations of consumer-grade system. An example of this is the fact you have 128gb RAM which is the max for your board with no scope for upgrading without changing to new board different chip (i.e. thread ripper). You also may be limited to two GPU's AFAIK. Even if you get two RTX titans depending on your model 48gb of RAM may not be enough so you might be relying on renting hardware.

    Overall I think getting 2070 super vs 2080ti is not a bad idea for the extra ram. Also, I suspect prices for 2080ti will drop more compared to 2070super when new models are released.

    Other things to note is NVIDIA have released new "industrial GPU" a100 is meant to make outsourcing deeplearning much cheaper from what i read.

  • Jesus I see graphics cards haven't taken their foot off the pedal, near $1000 for a 2070, crazy.

    If I was spending this much money for an extra $200 I'd be inclined to go the 3900X.

Login or Join to leave a comment