ML Workstation Build

Finally decided to get some GPUs running locally. It's a part of stack I want to learn more about and having a machine that is always running, one I don't have to reserve in the cloud just made sense. A machine that I can control end to end. Decided on the following specs:

26A49F4D-C636-420B-8BC7-00D545CAADDE_1_105_c

F9218CF0-7E57-4858-B786-A96490C94E0B_1_102_a

4CFBFF01-0CA5-4849-9629-82AEB88F1E7D_1_105_c The goal is to eventually build it up to 1TB RAM, 4 x RTX Pro 6000 but the family was not too keen to switch to a Ramen only diet just yet so will build this one out slowly over time.

The hardware side of the build was very straightforward (almost destroying the CPU aside). Just parsing through lot of manuals (AI assisted of course) and dealing with a lot of screws. Things that made the build a breeze:

4CFBFF01-0CA5-4849-9629-82AEB88F1E7D_1_105_c

F9AE5C44-28AD-46C2-8028-2DECE2F28162_1_102_a

F9218CF0-7E57-4858-B786-A96490C94E0B_1_102_a

Decided to have Ubuntu 24.04 headless running on it. Some issues I ran into post first boot:

BUG: kernel NULL pointer dereference
RIP: bit_entry+0x15/0x110 [nouveau]
nouveau 0000:f1:00.0: vgaarb: deactivate vga console
Console: switching to colour dummy device 80x25

The solution was blacklisting the driver and rebuilding initramfs so the driver is not used during the next boot. Once booted successfully we installed the official Nvidia drivers which worked like a charm

 nvidia-smi --query-gpu=index,name,pci.bus_id,pcie.link.width.current,pcie.link.width.max,pcie.link.gen.current,pcie.link.gen.max --format=csv
index, name, pci.bus_id, pcie.link.width.current, pcie.link.width.max, pcie.link.gen.current, pcie.link.gen.max

0, NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition, 00000000:21:00.0, 16, 16, 5, 5
1, NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition, 00000000:F1:00.0, 16, 4, 5, 5

With all that out of the way the machine is alive and well. It's humming away and I have been restricted to a diet of Ramen for the next few months.

CE6AD297-729E-4895-834F-30E20AC29358_1_105_c