A guide on how to run Nemotron 3 Super 120B Thinking on 2 Nvidia DGX Spark

2 pointsposted 9 hours ago
by TechPreacher

3 Comments

orbanlevi

9 hours ago

I have 1 DGX Spark and running models with vLLM to, out of curiosity why not using Llama.cpp / TensorRT-LLM or any other alternatives?

awedisee

6 hours ago

Oh thank god. Finally a man of the people who can show us how to optimize 10k worth of equipment.

Because we all have at least two of these. Shout out to OP!!