abtinf
3 months ago
From the FAQ… doesn’t seem promising when they ask and then evade a crucial question.
> What is the memory bandwidth supported by Ascent GX10? AI applications often require a bigger memory. With the NVIDIA Blackwell GPU that supports 128GB of unified memory, ASUS Ascent GX10 is an AI supercomputer that enables faster training, better real-time inference, and support larger models like LLMs.
Youden
3 months ago
They seem to have another FAQ here that gives a real answer (273GB/s): https://www.asus.com/us/support/faq/1056142/
suprjami
3 months ago
Now we can see why they avoided giving a straight answer.
File this one in the blue folder like the DGX
stogot
3 months ago
Noob here. Why is that number bad?
TomatoCo
3 months ago
LLM performance depends on doing a lot of math on a lot of different numbers. For example, if your model has 8 billion parameters, and each parameter is one byte, then for 256gb/s you can't do better than 32 tokens per second. So if you try to load a model that's 80 gigs, you only get 3.2 tokens per second, which is kinda bad for something that costs 3-4k.
There's newer models called "Mixture of Experts" that are, say, 120b parameters, but only use 5b parameters per token (the specific parameters are chosen via a much smaller routing model). That is the kind of model that excels on this machine. Unfortunately again, those models work really well when doing hybrid inference, because the GPU can handle the small-but-computationally-complex fully connected layers while the CPU can handle the large-but-computationally-easy expert layers.
This product doesn't really have a niche for inference. For training and prototyping is another story, but I'm a noob on those topics.
abtinf
3 months ago
My mac laptop has 400gb/s bandwidth. LLMs are bandwidth bound.
kennethallen
3 months ago
Running LLMs will be slow and training them is basically out of the question. You can get a Framework Desktop with similar bandwidth for less than a third of the price of this thing (though that isn't NVIDIA).
embedding-shape
3 months ago
> Running LLMs will be slow and training them is basically out of the question
I think it's the reverse, the use case for these boxes are basically training and fine-tuning, not inference.
kennethallen
3 months ago
The use case for these boxes is a local NVIDIA development platform before you do your actual training run on your A100 cluster.
NaomiLehman
3 months ago
refurbished macbooks m1 for $1,500 have more with less latency
fancyfredbot
3 months ago
They have failed to provide answers to other FAQ as well. The answers are really awkward and don't read like LLM output which I'd expect to be much more fluent. Perhaps a model which was lobotomized through FP4 quantisation and "fine tuning" on one of these.
LeifCarrotson
3 months ago
It sounds good, but it ultimately fails to comprehend the question: ignoring the word "bandwidth" and just spewing pretty nonsense.
Which is appropriate, given the applications!
I see that they mention it uses LPDDR5x, so bandwidth will not be nearly as fast as something using HBM or GDDR7, even if bus width is large.
Edit: I found elsewhere that the GB10 has a 256bit L5X-9400 memory interface, allowing for ~300GB/sec of memory bandwidth.
tuhgdetzhh
3 months ago
For comparison, the RTX 5090 has a memory bandwidth of 1,792 GB/s. The GX10 will likely be quite disappointing in terms of tokens per second and therefore not well suited for real-time interaction with a state-of-the-art large language model or as a coding assistant.
guerrilla
3 months ago
It doesn't sound good at all. It sounds like malicious evasion and marketing bullshit.
exe34
3 months ago
It gives you a very good idea of the capability of the models you'll be running on it!
guerrilla
3 months ago
It doesn't give a good idea of anything. We already know it has 128GB unified memory from the first bullet point on the page.
darkwater
3 months ago
GP was subtly implying that the text was written by an LLM (running in the very same Ascent GX10).
guerrilla
3 months ago
Ah! Thanks for explaining. haha
BikiniPrince
3 months ago
With a little tinkering we can just have the AI gaslight us about it’s capabilities.
epolanski
3 months ago
I think the previous user made a joke about LLMs spewing nonsense on top of AI bs thus this product being quite fitting.
curvaturearth
3 months ago
Written by a LLM?