Nemotron 3 Ultra: Open Moe Hybrid Mamba-Transformer for Agentic Reasoning [pdf]

20 pointsposted 5 hours ago
by victormustar

2 Comments

2001zhaozhao

21 minutes ago

This model seems like a really big deal. Is this the biggest Western open-source AI model in the world (beating out Llama3 405B)?

throwa356262

4 hours ago

Is this the one from Jensens Computex presentation the other day?

It is significantly bigger than Qwen for the same level of intelligence, but I think the key strength was inference speed.