gardnr
5 days ago
This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]
You can expect this model to have similar performance to the non-omni version. [2]
There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.
1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B
2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct
red2awn
5 days ago
This is a stack of models:
- 650M Audio Encoder
- 540M Vision Encoder
- 30B-A3B LLM
- 3B-A0.3B Audio LLM
- 80M Transformer/200M ConvNet audio token to waveform
This is a closed source weight update to their Qwen3-Omni model. They had a previous open weight release Qwen/Qwen3-Omni-30B-A3B-Instruct and a closed version Qwen3-Omni-Flash.
You basically can't use this model right now since none of the open source inference framework have the model fully implemented. It works on transformers but it's extremely slow.
olafura
5 days ago
Looks like it's not open source: https://www.alibabacloud.com/help/en/model-studio/qwen-omni#...
coder543
5 days ago
No... that website is not helpful. If you take it at face value, it is claiming that the previous Qwen3-Omni-Flash wasn't open either, but that seems wrong? It is very common for these blog posts to get published before the model weights are uploaded.
red2awn
5 days ago
The previous -Flash weight is closed source. They do have weights for the original model that is slightly behind in performance https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct
coder543
5 days ago
Based on things I had read over the past several months, Qwen3-Flash seemed to just be a weird marketing term for the Qwen3-Omni-30B-A3B series, not a different model. If they are not the same, then that is interesting/confusing.
red2awn
5 days ago
It is an in-house closed weight model for their own chat platform, mentioned in Section 5 of the original paper: https://arxiv.org/pdf/2509.17765
I've seen it in their online materials too but can't seem to find it now.
gardnr
5 days ago
I can't find the weights for this new version anywhere. I checked modelscope and huggingface. It looks like they may have extended the context window to 200K+ tokens but I can't find the actual weights.
pythux
5 days ago
They link to: https://huggingface.co/collections/Qwen/qwen3-omni-68d100a86... from the blog post but it does seem like this redirects to their main space on HF so maybe they didn't yet make the model public?
tensegrist
5 days ago
> There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.
last i checked (months ago) claude used to do this
plipt
5 days ago
I dont think the Flash model discussed in the article is 30B
Their benchmark table shows it beating Qwen3-235B-A22B
Does "Flash" in the name of a Qwen model indicate a model-as-a-service and not open weights?
red2awn
5 days ago
Flash is a closed weight version of https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct (it is 30B but with addtional training on top of the open weight release). They deploy the flash version on Qwen's own chat.
plipt
5 days ago
Thanks
Was it being closed weight obvious to you from the article? Trying to understand why I was confused. Had not seen the "Flash" designation before
Also 30B models can beat a semi-recent 235B with just some additional training?
red2awn
5 days ago
They had a Flash variant released alongside the original open weight release. It is also mentioned in Section 5 of the paper: https://arxiv.org/pdf/2509.17765
For the evals it's probably just trained on a lot of the benchmark adjacent datasets compared to the 235B model. Similar thing happened on other model today: https://x.com/NousResearch/status/1998536543565127968 (a 30B model trained specifically to do well in maths get near SOTA scores)
andy_ppp
4 days ago
Haha, you could hear how it’s mind thinks, maybe by putting a lot of reverb on the thinking tokens or some other effect…
andy_xor_andrew
5 days ago
> This is a 30B parameter MoE with 3B active parameters
Where are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.
gardnr
2 days ago
I was wrong. I confused this with their open model. Looking at it more closely, it is likely an omni version of Qwen3-235B-A22B. I wonder why they benchmarked it against Qwen2.5-Omni-7B instead of Qwen3-Omni-30B-A3B.
I wish I could delete the comment.
plipt
5 days ago
The link[1] at the top of their article to HuggingFace goes to some models named Qwen3-Omni-30B-A3B that were last updated in September. None of them have "Flash" in the name.
The benchmark table shows this Flash model beating their Qwen3-235B-A22B. I dont see how that is possible if it is a 30B-A3B model.
I don't see a mention of a parameter count anywhere in the article. Do you? This may not be an open weights model.
This article feels a bit deceptive