minimaxir
13 hours ago
Everyone is sleeping on Gemini 2.5 Flash Image / Nano Banana. As shown in the OP, it's substantially more powerful than most other models while at the same price-per-image, and due to its text encoder it can handle significantly larger and more nuanced prompts to get exactly what you want. I open-sourced a Python package for generating from it with examples (https://github.com/minimaxir/gemimg) and am currently working on a blog post with even more representative examples. Google also allows generations for free with aspect ratio control in AI Studio: https://aistudio.google.com/prompts/new_chat
That said, I am surprised Seedream 4.0 beat it in these tests.
daemonologist
12 hours ago
I don't think people are really sleeping on it - nano-banana more or less went viral when it first came out. I'd argue that aside from the capabilities built into ChatGPT (with the Ghibli craze and whatnot) craze it's the best known image editing model.
minimaxir
10 hours ago
It's a weird situation where the Gemini mobile app hit #2 on the App Stores because of free Nano Banana, but no one ever talks about it and most disclosed image generations I've seen are still ChatGPT.
ec109685
7 hours ago
Google photos should just include the feature. It’s kinda buried in Gemini.
Google is so weirdly non-integrated.
piquadrat
3 hours ago
They announced that Nano Banana will be integrated in Google Photos a couple weeks ago.
https://blog.google/technology/ai/nano-banana-google-product...
troupo
3 hours ago
> It’s kinda buried in Gemini.
> Google is so weirdly non-integrated.
Where by try gemini non- integrated have you tried gemini you mean gemini is here they shove use gemini gemini into every single product they have?
vunderba
10 hours ago
> That said, I am surprised Seedream 4.0 beat it in these tests.
OP here. While Seedream did have the edge in adherence it also tends to introduce slight (but noticeable) color gradation changes. It's not a huge deal for me, but it might be for other people depending on their goals in which case NanoBanana would be the better choice.
herval
12 hours ago
Gemini is great when it gets it right, but in my experience, it sometimes gives you completely unexpected results and won't get it right no matter what. You can see that in some of the examples (eg the Girl with the pearl earring one). I'm constantly surprised by how good Flux is, but the tragedy is most people (me included) will just default to whatever they normally use (chatgpt and gemini, in my case), so it doesn't really matter that it's better
dimitri-vs
11 hours ago
Agreed, to the point where I built my own UI where I can simultaneously generate three images and see a before/after. Most often only one of three is what I actually wanted.
tigershark
5 hours ago
Flux kontext quality is noticeably worse that nano banana, Qwen image 2509 and Seedream 4 most of the times. For pure image generation instead Hunyuan image is scarily good.
cosama
12 hours ago
I was trying to use gemini 2.5 flash image / nano banana to tidy up a picture of my messy kitchen. It failed horribly on my first attempt. I was quite surprised how much trouble it had with this simple task (similar to cleaning up the street in the post). On my second attempt I had it first analyze the image to point out all the items that clutter the space, and then on a second prompt had it remove all those items. That worked much better, showing how important prompt engineering is.
vunderba
4 hours ago
Yeah, that's part of the reason I list the number of attempts as part of the stats for each model + respective prompt. It's a loose metric of how "steerable" a given model is, or put another way, how much I had to fight with it before we were able to get it to follow the prompt directives.
tigershark
5 hours ago
Seedream 4 is better than nano banana on average, so that test result seems accurate to me
BoorishBears
10 hours ago
No one is sleeping on nano-banana/Gemini Flash, it's highly over-tuned for editing vs novel generation and maxes out at a pretty low resolution.
Seedream 4.0 is somewhat slept on for being 4k at the same cost as nano-banana. It's not as great at perfect 1:1 edits, but it's aesthetics are much better and it's significantly more reliable in production for me.
Models with LLM backbones/omni-modal models are not rare anymore, even Qwen Image Edit is out there for open-weights.
cpursley
11 hours ago
Meh, most Google AI products look great on paper but fail in actual real scenarios. And that ranges from their Claude Code clone to their buggy storybook thing which I really wanted to like.