danielhanchen
3 days ago
Downloaded params.json - GeLU & 2D RoPE are used for the vision adapter. The vocab size also got larger - 131072 in size.
Also Mistral's latest tokenizer PR shows 3 extra new tokens (the image, the start & end).
The torrent is 24GB, and I guess the implementation will be up in HF in a few days!
Exciting times!
generalizations
2 days ago
At 24gb, I wonder how small the quantized models will go.