hackernews client

oneout

8 hours ago

Large Language Models are amazing tools for text processing and generation. Yet people keep complaining that LLMs can't count the number of r's in the word 'strawberry' and make mistakes in math calculations.

So, instead of explaining tokenizers for the nth time, I decided to build a system that can both give you a soup recipe and multiply 74398 by 94673.

Solved: - Counting letters in a word/words in a sentence. - Accurately performing calculations.

Limitations: - The system lacks spatial reasoning. - Some tokenizer-related challenges still persist (e.g. producing an output containing x number of words). - The number of application instances and inference speed are limited: this is a research project, and I didn't build it to be scalable and handle high load.

This is just a thought experiment that has been on my mind for a long time.

I'll be happy to hear your feedback or collaborate on the next iteration of the project!

Project Cloudberry: how to teach an LLM to count

1 Comments

oneout