hackernews client

deepsquirrelnet

9 months ago

Over the last few weeks, I've been working hard at adding semantic splitting/chunking into WordLlama, and I'm excited to share what I came up with. Because of the fast and lightweight nature of WordLlama, I felt it was a great application for the platform, and aligns with our goal of creating a useful utility for LLM-related interfacing tasks.

In this blog post, I demonstrate the methodology that I arrived at. At the end, I show semantic splitting on a 1 million character text of "The Lord of the Rings". The new (python) method `wl.split(text)` executes on a single cpu core of my T480 Thinkpad in 700ms.

You might use a feature like this when chunking for building knowledge bases (RAG) -- or extracting and filtering text to send to an LLM (combining wl.spit(...) and wl.filter(...)) in online applications.

I hope you enjoy the technical deep dive, and find this feature useful.

magicalhippo

9 months ago

Did indeed enjoy, thanks for sharing.

Have you by any chance compared this to the rolling sentence method described briefly here[1], with discussion here[2]?

I'd take it your method is more compute optimized, given you have to do far more embedding calculations using the rolling sentence method. However, from what I can gather, you sacrifice some "boundary quality" for speed by doing the chunking "blind", so could perhaps be interesting to compare the quality and performance of the two approaches.

[1]: https://gpt3experiments.substack.com/p/a-new-chunking-approa...

[2]: https://news.ycombinator.com/item?id=41643388

deepsquirrelnet

9 months ago

I’ve taken ideas from blog posts like that one, but mostly I’ve found ideas and haven’t really seen a well described approach. There’s a lot of details to be worked out which I feel are all important in the process. I probably spent more time figuring out how to distribute the text into small segments to even perform similarity comparisons than anything else.

That was one motivation I had in doing a full write up. It has some aspects of a rolling sentence method (rolling window similarity), and some aspects of regular chunking (trying to retain paragraph structure).

I wanted to demonstrate an approach that I worked on with all the gory details for people to follow if they want to understand what’s happening under the hood or take ideas for their own experiments.

magicalhippo

9 months ago

Thanks again. Definitely something I want to play with.

The efficiency of your approach is very nice, thought it would be interesting to compare with the "rolling embeddings" approach to see if the quality stays sufficient. If so it's a huge gain in speed.

Show HN: Semantic Splitting with WordLlama

4 Comments

deepsquirrelnet

magicalhippo

deepsquirrelnet

magicalhippo