Show HN: Local-first fast CPU image to text for screenshots, PDFs, webpages

19 pointsposted 19 days ago
by mrkn1

18 Comments

KetoManx64

18 days ago

What's the performance like compared to tesseract? I don't see tesseract mentioned anywhere in the readme, which is surprising considering that's the number one tool most go to for Image > text OCR.

mrkn1

18 days ago

No rigorous eval, and I love Tesseract. Here's the example that motivated me to build textsnap (which is in the github's README), parsed with Tesseract:

https://imgur.com/a/i2eQra8

KetoManx64

18 days ago

Very noticable difference and the exact issue I run repeatedly with tesseract! Definitely going to try dropping textsnap into my scripts now. Thanks!!

abstract257

19 days ago

Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...

krunck

19 days ago

I had to extract the image from a PDF for it to work. Then run it on each page image extracted.

vivzkestrel

19 days ago

- how well do you think this ll work with code? i mean take code screenshots and convert it into actual code for vscode

mrkn1

19 days ago

Just ran

  textsnap "https://i.ytimg.com/vi/LBNDfxjEYlA/maxresdefault.jpg"
and got this

  $('.count').each(function () {
  $('this').prop('Counter', 0).animate({
    Counter: $('this').text()
  }, {
      duration: 4000,
      easing: 'swing',
      step: 'function (now) {
          $('this").text(Math.ceil(now));
      }
    }); 
  });

lavaman131

18 days ago

This is awesome! Been needing something like this for some research paper diagrams I've been indexing.

monosma

19 days ago

What was the reason for adopting PaddleOCR? Can other OCR models be used as well?

mrkn1

19 days ago

No reason other than their Q4 model working reasonably well and fast on my CPU laptop. Should work with any ONNX VLM model

garrett2558

19 days ago

Very cool, I'm building my own local-first product as well

mrkn1

19 days ago

thank you! what is it about?

kouru225

19 days ago

Roman alphabet only or does this work with other alphabets?

mrkn1

19 days ago

109 languages, including other alphabets.

BIGFOOT_EXISTS

19 days ago

Now this is legit cool, keep up the great work.

mrkn1

19 days ago

thank you!