Show HN: YouTube Transcript Optimizer – Turn Videos into Polished Documents

3 pointsposted 6 hours ago
by eigenvalue

2 Comments

skeptrune

an hour ago

Is there a market for making folks' content libraries searchable? I imagine creators might be interested in what they have posted for a given topic in the past to ease writing time for new content.

Lots of folks have asked us to make search for podcasts/youtube-channels and we've tried it with the raw transcripts but it doesn't work too well.

Chunking it into semantic pieces to put in the search index by sentence splitting or other naive techniques isn't great and I have not seen a product which can do speaker recognition out of the box.

Speaker recognition for multi-speaker podcasts is probably the best chunking technique for those. However, I think you have the best one for this style of educational content.

Also, cool project!!!

eigenvalue

36 minutes ago

Yes, I think that's one of the use cases for my project. If you're a YouTube creator, you've already invested a lot of care and energy into making your videos. If you could easily convert those videos in a fully automated way to written documents, complete direct transcripts, and other content like quizzes, then you can add those to your website and it should help you rank higher with search engines.

Once you have the complete direct transcripts and the optimized written documents, I think you could just use regular text search on that and it would work well-- something like Elastic or Algolia for a hosted option would work great. Even a prolific YouTuber probably isn't going to have more than a couple thousand pages worth of text to search through. But yeah, I guess you could also build semantic search on top of that.

I don't think speaker identification is that important for most videos-- they tend to just have a single narrator/speaker. In any case, the written documents that my tool creates just sort of ignores that aspect and turns it into more expository writing that conveys the same information. It's also hard to make a fully automated tool that does speaker identification where you know the identity of the speaker and it's not just Speaker1, Speaker2.

Thanks for your feedback! You should try submitting a video, it gives you free credits just for signing up so you can try it with a few videos.