samuelm2
5 hours ago
Hey all!
A few months ago, I launched vid2scene.com, a free platform for creating 3D Gaussian Splat scenes from phone videos. Since then, it's grown to thousands of scenes being generated by thousands of people. I've absolutely loved getting to talk to so many users and learn about the incredible diversity of use cases: from earthquake damage documentation, to people selling commercial equipment, to creating entire 3D worlds from text prompts using AI-generated video (a project using the vid2scene API to do this won a major Supercell games hackathon just recently!)
When Meta's Horizon Hyperscape come out, I was impressed by the quality. But I didn't like the fact that users don't control their data. It all stays locked in Meta's ecosystem. So I built a UX for scanning called OpenQuestCapture. It is an open source, MIT licensed Quest 3 reconstruction app.
Here's the GitHub repo: https://github.com/samuelm2/OpenQuestCapture
It captures Quest 3 images, depth maps, and pose data from the Quest 3 headset to generate a point cloud. While you're capturing, it shows you a live 3D point cloud visualization so you can see which areas (and from which angles) you've covered. In the repo submodules is a Python script that converts the raw Quest sensor data into COLMAP format for processing via Gaussian Splatting (or whatever pipeline you prefer). You can also zip the raw Quest data and upload it directly to vid2scene.com/upload/quest/ to generate a 3D Gaussian Splat scene if you don't want to run the processing yourself.
It's still pretty new and barebones, and the raw capture files are quite large. The quality isn't quite as good as HyperScape yet, but I'm hoping this might push them to be more open with Hyperscape data. At minimum, it's something the community can build on and improve.
There's still a lot to improve upon for the app. Here are some of the things that are top of mind for me:
- An intermediary step of the reconstruction post-process is a high quality, Matterport-like triangulated colored 3D mesh. That itself could be very valuable as an artifact for users. So maybe there could be more pipeline development around extracting and exporting that.
- Also, the visualization UX could be improved. I haven't found a UX that does an amazing job at showing you exactly what (and from what angles) you've captured. So if anyone has any ideas or wants to contribute, please feel free to submit a PR!
- The raw quest sensor data files are massive right now. So, I'm considering doing some more advanced Quest-side compression of the raw data. I'm probably going to add QOI compression to the raw RGB data at capture time, which should be able to losslessly compress the raw data by 50% or so.
If anyone wants to take on one of these (or any other cool idea!), would love to collaborate. And, if you decide to try it out, let me know if you have any questions or run into issues. Or file a Github issue. Always happy to hear feedback!
Tl;dr, try out OpenQuestCapture at the github link above