Launch HN: Panora (YC S24) – Data Integration API for LLMs

100 pointsposted 10 months ago
by nael_ob

20 Comments

swyx

10 months ago

as a former yc data integration startup employee... this is a very very very challenging business. i'd consider a different business if i were you. being "x but open source" isn't all it is cracked up to be. i dont really care enough to elaborate but please do feel free to tell me if i am wrong in five years.

memhole

10 months ago

Only a data engineer here, but I agree. If I’m understanding right, your competitors are more likely Fivetran/Airbyte, and Snowflake/Databricks. There’s lots of money and marketing you’ll need to overcome. Mage reminds me of the Zero ETL idea; which seems to be getting popular again with companies like Trino.

nael_ob

10 months ago

I must ask you some arguments... it'll be helpful to have your detailed thoughts

undefinedblog

10 months ago

i’m with all my ears, please share your thoughts!

thelittleone

10 months ago

Sound's like your hosted version will end up with a lot of potentially sensitive information. You will probably want to add ISO 27001 and / or SOC 2 Type 2 as a priority. Not to say an org with that is more secure than one without, but you will certainly need to evidence a comprehensive security program to pass procurement. Choosing what third parties you add now (libraries, platforms etc) can save you a TON down the road.

nael_ob

10 months ago

Yes you're right. SOC 2 is our priority for the next few weeks. If you have experience in enterprise sales, I'd love to chat (nael@panora.dev). Thanks.

user

10 months ago

[deleted]

thelittleone

10 months ago

> nael@panora.dev

Happy to chat. Email sent.

tayloramurphy

10 months ago

The other open source option for this that I'm familiar with is Nango[0]. How are you different?

Also, a big challenge in this space is pricing. How are you thinking about tackling that?

[0] https://github.com/nangoHQ/nango

nael_ob

10 months ago

Yes they built a cool product! Actually, we aim to focus on companies feeding their LLMs by providing embeddings and chunkings out of the box on top of all the data we sync. We don't only help you connect with 3rd parties but also receive data that can be interpreted for AI use cases (e.g: RAG).

thenaturalist

10 months ago

The optimal chunking strategy is often highly, highly dependent on the data used and questions to be answered.

The net is plastered with blog posts about optimal strategies, of which there seem to be more than 10 and new approaches popping up often.

It seems consensus that trial and error is the way to go to optimize cost and performance.

How do you plan to tackle this when providing it out of the box?

nael_ob

10 months ago

That's why we wanted to try the OSS approach where contributors can help keep up with the optimal strategy. We also plan to build an engine to test each strategy and compare retrieval perf before choosing one at runtime.

rflih96

10 months ago

Hey - for pricing, we're going usage based on two metrics : amount of third-party connections and volume of data transformed (for chunking / embedding). Ps: This will evolve in the next months probably!

gavmor

10 months ago

ctrl + f "tool calling"

ctrl + f "function calling"

Have these terms already become passe, or not yet caught on? Or are they an implementation detail which Panora seeks to gracefully elide?

Edit: Oh, very cool, though. I'm envious, in fact.

nael_ob

10 months ago

We first wanted to let ppl handle it since our API already provides the necessary abstraction to extract/write data, then doing "function calling" is just a matter of plugging the right API calls. Really curious to have your thoughts on whether that should be something we'll have to expose as well.

zkid18

10 months ago

any differences from nango or supaglue?

nael_ob

10 months ago

Yes, we aim to focus on companies with their LLMs by providing embeddings and chunkings out of the box on top of all the data we sync across different software.

zkid18

10 months ago

sounds like a neat use-case!

ilrwbwrkhv

10 months ago

This is one of the better open source but business projects as it has running it on my machine front and center. I still don't know why you are tying yourself with VCs but well done.