OrioleDB Patent: now freely available to the Postgres community

235 pointsposted 5 hours ago
by tosh

79 Comments

iam_saurabh

an hour ago

Open-sourcing a patent in the database space is rare. Do you think this signals a shift where companies will realize that open ecosystems drive adoption faster than closed IP walls?

wslh

an hour ago

I think no-open-source is a no-go. In the "best" case it adds a lot of friction in a sales funnel for premium offerings. You can avoid open source in special cases, mostly without complementary offerings.

916c0553e164269

4 hours ago

from the blog: "The patent is intended as a shield, not a sword, to protect Open Source from hostile IP claims."

vs. the current license:

  "IF ANY LITIGATION IS INSTITUTED AGAINST SUPABASE, INC. BY A LICENSEE OF THIS SOFTWARE, THEN THE LICENSE GRANTED TO SAID LICENSEE SHALL TERMINATE AS OF THE DATE SUCH LITIGATION IS FILED."
( https://github.com/orioledb/orioledb/blob/main/LICENSE )

imho: the current wording might discourage state organisations, since even a trivial lawsuit (e.g. a minor tax delay) could terminate the licence - perhaps a narrower patent-focused clause would work better (or an OSI-approved licence?).

kiwicopple

4 hours ago

(Supabase ceo)

I’ll revisit this with legal to try make it clearer.

Our intentions here are clear - if people have examples that we can follow we will do what we can to make this irrevocable (even to the extent of donating the patent if/when the community are ready to bear the cost of the maintainance)

oefrha

2 hours ago

Facebook famously dropped Patents from their BSD + Patents for React and a bunch of other projects, and went MIT unencumbered.

https://engineering.fb.com/2017/09/22/web/relicensing-react-...

cyphar

11 minutes ago

The whole patents kerfuffle with Facebook was about a larger issue with their patent grant. Critically the issue was that it practically stopped you from suing Facebook for any patent issues (not just those granted for React, which would be more like the standard reactive termination clause), including counter-suits. Here is the key text from their patent license:

    The license granted hereunder will terminate, automatically and without notice,
    for anyone that makes any claim (including by filing any lawsuit, assertion or
    other action) alleging (a) direct, indirect, or contributory infringement or
    inducement to infringe any patent: (i) by Facebook or any of its subsidiaries or
    affiliates, whether or not such claim is related to the Software, (ii) by any
    party if such claim arises in whole or in part from any software, product or
    service of Facebook or any of its subsidiaries or affiliates, whether or not
    such claim is related to the Software, or (iii) by any party relating to the
    Software; or (b) that any right in any patent claim of Facebook is invalid or
    unenforceable.
And so that was a fairly justified reaction IMHO. However, MIT has _no_ patent protections and is strictly worse than almost any license with some patent protections for users included. The modern landscape of software patent trolls is far less insane than it was in the 90s but I would really think twice about using something that is likely patented under a license other than Apache-2.0, MPLv2, or GPLv3.

tux3

3 hours ago

Google has a strong patent shield situation with AV1. Despite burning interest from patent trolls, no one is going after AOMedia members directly.

nightpool

25 minutes ago

Agree with this—the A/V media system has some of the most active patent trolls around. https://aomedia.org/license/patent-license/

The relevant patent license is the following:

> 1.3. Defensive Termination. If any Licensee, its Affiliates, or its agents initiates patent litigation or files, maintains, or voluntarily participates in a lawsuit against another entity or any person asserting that any Implementation infringes Necessary Claims, any patent licenses granted under this License directly to the Licensee are immediately terminated as of the date of the initiation of action unless 1) that suit was in response to a corresponding suit regarding an Implementation first brought against an initiating entity, or 2) that suit was brought to enforce the terms of this License (including intervention in a third-party action by a Licensee).

916c0553e164269

3 hours ago

Appreciate the intent!

For practical adoption, especially in larger orgs, OSI-approved licences are much easier to get through legal review than custom ones.

kiwicopple

3 hours ago

The current license is PostgreSQL (which is OSI approved)

We could also change to MIT/Apache but we feel PostgreSQL is more appropriate given our intentions to upstream the code

crote

2 hours ago

> The current license is PostgreSQL

That's just not true. Your license[0] adds a clause to the Postgresql license[1]. This makes it a different license, which by extension also means it isn't OSI approved.

It's the same with the BSD licenses[2]: the 4-clause one is OSI-approved, whereas the 3-clause one is not. Turns out that one additional "all advertising must display the following acknowledgement" clause was rather important - and so is your lawsuit clause.

[0]: https://github.com/orioledb/orioledb?tab=License-1-ov-file

[1]: https://github.com/postgres/postgres?tab=License-1-ov-file

[2]: https://en.wikipedia.org/wiki/BSD_licenses#4-clause_license_...

nightpool

24 minutes ago

(er, surely it's the other way around? the 3-clause one is OSI approved and the 4-clause one is not)

Anyway, I'm not sure this is true. Having a separate software license + secondary patent grant license is very very common in open source projects where patent trolls are common. See e.g. https://aomedia.org/about/legal/

I would just put them in separate files and then you're good to go.

limagnolia

an hour ago

The PostgreSQL license does not have a termination clause, you added that. I see that you are trying to use the PostgreSQL license as the basis and simply add the patent clause onto it, but it fundamentally changes the license.

I hope you can look at the Apache 2 patent grant as a better clause- or even adopt something like Google's Additional IP License found here- https://www.webmproject.org/license/additional/, which doesn't modify the open source license but instead adds an additional grant as a separate license.

Supabase is doing great work, thank you!

tomnipotent

an hour ago

The existing Postgres license already has an "as is" disclaimer, so adding this clause means you want to _punitively_ punish companies that sue you for reasons outside of this software. The interpretation then is you want to punish users of your software that find themselves in a (potentially legitimate) situation to sue you over unrelated matters.

For example, if Supabase failed to pay a vendor that happened to use OrioleDB they wouldn't be able to sue you for damages without compromising their stack. That's uncool.

My take-away from the Facebook/React license issue was that the community agrees this violates the spirit of FOSS and invalidates claiming to be open source (at least OSI-approved), with many taking offense to the punitive nature of the clause.

Granted Facebook was in a position to see litigation over a lot more reasons.

gobdovan

3 hours ago

Can you acquire atlasgo too, or is that still on the secret roadmap?

kiwicopple

3 hours ago

we will have something to announce in this space within a few months

(if the atlasgo team are reading this feel free to reach out too)

jacquesm

2 hours ago

This is highly unprofessional.

916c0553e164269

4 hours ago

Apache 2.0 has a better patent clause - against hostile IP claims, so tax dispute is not terminate the OrioleDB license:

"If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed."

https://opensource.org/license/apache-2-0

crote

an hour ago

It also seems a lot less strict on what is being terminated.

On violation the Apache 2.0 license terminates the patent license. I might be mistaken, but that reads an awful lot like you're still allowed to use the software provided you do so in a way which doesn't violate the patent.

On the other hand, the OrioleDB license seems to terminate the entire license - so the way I read this it would include parts of the software which aren't covered under the patent itself.

crote

an hour ago

Does the current license even allow for friendly forks, or redistribution?

It starts off nice with the usual:

> PERMISSION TO USE, COPY, MODIFY, AND DISTRIBUTE THIS SOFTWARE AND ITS DOCUMENTATION FOR ANY PURPOSE, WITHOUT FEE, AND WITHOUT A WRITTEN AGREEMENT IS HEREBY GRANTED

.. but then there's the:

> HEREBY GRANTS A (..) LICENSE TO UNITED STATES PATENT NO. 10,325,030 TO MAKE, HAVE MADE, USE, HAVE USED, OFFER TO SELL, OFFERED TO SELL, SELL, SOLD, IMPORT INTO THE UNITED STATES, IMPORTED INTO THE UNITED STATES, AND OTHERWISE TRANSFER THIS SOFTWARE

.. which to me seems to be missing some kind of "modify" clause? Sure, it seems like you're allowing me to distribute it as-is the way a store like Amazon distributes boxes, but what happens when I start modifying the code and distributing those modifications? Is it still "this software", or has it become a derivative? Is the license I get to that patent even sublicensable? What happens to users of a fork when the forkee sues Supabase: do they also by extension lose their patent license?

The GPLv2, for example, has a clause stating that "Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor" which makes it very clear what happens. If you're adding a poison pill to open-source code, you really shouldn't be this sloppy: it should be painfully obvious to every reader what the implications are, or nobody will ever risk using it.

yellow_lead

4 hours ago

A shield for Supabase, not for us

Reubend

4 hours ago

So what? I don't see any conflict between what they said and what the license says. As they stated, it's being used as a shield. If you're suing them, you probably don't deserve a free license to their patented tech.

graemep

4 hours ago

The difference is that the license is terminated by ANY litigation against Supabase - e.g. if you sue them for breach of contract completely unrelated to the software.

Use as a shield would mean limiting it to patent litigation against a user of the software.

It also only covers litigation against Supabase - it does not provide a shield against litigation against OrioleDB users.

cwillu

4 hours ago

Or litigation from a future license violation

giancarlostoro

4 hours ago

Sounds like the MS-PL which Microsoft used to use but switched to MIT. MS-PL is basically MIT but cover your butt against patent litigation.

gallypette

5 minutes ago

It is time to realize that open source drives innovation.

fuzzy_biscuit

4 hours ago

I strongly dislike the idea of patenting data structures.

kiwicopple

4 hours ago

fwiw, this is not our m.o. - oriole was under development before we took on the maintenance/development.

Our goal now is to ensure that it’s as F/OSS as possible given the pre-existing conditions

gethly

2 hours ago

Software patents is such an americanism. In this case, I prefer Chinese approach to ignoring patent law altogether.

navigate8310

2 hours ago

That simply kills innovation and dries up funding for research.

Zetaphor

2 hours ago

China is far ahead of the US in many sectors, notably electric cars and solar panels which are two industries whose progress heavily depend on research and innovation.

throw0101d

an hour ago

> China is far ahead of the US in many sectors, notably electric cars and solar panels which are two industries whose progress heavily depend on research and innovation.

Ahead in production. Did China research/innovate/develop those industries, or were they 'just' fast followers? (Early in its history the US used the same 'tactics' relative to the UK and other European countries.)

henry700

2 hours ago

It's what I think too, BUT curiously is not the case for China. Imagine if the DeepSeek breakthroughs were patented and closed instead of published in the open. And here we are, and they're not patented and not built on patented technology.

samlambert

2 hours ago

This is not an open source license and it's untrue to say it's an open source project when it's licensed this way.

"IF ANY LITIGATION IS INSTITUTED AGAINST SUPABASE, INC. BY A LICENSEE OF THIS SOFTWARE, THEN THE LICENSE GRANTED TO SAID LICENSEE SHALL TERMINATE AS OF THE DATE SUCH LITIGATION IS FILED."

This is a poison pill. At best the licensing is naive and blocks even customers of Supabase from using OrioleDB, at worst it's an attempt for Supabase to provide themselves unlimited indemnity through the guise of a community project. It means the moment you sue Supabase for anything. Contract dispute, IP, employment, unrelated tort you lose the license. They could lose your data and if you try do anything about it they can immediately counter sue for a license violation. Using the brand of the postgres license to slip this in is pretty wild.

OrioleDB looks like a promising project and undoubtedly an great contribution from Supabase but it's not open source or really usable by anyone with this license.

jitl

2 hours ago

I recall Facebook had a similar rider on the React license for many years until eventually removing it. It’s visually similar to the Apache2 patent clause but not scoped to just the licensed software use

seveibar

an hour ago

Isn’t this just Apache 2-style permissive licensing?

samlambert

an hour ago

lol no they both read as permissive on the surface. apache 2 doesn't include a termination clause that broadly protects an entity against any litigation. this is incredibly broad and not community safe.

0xb0565e486

4 hours ago

I did not know you could patent data structures like that.

jonathaneunice

3 hours ago

IP owners often play the game of “patent what you can, threaten with the rest.” So you might not be able to strictly patent the way data is laid out, but specific, novel algorithms that update or manipulate that layout and improve what was possible before? Those can be understood as key steps of an “innovative process”—and courts have been willing to uphold process claims, especially when tied to what they understand are genuine technical improvements. Fighting even a marginal patent usually means a long, expensive slog with plenty of downside risk.

IANAL nor a patent judge, but this is my understanding after watching the space for some years.

thayne

24 minutes ago

At least this is how it works in the US. And in the US algorithms are (unfortunately) patentable. That is not the case in all countries.

wokkel

4 hours ago

You can in the US. Not so much in the rest of the world.

psychoslave

4 hours ago

That's juridiction dependent. Europe didn't allow such a thing last time I checked. But lobbying to do so as been recurrent on this topic, just like putting governmental backdoor everywhere. They will try until it passes. There should be legal penality for such a stubborn will to destroy civil liberty. At least in this case they can't play the card "think of the children, nazi pedophiles use this".

dkhenry

4 hours ago

I am super bullish on OrioleDB. It really seems like the next logical progression for scaling Postgres for 99% of all databases out there. I have been following their development for a while and running benchmarks to see if their performance claims are legitimate, and so far it has been amazing

https://airtable.com/app7jp5t0dEHyDpa8/shr00etqywoDW2N6N

kiwicopple

3 hours ago

Thanks for verifying the benchmarks. We’re close to a full RC, aiming for December

Just to add: if anyone wants to contribute (beyond code) benchmarking and stress-testing is very helpful for us

Sesse__

3 hours ago

I assume you get this a lot, but how much patching is left in PG 18?

kiwicopple

3 hours ago

We are tracking the patches here:

https://www.orioledb.com/docs#patch-set

The actual storage engine is written as an extension - these patches are mostly to improve the TAM API. If these are accepted by the community then it should be simpler for anyone to write their own Storage extensions.

I think (correctly) it will take a lot longer to upstream the extension - the PG community should take a “default no” attitude with big changes like this. Over time we will prove its worthiness (hopefully beyond just supabase - it would be good to collaborate with other Postgres providers)

Sesse__

3 hours ago

OK, so basically no big change with PG 18, and for the time being, one needs to basically your own Postgres?

Would be really nice with a pgdg package, as this is definitely the kind of thing I would want to test in a separate cluster :-)

btown

2 hours ago

Based on the README [0] and discussion [1] it seems like it might especially shine on high-write-volume workflows, with the implementation of anti-bloat measures. Do you have a sense for whether it would shine even further where those rows have large text/JSONB fields that might be TOASTed?

And more generally, curious if you have any sense for what might make up the "1%" of workflows this wouldn't be advisable for? Any downsides to be aware of?

[0] https://github.com/orioledb/orioledb?tab=readme-ov-file#orio...

[1] https://news.ycombinator.com/item?id=30462695 (2022)

dkhenry

an hour ago

I haven't explicitly tested how it handles TOASTed column's, but since there is an upcoming RC I will try it out next time. I don't generally like using JSONB/text columns for very large rows as they have other performance problems on the DB like causing lots of WAL write overhead.

In term of other workloads it might not be great for, all my testing has shown a great improvement in every workload I have thrown at it.

dangoodmanUT

2 hours ago

> OrioleDB tables don't support SERIALIZABLE isolation level.

This is an unfortunate limitation to be aware of when evaluating

https://www.orioledb.com/docs/usage/getting-started#current-...

akorotkov

12 minutes ago

We will eventually add the SERIALIZABLE isolation level to OrioleDB, but right now that's not our highest priority. Let me explain why. At first, SSI (serializable snapshot isolation) in PostgreSQL comes with significant shortcomings, including.

1) Overhead. SSI implies a heavyweight lock on any involved index page or heap tuple (even for reads). The overhead of SSI was initially measured at ~10%, but nowadays, scalability has gone much farther. These many HW-locks could slow down in multiple times a typical workload on a multicore machine.

2) SSI needs the user to be able to repeat any transaction due to serialization failure. Even a read-only transaction needs to be DEFERRABLE or might be aborted due to serialization failure (it might "saw impossible things" and needs a retry).

In contrast, typically it's not hard to resolve the concurrency problems of writing transactions using explicit row-level and advisory locks, while REPEATABLE READ is enough for reporting. Frankly speaking, during my whole career, I didn't see a single case where SERIALIZABLE isolation level was justified.

thayne

22 minutes ago

Eh. If you care about performance enough to use OrioleDB you probably also want to avoid SERIALIZABLE.

hardwaresofton

5 hours ago

Supabase consistently delivering massive value to the postgres ecosystem

8cvor6j844qw_d6

4 hours ago

Is OrioleDB just PostgreSQL but with some underlying modifications for cloud environments?

How does it compare with Neon DB?

LtdJorge

4 hours ago

It’s a different storage engine for Postgres

boxed

4 hours ago

The "cloud environments" part sounds like marketing fluff. "The cloud" is just someone else's servers after all. There's nothing special about it.

IgorPartola

4 hours ago

That’s like saying a chair is just a tree that has been modified. Technically true, practically there are some very specific differences.

throwaway894345

2 hours ago

What are the relevant differences? I’m not as cynical as the parent commenter, but I’m also unclear about what OrioleDB is doing that is meaningfully “CloudNative”. From skimming the main page, it seems like it’s just doing storage differently, but so far I’ve seen nothing to suggest that difference is “leveraging cloud services” or anything else.

IgorPartola

11 minutes ago

I am not familiar with this particular product but generally if you run on say AWS you either need to account for the greatly increased disk latency due to EBS being network storage or build provisions for local storage that is not necessarily unlimited, it is unclear what kind of disk controller it is attached to, etc. It could also mean optimizing for the AWS-specific CPU architecture. Or it could mean using S3 as storage which has yet different durability and consistency semantics compared to other storage systems. It might also mean optimizing for pricing of a given cloud provider in some way.

pbronez

4 hours ago

In this case, it seems to refer to their support for S3-compatible object storage as for persistence.

pbronez

4 hours ago

OrioleDB is new to me.

According to the docs, it “uses Postgres Table Access Method (TAM) to provide a pluggable storage engine for PostgreSQL. […] Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. Developers will be able to choose a storage method that is optimized for their specific needs: some tables could be configured for high transactional loads, others for analytics workloads, and still others for archiving.”

https://www.orioledb.com/docs

jitl

3 hours ago

> OrioleDB implements default 64-bit transaction identifiers

RDS support when

boxed

4 hours ago

The graphs for OrioleDB looks very impressive. Can someone give a counter argument to switching to this?

wwizo

4 hours ago

Oreole is not a plug-and-play yet. From their docs ( https://www.orioledb.com/docs ) > OrioleDB currently requires a set of patches to PostgreSQL to enhance the pluggable storage API and other PostgreSQL subsystems. All of these patches have been submitted to the PostgreSQL community and are under review.

Sesse__

4 hours ago

You get basically most of the advantages of a B-tree-oriented table, but also most of the disadvantages AFAIK. In particular, any index lookup/scan is going to take twice as long (index blocks don't point to the place on disk, they just contain the primary key and then you need to go lookup _that_ in the actual table).

akorotkov

3 hours ago

This is generally true, but there are some additional aspects.

1. With PostgreSQL heap, you need to access the heap page itself. And it's not for free. It goes all through the buffer manager and other related components.

2. In OrioleDB, we have a lightweight protocol to read from pages. In-memory pages are connected using direct links (https://www.orioledb.com/docs/architecture/overview#dual-poi...), and pages are read lock-less (https://www.orioledb.com/docs/architecture/overview#page-str...). Additionally, tree navigation for simple data types skips both copying and tuple deforming (https://www.orioledb.com/blog/orioledb-fastpath-search).

According to all of the above, I believe OrioleDB still wins in the case of secondary key lookup. I think this is indirectly confirmed by the results of the TPC-C benchmark, which contains quite a lot of log of secondary key lookups. However, this subject is worth dedicated benchmarking in the future.

Sesse__

3 hours ago

It would be interesting to see how OrioleDB does with more OLAP-like loads. From when I spent a lot of time benchmarking this, the indirect index design was _the_ main reason why MySQL+InnoDB was losing significantly to Postgres on TPC-H (well, DBT-3).[1] There was a lot of working around it with covering indexes etc..

Of course, the flip side of the coin is that if you do an UPDATE of a row in the presence of a secondary index, and the UPDATE doesn't touch the key, then you don't need to update the index(es) at all. So it really depends on how much you update rows versus how often you index-scan them IME.

[1] TPC-H doesn't have difficult enough queries to really stress the planner, so it mattered comparatively less there than in other OLAP work.

akorotkov

3 hours ago

Thank you, that would be on the TODO list.

jitl

3 hours ago

That’s how regular Postgres b-tree indexes work too.

Sesse__

3 hours ago

I'll take a [citation needed] on that one.

jitl

an hour ago

https://www.postgresql.org/docs/current/indexes-index-only-s...

This is why Postgres b-tree indexes offer CREATE INDEX (indexCol1, indexCol2, ...) INCLUDE (includeCol1, includeCol2, ...). With INCLUDE, the index will directly store the listed additional columns, so if your query does `SELECT includeCol1 WHERE indexCol1 = X AND indexCol2 > Y`, you avoid needing to look up the entire row in the heap, because includeCol1 is stored in the index already. This is called a "covering index" because the index itself covers all the data necessary to answer the query, and you get an "index only scan" in your query plan.

The downside to creating covering indexes is that it's more work for Postgres to go update all the INCLUDE values in all your covering indexes at write time, so you are trading write speed for increased read speed.

I think it's quite typical to see this in SQL databases. SQLite behaves the same way for indexes; the exception is that if you create a WITHOUT ROWID table, then the table itself is sorted by primary key instead of by ROWID, so you get at most 1 index that maps directly to the row value. (sqlite docs: https://sqlite.org/withoutrowid.html)

Sesse__

40 minutes ago

That link directly contradicts what you are saying.

> This means that in an ordinary index scan, each row retrieval requires fetching data from both the index and the heap.

Note that it says _index and the heap_. Not _index and the primary index and the heap_. (For a B-tree-organized table, the leaf heap nodes are essentially the bottom of the primary index, so it means that to find anything, you need to follow the primary index from the top, which may or may not entail extra disk accesses. For a normal Postgres heap, this does not happen, you can just go directly to the right block.)

Index-only scans (and by extension, INCLUDE) are to avoid reaching into the heap at all.

> The downside to creating covering indexes is that it's more work for Postgres to go update all the INCLUDE values in all your covering indexes at write time, so you are trading write speed for increased read speed.

For updates, even those that don't touch INCLUDE values, Postgres generally needs to go update the index anyway (this the main weakness of such a scheme). HOT is an exception, where you can avoid that update if there's room in the same heap block, and the index scans will follow the marker(s) you left to “here's the new row!” instead of fetching it directly.

akorotkov

2 hours ago

Yep, regular PostgreSQL indexes point to a heap location (block number + offset). And it is the same for primary and secondary indexes.

awaseem

2 hours ago

Honestly so amazing! Supabase doing great work as always