alcroito
3 days ago
I wish PURL proposed something sensible or at least usable for tracking C / C++ native libraries, that are NOT hosted on a registry like conan.io, or one of the linux distro registries, but is still (self-)hosted somewhere online.
For libraries that are hosted on `github`, there's at least the github type.
But there is no official `gitlab` or `git` type, and i've read comments that even the `github` type is considered a mistake.
One example of such a library could be a Qt or KDE / Plasma library.
They are hosted on their own forges, https://code.qt.io/ and https://invent.kde.org respectively.
So to the more knowledgeable people out there, what is the PURL way of identifying a C++ library like that?
Is `generic` type + vcs_url qualifier really the only way?
Right now it seems impossible to track vulnerabilities for such libraries with OSS / open tools, because none of the open tools or databases support a custom type or registry or ecosystem.
For example none of services here support some custom C++ ecosystem (putting aside conan):
https://docs.dependencytrack.org/analysis-types/known-vulner...
giantrobot
3 days ago
Something else PURLs don't capture well for native libraries is any sort of build configuration. I don't know of any clear way in a PURL to describe a say Debian package built from a src package with a custom set of compiler options.
For Java and interpreted language packages the "build" configuration is less important or non-existent. For compiled packages the build environment is important.
It seems the only way is to use a custom namespace and abuse the qualifiers but then you've got a non-canonical PURL and its utility in things like SBOMs is limited.
pombreda
2 days ago
Good point, but that's may not be in scope either... since this is not even something you can get from Debian easily: not just looking at a Debian pool or diving into a package control files AFAIK?
Say I rebuild a Debian package with some new build options.
Is this a the same or a new package? I'd say a new one.
Is this the same name? I'd say a new one.
Is this distributed by Debian? Nope, so this comes from another repo and pool, right?
The idea with PURL is to have simple and short PURLs for the common case, and make it possible to handle less common cases. Rebuilding a package and sharing it on another repo would be a less common case to me? WDYT?
giantrobot
2 days ago
I've worked with ingesting and generating SBOMs a bit which is where my experience with PURLs come from. I loved the idea because it gets about 80% to usefully identifying software components. So just to be clear I don't dislike them and think you've done good work.
I don't necessarily agree that a site-built package is a different package. It's just a single line of text might not be enough to encode build configurations.
A binary package built by Debian's build fleet is a unique artifact signed by the project's keys. It's a thing with a canonical identifier. A deb-src, Gentoo package, or FreeBSD port might have a canonical identifier for the original source but that isn't canonical once it's built on a machine. In many cases the difference is immaterial but there's a lot of #IFDEFs in a lot of code. Then whatever autoconf generates for any system.
The canonical source distribution is useful information but then so is the build information. I'm not sure this can be captured via qualifiers, at least I can't think of a way to do it.
Maybe just a source package is enough. For reporting a bug or CVE knowing something came from a particular source package is a start to triaging an issue. But you'd want a distinct namespace for source packages. A source package namespace at least tells you "in summary this package contains all the diffs Debian uses" versus the PURL for the upstream source package (from GitHub etc).
pabs3
3 hours ago
Nitpick: Debian does not sign binary packages, they sign Release files, which contain hashes of Packages files, which contain hashes of .deb binary packages.
Debian uses .buildinfo files for builders to record the information about the inputs to building a binary package, including the source hashes, environment variables etc.
A site-built package could be a different package, but it could also be a bit-identical package, due to Debian working on Reproducible Builds.
pombreda
a day ago
You wrote:
> It's just a single line of text might not be enough to encode build configurations.
that's the tough part, and IMHO outside of PURL? ... Note that for C/C++ code ... @alcroito mentions cps in the same comment page at https://news.ycombinator.com/item?id=44196246 ... and a quick glance is that this attempts to capture these details may be?
So it could be a happy combo?
pombreda
3 days ago
You wrote:
> So to the more knowledgeable people out there, what is the PURL way of identifying a C++ library like that?
That's a blind spot. This is a real problem for every as you rightfully explained.
So I have been thinking a lot about how to track C/C++ native libraries, and I have been working on a plan to deal with this.
You can read a summary there (that I just posted to supply this discussion!) - https://github.com/aboutcode-org/www.aboutcode.org/issues/30
And this comment links to more detailed work-in-progress planning doc: - https://github.com/aboutcode-org/www.aboutcode.org/issues/30...
If you want to chip in and help, this would be awesome.
And IMHO, aligned with your thinking this should not be tied to a build system or a for-profit operation like conan.io, or a linux distro, or for that matter a specific build tool or approach as they are so many, and be self-hosted, easy to sync, and simple to store in a git repo.
alcroito
3 days ago
Thanks for the links! I hope the proposal works out. I skimmed through the doc, and one thing i’d suggest is to consider using the CPS format rather than the ABOUT one for the metadata. The format is driven by Kitware, the developers of cmake, and thus if it’s contributed to them, a big chunk of the cpp ecosystem would get buy-in just because of the intertia of using cmake, and getting it for free with the tool.
https://cps-org.github.io/cps/overview.html
I’m not sure how I can help, but I’m open for discussion, because the company i work for is also interested in how to handle this well for our products.
pombreda
2 days ago
let's chat. There is a really a lot of folks interesting because of the suffering! ABOUT is just a suggestion, and an TIL about cps and it looks awesome! pombredanne@aboutcode.org Or a comment on the issue or doc linked.
pombreda
3 days ago
Note that there should be a gitlab type as it is planned for: https://github.com/package-url/purl-spec/blob/a90ee02679afc3...
gitlab and github do provide package-like discoverability. Do you have a pointer that says a github package is a mistake?
alcroito
2 days ago
I believe i was thinking of the comments at https://github.com/package-url/purl-spec/issues/59 but I see you've already replied there.
donenext
3 days ago
completely agree here `git` type using the namespace of your choice would be plenty to enable tools to find these packages. Even though its not "officially" supported in the spec this is what we do internally
pombreda
3 days ago
IMHO, a bare git stuff would be a git URL as specified in pip and SPDX and not a PURL... I would be interested to know more about your use case. Feel free to drop a note at pombredanne@aboutcode.org