From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: Marta Rybczynska <rybczynska@gmail.com>
Cc: benjamin.robin@bootlin.com,
openembedded-core@lists.openembedded.org, ross.burton@arm.com,
peter.marko@siemens.com, jpewhacker@gmail.com,
olivier.benjamin@bootlin.com, antonin.godard@bootlin.com,
mathieu.dubois-briand@bootlin.com, thomas.petazzoni@bootlin.com
Subject: Re: [OE-core] [PATCH RFC 0/2] sbom-cve-check: Download CVE DB using BitBake fetcher
Date: Thu, 19 Mar 2026 07:52:13 +0000 [thread overview]
Message-ID: <793e23609ccbbd3e139136ee8243d6ed2d116a55.camel@linuxfoundation.org> (raw)
In-Reply-To: <CAApg2=T6f1yQ8qQFF+V3K+0yhpZWscDUqO+pXA2MK0jik3C4Yw@mail.gmail.com>
On Thu, 2026-03-19 at 08:29 +0100, Marta Rybczynska wrote:
>
>
> On Wed, Mar 18, 2026 at 6:45 PM Richard Purdie via lists.openembedded.org <richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote:
> > Hi,
> >
> > On Mon, 2026-03-09 at 12:57 +0100, Benjamin Robin via lists.openembedded.org wrote:
> > > This series is an RFC and a follow-up to patch 6/6 ("Add class for
> > > post-build CVE analysis"), which was previously discussed [1].
> > > I have prepared two RFC series, this one and another, each exploring
> > > different approaches to handling the download of CVE databases.
> > >
> > > I explored using BitBake's internal fetcher instead of direct Git calls
> > > for fetching CVE databases. However, I encountered two major issues:
> > >
> > > - No proper shallow clone support: I wanted to clone the repository
> > > without downloading the entire history (which is very large). While
> > > `BB_GIT_SHALLOW` exists, it creates multiple tarballs in the download
> > > directory, which is inefficient for updates.
> > >
> > > In this series, we are going to do a full clone of the git repository,
> > > so this point is not going to be fixed.
> > >
> > > - Performance overhead for CVE databases deployment: The recipes
> > > downloading CVE databases must copy them to the sysroot or to the
> > > deploy directory. This requires copying the extracted databases
> > > multiple times, even with hard links, which is slow due to the
> > > combined size (~6 GB, ~672,000 small files).
> > >
> > > In this series, we are using a custom deploy task that is going to
> > > copy the git repository using rsync directly in the final deploy
> > > directory, by-passing all the Bitbake logic.
> > >
> > > Additionally, there's no built-in way to control the interval between
> > > CVE database fetches: In this series, we are going to use AUTOREV,
> > > which imply to query the git repositories for each build, to check if
> > > there is a new git revision.
> > >
> > > Moreover, this series ensures that the CVE analysis runs only when
> > > the original SBOM changes or when the CVE databases are updated.
> > >
> > > Upon revisiting the class and its associated recipes, I identified
> > > several areas for improvement, which were fixed in the first commit.
> > > This series also includes a second commit making the VEX class optional
> > > rather than mandatory.
> > >
> > > [1] https://lore.kernel.org/all/20260226-add-sbom-cve-check-v3-0-2e60423f4d35@bootlin.com/
> >
> > I've just been trying to work out where we're at with this coming up to
> > release and we need to get this resolved.
> >
> > I feel quite strongly that we need to use the fetcher for obtaining
> > this data. "fetching" isn't trivial and is full of
> > license/auditing/sbom issues. Making any exception to that, even for
> > cve data tends to become problematic later.
> >
> > The existing approach was only done as it was a sqlite database and we
> > didn't have fetcher support for such a thing. If we need to improve the
> > git fetcher in some way to better support this use case (e.g. shallow
> > clone update efficiency), that is something we can work on.
> >
> > As such, I was wondering if you had never versions of these patches?
> >
> > I'd note that we can't set AUTOREV by default, we'll need to specify a
> > revision, and document how the user can enable AUTOREV in their config
> > (maybe even a config fragment?). Whilst it is annoying to do that, it
> > is a requirement that the system doesn't touch the network outside
> > mirrors unless configured to.
>
>
> Fetching the complete git repos has a number of problems. Why not use release
> tarballs like those in https://github.com/CVEProject/cvelistV5/releases ?
> Fkie feeds also have them https://github.com/fkie-cad/nvd-json-data-feeds/releases
FWIW we can shallow clone git repos, it is just isn't optimal in how
updates are handled which was Benjamin's concern as the shallow clones
end up more like tarballs.
If we use the bitbake fetcher, it also makes it much easier to actually
use tarballs directly too, since the fetcher also supports those and it
just becomes a simple SRC_URI change.
Cheers,
Richard
next prev parent reply other threads:[~2026-03-19 7:52 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-09 11:57 [PATCH RFC 0/2] sbom-cve-check: Download CVE DB using BitBake fetcher Benjamin Robin
2026-03-09 11:57 ` [PATCH RFC 1/2] " Benjamin Robin
2026-03-09 11:57 ` [PATCH RFC 2/2] sbom-cve-check: VEX class is no longer mandatory Benjamin Robin
2026-03-18 17:45 ` [OE-core] [PATCH RFC 0/2] sbom-cve-check: Download CVE DB using BitBake fetcher Richard Purdie
2026-03-19 7:29 ` Marta Rybczynska
2026-03-19 7:52 ` Richard Purdie [this message]
2026-03-19 9:07 ` Benjamin Robin
2026-03-19 9:57 ` Benjamin Robin
2026-03-19 8:45 ` Benjamin Robin
2026-03-19 8:58 ` Marta Rybczynska
2026-03-19 9:48 ` Benjamin Robin
2026-03-19 12:00 ` Marta Rybczynska
2026-03-19 12:03 ` Benjamin Robin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=793e23609ccbbd3e139136ee8243d6ed2d116a55.camel@linuxfoundation.org \
--to=richard.purdie@linuxfoundation.org \
--cc=antonin.godard@bootlin.com \
--cc=benjamin.robin@bootlin.com \
--cc=jpewhacker@gmail.com \
--cc=mathieu.dubois-briand@bootlin.com \
--cc=olivier.benjamin@bootlin.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=peter.marko@siemens.com \
--cc=ross.burton@arm.com \
--cc=rybczynska@gmail.com \
--cc=thomas.petazzoni@bootlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox