Future of CVE scanning in Yocto

Openembedded Core Discussions
 help / color / mirror / Atom feed

* Future of CVE scanning in Yocto
@ 2025-03-07 14:36 Ross Burton
  2025-03-10 15:37 ` [Openembedded-architecture] " Marta Rybczynska
  2025-03-11 11:00 ` [OE-core] " Antonin Godard
  0 siblings, 2 replies; 3+ messages in thread
From: Ross Burton @ 2025-03-07 14:36 UTC (permalink / raw)
  To: openembedded-core, openembedded-architecture

Hi,

[ this turned into a wall of text, I clearly need to go and buy The Art of Explanation.  Feel free to skim down to tl;dr ]

This isn’t going to be a surprise to anyone who has been keeping up with the CVE situation, but there’s been problems and changes at NIST related to the National Vulnerability Database (NVD).  For example, The Register[1] had an article last year discussing the almost complete stall of processing new issues, and the NVD news site[2] talks about how they are attempting to deal with the backlog, but also admits that they’re struggling with API stability and integrating data at an acceptable speed.

Unfortunately, our cve-check class uses the NVD database because at the time it was by far the best open solution: it was easily downloaded, the data had machine-readable metadata identifying what software was affected (called CPEs), and the CPEs were easily updated if they were incorrect by just sending an email.  The “upstream” CVE project (cve.org, ran by MITRE) at the time didn’t have a convenient way to batch-download the database and didn’t include any CPE information so it wasn’t really an option.

The end result is that the security reports generated by our cve-check class may look reasonable, but they’re missing large numbers of issues as the CPEs are missing for the scanner to work.  Randomly picked example is https://nvd.nist.gov/vuln/detail/CVE-2024-53589: the human-readable summary says this is an issue in objdump 2.43 but there’s no CPE, so cve-check cannot process it.

This is a problem we need to mitigate: the reports _look_ good but the reality is unknown.

The good news is that the CVE Project has been quietly catching up.  The database is available in a git repository on GitHub[3] and there’s an “vulnrichment” program to add more metadata to all future CVEs and backfilling it as needed.  This means we’re now in a situation where newer CVEs often have no CPE in the NVD[4] but have a CPE in the CVE database[5], as they’re pulling more information from the provider (GitHub in the case of this specific issue).

So there’s an argument to be made that the cve-check class should be ported to use the CVE database.  However, the cve-check class itself is quite limited because it only runs at build time: there’s no way to take the output of a build and re-run the security scanner a year later without involving bitbake again.

I believe that, like many things, SPDX3 can help us here.  SPDX3 can contain vulnerability information to the extent that it’s simple to extract it as OpenVEX if that’s the sort of thing you want[6].  Our SPDX contains for every recipe:
- All known CPEs
- Details of issues that are mitigated with CVE_STATUS
- Details of issues that are mitigated with patches[7]

This means we have a single file that can act as both the image manifest and vulnerability list, so with some suitable tooling (using the CVE database) that can be used standalone or directly post-build we can have security reports at build time or periodically after the build.  This would then obsolete both the cve-check and vex classes, and associated recipes.

tl;dr:  I propose:
- For the upcoming (April 2025) release we should add to the release notes that we’re aware that NVD is struggling, this is what our CVE tooling is based on, and we’re working on a solution for the October 2025 release.
- The future of CVE tooling should revolve around SPDX3 SBOMs. If there’s any missing metadata that isn’t in the SPDX already, we should add it.
- We should ideally find, or alternatively write, a tool that can do automated CVE detection sweeps like cve-check does using the CVE Database and SPDX3. This tool should be able to be run at both build time inside a Yocto build, or on a separate system with just the SPDX as an input.
- Automated tooling is just a piece of the puzzle, but it’s a useful piece.  More on that later...

Cheers,
Ross

[1] https://www.theregister.com/2024/03/22/opinion_column_nist/
[2] https://www.nist.gov/itl/nvd
[3] https://github.com/CVEProject/cvelistV5
[4] https://nvd.nist.gov/vuln/detail/CVE-2025-26603
[5] https://www.cve.org/CVERecord?id=CVE-2025-26603
[6] https://github.com/JPEWdev/spdx3toopenvex
[7] Once https://lore.kernel.org/openembedded-core/20250306212007.44880-1-JPEWhacker@gmail.com/T/#u is merged

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Openembedded-architecture] Future of CVE scanning in Yocto
  2025-03-07 14:36 Future of CVE scanning in Yocto Ross Burton
@ 2025-03-10 15:37 ` Marta Rybczynska
  2025-03-11 11:00 ` [OE-core] " Antonin Godard
  1 sibling, 0 replies; 3+ messages in thread
From: Marta Rybczynska @ 2025-03-10 15:37 UTC (permalink / raw)
  To: ross.burton; +Cc: openembedded-core, openembedded-architecture

[-- Attachment #1: Type: text/plain, Size: 9094 bytes --]

On Fri, Mar 7, 2025 at 3:36 PM Ross Burton via lists.openembedded.org
<ross.burton=arm.com@lists.openembedded.org> wrote:

> Hi,
>
> [ this turned into a wall of text, I clearly need to go and buy The Art of
> Explanation.  Feel free to skim down to tl;dr ]
>

Well.. the answer will be also a wall of text...

>
> This isn’t going to be a surprise to anyone who has been keeping up with
> the CVE situation, but there’s been problems and changes at NIST related to
> the National Vulnerability Database (NVD).  For example, The Register[1]
> had an article last year discussing the almost complete stall of processing
> new issues, and the NVD news site[2] talks about how they are attempting to
> deal with the backlog, but also admits that they’re struggling with API
> stability and integrating data at an acceptable speed.
>
> Unfortunately, our cve-check class uses the NVD database because at the
> time it was by far the best open solution: it was easily downloaded, the
> data had machine-readable metadata identifying what software was affected
> (called CPEs), and the CPEs were easily updated if they were incorrect by
> just sending an email.  The “upstream” CVE project (cve.org, ran by
> MITRE) at the time didn’t have a convenient way to batch-download the
> database and didn’t include any CPE information so it wasn’t really an
> option.
>
> The end result is that the security reports generated by our cve-check
> class may look reasonable, but they’re missing large numbers of issues as
> the CPEs are missing for the scanner to work.  Randomly picked example is
> https://nvd.nist.gov/vuln/detail/CVE-2024-53589: the human-readable
> summary says this is an issue in objdump 2.43 but there’s no CPE, so
> cve-check cannot process it.
>
> This is a problem we need to mitigate: the reports _look_ good but the
> reality is unknown.
>

To add more context: it was NVD that was adding CPE data to CVE entries,
and also creating CPE information for projects that didn't have one.
cve-check runs under an assumption that there's CPE identifying the product
for each CVE. The reality is that it isn't the case any longer. The NVD
backlog is more than 20k entries right now, and growing. That is
frustrating: the CVE exists, but if you have one from the backlog,
cve-check won't show it.

>
> The good news is that the CVE Project has been quietly catching up.  The
> database is available in a git repository on GitHub[3] and there’s an
> “vulnrichment” program to add more metadata to all future CVEs and
> backfilling it as needed.  This means we’re now in a situation where newer
> CVEs often have no CPE in the NVD[4] but have a CPE in the CVE database[5],
> as they’re pulling more information from the provider (GitHub in the case
> of this specific issue).
>

There are two fields to look into. One is the product/vendor field that is
mandatory (some vendors unfortunately put "n/a" in there) and quite
recently they have added a possibility to add a CPE. With the effort the
CVE programme is putting into this, we can reasonably assume that the
situation is going to improve in the future, at least for new entries.
According to the CVE programme rules, modification of an existing record is
a little more complicated that it was with NVD (it has to be either the
issuer of the entry, or an "ADP" of which there is only one today - CISA).

>
> So there’s an argument to be made that the cve-check class should be
> ported to use the CVE database.  However, the cve-check class itself is
> quite limited because it only runs at build time: there’s no way to take
> the output of a build and re-run the security scanner a year later without
> involving bitbake again.
>
> I believe that, like many things, SPDX3 can help us here.  SPDX3 can
> contain vulnerability information to the extent that it’s simple to extract
> it as OpenVEX if that’s the sort of thing you want[6].  Our SPDX contains
> for every recipe:
> - All known CPEs
> - Details of issues that are mitigated with CVE_STATUS
> - Details of issues that are mitigated with patches[7]
>

> This means we have a single file that can act as both the image manifest
> and vulnerability list, so with some suitable tooling (using the CVE
> database) that can be used standalone or directly post-build we can have
> security reports at build time or periodically after the build.  This would
> then obsolete both the cve-check and vex classes, and associated recipes.
>

From the security person point of view, I think that the ideal solution
would have to have an SBOM with the list of packages and all other
immutable data and the vulnerability information (VEX) separately. The main
reason is that the VEX file will change every day, while the list of
packages won't. There are more and more cases when people ask for signed
SBOMs. Re-signing a modified file is clearly problematic. Also, SPDX (as
all SBOM standards) support a separate VEX file. I also expect people to
have a need to modify the VEX files to take into account specifics of their
board and configuration, for example marking some CVEs are not exploitable
because of the compiler options, configuration options of the package,
kernel hardening etc.

Also, the metadata we need like CVE_STATUS et al, currently do not always
have an existing representation in VEX. VEX standards have been written for
final security analysts who analyze the complete product. They do not cover
certain cases that need to be transitional for the "cve-check" to work
(database error, for example). At least I do not know how to do it. I've
talked with multiple people about this problem and the current solution
they propose is to use custom extensions. This is unfortunate in the case
of composition, but there seems to be no better way.

However, I would really like to have a discussion about the complete view
of usage of SPDX and VEX. If everyone wants only to "cve-check", fine for
me. I see other use cases showing up too. Like a compositional analysis
when it is only a component of a bigger solution and needs to be aggregated
in a bigger "cve-check".

>
> tl;dr:  I propose:
> - For the upcoming (April 2025) release we should add to the release notes
> that we’re aware that NVD is struggling, this is what our CVE tooling is
> based on, and we’re working on a solution for the October 2025 release.
>

And swap the source to a different fetcher, like the FKIE. And tell them we
are doing the switch. I've been wondering for months how big a load we
cause to NVD servers.

> - The future of CVE tooling should revolve around SPDX3 SBOMs. If there’s
> any missing metadata that isn’t in the SPDX already, we should add it.
>
We also need to fix the cases when there was no appropriate SPDX generation
when we wrote yocto-vex-check. The most notable case for SPDX2 was the
world build. The dependency graph is also something to look into -
"cve-check" allows rapid generation of CVE data for a world build, as it
doesn't need to do a build. This wasn't the case with SPDX2. A world build
that contains a complete build is well...long. And not necessary to perform
a CVE scan for a set of layers or a distribution.

> - We should ideally find, or alternatively write, a tool that can do
> automated CVE detection sweeps like cve-check does using the CVE Database
> and SPDX3. This tool should be able to be run at both build time inside a
> Yocto build, or on a separate system with just the SPDX as an input.
>

I think that in the time frame of October, the tool will need to merge NVD
and CVE records (or scan runs). The CVE data is improving but it is far
from ideal (see below). To have a reasonably working tool, someone will
have to invest into fixing CVE records, at a way faster pace than NVD is
doing now.

> - Automated tooling is just a piece of the puzzle, but it’s a useful
> piece.  More on that later...
>

I have my opinion on the topic too :)

Kind regards,
Marta

PS. Here's the yocto-vex-check result for the master branch as of this
night. There are a few false positives and I'm looking at the Linux kernel
entries (which usually parse way better).

Issues for package graphene (version 1.10.8):

Unpatched: CVE-2024-1984

Count: 1

Issues for package linux-yocto (version 6.12.13):

Unpatched: CVE-2021-46978 CVE-2021-47089 CVE-2021-47137 CVE-2024-26666
CVE-2024-34027 CVE-2024-35919

Count: 6

Issues for package go-runtime (version 1.24.0):

Unpatched: CVE-2024-24786

Count: 1

Issues for package libsndfile1 (version 1.2.2):

Unpatched: CVE-2024-50613

Count: 1

Issues for package libsoup-2.4 (version 2.74.3):

Unpatched: CVE-2024-52530 CVE-2024-52531 CVE-2024-52532

Count: 3

Issues for package ghostscript (version 10.04.0):

Unpatched: CVE-2024-46951 CVE-2024-46952 CVE-2024-46953 CVE-2024-46954
CVE-2024-46955 CVE-2024-46956

Count: 6

Global issue count: 18

[-- Attachment #2: Type: text/html, Size: 11135 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [OE-core] Future of CVE scanning in Yocto
  2025-03-07 14:36 Future of CVE scanning in Yocto Ross Burton
  2025-03-10 15:37 ` [Openembedded-architecture] " Marta Rybczynska
@ 2025-03-11 11:00 ` Antonin Godard
  1 sibling, 0 replies; 3+ messages in thread
From: Antonin Godard @ 2025-03-11 11:00 UTC (permalink / raw)
  To: ross.burton, openembedded-core, openembedded-architecture

Hi Ross,

On Fri Mar 7, 2025 at 3:36 PM CET, Ross Burton via lists.openembedded.org wrote:
[...]
> tl;dr:  I propose:
> - For the upcoming (April 2025) release we should add to the release notes that we’re aware that NVD is struggling, this is what our CVE tooling is based on, and we’re working on a solution for the October 2025 release.

I've tried to summarize and add the known issue to the release note here:
https://lists.yoctoproject.org/g/docs/message/6530

Is there a bug associated to this? I could add it to the release note for people
to track the progress on this.

Antonin

-- 
Antonin Godard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-03-11 11:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-07 14:36 Future of CVE scanning in Yocto Ross Burton
2025-03-10 15:37 ` [Openembedded-architecture] " Marta Rybczynska
2025-03-11 11:00 ` [OE-core] " Antonin Godard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox