From: Thomas Perale via buildroot <buildroot@buildroot.org>
To: Fabien Lehoussel <fabien.lehoussel@smile.fr>
Cc: Thomas Perale <thomas.perale@mind.be>, buildroot@buildroot.org
Subject: Re: [Buildroot] [PATCH v3 2/2] support/scripts/cve-check: add kernel CVE filtering based on compiled files
Date: Fri, 27 Feb 2026 10:50:11 +0100 [thread overview]
Message-ID: <20260227095011.47847-1-thomas.perale@mind.be> (raw)
In-Reply-To: <20260225105707.139320-3-fabien.lehoussel@smile.fr>
Hi Fabien,
In reply of:
> Introduce optional filtering for Linux kernel CVEs using compile_commands.json
> and the CNA database [1].
> This reduces false positives by marking CVEs that don't
> affect compiled files as "not_affected" with "code_not_present" justification,
> while reporting only CVEs that affect compiled files, enhancing the relevance
> of CVE reports for specific builds.
>
> Changes:
> - Add CVEDatabase class for generic CVE database management (clone/pull)
> supporting both NVD and CNA databases with automatic sync capability
> Database path argument is the repository root directly (no automatic 'git' subfolder)
> enabling seamless CI/CD integration and clearer path semantics
> - Add CVE_LINUX class for specialized Linux kernel CVE analysis
> - Load compiled files from compile_commands.json
> - Match CVE affected files from CNA database against compiled files
> - Categorize CVEs as applicable, not applicable, or insufficient data
> - Enhance cve-check script with optional kernel CVE filtering
> - Activate filtering automatically when --cc-path and --cna-path are provided
> - Replace --no-nvd-update with generic --no-db-update flag for both databases
> - Mark non-compiled kernel CVEs as "not_affected" with "code_not_present" justification
> - Preserve audit trail by keeping all vulnerabilities in SBOM
> - Keeps potentially applicable and uncertain Linux CVEs for manual review
>
> Usage:
> make show-info | utils/generate-cyclonedx | support/script/cve-check \
> --nvd-path dl/buildroot-nvd/ \
> --cc-path output/images/linux_compile_commands.json \
> --cna-path dl/buildroot-cvelistV5 \
> --no-db-update
>
> Requires:
> - linux_compile_commands.json from kernel build (BR2_LINUX_KERNEL_COMPILE_COMMANDS)
>
> [1] CNA database: https://github.com/CVEProject/cvelistV5.git
>
> Signed-off-by: Fabien Lehoussel <fabien.lehoussel@smile.fr>
> ---
> support/scripts/cve-check | 119 ++++++++++++++-
> support/scripts/cve.py | 294 +++++++++++++++++++++++++++++++++++---
> 2 files changed, 386 insertions(+), 27 deletions(-)
>
> diff --git a/support/scripts/cve-check b/support/scripts/cve-check
> index ff14e4b238..1e3743248a 100755
> --- a/support/scripts/cve-check
> +++ b/support/scripts/cve-check
> @@ -7,11 +7,20 @@
> # The NVD database is cloned using a mirror of it and the content is compared
> # locally.
> #
> +# For Linux kernel CVEs, optionally filters them based on compiled files using
> +# the CNA database (cvelistV5) and compile_commands.json matching.
> +#
> # Example usage:
> # $ make show-info | utils/generate-cyclonedx | support/script/cve-check --nvd-path dl/buildroot-nvd/
> +#
> +# With kernel CVE filtering:
> +# $ make show-info | utils/generate-cyclonedx | support/script/cve-check \
> +# --nvd-path dl/buildroot-nvd/ \
> +# --cc-path output/images/linux_compile_commands.json \
> +# --cna-path dl/buildroot-cvelistV5
> from collections import defaultdict
> from pathlib import Path
> -from typing import TypedDict
> +from typing import TypedDict, Optional
> import argparse
> import sys
> import json
> @@ -29,6 +38,16 @@ database.
>
> The NVD database is cloned using a mirror of it and the content is compared
> locally.
> +
> +For Linux kernel CVEs, optionally filters them based on compiled files using
> +the CNA database (cvelistV5) and compile_commands.json matching.
> +
> +When kernel CVE filtering is enabled (via --cc-path and --cna-path):
> +- Marks false positives as "not_affected": CVEs where NO affected files are compiled
> + with "code_not_present" justification, preserving complete audit trail
> +- Keeps potentially applicable CVEs: where affected files ARE compiled
> +- Keeps uncertain CVEs: where CNA database lacks sufficient information
> +
> """
>
>
> @@ -281,8 +300,70 @@ def enrich_vulnerabilities(nvd_path: Path, sbom):
> vuln_append_or_update_affects_if_exists(vulnerabilities, vulnerability)
>
>
> +def analyze_kernel_cves(sbom, cve_linux: cvecheck.CVE_LINUX, compiled_files: set):
> + """
> + Analyze Linux kernel CVEs based on compiled files.
> +
> + Marks CVEs where NO affected files are compiled as "not_affected" with
> + "code_not_present" justification, preserving complete audit trail.
> + Keeps CVEs where affected files ARE compiled or insufficient data.
> +
> + Args:
> + sbom (dict): Input SBOM with vulnerabilities.
> + cve_linux (CVE_LINUX): CVE_LINUX instance for analysis.
> + compiled_files (set): Set of compiled kernel files.
> + """
> + vulnerabilities = sbom.get("vulnerabilities", [])
> +
> + # Process vulnerabilities: mark non-applicable kernel CVEs as not_affected
> + for vuln in vulnerabilities:
> + cve_id = vuln.get("id", "")
> +
> + # Check if this CVE affects "linux" component
> + affects = vuln.get("affects", [])
> + affects_linux = any(a.get("ref") == "linux" for a in affects)
> +
> + if not affects_linux or not cve_id.startswith("CVE-"):
> + # Not a kernel CVE, leave it unchanged
> + continue
> +
> + # Check if kernel CVE is applicable
> + status = cve_linux.affects(cve_id, compiled_files)
> +
> + if status == cvecheck.CVE_LINUX.CVE_NOT_APPLICABLE:
> + # Mark as not_affected since code is not present
> + vuln["analysis"] = {
> + "state": "not_affected",
> + "justification": "code_not_present"
> + }
> +
> + sbom["vulnerabilities"] = vulnerabilities
> +
> +def apply_kernel_cve_filtering(sbom, cve_linux, cc_path):
> + """
> + Apply kernel CVE filtering based on compiled files.
> +
> + Args:
> + sbom (dict): Input SBOM with vulnerabilities
> + cve_linux (CVE_LINUX): CVE_LINUX instance for analysis
> + cc_path (str): Path to compile_commands.json
> + """
> + # Load compiled files
> + compiled_files = cvecheck.CVE_LINUX.load_compiled_files(cc_path)
> + if not compiled_files:
> + print("Error: Failed to load compiled files", file=sys.stderr)
> + return False
> +
> + # Filter kernel CVEs
> + analyze_kernel_cves(sbom, cve_linux, compiled_files)
> + return True
> +
> +
> def main():
> - parser = argparse.ArgumentParser(description=DESCRIPTION)
> + parser = argparse.ArgumentParser(
> + description=DESCRIPTION,
> + formatter_class=argparse.RawDescriptionHelpFormatter
> + )
> parser.add_argument("-i", "--in-file", nargs="?", type=argparse.FileType("r"),
> default=(None if sys.stdin.isatty() else sys.stdin))
> parser.add_argument("-o", "--out-file", nargs="?", type=argparse.FileType("w"),
> @@ -297,8 +378,15 @@ def main():
> parser.add_argument("--include-resolved", default=False, action='store_true',
> help="Add vulnerabilities already 'resolved' that don't affect a " +
> "component to the output CycloneDX vulnerabilities analysis.")
> - parser.add_argument("--no-nvd-update", default=False, action='store_true',
> - help="Doesn't update the NVD database.")
> + parser.add_argument('--cc-path', dest='cc_path', default=None,
> + help="Path to the kernel_compile_commands.json file for CVE filtering. " +
> + "Marks non-compiled kernel CVEs as 'not_affected' with 'code_not_present' justification. " +
> + "Requires --cna-path. " +
> + "Requires BR2_LINUX_KERNEL_COMPILE_COMMANDS to be enabled.")
Why not default to `default=brpath / 'dl' / 'buildroot-cna'` similarly to `--nvd-path` ?
> + parser.add_argument('--cna-path', dest='cna_path', default=None,
> + help='Path to CNA database (cvelistV5) for kernel CVE filtering')
> + parser.add_argument("--no-db-update", default=False, action='store_true',
> + help="Doesn't clone/pull the CVE databases (NVD/CNA).")
Would do the name change in a seperate commit.
>
> args = parser.parse_args()
>
> @@ -306,21 +394,42 @@ def main():
> parser.print_help()
> sys.exit(1)
>
> + cve_linux = None
> + if args.cc_path:
> + if args.cna_path is None:
> + print("ERROR: cna_path is required when cc_path is specified!", file=sys.stderr)
> + parser.print_help()
> + sys.exit(1)
> + else:
> + # Initialize CVE_LINUX for kernel CVE analysis
> + cve_linux = cvecheck.CVE_LINUX(args.cna_path)
> +
> sbom = json.load(args.in_file)
>
> opt = Options(
> include_resolved=args.include_resolved,
> )
>
> + # Sync NVD database if requested
> args.nvd_path.mkdir(parents=True, exist_ok=True)
> - if not args.no_nvd_update:
> + if not args.no_db_update:
> cvecheck.CVE.download_nvd(args.nvd_path)
>
> + # Sync CNA database if requested
> + if cve_linux and not args.no_db_update:
> + cve_linux.sync_database()
> +
> + # Process
> if args.enrich_only:
> enrich_vulnerabilities(args.nvd_path, sbom)
> else:
> check_package_cves(args.nvd_path, sbom, opt)
>
> + # Apply kernel CVE filtering if enabled
> + if cve_linux:
> + apply_kernel_cve_filtering(sbom, cve_linux, args.cc_path)
> +
> + # write results
> args.out_file.write(json.dumps(sbom, indent=2))
> args.out_file.write('\n')
>
> diff --git a/support/scripts/cve.py b/support/scripts/cve.py
> index 3875c4258c..4aaa978302 100755
> --- a/support/scripts/cve.py
> +++ b/support/scripts/cve.py
> @@ -24,11 +24,14 @@ import json
> import subprocess
> import sys
> import operator
> +from pathlib import Path
> +from typing import Optional, Set, List
>
> sys.path.append('utils/')
>
> NVD_START_YEAR = 1999
> -NVD_BASE_URL = "https://github.com/fkie-cad/nvd-json-data-feeds/"
> +NVD_BASE_URL = "https://github.com/fkie-cad/nvd-json-data-feeds.git"
> +CNA_REPO_URL = "https://github.com/CVEProject/cvelistV5.git"
>
> ops = {
> '>=': operator.ge,
> @@ -39,6 +42,67 @@ ops = {
> }
>
>
> +class CVEDatabase:
> + """Generic class for managing CVE database operations (clone, pull, sync)"""
> +
> + def __init__(self, repo_url: str, db_dir: str):
> + """
> + Initialize CVE database manager.
> +
> + Args:
> + repo_url (str): Git repository URL for the CVE database
> + db_dir (str): Local directory path for the database
> + """
> + self.repo_url = repo_url
> + self.db_dir = db_dir
> +
> + def exists(self) -> bool:
> + """Check if the database directory exists"""
> + return os.path.exists(self.db_dir)
> +
> + def is_git_repo(self) -> bool:
> + """Check if the database directory is a git repository"""
> + return os.path.exists(os.path.join(self.db_dir, ".git"))
> +
> + def sync(self) -> bool:
> + """
> + Clone or update the CVE database from GitHub.
> +
> + If the directory doesn't exist, clone the repository.
> + If it exists and is a git repository, pull the latest changes.
> +
> + Returns:
> + bool: True if successful, False otherwise
> + """
> + try:
> + if self.is_git_repo():
> + # Directory is a git repo, pull latest changes
> + subprocess.run(
> + ["git", "pull"],
> + cwd=self.db_dir,
> + stdout=subprocess.DEVNULL,
> + stderr=subprocess.DEVNULL,
> + check=True,
> + )
> + return True
> + else:
> + # Clone the repository
> + os.makedirs(self.db_dir, exist_ok=True)
> + subprocess.run(
> + ["git", "clone", self.repo_url, self.db_dir],
> + stdout=subprocess.DEVNULL,
> + stderr=subprocess.DEVNULL,
> + check=True,
> + )
> + return True
> + except subprocess.CalledProcessError:
> + print(f"Warning: Failed to sync database from {self.repo_url}", file=sys.stderr)
> + return False
> + except FileNotFoundError:
> + print("Warning: git is not installed or not in PATH", file=sys.stderr)
> + return False
> +
> +
I would create this class in a seperate commit to really have the code specific
to the kernel CVE logic in a smaller commit.
> class CPE:
> DISJOINT = 0
> SUBSET = 1
> @@ -145,24 +209,9 @@ class CVE:
>
> @staticmethod
> def download_nvd(nvd_dir):
> - nvd_git_dir = os.path.join(nvd_dir, "git")
> -
> - if os.path.exists(nvd_git_dir):
> - subprocess.check_call(
> - ["git", "pull"],
> - cwd=nvd_git_dir,
> - stdout=subprocess.DEVNULL,
> - stderr=subprocess.DEVNULL,
> - )
> - else:
> - # Create the directory and its parents; git
> - # happily clones into an empty directory.
> - os.makedirs(nvd_git_dir)
> - subprocess.check_call(
> - ["git", "clone", NVD_BASE_URL, nvd_git_dir],
> - stdout=subprocess.DEVNULL,
> - stderr=subprocess.DEVNULL,
> - )
> + """Download or update the NVD database"""
> + db = CVEDatabase(NVD_BASE_URL, nvd_dir)
> + db.sync()
>
Maybe it's a good occasion to re-think the CVE class and instead move some of
the `staticmethod` present here into some kind of DB object that get inherited
by `NVDDB` & `CNADB`.
I don't know yet if it's possible (or even make sense) but I think it would
make sense to be able to compare packages (not only kernel) by both the NVD &
CNA Db.
I'm thinking about the `read_nvd_dir` method which should probably have some
code in common with `load_cve_from_cna`.
> @staticmethod
> def sort_id(cve_ids):
> @@ -178,10 +227,8 @@ class CVE:
> feeds since NVD_START_YEAR. If the files are missing or outdated in
> nvd_dir, a fresh copy will be downloaded, and kept in .json.gz
> """
> - nvd_git_dir = os.path.join(nvd_dir, "git")
> -
> for year in range(NVD_START_YEAR, datetime.datetime.now().year + 1):
> - for dirpath, _, filenames in os.walk(os.path.join(nvd_git_dir, f"CVE-{year}")):
> + for dirpath, _, filenames in os.walk(os.path.join(nvd_dir, f"CVE-{year}")):
> for filename in filenames:
> if filename[-5:] != ".json":
> continue
> @@ -340,3 +387,206 @@ class CVE:
> return self.CVE_AFFECTS
>
> return self.CVE_DOESNT_AFFECT
> +
> +
> +class CVE_LINUX:
> + """Specialized class for Linux kernel CVE analysis based on compiled files.
> +
> + Uses the CNA (CVE Numbering Authority) database (cvelistV5) to determine
> + if CVEs affecting the Linux kernel are applicable to the current build
> + based on which files are actually compiled.
> + """
> +
> + CVE_APPLICABLE = 1 # CVE affects compiled files
> + CVE_NOT_APPLICABLE = 2 # CVE doesn't affect any compiled files
> + CVE_INSUFFICIENT_DATA = 3 # CVE not found in CNA or no program files info
Those are similar to the CVE.CVE_AFFECT/CVE_DOESNT_AFFECT/CVE_UNKNOWN category
that already exists. It would make sense to find a way to put some stuff in
common.
> +
> + def __init__(self, cna_dir: str):
> + """Initialize with path to CNA database (cvelistV5)"""
> + self.cna_dir = cna_dir
> + self.db = CVEDatabase(CNA_REPO_URL, cna_dir)
> +
> + def sync_database(self) -> bool:
> + """Sync (clone or pull) the CNA database from GitHub"""
> + return self.db.sync()
> +
> + @staticmethod
> + def load_compiled_files(compile_commands_path: str) -> Set[str]:
> + """
> + Load compile_commands.json and extract the list of compiled files.
> + Returns a set of relative paths (e.g., 'drivers/net/foo.c').
> + """
> + p = Path(compile_commands_path)
> + if not p.exists():
> + print(f"Error: compile_commands.json not found: {compile_commands_path}", file=sys.stderr)
> + return set()
> +
> + try:
> + with open(p, encoding="utf-8") as f:
> + commands = json.load(f)
> + except (json.JSONDecodeError, OSError) as e:
> + print(f"Error: Failed to load compile_commands.json: {e}", file=sys.stderr)
> + return set()
> +
> + if not isinstance(commands, list):
> + print("Error: Unexpected format in compile_commands.json", file=sys.stderr)
> + return set()
> +
> + # Infer kernel_dir from compile_commands
> + kernel_dir = None
> + if commands:
> + first_dir = Path(commands[0].get("directory", ""))
> + kernel_dir = str(first_dir.resolve())
> +
> + kernel_root = Path(kernel_dir).resolve() if kernel_dir else None
> + compiled_files = set()
> +
> + for entry in commands:
> + file_path = Path(entry.get("file", ""))
> +
> + # Make relative to kernel_root if possible
> + if kernel_root:
> + try:
> + rel_path = file_path.resolve().relative_to(kernel_root)
> + compiled_files.add(str(rel_path))
> + except ValueError:
> + # File outside kernel tree → ignore
> + continue
> + else:
> + # Fallback: use path as-is
> + compiled_files.add(str(file_path))
> +
> + return compiled_files
> +
> + def load_cve_from_cna(self, cve_id: str) -> Optional[dict]:
> + """
> + Load a specific CVE from the CNA directory.
> +
> + Searches for a file named CVE-YYYY-NNNNN.json or similar.
> + Returns a dict with extracted info, or None if not found/invalid.
> + """
> + cna_path = Path(self.cna_dir)
> +
> + # Construct possible file names
> + possible_names = [
> + f"{cve_id}.json",
> + f"{cve_id.replace('CVE-', '')}.json",
> + ]
> +
> + cve_file = None
> + for name in possible_names:
> + candidate = cna_path / name
> + if candidate.exists():
> + cve_file = candidate
> + break
> +
> + # Recursive search as last resort
> + if not cve_file:
> + matches = list(cna_path.rglob(f"{cve_id}.json"))
> + if matches:
> + cve_file = matches[0]
> +
> + if not cve_file:
> + return None
> +
> + try:
> + with open(cve_file, encoding="utf-8") as f:
> + data = json.load(f)
> + except (json.JSONDecodeError, OSError):
> + return None
> +
> + # Verify CNA 5.x format
> + if data.get("dataType") != "CVE_RECORD":
> + return None
> +
> + cna = data.get("containers", {}).get("cna", {})
> + if not cna:
> + return None
> +
> + # Extract programFiles
> + program_files = []
> + for affected in cna.get("affected", []):
> + if affected.get("vendor") == "Linux" and affected.get("product") == "Linux":
> + program_files.extend(affected.get("programFiles", []))
> +
> + # Deduplicate
> + program_files = sorted(set(program_files))
> +
> + return {
> + "cve_id": cve_id,
> + "program_files": program_files,
> + }
> +
> + @staticmethod
> + def match_files(program_files: List[str], compiled_files: Set[str]) -> List[str]:
> + """
> + Return compiled files that match the CVE's programFiles.
> + Uses suffix matching to handle path differences.
> + """
> + matched = []
> +
> + for prog_file in program_files:
> + # Normalize: remove leading slash
> + prog_norm = prog_file.lstrip("/")
> +
> + for compiled in compiled_files:
> + # Exact match
> + if compiled == prog_norm:
> + if compiled not in matched:
> + matched.append(compiled)
> + break
> +
> + return sorted(matched)
> +
> + def affects(self, cve_id: str, compiled_files: Set[str]) -> int:
> + """
> + Determine if a Linux kernel CVE affects the current build.
> +
> + Returns:
> + CVE_APPLICABLE: CVE affects compiled files
> + CVE_NOT_APPLICABLE: CVE doesn't affect any compiled files
> + CVE_INSUFFICIENT_DATA: CVE not found in CNA or no program files info
> + """
> + cve_cna = self.load_cve_from_cna(cve_id)
> +
> + if not cve_cna:
> + # CVE not found in CNA database - keep for review (insufficient data)
> + return self.CVE_INSUFFICIENT_DATA
> +
> + if not cve_cna["program_files"]:
> + # No program files info - keep for review
> + return self.CVE_INSUFFICIENT_DATA
> +
> + # Check if any affected files are compiled
> + matched = self.match_files(cve_cna["program_files"], compiled_files)
> +
> + if len(matched) > 0:
> + return self.CVE_APPLICABLE
> + else:
> + return self.CVE_NOT_APPLICABLE
> +
> + def get_affected_files(self, cve_id: str, compiled_files: Set[str]) -> dict:
> + """
> + Get details about which files are affected by a CVE and which are compiled.
> +
> + Returns a dict with:
> + - cve_id: The CVE identifier
> + - program_files: All affected program files from CNA
> + - matched_compiled: Compiled files that match affected files
> + """
> + cve_cna = self.load_cve_from_cna(cve_id)
> +
> + if not cve_cna:
> + return {
> + "cve_id": cve_id,
> + "program_files": [],
> + "matched_compiled": [],
> + }
> +
> + matched = self.match_files(cve_cna["program_files"], compiled_files)
> +
> + return {
> + "cve_id": cve_id,
> + "program_files": cve_cna["program_files"],
> + "matched_compiled": matched,
> + }
> --
> 2.43.0
>
> _______________________________________________
> buildroot mailing list
> buildroot@buildroot.org
> https://lists.buildroot.org/mailman/listinfo/buildroot
_______________________________________________
buildroot mailing list
buildroot@buildroot.org
https://lists.buildroot.org/mailman/listinfo/buildroot
prev parent reply other threads:[~2026-02-27 9:50 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 10:55 [Buildroot] [PATCH v3 0/2] Linux kernel CVE filtering Fabien Lehoussel via buildroot
2026-02-25 10:55 ` [Buildroot] [PATCH 1/2] linux/linux.mk: add generation of compile_commands.json Fabien Lehoussel via buildroot
2026-02-25 10:56 ` [Buildroot] [PATCH v3 2/2] support/scripts/cve-check: add kernel CVE filtering based on compiled files Fabien Lehoussel via buildroot
2026-02-27 9:50 ` Thomas Perale via buildroot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260227095011.47847-1-thomas.perale@mind.be \
--to=buildroot@buildroot.org \
--cc=fabien.lehoussel@smile.fr \
--cc=thomas.perale@mind.be \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox