From: Luis <luis.augenstein@tngtech.com>
To: nathan@kernel.org, nsc@kernel.org
Cc: linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, gregkh@linuxfoundation.org,
kstewart@linuxfoundation.org, maximilian.huber@tngtech.com,
Luis Augenstein <luis.augenstein@tngtech.com>
Subject: [PATCH v5 08/15] scripts/sbom: add JSON-LD serialization
Date: Fri, 10 Apr 2026 23:22:48 +0200 [thread overview]
Message-ID: <20260410212255.9883-9-luis.augenstein@tngtech.com> (raw)
In-Reply-To: <20260410212255.9883-1-luis.augenstein@tngtech.com>
From: Luis Augenstein <luis.augenstein@tngtech.com>
Add infrastructure to serialize an SPDX graph as a JSON-LD
document. NamespaceMaps in the SPDX document are converted
to custom prefixes in the @context field of the JSON-LD output.
The SBOM tool uses NamespaceMaps solely to shorten SPDX IDs,
avoiding repetition of full namespace URIs by using short prefixes.
Assisted-by: Cursor:claude-sonnet-4-5
Assisted-by: OpenCode:GLM-4-7
Co-developed-by: Maximilian Huber <maximilian.huber@tngtech.com>
Signed-off-by: Maximilian Huber <maximilian.huber@tngtech.com>
Signed-off-by: Luis Augenstein <luis.augenstein@tngtech.com>
---
Makefile | 3 +-
scripts/sbom/sbom.py | 52 +++++++++++++++++
scripts/sbom/sbom/config.py | 56 +++++++++++++++++++
scripts/sbom/sbom/spdx_graph/__init__.py | 7 +++
.../sbom/sbom/spdx_graph/build_spdx_graphs.py | 36 ++++++++++++
.../sbom/sbom/spdx_graph/spdx_graph_model.py | 36 ++++++++++++
6 files changed, 189 insertions(+), 1 deletion(-)
create mode 100644 scripts/sbom/sbom/spdx_graph/__init__.py
create mode 100644 scripts/sbom/sbom/spdx_graph/build_spdx_graphs.py
create mode 100644 scripts/sbom/sbom/spdx_graph/spdx_graph_model.py
diff --git a/Makefile b/Makefile
index 394ebd46e82..279e3abd34c 100644
--- a/Makefile
+++ b/Makefile
@@ -2174,7 +2174,8 @@ quiet_cmd_sbom = GEN $(sbom_targets)
--src-tree $(abspath $(srctree)) \
--obj-tree $(abspath $(objtree)) \
--roots-file "$(tmp-target)" \
- --output-directory $(abspath $(objtree));
+ --output-directory $(abspath $(objtree)) \
+ --generate-spdx;
PHONY += sbom
sbom: $(notdir $(KBUILD_IMAGE)) include/generated/autoconf.h $(if $(CONFIG_MODULES),modules modules.order)
$(call cmd,sbom)
diff --git a/scripts/sbom/sbom.py b/scripts/sbom/sbom.py
index 25d912a282d..426521ade46 100644
--- a/scripts/sbom/sbom.py
+++ b/scripts/sbom/sbom.py
@@ -6,13 +6,18 @@
Compute software bill of materials in SPDX format describing a kernel build.
"""
+import json
import logging
import os
import sys
import time
+import uuid
import sbom.sbom_logging as sbom_logging
from sbom.config import get_config
from sbom.path_utils import is_relative_to
+from sbom.spdx import JsonLdSpdxDocument, SpdxIdGenerator
+from sbom.spdx.core import CreationInfo, SpdxDocument
+from sbom.spdx_graph import SpdxIdGeneratorCollection, build_spdx_graphs
from sbom.cmd_graph import CmdGraph
@@ -56,10 +61,57 @@ def main():
f.write("\n".join(str(file_path) for file_path in used_files))
logging.debug(f"Successfully saved {used_files_path}")
+ if config.generate_spdx is False:
+ return
+
+ # Build SPDX Documents
+ logging.debug("Start generating SPDX graph based on cmd graph")
+ start_time = time.time()
+
+ # The real uuid will be generated based on the content of the SPDX graphs
+ # to ensure that the same SPDX document is always assigned the same uuid.
+ PLACEHOLDER_UUID = "00000000-0000-0000-0000-000000000000"
+ spdx_id_base_namespace = f"{config.spdxId_prefix}{PLACEHOLDER_UUID}/"
+ spdx_id_generators = SpdxIdGeneratorCollection(
+ base=SpdxIdGenerator(prefix="p", namespace=spdx_id_base_namespace),
+ source=SpdxIdGenerator(prefix="s", namespace=f"{spdx_id_base_namespace}source/"),
+ build=SpdxIdGenerator(prefix="b", namespace=f"{spdx_id_base_namespace}build/"),
+ output=SpdxIdGenerator(prefix="o", namespace=f"{spdx_id_base_namespace}output/"),
+ )
+
+ spdx_graphs = build_spdx_graphs(
+ cmd_graph,
+ spdx_id_generators,
+ config,
+ )
+ spdx_id_uuid = uuid.uuid5(
+ uuid.NAMESPACE_URL,
+ "".join(
+ json.dumps(element.to_dict()) for spdx_graph in spdx_graphs.values() for element in spdx_graph.to_list()
+ ),
+ )
+ logging.debug(f"Generated SPDX graph in {time.time() - start_time} seconds")
+
# Report collected warnings and errors in case of failure
warning_summary = sbom_logging.summarize_warnings()
error_summary = sbom_logging.summarize_errors()
+ if not sbom_logging.has_errors() or config.write_output_on_error:
+ for kernel_sbom_kind, spdx_graph in spdx_graphs.items():
+ spdx_graph_objects = spdx_graph.to_list()
+ # Add warning and error summary to creation info comment
+ creation_info = next(element for element in spdx_graph_objects if isinstance(element, CreationInfo))
+ creation_info.comment = "\n".join([warning_summary, error_summary]).strip()
+ # Replace Placeholder uuid with real uuid for spdxIds
+ spdx_document = next(element for element in spdx_graph_objects if isinstance(element, SpdxDocument))
+ for namespaceMap in spdx_document.namespaceMap:
+ namespaceMap.namespace = namespaceMap.namespace.replace(PLACEHOLDER_UUID, str(spdx_id_uuid))
+ # Serialize SPDX graph to JSON-LD
+ spdx_doc = JsonLdSpdxDocument(graph=spdx_graph_objects)
+ save_path = os.path.join(config.output_directory, config.spdx_file_names[kernel_sbom_kind])
+ spdx_doc.save(save_path, config.prettify_json)
+ logging.debug(f"Successfully saved {save_path}")
+
if warning_summary:
logging.warning(warning_summary)
if error_summary:
diff --git a/scripts/sbom/sbom/config.py b/scripts/sbom/sbom/config.py
index 39e556a4c53..0985457c3ca 100644
--- a/scripts/sbom/sbom/config.py
+++ b/scripts/sbom/sbom/config.py
@@ -3,11 +3,18 @@
import argparse
from dataclasses import dataclass
+from enum import Enum
import os
from typing import Any
from sbom.path_utils import PathStr
+class KernelSpdxDocumentKind(Enum):
+ SOURCE = "source"
+ BUILD = "build"
+ OUTPUT = "output"
+
+
@dataclass
class KernelSbomConfig:
src_tree: PathStr
@@ -19,6 +26,13 @@ class KernelSbomConfig:
root_paths: list[PathStr]
"""List of paths to root outputs (relative to obj_tree) to base the SBOM on."""
+ generate_spdx: bool
+ """Whether to generate SPDX SBOM documents. If False, no SPDX files are created."""
+
+ spdx_file_names: dict[KernelSpdxDocumentKind, str]
+ """If `generate_spdx` is True, defines the file names for each SPDX SBOM kind
+ (source, build, output) to store on disk."""
+
generate_used_files: bool
"""Whether to generate a flat list of all source files used in the build.
If False, no used-files document is created."""
@@ -38,6 +52,12 @@ class KernelSbomConfig:
write_output_on_error: bool
"""Whether to write output documents even if errors occur."""
+ spdxId_prefix: str
+ """Prefix to use for all SPDX element IDs."""
+
+ prettify_json: bool
+ """Whether to pretty-print generated SPDX JSON documents."""
+
def _parse_cli_arguments() -> dict[str, Any]:
"""
@@ -72,6 +92,15 @@ def _parse_cli_arguments() -> dict[str, Any]:
"--roots-file",
help="Path to a file containing the root paths (one per line). Cannot be used together with --roots.",
)
+ parser.add_argument(
+ "--generate-spdx",
+ action="store_true",
+ default=False,
+ help=(
+ "Whether to create sbom-source.spdx.json, sbom-build.spdx.json and "
+ "sbom-output.spdx.json documents (default: False)"
+ ),
+ )
parser.add_argument(
"--generate-used-files",
action="store_true",
@@ -119,6 +148,20 @@ def _parse_cli_arguments() -> dict[str, Any]:
),
)
+ # SPDX specific options
+ spdx_group = parser.add_argument_group("SPDX options", "Options for customizing SPDX document generation")
+ spdx_group.add_argument(
+ "--spdxId-prefix",
+ default="urn:spdx.dev:",
+ help="The prefix to use for all spdxId properties. (default: urn:spdx.dev:)",
+ )
+ spdx_group.add_argument(
+ "--prettify-json",
+ action="store_true",
+ default=False,
+ help="Whether to pretty print the generated spdx.json documents (default: False)",
+ )
+
args = vars(parser.parse_args())
return args
@@ -144,6 +187,7 @@ def get_config() -> KernelSbomConfig:
root_paths = args["roots"]
_validate_path_arguments(src_tree, obj_tree, root_paths)
+ generate_spdx = args["generate_spdx"]
generate_used_files = args["generate_used_files"]
output_directory = os.path.realpath(args["output_directory"])
debug = args["debug"]
@@ -151,19 +195,31 @@ def get_config() -> KernelSbomConfig:
fail_on_unknown_build_command = not args["do_not_fail_on_unknown_build_command"]
write_output_on_error = args["write_output_on_error"]
+ spdxId_prefix = args["spdxId_prefix"]
+ prettify_json = args["prettify_json"]
+
# Hardcoded config
+ spdx_file_names = {
+ KernelSpdxDocumentKind.SOURCE: "sbom-source.spdx.json",
+ KernelSpdxDocumentKind.BUILD: "sbom-build.spdx.json",
+ KernelSpdxDocumentKind.OUTPUT: "sbom-output.spdx.json",
+ }
used_files_file_name = "sbom.used-files.txt"
return KernelSbomConfig(
src_tree=src_tree,
obj_tree=obj_tree,
root_paths=root_paths,
+ generate_spdx=generate_spdx,
+ spdx_file_names=spdx_file_names,
generate_used_files=generate_used_files,
used_files_file_name=used_files_file_name,
output_directory=output_directory,
debug=debug,
fail_on_unknown_build_command=fail_on_unknown_build_command,
write_output_on_error=write_output_on_error,
+ spdxId_prefix=spdxId_prefix,
+ prettify_json=prettify_json,
)
diff --git a/scripts/sbom/sbom/spdx_graph/__init__.py b/scripts/sbom/sbom/spdx_graph/__init__.py
new file mode 100644
index 00000000000..3557b1d51bf
--- /dev/null
+++ b/scripts/sbom/sbom/spdx_graph/__init__.py
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0-only OR MIT
+# Copyright (C) 2025 TNG Technology Consulting GmbH
+
+from .build_spdx_graphs import build_spdx_graphs
+from .spdx_graph_model import SpdxIdGeneratorCollection
+
+__all__ = ["build_spdx_graphs", "SpdxIdGeneratorCollection"]
diff --git a/scripts/sbom/sbom/spdx_graph/build_spdx_graphs.py b/scripts/sbom/sbom/spdx_graph/build_spdx_graphs.py
new file mode 100644
index 00000000000..bb3db4e423d
--- /dev/null
+++ b/scripts/sbom/sbom/spdx_graph/build_spdx_graphs.py
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0-only OR MIT
+# Copyright (C) 2025 TNG Technology Consulting GmbH
+
+
+from typing import Protocol
+
+from sbom.config import KernelSpdxDocumentKind
+from sbom.cmd_graph import CmdGraph
+from sbom.path_utils import PathStr
+from sbom.spdx_graph.spdx_graph_model import SpdxGraph, SpdxIdGeneratorCollection
+
+
+class SpdxGraphConfig(Protocol):
+ obj_tree: PathStr
+ src_tree: PathStr
+
+
+def build_spdx_graphs(
+ cmd_graph: CmdGraph,
+ spdx_id_generators: SpdxIdGeneratorCollection,
+ config: SpdxGraphConfig,
+) -> dict[KernelSpdxDocumentKind, SpdxGraph]:
+ """
+ Builds SPDX graphs (output, source, and build) based on a cmd dependency graph.
+ If the source and object trees are identical, no dedicated source graph can be created.
+ In that case the source files are added to the build graph instead.
+
+ Args:
+ cmd_graph: The dependency graph of a kernel build.
+ spdx_id_generators: Collection of SPDX ID generators.
+ config: Configuration options.
+
+ Returns:
+ Dictionary of SPDX graphs
+ """
+ return {}
diff --git a/scripts/sbom/sbom/spdx_graph/spdx_graph_model.py b/scripts/sbom/sbom/spdx_graph/spdx_graph_model.py
new file mode 100644
index 00000000000..682194d4362
--- /dev/null
+++ b/scripts/sbom/sbom/spdx_graph/spdx_graph_model.py
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0-only OR MIT
+# Copyright (C) 2025 TNG Technology Consulting GmbH
+
+from dataclasses import dataclass
+from sbom.spdx.core import CreationInfo, SoftwareAgent, SpdxDocument, SpdxObject
+from sbom.spdx.software import Sbom
+from sbom.spdx.spdxId import SpdxIdGenerator
+
+
+@dataclass
+class SpdxGraph:
+ """Represents the complete graph of a single SPDX document."""
+
+ spdx_document: SpdxDocument
+ agent: SoftwareAgent
+ creation_info: CreationInfo
+ sbom: Sbom
+
+ def to_list(self) -> list[SpdxObject]:
+ return [
+ self.spdx_document,
+ self.agent,
+ self.creation_info,
+ self.sbom,
+ *self.sbom.element,
+ ]
+
+
+@dataclass
+class SpdxIdGeneratorCollection:
+ """Holds SPDX ID generators for different document types to ensure globally unique SPDX IDs."""
+
+ base: SpdxIdGenerator
+ source: SpdxIdGenerator
+ build: SpdxIdGenerator
+ output: SpdxIdGenerator
--
2.43.0
next prev parent reply other threads:[~2026-04-10 21:23 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 21:22 [PATCH v5 00/15] add SPDX SBOM generation script Luis
2026-04-10 21:22 ` [PATCH v5 01/15] scripts/sbom: add documentation Luis
2026-04-10 21:22 ` [PATCH v5 02/15] scripts/sbom: integrate script in make process Luis
2026-04-10 21:22 ` [PATCH v5 03/15] scripts/sbom: setup sbom logging Luis
2026-04-10 21:22 ` [PATCH v5 04/15] scripts/sbom: add command parsers Luis
2026-04-10 21:22 ` [PATCH v5 05/15] scripts/sbom: add cmd graph generation Luis
2026-04-10 21:22 ` [PATCH v5 06/15] scripts/sbom: add additional dependency sources for cmd graph Luis
2026-04-10 21:22 ` [PATCH v5 07/15] scripts/sbom: add SPDX classes Luis
2026-04-10 21:22 ` Luis [this message]
2026-04-10 21:22 ` [PATCH v5 09/15] scripts/sbom: add shared SPDX elements Luis
2026-04-10 21:22 ` [PATCH v5 10/15] scripts/sbom: collect file metadata Luis
2026-04-10 21:22 ` [PATCH v5 11/15] scripts/sbom: add SPDX output graph Luis
2026-04-10 21:22 ` [PATCH v5 12/15] scripts/sbom: add SPDX source graph Luis
2026-04-10 21:22 ` [PATCH v5 13/15] scripts/sbom: add SPDX build graph Luis
2026-04-10 21:22 ` [PATCH v5 14/15] scripts/sbom: add unit tests for command parsers Luis
2026-04-10 21:22 ` [PATCH v5 15/15] scripts/sbom: add unit tests for SPDX-License-Identifier parsing Luis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260410212255.9883-9-luis.augenstein@tngtech.com \
--to=luis.augenstein@tngtech.com \
--cc=akpm@linux-foundation.org \
--cc=gregkh@linuxfoundation.org \
--cc=kstewart@linuxfoundation.org \
--cc=linux-kbuild@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maximilian.huber@tngtech.com \
--cc=nathan@kernel.org \
--cc=nsc@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox