[PATCH 00/24] better handle media headers

linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 00/24] better handle media headers
@ 2025-08-21 14:21 Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 01/24] docs: parse-headers.pl: improve its debug output format Mauro Carvalho Chehab
                   ` (24 more replies)
  0 siblings, 25 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel, Mauro Carvalho Chehab,
	Sean Young, linux-media

Hi Jon,

Sorry for the big series. I wanted it to be smaller, by still this
is still only the first half of the history. I have a pile of other
patches on the top of this one to be sent - part of them to media.

This series comes after:
	https://lore.kernel.org/linux-doc/cover.1755258303.git.mchehab+huawei@kernel.org/

Its goal is to drop one of the most ancient and ugliest hack from
the documentation build system.

Before migrating to Sphinx, the media subsystem already had
a very comprehensive uAPI book, together with a build time
system to detect and point for any documentation gaps.

When migrating to Sphinx, we ported the logic to a Perl script
(parse-headers.pl) and Markus came up with a Sphinx extension
(kernel_include.py). We also added some files to control how
parse-headers produce results, and a Makefile.

At the initial Sphinx versions (1.4.1 if I recall correctly), when
a new symbol is added to videodev2.h, a new warning were
produced at documentatiion time, it the patchset didn't have
the corresponding documentation path.

While kernel-include is generic, the only user at the moment
is the media subsystem.

This series gets rid of the Python script, replacing it by a
command line script and a class. The parse header class
can optionally be used by kernel-include to produce an
enriched code that will contain cross-references.

As the other conversions, it starts with a bug-compatible
version of parse-headers, but the subsequent patches
add more functionalities and fix bugs.

It should be noticed that modern of Sphinx disabled the
cross-reference warnings. So, at the next series, I'll be
re-adding it in a controlled way (e.g. just for the
references from kernel-include that has an special
argument).

The script also supports now generating a "toc" output,
which will be used at the next series.

Mauro Carvalho Chehab (24):
  docs: parse-headers.pl: improve its debug output format
  docs: parse-headers.py: convert parse-headers.pl
  docs: parse-headers.py: improve --help logic
  docs: parse-headers.py: better handle @var arguments
  docs: parse-headers.py: simplify the rules for hashes
  tools: docs: parse-headers.py: move it from sphinx dir
  tools: docs: parse_data_structs.py: add methods to return output
  MAINTAINERS: add files from tools/docs to documentation entry
  docs: uapi: media: Makefile: use parse-headers.py
  docs: : Update its coding style
  docs: kernel_include.py: allow cross-reference generation
  docs: kernel_include.py: generate warnings for broken refs
  docs: kernel_include.py: move rawtext logic to separate functions
  docs: kernel_include.py: move range logic to a separate function
  docs: kernel_include.py: remove range restriction for gen docs
  docs: kernel_include.py: move code and literal functions
  docs: kernel_include.py: add support to generate a TOC table
  docs: kernel_include.py: append line numbers to better report errors
  docs: kernel_include.py: move apply_range() and add a docstring
  docs: kernel_include.py: remove line numbers from parsed-literal
  docs: kernel_include.py: remove Include class inheritance
  docs: kernel_include.py: document all supported parameters
  scripts: sphinx-build-wrapper: get rid of uapi/media Makefile
  docs: sphinx: drop parse-headers.pl

 .pylintrc                                     |   2 +-
 Documentation/sphinx/kernel_include.py        | 519 +++++++++++++-----
 Documentation/sphinx/parse-headers.pl         | 404 --------------
 Documentation/userspace-api/media/Makefile    |  64 ---
 .../userspace-api/media/cec/cec-header.rst    |   5 +-
 .../media/{ => cec}/cec.h.rst.exceptions      |   0
 .../media/{ => dvb}/ca.h.rst.exceptions       |   0
 .../media/{ => dvb}/dmx.h.rst.exceptions      |   0
 .../media/{ => dvb}/frontend.h.rst.exceptions |   0
 .../userspace-api/media/dvb/headers.rst       |  17 +-
 .../media/{ => dvb}/net.h.rst.exceptions      |   0
 .../media/mediactl/media-header.rst           |   5 +-
 .../{ => mediactl}/media.h.rst.exceptions     |   0
 .../userspace-api/media/rc/lirc-header.rst    |   4 +-
 .../media/{ => rc}/lirc.h.rst.exceptions      |   0
 .../userspace-api/media/v4l/videodev.rst      |   4 +-
 .../{ => v4l}/videodev2.h.rst.exceptions      |   0
 MAINTAINERS                                   |   1 +
 scripts/sphinx-build-wrapper                  |  48 --
 tools/docs/lib/__init__.py                    |   0
 tools/docs/lib/enrich_formatter.py            |  70 +++
 tools/docs/lib/parse_data_structs.py          | 452 +++++++++++++++
 tools/docs/parse-headers.py                   |  60 ++
 23 files changed, 984 insertions(+), 671 deletions(-)
 delete mode 100755 Documentation/sphinx/parse-headers.pl
 delete mode 100644 Documentation/userspace-api/media/Makefile
 rename Documentation/userspace-api/media/{ => cec}/cec.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/ca.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/dmx.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/frontend.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/net.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => mediactl}/media.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => rc}/lirc.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => v4l}/videodev2.h.rst.exceptions (100%)
 create mode 100644 tools/docs/lib/__init__.py
 create mode 100644 tools/docs/lib/enrich_formatter.py
 create mode 100755 tools/docs/lib/parse_data_structs.py
 create mode 100755 tools/docs/parse-headers.py

-- 
2.50.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/24] docs: parse-headers.pl: improve its debug output format
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 02/24] docs: parse-headers.py: convert parse-headers.pl Mauro Carvalho Chehab
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel

Change the --debug logic to help comparing its results with
a new python script.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.pl | 31 ++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/Documentation/sphinx/parse-headers.pl b/Documentation/sphinx/parse-headers.pl
index 7b1458544e2e..560685926cdb 100755
--- a/Documentation/sphinx/parse-headers.pl
+++ b/Documentation/sphinx/parse-headers.pl
@@ -31,8 +31,6 @@ my %enums;
 my %enum_symbols;
 my %structs;
 
-require Data::Dumper if ($debug);
-
 #
 # read the file and get identifiers
 #
@@ -197,6 +195,9 @@ if ($file_exceptions) {
 		} else {
 			$reftype = $def_reftype{$type};
 		}
+		if (!$reftype) {
+		    print STDERR "Warning: can't find ref type for $type";
+		}
 		$new = "$reftype:`$old <$new>`";
 
 		if ($type eq "ioctl") {
@@ -229,12 +230,26 @@ if ($file_exceptions) {
 }
 
 if ($debug) {
-	print Data::Dumper->Dump([\%ioctls], [qw(*ioctls)]) if (%ioctls);
-	print Data::Dumper->Dump([\%typedefs], [qw(*typedefs)]) if (%typedefs);
-	print Data::Dumper->Dump([\%enums], [qw(*enums)]) if (%enums);
-	print Data::Dumper->Dump([\%structs], [qw(*structs)]) if (%structs);
-	print Data::Dumper->Dump([\%defines], [qw(*defines)]) if (%defines);
-	print Data::Dumper->Dump([\%enum_symbols], [qw(*enum_symbols)]) if (%enum_symbols);
+	my @all_hashes = (
+		{ioctl      => \%ioctls},
+		{typedef    => \%typedefs},
+		{enum       => \%enums},
+		{struct     => \%structs},
+		{define     => \%defines},
+		{symbol     => \%enum_symbols}
+	);
+
+	foreach my $hash (@all_hashes) {
+		while (my ($name, $hash_ref) = each %$hash) {
+			next unless %$hash_ref;  # Skip empty hashes
+
+			print "$name:\n";
+			for my $key (sort keys %$hash_ref) {
+				print "  $key -> $hash_ref->{$key}\n";
+			}
+			print "\n";
+		}
+	}
 }
 
 #
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 02/24] docs: parse-headers.py: convert parse-headers.pl
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 01/24] docs: parse-headers.pl: improve its debug output format Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 03/24] docs: parse-headers.py: improve --help logic Mauro Carvalho Chehab
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

When the Kernel started to use Sphinx, we had to come up with
a solution to parse media headers. On that time, we didn't have
much experience with Sphinx extensions. So, we came up with our
own script-based solution that were basically implementing a
set of rules we used to have at the Makefile.

Convert it to Python, keeping it bug-compatible with the
original script.

While here, try to better document it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.py | 429 ++++++++++++++++++++++++++
 1 file changed, 429 insertions(+)
 create mode 100755 Documentation/sphinx/parse-headers.py

diff --git a/Documentation/sphinx/parse-headers.py b/Documentation/sphinx/parse-headers.py
new file mode 100755
index 000000000000..b39284d21090
--- /dev/null
+++ b/Documentation/sphinx/parse-headers.py
@@ -0,0 +1,429 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
+# pylint: disable=C0103,R0902,R0912,R0914,R0915
+
+"""
+Convert a C header or source file (C_FILE), into a ReStructured Text
+included via ..parsed-literal block with cross-references for the
+documentation files that describe the API. It accepts an optional
+EXCEPTIONS_FILE with describes what elements will be either ignored or
+be pointed to a non-default reference.
+
+The output is written at the (OUT_FILE).
+
+It is capable of identifying defines, functions, structs, typedefs,
+enums and enum symbols and create cross-references for all of them.
+It is also capable of distinguish #define used for specifying a Linux
+ioctl.
+
+The EXCEPTIONS_FILE contains a set of rules like:
+
+    ignore ioctl VIDIOC_ENUM_FMT
+    replace ioctl VIDIOC_DQBUF vidioc_qbuf
+    replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
+"""
+
+import argparse
+import os
+import re
+import sys
+
+
+class ParseHeader:
+    """
+    Creates an enriched version of a Kernel header file with cross-links
+    to each C data structure type.
+
+    It is meant to allow having a more comprehensive documentation, where
+    uAPI headers will create cross-reference links to the code.
+
+    It is capable of identifying defines, functions, structs, typedefs,
+    enums and enum symbols and create cross-references for all of them.
+    It is also capable of distinguish #define used for specifying a Linux
+    ioctl.
+
+    By default, it create rules for all symbols and defines, but it also
+    allows parsing an exception file. Such file contains a set of rules
+    using the syntax below:
+
+    1. Ignore rules:
+
+        ignore <type> <symbol>`
+
+    Removes the symbol from reference generation.
+
+    2. Replace rules:
+
+        replace <type> <old_symbol> <new_reference>
+
+    Replaces how old_symbol with a new reference. The new_reference can be:
+        - A simple symbol name;
+        - A full Sphinx reference.
+
+    On both cases, <type> can be:
+        - ioctl: for defines that end with _IO*, e.g. ioctl definitions
+        - define: for other defines
+        - symbol: for symbols defined within enums;
+        - typedef: for typedefs;
+        - enum: for the name of a non-anonymous enum;
+        - struct: for structs.
+
+    Examples:
+
+        ignore define __LINUX_MEDIA_H
+        ignore ioctl VIDIOC_ENUM_FMT
+        replace ioctl VIDIOC_DQBUF vidioc_qbuf
+        replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
+    """
+
+    # Parser regexes with multiple ways to capture enums and structs
+    RE_ENUMS = [
+        re.compile(r"^\s*enum\s+([\w_]+)\s*\{"),
+        re.compile(r"^\s*enum\s+([\w_]+)\s*$"),
+        re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*\{"),
+        re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*$"),
+    ]
+    RE_STRUCTS = [
+        re.compile(r"^\s*struct\s+([_\w][\w\d_]+)\s*\{"),
+        re.compile(r"^\s*struct\s+([_\w][\w\d_]+)$"),
+        re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s*\{"),
+        re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)$"),
+    ]
+
+    # FIXME: the original code was written a long time before Sphinx C
+    # domain to have multiple namespaces. To avoid to much turn at the
+    # existing hyperlinks, the code kept using "c:type" instead of the
+    # right types. To change that, we need to change the types not only
+    # here, but also at the uAPI media documentation.
+    DEF_SYMBOL_TYPES = {
+        "ioctl": {
+            "prefix": "\\ ",
+            "suffix": "\\ ",
+            "ref_type": ":ref",
+        },
+        "define": {
+            "prefix": "\\ ",
+            "suffix": "\\ ",
+            "ref_type": ":ref",
+        },
+        # We're calling each definition inside an enum as "symbol"
+        "symbol": {
+            "prefix": "\\ ",
+            "suffix": "\\ ",
+            "ref_type": ":ref",
+        },
+        "typedef": {
+            "prefix": "\\ ",
+            "suffix": "\\ ",
+            "ref_type": ":c:type",
+        },
+        # This is the name of the enum itself
+        "enum": {
+            "prefix": "",
+            "suffix": "\\ ",
+            "ref_type": ":c:type",
+        },
+        "struct": {
+            "prefix": "",
+            "suffix": "\\ ",
+            "ref_type": ":c:type",
+        },
+    }
+
+    def __init__(self, debug: bool = False):
+        """Initialize internal vars"""
+        self.debug = debug
+        self.data = ""
+
+        self.symbols = {}
+
+        for symbol_type in self.DEF_SYMBOL_TYPES:
+            self.symbols[symbol_type] = {}
+
+    def store_type(self, symbol_type: str, symbol: str,
+                   ref_name: str = None, replace_underscores: bool = True):
+        """
+        Stores a new symbol at self.symbols under symbol_type.
+
+        By default, underscores are replaced by "-"
+        """
+        defs = self.DEF_SYMBOL_TYPES[symbol_type]
+
+        prefix = defs.get("prefix", "")
+        suffix = defs.get("suffix", "")
+        ref_type = defs.get("ref_type")
+
+        # Determine ref_link based on symbol type
+        if ref_type:
+            if symbol_type == "enum":
+                ref_link = f"{ref_type}:`{symbol}`"
+            else:
+                if not ref_name:
+                    ref_name = symbol.lower()
+
+                if replace_underscores:
+                    ref_name = ref_name.replace("_", "-")
+
+                ref_link = f"{ref_type}:`{symbol} <{ref_name}>`"
+        else:
+            ref_link = symbol
+
+        self.symbols[symbol_type][symbol] = f"{prefix}{ref_link}{suffix}"
+
+    def store_line(self, line):
+        """Stores a line at self.data, properly indented"""
+        line = "    " + line.expandtabs()
+        self.data += line.rstrip(" ")
+
+    def parse_file(self, file_in: str):
+        """Reads a C source file and get identifiers"""
+        self.data = ""
+        is_enum = False
+        is_comment = False
+        multiline = ""
+
+        with open(file_in, "r",
+                  encoding="utf-8", errors="backslashreplace") as f:
+            for line_no, line in enumerate(f):
+                self.store_line(line)
+                line = line.strip("\n")
+
+                # Handle continuation lines
+                if line.endswith(r"\\"):
+                    multiline += line[-1]
+                    continue
+
+                if multiline:
+                    line = multiline + line
+                    multiline = ""
+
+                # Handle comments. They can be multilined
+                if not is_comment:
+                    if re.search(r"/\*.*", line):
+                        is_comment = True
+                    else:
+                        # Strip C99-style comments
+                        line = re.sub(r"(//.*)", "", line)
+
+                if is_comment:
+                    if re.search(r".*\*/", line):
+                        is_comment = False
+                    else:
+                        multiline = line
+                        continue
+
+                # At this point, line variable may be a multilined statement,
+                # if lines end with \ or if they have multi-line comments
+                # With that, it can safely remove the entire comments,
+                # and there's no need to use re.DOTALL for the logic below
+
+                line = re.sub(r"(/\*.*\*/)", "", line)
+                if not line.strip():
+                    continue
+
+                # It can be useful for debug purposes to print the file after
+                # having comments stripped and multi-lines grouped.
+                if self.debug > 1:
+                    print(f"line {line_no + 1}: {line}")
+
+                # Now the fun begins: parse each type and store it.
+
+                # We opted for a two parsing logic here due to:
+                # 1. it makes easier to debug issues not-parsed symbols;
+                # 2. we want symbol replacement at the entire content, not
+                #    just when the symbol is detected.
+
+                if is_enum:
+                    match = re.match(r"^\s*([_\w][\w\d_]+)\s*[\,=]?", line)
+                    if match:
+                        self.store_type("symbol", match.group(1))
+                    if "}" in line:
+                        is_enum = False
+                    continue
+
+                match = re.match(r"^\s*#\s*define\s+([\w_]+)\s+_IO", line)
+                if match:
+                    self.store_type("ioctl", match.group(1),
+                                    replace_underscores=False)
+                    continue
+
+                match = re.match(r"^\s*#\s*define\s+([\w_]+)(\s+|$)", line)
+                if match:
+                    self.store_type("define", match.group(1))
+                    continue
+
+                match = re.match(r"^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);",
+                                 line)
+                if match:
+                    name = match.group(2).strip()
+                    symbol = match.group(3)
+                    self.store_type("typedef", symbol, ref_name=name,
+                                    replace_underscores=False)
+                    continue
+
+                for re_enum in self.RE_ENUMS:
+                    match = re_enum.match(line)
+                    if match:
+                        self.store_type("enum", match.group(1))
+                        is_enum = True
+                        break
+
+                for re_struct in self.RE_STRUCTS:
+                    match = re_struct.match(line)
+                    if match:
+                        self.store_type("struct", match.group(1),
+                                        replace_underscores=False)
+                        break
+
+    def process_exceptions(self, fname: str):
+        """
+        Process exceptions file with rules to ignore or replace references.
+        """
+        if not fname:
+            return
+
+        name = os.path.basename(fname)
+
+        with open(fname, "r", encoding="utf-8", errors="backslashreplace") as f:
+            for ln, line in enumerate(f):
+                ln += 1
+                line = line.strip()
+                if not line or line.startswith("#"):
+                    continue
+
+                # Handle ignore rules
+                match = re.match(r"^ignore\s+(\w+)\s+(\S+)", line)
+                if match:
+                    c_type = match.group(1)
+                    symbol = match.group(2)
+
+                    if c_type not in self.DEF_SYMBOL_TYPES:
+                        sys.exit(f"{name}:{ln}: {c_type} is invalid")
+
+                    d = self.symbols[c_type]
+                    if symbol in d:
+                        del d[symbol]
+
+                    continue
+
+                # Handle replace rules
+                match = re.match(r"^replace\s+(\S+)\s+(\S+)\s+(\S+)", line)
+                if not match:
+                    sys.exit(f"{name}:{ln}: invalid line: {line}")
+
+                c_type, old, new = match.groups()
+
+                if c_type not in self.DEF_SYMBOL_TYPES:
+                    sys.exit(f"{name}:{ln}: {c_type} is invalid")
+
+                reftype = None
+
+                # Parse reference type when the type is specified
+
+                match = re.match(r"^\:c\:(data|func|macro|type)\:\`(.+)\`", new)
+                if match:
+                    reftype = f":c:{match.group(1)}"
+                    new = match.group(2)
+                else:
+                    match = re.search(r"(\:ref)\:\`(.+)\`", new)
+                    if match:
+                        reftype = match.group(1)
+                        new = match.group(2)
+
+                # If the replacement rule doesn't have a type, get default
+                if not reftype:
+                    reftype = self.DEF_SYMBOL_TYPES[c_type].get("ref_type")
+                    if not reftype:
+                        reftype = self.DEF_SYMBOL_TYPES[c_type].get("real_type")
+
+                new_ref = f"{reftype}:`{old} <{new}>`"
+
+                # Change self.symbols to use the replacement rule
+                if old in self.symbols[c_type]:
+                    self.symbols[c_type][old] = new_ref
+                else:
+                    print(f"{name}:{ln}: Warning: can't find {old} {c_type}")
+
+    def debug_print(self):
+        """
+        Print debug information containing the replacement rules per symbol.
+        To make easier to check, group them per type.
+        """
+        if not self.debug:
+            return
+
+        for c_type, refs in self.symbols.items():
+            if not refs:  # Skip empty dictionaries
+                continue
+
+            print(f"{c_type}:")
+
+            for symbol, ref in sorted(refs.items()):
+                print(f"  {symbol} -> {ref}")
+
+            print()
+
+    def write_output(self, file_in: str, file_out: str):
+        """Write the formatted output to a file."""
+
+        # Avoid extra blank lines
+        text = re.sub(r"\s+$", "", self.data) + "\n"
+        text = re.sub(r"\n\s+\n", "\n\n", text)
+
+        # Escape Sphinx special characters
+        text = re.sub(r"([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^])", r"\\\1", text)
+
+        # Source uAPI files may have special notes. Use bold font for them
+        text = re.sub(r"DEPRECATED", "**DEPRECATED**", text)
+
+        # Delimiters to catch the entire symbol after escaped
+        start_delim = r"([ \n\t\(=\*\@])"
+        end_delim = r"(\s|,|\\=|\\:|\;|\)|\}|\{)"
+
+        # Process all reference types
+        for ref_dict in self.symbols.values():
+            for symbol, replacement in ref_dict.items():
+                symbol = re.escape(re.sub(r"([\_\`\*\<\>\&\\\\:\/])", r"\\\1", symbol))
+                text = re.sub(fr'{start_delim}{symbol}{end_delim}',
+                              fr'\1{replacement}\2', text)
+
+        # Remove "\ " where not needed: before spaces and at the end of lines
+        text = re.sub(r"\\ ([\n ])", r"\1", text)
+
+        title = os.path.basename(file_in)
+
+        with open(file_out, "w", encoding="utf-8", errors="backslashreplace") as f:
+            f.write(".. -*- coding: utf-8; mode: rst -*-\n\n")
+            f.write(f"{title}\n")
+            f.write("=" * len(title))
+            f.write("\n\n.. parsed-literal::\n\n")
+            f.write(text)
+
+
+def main():
+    """Main function"""
+    parser = argparse.ArgumentParser(description=__doc__,
+                                     formatter_class=argparse.RawDescriptionHelpFormatter)
+
+    parser.add_argument("-d", "--debug", action="count", default=0,
+                        help="Increase debug level. Can be used multiple times")
+    parser.add_argument("file_in", help="Input C file")
+    parser.add_argument("file_out", help="Output RST file")
+    parser.add_argument("file_exceptions", nargs="?",
+                        help="Exceptions file (optional)")
+
+    args = parser.parse_args()
+
+    parser = ParseHeader(debug=args.debug)
+    parser.parse_file(args.file_in)
+
+    if args.file_exceptions:
+        parser.process_exceptions(args.file_exceptions)
+
+    parser.debug_print()
+    parser.write_output(args.file_in, args.file_out)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 03/24] docs: parse-headers.py: improve --help logic
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 01/24] docs: parse-headers.pl: improve its debug output format Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 02/24] docs: parse-headers.py: convert parse-headers.pl Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 04/24] docs: parse-headers.py: better handle @var arguments Mauro Carvalho Chehab
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

When printing --help, we'd like the name of the files
from __doc__ to match the displayed positional arguments at
both usage and argument description lines.

Use a custom formatter class to convert ``foo`` into ANSI SGR
code to bold the argument, if is TTY, and adjust the help
text to match the argument names.

Here on Plasma, that makes it display it colored, wich is
really cool. Yet, I opted for SGR, as the best is to follow
the terminal color schema for bold.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.py | 67 +++++++++++++++++++++++----
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/Documentation/sphinx/parse-headers.py b/Documentation/sphinx/parse-headers.py
index b39284d21090..650f9c9a68d1 100755
--- a/Documentation/sphinx/parse-headers.py
+++ b/Documentation/sphinx/parse-headers.py
@@ -4,20 +4,20 @@
 # pylint: disable=C0103,R0902,R0912,R0914,R0915
 
 """
-Convert a C header or source file (C_FILE), into a ReStructured Text
+Convert a C header or source file ``FILE_IN``, into a ReStructured Text
 included via ..parsed-literal block with cross-references for the
 documentation files that describe the API. It accepts an optional
-EXCEPTIONS_FILE with describes what elements will be either ignored or
-be pointed to a non-default reference.
+``FILE_RULES`` file to describes what elements will be either ignored or
+be pointed to a non-default reference type/name.
 
-The output is written at the (OUT_FILE).
+The output is written at ``FILE_OUT``.
 
 It is capable of identifying defines, functions, structs, typedefs,
 enums and enum symbols and create cross-references for all of them.
 It is also capable of distinguish #define used for specifying a Linux
 ioctl.
 
-The EXCEPTIONS_FILE contains a set of rules like:
+The optional ``FILE_RULES`` contains a set of rules like:
 
     ignore ioctl VIDIOC_ENUM_FMT
     replace ioctl VIDIOC_DQBUF vidioc_qbuf
@@ -400,17 +400,66 @@ class ParseHeader:
             f.write("\n\n.. parsed-literal::\n\n")
             f.write(text)
 
+class EnrichFormatter(argparse.HelpFormatter):
+    """
+    Better format the output, making easier to identify the positional args
+    and how they're used at the __doc__ description.
+    """
+    def __init__(self, *args, **kwargs):
+        """Initialize class and check if is TTY"""
+        super().__init__(*args, **kwargs)
+        self._tty = sys.stdout.isatty()
+
+    def enrich_text(self, text):
+        """Handle ReST markups (currently, only ``foo``)"""
+        if self._tty and text:
+            # Replace ``text`` with ANSI bold
+            return re.sub(r'\`\`(.+?)\`\`',
+                          lambda m: f'\033[1m{m.group(1)}\033[0m', text)
+        return text
+
+    def _fill_text(self, text, width, indent):
+        """Enrich descriptions with markups on it"""
+        enriched = self.enrich_text(text)
+        return "\n".join(indent + line for line in enriched.splitlines())
+
+    def _format_usage(self, usage, actions, groups, prefix):
+        """Enrich positional arguments at usage: line"""
+
+        prog = self._prog
+        parts = []
+
+        for action in actions:
+            if action.option_strings:
+                opt = action.option_strings[0]
+                if action.nargs != 0:
+                    opt += f" {action.dest.upper()}"
+                parts.append(f"[{opt}]")
+            else:
+                # Positional argument
+                parts.append(self.enrich_text(f"``{action.dest.upper()}``"))
+
+        usage_text = f"{prefix or 'usage: '} {prog} {' '.join(parts)}\n"
+        return usage_text
+
+    def _format_action_invocation(self, action):
+        """Enrich argument names"""
+        if not action.option_strings:
+            return self.enrich_text(f"``{action.dest.upper()}``")
+        else:
+            return ", ".join(action.option_strings)
+
 
 def main():
     """Main function"""
     parser = argparse.ArgumentParser(description=__doc__,
-                                     formatter_class=argparse.RawDescriptionHelpFormatter)
+                                     formatter_class=EnrichFormatter)
 
     parser.add_argument("-d", "--debug", action="count", default=0,
                         help="Increase debug level. Can be used multiple times")
     parser.add_argument("file_in", help="Input C file")
     parser.add_argument("file_out", help="Output RST file")
-    parser.add_argument("file_exceptions", nargs="?",
+    parser.add_argument("file_rules", nargs="?",
                         help="Exceptions file (optional)")
 
     args = parser.parse_args()
@@ -418,8 +467,8 @@ def main():
     parser = ParseHeader(debug=args.debug)
     parser.parse_file(args.file_in)
 
-    if args.file_exceptions:
-        parser.process_exceptions(args.file_exceptions)
+    if args.file_rules:
+        parser.process_exceptions(args.file_rules)
 
     parser.debug_print()
     parser.write_output(args.file_in, args.file_out)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 04/24] docs: parse-headers.py: better handle @var arguments
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 03/24] docs: parse-headers.py: improve --help logic Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 05/24] docs: parse-headers.py: simplify the rules for hashes Mauro Carvalho Chehab
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

The kernel-doc markups inside headers may contain @var markups.

With the current rule, this would be converted into:

     \* @:c:type:`DMX_BUFFER_FLAG_DISCONTINUITY_INDICATOR <dmx_buffer_flags>`\:

Fix it adding a non-printed space if needed.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/sphinx/parse-headers.py b/Documentation/sphinx/parse-headers.py
index 650f9c9a68d1..f4ab9c49d2f5 100755
--- a/Documentation/sphinx/parse-headers.py
+++ b/Documentation/sphinx/parse-headers.py
@@ -120,12 +120,12 @@ class ParseHeader:
         },
         # This is the name of the enum itself
         "enum": {
-            "prefix": "",
+            "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":c:type",
         },
         "struct": {
-            "prefix": "",
+            "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":c:type",
         },
@@ -390,6 +390,8 @@ class ParseHeader:
 
         # Remove "\ " where not needed: before spaces and at the end of lines
         text = re.sub(r"\\ ([\n ])", r"\1", text)
+        text = re.sub(r" \\ ", " ", text)
+
 
         title = os.path.basename(file_in)
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 05/24] docs: parse-headers.py: simplify the rules for hashes
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 04/24] docs: parse-headers.py: better handle @var arguments Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 06/24] tools: docs: parse-headers.py: move it from sphinx dir Mauro Carvalho Chehab
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Normal :ref domain accept either hashes or underscores, but
c-domain ones don't. Fix it and remove unneeded places where
we opt to disable underscore transformation.

Ideally, we should have a rule about the default, or change
the way media docs have their references.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.py | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/Documentation/sphinx/parse-headers.py b/Documentation/sphinx/parse-headers.py
index f4ab9c49d2f5..344090ef259c 100755
--- a/Documentation/sphinx/parse-headers.py
+++ b/Documentation/sphinx/parse-headers.py
@@ -162,7 +162,8 @@ class ParseHeader:
                 if not ref_name:
                     ref_name = symbol.lower()
 
-                if replace_underscores:
+                # c-type references don't support hash
+                if ref_type == ":ref" and replace_underscores:
                     ref_name = ref_name.replace("_", "-")
 
                 ref_link = f"{ref_type}:`{symbol} <{ref_name}>`"
@@ -258,8 +259,7 @@ class ParseHeader:
                 if match:
                     name = match.group(2).strip()
                     symbol = match.group(3)
-                    self.store_type("typedef", symbol, ref_name=name,
-                                    replace_underscores=False)
+                    self.store_type("typedef", symbol, ref_name=name)
                     continue
 
                 for re_enum in self.RE_ENUMS:
@@ -272,8 +272,7 @@ class ParseHeader:
                 for re_struct in self.RE_STRUCTS:
                     match = re_struct.match(line)
                     if match:
-                        self.store_type("struct", match.group(1),
-                                        replace_underscores=False)
+                        self.store_type("struct", match.group(1))
                         break
 
     def process_exceptions(self, fname: str):
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 06/24] tools: docs: parse-headers.py: move it from sphinx dir
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (4 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 05/24] docs: parse-headers.py: simplify the rules for hashes Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 07/24] tools: docs: parse_data_structs.py: add methods to return output Mauro Carvalho Chehab
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Donald Hunter, Jakub Kicinski, Jan Stancek,
	linux-kernel

As suggested by Jon, we should start having a tools/docs
directory, instead of placing everything under scripts.

In the specific case of parse-headers.py, the previous
location is where we're placing Sphinx extensions, which is
not the right place for execs.

Move it to tools/docs/parse-headers.py.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 .pylintrc                                     |  2 +-
 tools/docs/lib/__init__.py                    |  0
 tools/docs/lib/enrich_formatter.py            | 70 ++++++++++++++
 .../docs/lib/parse_data_structs.py            | 95 ++-----------------
 tools/docs/parse-headers.py                   | 57 +++++++++++
 5 files changed, 135 insertions(+), 89 deletions(-)
 create mode 100644 tools/docs/lib/__init__.py
 create mode 100644 tools/docs/lib/enrich_formatter.py
 rename Documentation/sphinx/parse-headers.py => tools/docs/lib/parse_data_structs.py (80%)
 create mode 100755 tools/docs/parse-headers.py

diff --git a/.pylintrc b/.pylintrc
index f1d21379254b..ad2476751f80 100644
--- a/.pylintrc
+++ b/.pylintrc
@@ -1,2 +1,2 @@
 [MASTER]
-init-hook='import sys; sys.path += ["scripts/lib", "scripts/lib/kdoc", "scripts/lib/abi"]'
+init-hook='import sys; sys.path += ["scripts/lib", "scripts/lib/kdoc", "scripts/lib/abi", "tools/docs/lib"]'
diff --git a/tools/docs/lib/__init__.py b/tools/docs/lib/__init__.py
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/tools/docs/lib/enrich_formatter.py b/tools/docs/lib/enrich_formatter.py
new file mode 100644
index 000000000000..bb171567a4ca
--- /dev/null
+++ b/tools/docs/lib/enrich_formatter.py
@@ -0,0 +1,70 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
+
+"""
+Ancillary argparse HelpFormatter class that works on a similar way as
+argparse.RawDescriptionHelpFormatter, e.g. description maintains line
+breaks, but it also implement transformations to the help text. The
+actual transformations ar given by enrich_text(), if the output is tty.
+
+Currently, the follow transformations are done:
+
+    - Positional arguments are shown in upper cases;
+    - if output is TTY, ``var`` and positional arguments are shown prepended
+      by an ANSI SGR code. This is usually translated to bold. On some
+      terminals, like, konsole, this is translated into a colored bold text.
+"""
+
+import argparse
+import re
+import sys
+
+class EnrichFormatter(argparse.HelpFormatter):
+    """
+    Better format the output, making easier to identify the positional args
+    and how they're used at the __doc__ description.
+    """
+    def __init__(self, *args, **kwargs):
+        """Initialize class and check if is TTY"""
+        super().__init__(*args, **kwargs)
+        self._tty = sys.stdout.isatty()
+
+    def enrich_text(self, text):
+        """Handle ReST markups (currently, only ``foo``)"""
+        if self._tty and text:
+            # Replace ``text`` with ANSI SGR (bold)
+            return re.sub(r'\`\`(.+?)\`\`',
+                          lambda m: f'\033[1m{m.group(1)}\033[0m', text)
+        return text
+
+    def _fill_text(self, text, width, indent):
+        """Enrich descriptions with markups on it"""
+        enriched = self.enrich_text(text)
+        return "\n".join(indent + line for line in enriched.splitlines())
+
+    def _format_usage(self, usage, actions, groups, prefix):
+        """Enrich positional arguments at usage: line"""
+
+        prog = self._prog
+        parts = []
+
+        for action in actions:
+            if action.option_strings:
+                opt = action.option_strings[0]
+                if action.nargs != 0:
+                    opt += f" {action.dest.upper()}"
+                parts.append(f"[{opt}]")
+            else:
+                # Positional argument
+                parts.append(self.enrich_text(f"``{action.dest.upper()}``"))
+
+        usage_text = f"{prefix or 'usage: '} {prog} {' '.join(parts)}\n"
+        return usage_text
+
+    def _format_action_invocation(self, action):
+        """Enrich argument names"""
+        if not action.option_strings:
+            return self.enrich_text(f"``{action.dest.upper()}``")
+
+        return ", ".join(action.option_strings)
diff --git a/Documentation/sphinx/parse-headers.py b/tools/docs/lib/parse_data_structs.py
similarity index 80%
rename from Documentation/sphinx/parse-headers.py
rename to tools/docs/lib/parse_data_structs.py
index 344090ef259c..2b7fa6bd8321 100755
--- a/Documentation/sphinx/parse-headers.py
+++ b/tools/docs/lib/parse_data_structs.py
@@ -1,36 +1,32 @@
 #!/usr/bin/env python3
 # SPDX-License-Identifier: GPL-2.0
-# Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
-# pylint: disable=C0103,R0902,R0912,R0914,R0915
+# Copyright (c) 2016-2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
+# pylint: disable=R0912,R0915
 
 """
-Convert a C header or source file ``FILE_IN``, into a ReStructured Text
-included via ..parsed-literal block with cross-references for the
-documentation files that describe the API. It accepts an optional
-``FILE_RULES`` file to describes what elements will be either ignored or
-be pointed to a non-default reference type/name.
+Parse a source file or header, creating ReStructured Text cross references.
 
-The output is written at ``FILE_OUT``.
+It accepts an optional file to change the default symbol reference or to
+suppress symbols from the output.
 
 It is capable of identifying defines, functions, structs, typedefs,
 enums and enum symbols and create cross-references for all of them.
 It is also capable of distinguish #define used for specifying a Linux
 ioctl.
 
-The optional ``FILE_RULES`` contains a set of rules like:
+The optional rules file contains a set of rules like:
 
     ignore ioctl VIDIOC_ENUM_FMT
     replace ioctl VIDIOC_DQBUF vidioc_qbuf
     replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
 """
 
-import argparse
 import os
 import re
 import sys
 
 
-class ParseHeader:
+class ParseDataStructs:
     """
     Creates an enriched version of a Kernel header file with cross-links
     to each C data structure type.
@@ -400,80 +396,3 @@ class ParseHeader:
             f.write("=" * len(title))
             f.write("\n\n.. parsed-literal::\n\n")
             f.write(text)
-
-class EnrichFormatter(argparse.HelpFormatter):
-    """
-    Better format the output, making easier to identify the positional args
-    and how they're used at the __doc__ description.
-    """
-    def __init__(self, *args, **kwargs):
-        """Initialize class and check if is TTY"""
-        super().__init__(*args, **kwargs)
-        self._tty = sys.stdout.isatty()
-
-    def enrich_text(self, text):
-        """Handle ReST markups (currently, only ``foo``)"""
-        if self._tty and text:
-            # Replace ``text`` with ANSI bold
-            return re.sub(r'\`\`(.+?)\`\`',
-                          lambda m: f'\033[1m{m.group(1)}\033[0m', text)
-        return text
-
-    def _fill_text(self, text, width, indent):
-        """Enrich descriptions with markups on it"""
-        enriched = self.enrich_text(text)
-        return "\n".join(indent + line for line in enriched.splitlines())
-
-    def _format_usage(self, usage, actions, groups, prefix):
-        """Enrich positional arguments at usage: line"""
-
-        prog = self._prog
-        parts = []
-
-        for action in actions:
-            if action.option_strings:
-                opt = action.option_strings[0]
-                if action.nargs != 0:
-                    opt += f" {action.dest.upper()}"
-                parts.append(f"[{opt}]")
-            else:
-                # Positional argument
-                parts.append(self.enrich_text(f"``{action.dest.upper()}``"))
-
-        usage_text = f"{prefix or 'usage: '} {prog} {' '.join(parts)}\n"
-        return usage_text
-
-    def _format_action_invocation(self, action):
-        """Enrich argument names"""
-        if not action.option_strings:
-            return self.enrich_text(f"``{action.dest.upper()}``")
-        else:
-            return ", ".join(action.option_strings)
-
-
-def main():
-    """Main function"""
-    parser = argparse.ArgumentParser(description=__doc__,
-                                     formatter_class=EnrichFormatter)
-
-    parser.add_argument("-d", "--debug", action="count", default=0,
-                        help="Increase debug level. Can be used multiple times")
-    parser.add_argument("file_in", help="Input C file")
-    parser.add_argument("file_out", help="Output RST file")
-    parser.add_argument("file_rules", nargs="?",
-                        help="Exceptions file (optional)")
-
-    args = parser.parse_args()
-
-    parser = ParseHeader(debug=args.debug)
-    parser.parse_file(args.file_in)
-
-    if args.file_rules:
-        parser.process_exceptions(args.file_rules)
-
-    parser.debug_print()
-    parser.write_output(args.file_in, args.file_out)
-
-
-if __name__ == "__main__":
-    main()
diff --git a/tools/docs/parse-headers.py b/tools/docs/parse-headers.py
new file mode 100755
index 000000000000..07d3b47c4834
--- /dev/null
+++ b/tools/docs/parse-headers.py
@@ -0,0 +1,57 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2016, 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>.
+# pylint: disable=C0103
+
+"""
+Convert a C header or source file ``FILE_IN``, into a ReStructured Text
+included via ..parsed-literal block with cross-references for the
+documentation files that describe the API. It accepts an optional
+``FILE_RULES`` file to describes what elements will be either ignored or
+be pointed to a non-default reference type/name.
+
+The output is written at ``FILE_OUT``.
+
+It is capable of identifying defines, functions, structs, typedefs,
+enums and enum symbols and create cross-references for all of them.
+It is also capable of distinguish #define used for specifying a Linux
+ioctl.
+
+The optional ``FILE_RULES`` contains a set of rules like:
+
+    ignore ioctl VIDIOC_ENUM_FMT
+    replace ioctl VIDIOC_DQBUF vidioc_qbuf
+    replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det`
+"""
+
+import argparse
+
+from lib.parse_data_structs import ParseDataStructs
+from lib.enrich_formatter import EnrichFormatter
+
+def main():
+    """Main function"""
+    parser = argparse.ArgumentParser(description=__doc__,
+                                     formatter_class=EnrichFormatter)
+
+    parser.add_argument("-d", "--debug", action="count", default=0,
+                        help="Increase debug level. Can be used multiple times")
+    parser.add_argument("file_in", help="Input C file")
+    parser.add_argument("file_out", help="Output RST file")
+    parser.add_argument("file_rules", nargs="?",
+                        help="Exceptions file (optional)")
+
+    args = parser.parse_args()
+
+    parser = ParseDataStructs(debug=args.debug)
+    parser.parse_file(args.file_in)
+
+    if args.file_rules:
+        parser.process_exceptions(args.file_rules)
+
+    parser.debug_print()
+    parser.write_output(args.file_in, args.file_out)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 07/24] tools: docs: parse_data_structs.py: add methods to return output
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (5 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 06/24] tools: docs: parse-headers.py: move it from sphinx dir Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 08/24] MAINTAINERS: add files from tools/docs to documentation entry Mauro Carvalho Chehab
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

When running it from command line, we want to write an output
file, but when used as a class, one may just want the output
content returned as a string.

Split write_output() on two methods to allow both usecases.

Also add an extra method to produce a TOC.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 tools/docs/lib/parse_data_structs.py | 62 ++++++++++++++++++++++++++--
 tools/docs/parse-headers.py          |  5 ++-
 2 files changed, 62 insertions(+), 5 deletions(-)

diff --git a/tools/docs/lib/parse_data_structs.py b/tools/docs/lib/parse_data_structs.py
index 2b7fa6bd8321..a5aa2e182052 100755
--- a/tools/docs/lib/parse_data_structs.py
+++ b/tools/docs/lib/parse_data_structs.py
@@ -97,33 +97,39 @@ class ParseDataStructs:
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":ref",
+            "description": "IOCTL Commands",
         },
         "define": {
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":ref",
+            "description": "Macros and Definitions",
         },
         # We're calling each definition inside an enum as "symbol"
         "symbol": {
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":ref",
+            "description": "Enumeration values",
         },
         "typedef": {
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":c:type",
+            "description": "Type Definitions",
         },
-        # This is the name of the enum itself
+        # This is the description of the enum itself
         "enum": {
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":c:type",
+            "description": "Enumerations",
         },
         "struct": {
             "prefix": "\\ ",
             "suffix": "\\ ",
             "ref_type": ":c:type",
+            "description": "Structures",
         },
     }
 
@@ -359,7 +365,7 @@ class ParseDataStructs:
 
             print()
 
-    def write_output(self, file_in: str, file_out: str):
+    def gen_output(self):
         """Write the formatted output to a file."""
 
         # Avoid extra blank lines
@@ -387,12 +393,60 @@ class ParseDataStructs:
         text = re.sub(r"\\ ([\n ])", r"\1", text)
         text = re.sub(r" \\ ", " ", text)
 
+        return text
 
+    def gen_toc(self):
+        """
+        Create a TOC table pointing to each symbol from the header
+        """
+        text = []
+
+        # Add header
+        text.append(".. contents:: Table of Contents")
+        text.append("   :depth: 2")
+        text.append("   :local:")
+        text.append("")
+
+        # Sort symbol types per description
+        symbol_descriptions = []
+        for k, v in self.DEF_SYMBOL_TYPES.items():
+            symbol_descriptions.append((v['description'], k))
+
+        symbol_descriptions.sort()
+
+        # Process each category
+        for description, c_type in symbol_descriptions:
+
+            refs = self.symbols[c_type]
+            if not refs:  # Skip empty categories
+                continue
+
+            text.append(f"{description}")
+            text.append("-" * len(description))
+            text.append("")
+
+            # Sort symbols alphabetically
+            for symbol, ref in sorted(refs.items()):
+                text.append(f"* :{ref}:")
+
+            text.append("")  # Add empty line between categories
+
+        return "\n".join(text)
+
+    def write_output(self, file_in: str, file_out: str, toc: bool):
         title = os.path.basename(file_in)
 
+        if toc:
+            text = self.gen_toc()
+        else:
+            text = self.gen_output()
+
         with open(file_out, "w", encoding="utf-8", errors="backslashreplace") as f:
             f.write(".. -*- coding: utf-8; mode: rst -*-\n\n")
             f.write(f"{title}\n")
-            f.write("=" * len(title))
-            f.write("\n\n.. parsed-literal::\n\n")
+            f.write("=" * len(title) + "\n\n")
+
+            if not toc:
+                f.write(".. parsed-literal::\n\n")
+
             f.write(text)
diff --git a/tools/docs/parse-headers.py b/tools/docs/parse-headers.py
index 07d3b47c4834..bfa4e46a53e3 100755
--- a/tools/docs/parse-headers.py
+++ b/tools/docs/parse-headers.py
@@ -36,6 +36,9 @@ def main():
 
     parser.add_argument("-d", "--debug", action="count", default=0,
                         help="Increase debug level. Can be used multiple times")
+    parser.add_argument("-t", "--toc", action="store_true",
+                        help="instead of a literal block, outputs a TOC table at the RST file")
+
     parser.add_argument("file_in", help="Input C file")
     parser.add_argument("file_out", help="Output RST file")
     parser.add_argument("file_rules", nargs="?",
@@ -50,7 +53,7 @@ def main():
         parser.process_exceptions(args.file_rules)
 
     parser.debug_print()
-    parser.write_output(args.file_in, args.file_out)
+    parser.write_output(args.file_in, args.file_out, args.toc)
 
 
 if __name__ == "__main__":
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 08/24] MAINTAINERS: add files from tools/docs to documentation entry
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (6 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 07/24] tools: docs: parse_data_structs.py: add methods to return output Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 09/24] docs: uapi: media: Makefile: use parse-headers.py Mauro Carvalho Chehab
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

As we now have a tools directory for docs, add it to its
corresponding entry.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index dafc11712544..ef87548b8f88 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7308,6 +7308,7 @@ F:	scripts/get_abi.py
 F:	scripts/kernel-doc*
 F:	scripts/lib/abi/*
 F:	scripts/lib/kdoc/*
+F:	tools/docs/*
 F:	tools/net/ynl/pyynl/lib/doc_generator.py
 F:	scripts/sphinx-pre-install
 X:	Documentation/ABI/
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 09/24] docs: uapi: media: Makefile: use parse-headers.py
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (7 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 08/24] MAINTAINERS: add files from tools/docs to documentation entry Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 10/24] docs: kernel_include.py: Update its coding style Mauro Carvalho Chehab
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	linux-media

Now that we have a new parser, use it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/userspace-api/media/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/userspace-api/media/Makefile b/Documentation/userspace-api/media/Makefile
index 3d8aaf5c253b..accc734d045a 100644
--- a/Documentation/userspace-api/media/Makefile
+++ b/Documentation/userspace-api/media/Makefile
@@ -3,7 +3,7 @@
 # Rules to convert a .h file to inline RST documentation
 
 SRC_DIR=$(srctree)/Documentation/userspace-api/media
-PARSER = $(srctree)/Documentation/sphinx/parse-headers.pl
+PARSER = $(srctree)/tools/docs/parse-headers.py
 UAPI = $(srctree)/include/uapi/linux
 KAPI = $(srctree)/include/linux
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 10/24] docs: kernel_include.py: Update its coding style
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (8 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 09/24] docs: uapi: media: Makefile: use parse-headers.py Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 11/24] docs: kernel_include.py: allow cross-reference generation Mauro Carvalho Chehab
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

With the help of tools like black, pylint, autopep8 and flake,
improve the code style in preparation for further changes.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 100 ++++++++++++-------------
 1 file changed, 47 insertions(+), 53 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 1e566e87ebcd..1212786ac516 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -1,7 +1,6 @@
 #!/usr/bin/env python3
-# -*- coding: utf-8; mode: python -*-
 # SPDX-License-Identifier: GPL-2.0
-# pylint: disable=R0903, C0330, R0914, R0912, E0401
+# pylint: disable=R0903, R0912, R0914, R0915, C0209,W0707
 
 """
     kernel-include
@@ -40,41 +39,38 @@ from docutils.parsers.rst import directives
 from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
 from docutils.parsers.rst.directives.misc import Include
 
-__version__  = '1.0'
+__version__ = "1.0"
+
 
 # ==============================================================================
 def setup(app):
-# ==============================================================================
-
+    """Setup Sphinx exension"""
     app.add_directive("kernel-include", KernelInclude)
-    return dict(
-        version = __version__,
-        parallel_read_safe = True,
-        parallel_write_safe = True
-    )
+    return {
+        "version": __version__,
+        "parallel_read_safe": True,
+        "parallel_write_safe": True,
+    }
+
 
 # ==============================================================================
 class KernelInclude(Include):
-# ==============================================================================
-
     """KernelInclude (``kernel-include``) directive"""
 
     def run(self):
         env = self.state.document.settings.env
-        path = os.path.realpath(
-            os.path.expandvars(self.arguments[0]))
+        path = os.path.realpath(os.path.expandvars(self.arguments[0]))
 
         # to get a bit security back, prohibit /etc:
         if path.startswith(os.sep + "etc"):
-            raise self.severe(
-                'Problems with "%s" directive, prohibited path: %s'
-                % (self.name, path))
+            raise self.severe('Problems with "%s" directive, prohibited path: %s' %
+                              (self.name, path))
 
         self.arguments[0] = path
 
         env.note_dependency(os.path.abspath(path))
 
-        #return super(KernelInclude, self).run() # won't work, see HINTs in _run()
+        # return super(KernelInclude, self).run() # won't work, see HINTs in _run()
         return self._run()
 
     def _run(self):
@@ -87,41 +83,39 @@ class KernelInclude(Include):
 
         if not self.state.document.settings.file_insertion_enabled:
             raise self.warning('"%s" directive disabled.' % self.name)
-        source = self.state_machine.input_lines.source(
-            self.lineno - self.state_machine.input_offset - 1)
+        source = self.state_machine.input_lines.source(self.lineno -
+                                                       self.state_machine.input_offset - 1)
         source_dir = os.path.dirname(os.path.abspath(source))
         path = directives.path(self.arguments[0])
-        if path.startswith('<') and path.endswith('>'):
+        if path.startswith("<") and path.endswith(">"):
             path = os.path.join(self.standard_include_path, path[1:-1])
         path = os.path.normpath(os.path.join(source_dir, path))
 
         # HINT: this is the only line I had to change / commented out:
-        #path = utils.relative_path(None, path)
+        # path = utils.relative_path(None, path)
 
-        encoding = self.options.get(
-            'encoding', self.state.document.settings.input_encoding)
-        e_handler=self.state.document.settings.input_encoding_error_handler
-        tab_width = self.options.get(
-            'tab-width', self.state.document.settings.tab_width)
+        encoding = self.options.get("encoding",
+                                    self.state.document.settings.input_encoding)
+        e_handler = self.state.document.settings.input_encoding_error_handler
+        tab_width = self.options.get("tab-width",
+                                     self.state.document.settings.tab_width)
         try:
             self.state.document.settings.record_dependencies.add(path)
-            include_file = io.FileInput(source_path=path,
-                                        encoding=encoding,
+            include_file = io.FileInput(source_path=path, encoding=encoding,
                                         error_handler=e_handler)
-        except UnicodeEncodeError as error:
+        except UnicodeEncodeError:
             raise self.severe('Problems with "%s" directive path:\n'
                               'Cannot encode input file path "%s" '
-                              '(wrong locale?).' %
-                              (self.name, SafeString(path)))
+                              "(wrong locale?)." % (self.name, SafeString(path)))
         except IOError as error:
-            raise self.severe('Problems with "%s" directive path:\n%s.' %
-                      (self.name, ErrorString(error)))
-        startline = self.options.get('start-line', None)
-        endline = self.options.get('end-line', None)
+            raise self.severe('Problems with "%s" directive path:\n%s.'
+                              % (self.name, ErrorString(error)))
+        startline = self.options.get("start-line", None)
+        endline = self.options.get("end-line", None)
         try:
             if startline or (endline is not None):
                 lines = include_file.readlines()
-                rawtext = ''.join(lines[startline:endline])
+                rawtext = "".join(lines[startline:endline])
             else:
                 rawtext = include_file.read()
         except UnicodeError as error:
@@ -129,43 +123,43 @@ class KernelInclude(Include):
                               (self.name, ErrorString(error)))
         # start-after/end-before: no restrictions on newlines in match-text,
         # and no restrictions on matching inside lines vs. line boundaries
-        after_text = self.options.get('start-after', None)
+        after_text = self.options.get("start-after", None)
         if after_text:
             # skip content in rawtext before *and incl.* a matching text
             after_index = rawtext.find(after_text)
             if after_index < 0:
                 raise self.severe('Problem with "start-after" option of "%s" '
-                                  'directive:\nText not found.' % self.name)
-            rawtext = rawtext[after_index + len(after_text):]
-        before_text = self.options.get('end-before', None)
+                                  "directive:\nText not found." % self.name)
+            rawtext = rawtext[after_index + len(after_text) :]
+        before_text = self.options.get("end-before", None)
         if before_text:
             # skip content in rawtext after *and incl.* a matching text
             before_index = rawtext.find(before_text)
             if before_index < 0:
                 raise self.severe('Problem with "end-before" option of "%s" '
-                                  'directive:\nText not found.' % self.name)
+                                  "directive:\nText not found." % self.name)
             rawtext = rawtext[:before_index]
 
         include_lines = statemachine.string2lines(rawtext, tab_width,
                                                   convert_whitespace=True)
-        if 'literal' in self.options:
+        if "literal" in self.options:
             # Convert tabs to spaces, if `tab_width` is positive.
             if tab_width >= 0:
                 text = rawtext.expandtabs(tab_width)
             else:
                 text = rawtext
             literal_block = nodes.literal_block(rawtext, source=path,
-                                    classes=self.options.get('class', []))
+                                                classes=self.options.get("class", [])
+            )
             literal_block.line = 1
             self.add_name(literal_block)
-            if 'number-lines' in self.options:
+            if "number-lines" in self.options:
                 try:
-                    startline = int(self.options['number-lines'] or 1)
+                    startline = int(self.options["number-lines"] or 1)
                 except ValueError:
-                    raise self.error(':number-lines: with non-integer '
-                                     'start value')
+                    raise self.error(":number-lines: with non-integer start value")
                 endline = startline + len(include_lines)
-                if text.endswith('\n'):
+                if text.endswith("\n"):
                     text = text[:-1]
                 tokens = NumberLines([([], text)], startline, endline)
                 for classes, value in tokens:
@@ -177,12 +171,12 @@ class KernelInclude(Include):
             else:
                 literal_block += nodes.Text(text, text)
             return [literal_block]
-        if 'code' in self.options:
-            self.options['source'] = path
+        if "code" in self.options:
+            self.options["source"] = path
             codeblock = CodeBlock(self.name,
-                                  [self.options.pop('code')], # arguments
+                                  [self.options.pop("code")],  # arguments
                                   self.options,
-                                  include_lines, # content
+                                  include_lines,  # content
                                   self.lineno,
                                   self.content_offset,
                                   self.block_text,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 11/24] docs: kernel_include.py: allow cross-reference generation
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (9 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 10/24] docs: kernel_include.py: Update its coding style Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 12/24] docs: kernel_include.py: generate warnings for broken refs Mauro Carvalho Chehab
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

kernel_include extension was originally designed to be used by the
media comprehensive uAPI documentation, where, instead of simpler
kernel-doc markups, the uAPI documentation is enriched with a larger
text, with images, complex tables, graphs, etc.

There, we wanted to include the much simpler yet documented .h
file.

This extension is needed to include files from other parts of the
Kernel tree outside Documentation, because the original Sphinx
include tag doesn't allow going outside of the directory passed
via sphinx-build command line.

Yet, the cross-references themselves to the full documentation
were using a perl script to create cross-references against the
comprehensive documentation.

As the perl script is now converted to Phython and there is a
Python class producing an include-compatible output with cross
references, add two optional arguments to kernel_include.py:

1. :generate-cross-refs:

        If present, instead of reading the file, it calls ParseDataStructs()
        class, which converts C data structures into cross-references to
        be linked to ReST files containing a more comprehensive documentation;

        Don't use it together with :start-line: and/or :end-line:, as
        filtering input file line range is currently not supported.

2. :exception-file:

        Used together with :generate-cross-refs:. Points to a file containing
        rules to ignore C data structs or to use a different reference name,
        optionally using a different reference type.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 94 ++++++++++++++++++++------
 1 file changed, 74 insertions(+), 20 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 1212786ac516..fc37e6fa9d96 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -25,6 +25,24 @@
     Substrings of the form $name or ${name} are replaced by the value of
     environment variable name. Malformed variable names and references to
     non-existing variables are left unchanged.
+
+    This extension overrides Sphinx include directory, adding two extra
+    arguments:
+
+    1. :generate-cross-refs:
+
+        If present, instead of reading the file, it calls ParseDataStructs()
+        class, which converts C data structures into cross-references to
+        be linked to ReST files containing a more comprehensive documentation;
+
+        Don't use it together with :start-line: and/or :end-line:, as
+        filtering input file line range is currently not supported.
+
+    2. :exception-file:
+
+        Used together with :generate-cross-refs:. Points to a file containing
+        rules to ignore C data structs or to use a different reference name,
+        optionally using a different reference type.
 """
 
 # ==============================================================================
@@ -32,6 +50,7 @@
 # ==============================================================================
 
 import os.path
+import sys
 
 from docutils import io, nodes, statemachine
 from docutils.utils.error_reporting import SafeString, ErrorString
@@ -39,6 +58,11 @@ from docutils.parsers.rst import directives
 from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
 from docutils.parsers.rst.directives.misc import Include
 
+srctree = os.path.abspath(os.environ["srctree"])
+sys.path.insert(0, os.path.join(srctree, "tools/docs/lib"))
+
+from parse_data_structs import ParseDataStructs
+
 __version__ = "1.0"
 
 
@@ -57,6 +81,14 @@ def setup(app):
 class KernelInclude(Include):
     """KernelInclude (``kernel-include``) directive"""
 
+    # Add extra options
+    option_spec = Include.option_spec.copy()
+
+    option_spec.update({
+        'generate-cross-refs': directives.flag,
+        'exception-file': directives.unchanged,
+    })
+
     def run(self):
         env = self.state.document.settings.env
         path = os.path.realpath(os.path.expandvars(self.arguments[0]))
@@ -99,28 +131,49 @@ class KernelInclude(Include):
         e_handler = self.state.document.settings.input_encoding_error_handler
         tab_width = self.options.get("tab-width",
                                      self.state.document.settings.tab_width)
-        try:
-            self.state.document.settings.record_dependencies.add(path)
-            include_file = io.FileInput(source_path=path, encoding=encoding,
-                                        error_handler=e_handler)
-        except UnicodeEncodeError:
-            raise self.severe('Problems with "%s" directive path:\n'
-                              'Cannot encode input file path "%s" '
-                              "(wrong locale?)." % (self.name, SafeString(path)))
-        except IOError as error:
-            raise self.severe('Problems with "%s" directive path:\n%s.'
-                              % (self.name, ErrorString(error)))
         startline = self.options.get("start-line", None)
         endline = self.options.get("end-line", None)
-        try:
-            if startline or (endline is not None):
-                lines = include_file.readlines()
-                rawtext = "".join(lines[startline:endline])
-            else:
-                rawtext = include_file.read()
-        except UnicodeError as error:
-            raise self.severe('Problem with "%s" directive:\n%s' %
-                              (self.name, ErrorString(error)))
+
+        # Get optional arguments to related to cross-references generation
+        if 'generate-cross-refs' in self.options:
+            parser = ParseDataStructs()
+            parser.parse_file(path)
+
+            exceptions_file = self.options.get('exception-file')
+            if exceptions_file:
+                exceptions_file = os.path.join(source_dir, exceptions_file)
+                parser.process_exceptions(exceptions_file)
+
+            title = os.path.basename(path)
+            rawtext = parser.gen_output()
+            if startline or endline:
+                raise self.severe('generate-cross-refs can\'t be used together with "start-line" or "end-line"')
+
+            if "code" not in self.options:
+                rawtext = ".. parsed-literal::\n\n" + rawtext
+        else:
+            try:
+                self.state.document.settings.record_dependencies.add(path)
+                include_file = io.FileInput(source_path=path, encoding=encoding,
+                                            error_handler=e_handler)
+            except UnicodeEncodeError:
+                raise self.severe('Problems with "%s" directive path:\n'
+                                'Cannot encode input file path "%s" '
+                                "(wrong locale?)." % (self.name, SafeString(path)))
+            except IOError as error:
+                raise self.severe('Problems with "%s" directive path:\n%s.'
+                                % (self.name, ErrorString(error)))
+
+            try:
+                if startline or (endline is not None):
+                    lines = include_file.readlines()
+                    rawtext = "".join(lines[startline:endline])
+                else:
+                    rawtext = include_file.read()
+            except UnicodeError as error:
+                raise self.severe('Problem with "%s" directive:\n%s' %
+                                (self.name, ErrorString(error)))
+
         # start-after/end-before: no restrictions on newlines in match-text,
         # and no restrictions on matching inside lines vs. line boundaries
         after_text = self.options.get("start-after", None)
@@ -171,6 +224,7 @@ class KernelInclude(Include):
             else:
                 literal_block += nodes.Text(text, text)
             return [literal_block]
+
         if "code" in self.options:
             self.options["source"] = path
             codeblock = CodeBlock(self.name,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 12/24] docs: kernel_include.py: generate warnings for broken refs
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (10 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 11/24] docs: kernel_include.py: allow cross-reference generation Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 13/24] docs: kernel_include.py: move rawtext logic to separate functions Mauro Carvalho Chehab
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

In the past, Sphinx used to warn about broken references. That's
basically the rationale for adding media uAPI files: to get
warnings about missed symbols.

This is not true anymore. So, we need to explicitly check them
after doctree-resolved event.

While here, move setup() to the end, to make it closer to
what we do on other extensions.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 108 ++++++++++++++++++++-----
 1 file changed, 89 insertions(+), 19 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index fc37e6fa9d96..0a3e5377dd1e 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -26,7 +26,7 @@
     environment variable name. Malformed variable names and references to
     non-existing variables are left unchanged.
 
-    This extension overrides Sphinx include directory, adding two extra
+    This extension overrides Sphinx include directory, adding some extra
     arguments:
 
     1. :generate-cross-refs:
@@ -35,14 +35,20 @@
         class, which converts C data structures into cross-references to
         be linked to ReST files containing a more comprehensive documentation;
 
-        Don't use it together with :start-line: and/or :end-line:, as
-        filtering input file line range is currently not supported.
-
     2. :exception-file:
 
-        Used together with :generate-cross-refs:. Points to a file containing
-        rules to ignore C data structs or to use a different reference name,
-        optionally using a different reference type.
+        Used together with :generate-cross-refs
+
+        Points to a file containing rules to ignore C data structs or to
+        use a different reference name, optionally using a different
+        reference type.
+
+    3. :warn-broken:
+
+        Used together with :generate-cross-refs:
+
+        Detect if the auto-generated cross references doesn't exist.
+
 """
 
 # ==============================================================================
@@ -50,6 +56,7 @@
 # ==============================================================================
 
 import os.path
+import re
 import sys
 
 from docutils import io, nodes, statemachine
@@ -58,23 +65,18 @@ from docutils.parsers.rst import directives
 from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
 from docutils.parsers.rst.directives.misc import Include
 
+from sphinx.util import logging
+
 srctree = os.path.abspath(os.environ["srctree"])
 sys.path.insert(0, os.path.join(srctree, "tools/docs/lib"))
 
 from parse_data_structs import ParseDataStructs
 
 __version__ = "1.0"
+logger = logging.getLogger(__name__)
 
-
-# ==============================================================================
-def setup(app):
-    """Setup Sphinx exension"""
-    app.add_directive("kernel-include", KernelInclude)
-    return {
-        "version": __version__,
-        "parallel_read_safe": True,
-        "parallel_write_safe": True,
-    }
+RE_DOMAIN_REF = re.compile(r'\\ :(ref|c:type|c:func):`([^<`]+)(?:<([^>]+)>)?`\\')
+RE_SIMPLE_REF = re.compile(r'`([^`]+)`')
 
 
 # ==============================================================================
@@ -86,6 +88,7 @@ class KernelInclude(Include):
 
     option_spec.update({
         'generate-cross-refs': directives.flag,
+        'warn-broken': directives.flag,
         'exception-file': directives.unchanged,
     })
 
@@ -103,9 +106,9 @@ class KernelInclude(Include):
         env.note_dependency(os.path.abspath(path))
 
         # return super(KernelInclude, self).run() # won't work, see HINTs in _run()
-        return self._run()
+        return self._run(env)
 
-    def _run(self):
+    def _run(self, env):
         """Include a file as part of the content of this reST file."""
 
         # HINT: I had to copy&paste the whole Include.run method. I'am not happy
@@ -151,6 +154,10 @@ class KernelInclude(Include):
 
             if "code" not in self.options:
                 rawtext = ".. parsed-literal::\n\n" + rawtext
+
+            # Store references on a symbol dict to be used at check time
+            if 'warn-broken' in self.options:
+                env._xref_files.add(path)
         else:
             try:
                 self.state.document.settings.record_dependencies.add(path)
@@ -239,3 +246,66 @@ class KernelInclude(Include):
             return codeblock.run()
         self.state_machine.insert_input(include_lines, path)
         return []
+
+# ==============================================================================
+
+reported = set()
+
+def check_missing_refs(app, env, node, contnode):
+    """Check broken refs for the files it creates xrefs"""
+    if not node.source:
+        return None
+
+    try:
+        xref_files = env._xref_files
+    except AttributeError:
+        logger.critical("FATAL: _xref_files not initialized!")
+        raise
+
+    # Only show missing references for kernel-include reference-parsed files
+    if node.source not in xref_files:
+        return None
+
+    target = node.get('reftarget', '')
+    domain = node.get('refdomain', 'std')
+    reftype = node.get('reftype', '')
+
+    msg = f"can't link to: {domain}:{reftype}:: {target}"
+
+    # Don't duplicate warnings
+    data = (node.source, msg)
+    if data in reported:
+        return None
+    reported.add(data)
+
+    logger.warning(msg, location=node, type='ref', subtype='missing')
+
+    return None
+
+def merge_xref_info(app, env, docnames, other):
+    """
+    As each process modify env._xref_files, we need to merge them back.
+    """
+    if not hasattr(other, "_xref_files"):
+        return
+    env._xref_files.update(getattr(other, "_xref_files", set()))
+
+def init_xref_docs(app, env, docnames):
+    """Initialize a list of files that we're generating cross references¨"""
+    app.env._xref_files = set()
+
+# ==============================================================================
+
+def setup(app):
+    """Setup Sphinx exension"""
+
+    app.connect("env-before-read-docs", init_xref_docs)
+    app.connect("env-merge-info", merge_xref_info)
+    app.add_directive("kernel-include", KernelInclude)
+    app.connect("missing-reference", check_missing_refs)
+
+    return {
+        "version": __version__,
+        "parallel_read_safe": True,
+        "parallel_write_safe": True,
+    }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 13/24] docs: kernel_include.py: move rawtext logic to separate functions
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (11 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 12/24] docs: kernel_include.py: generate warnings for broken refs Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 14/24] docs: kernel_include.py: move range logic to a separate function Mauro Carvalho Chehab
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

The run function is too complex. merge run() and _run() into
a single function and move the read logic to separate functions.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 82 ++++++++++++++------------
 1 file changed, 43 insertions(+), 39 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 0a3e5377dd1e..ef86ee9e79d6 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -92,7 +92,47 @@ class KernelInclude(Include):
         'exception-file': directives.unchanged,
     })
 
+    def read_rawtext(self, path, encoding):
+            """Read and process file content with error handling"""
+            try:
+                self.state.document.settings.record_dependencies.add(path)
+                include_file = io.FileInput(source_path=path,
+                                            encoding=encoding,
+                                            error_handler=self.state.document.settings.input_encoding_error_handler)
+            except UnicodeEncodeError:
+                raise self.severe('Problems with directive path:\n'
+                                'Cannot encode input file path "%s" '
+                                '(wrong locale?).' % SafeString(path))
+            except IOError as error:
+                raise self.severe('Problems with directive path:\n%s.' % ErrorString(error))
+
+            try:
+                return include_file.read()
+            except UnicodeError as error:
+                raise self.severe('Problem with directive:\n%s' % ErrorString(error))
+
+    def read_rawtext_with_xrefs(self, env, path):
+        parser = ParseDataStructs()
+        parser.parse_file(path)
+
+        if 'exception-file' in self.options:
+            source_dir = os.path.dirname(os.path.abspath(
+                self.state_machine.input_lines.source(
+                    self.lineno - self.state_machine.input_offset - 1)))
+            exceptions_file = os.path.join(source_dir, self.options['exception-file'])
+            parser.process_exceptions(exceptions_file)
+
+        if self.options.get("start-line") or self.options.get("end-line"):
+            raise self.severe('generate-cross-refs can\'t be used with "start-line" or "end-line"')
+
+        # Store references on a symbol dict to be used at check time
+        if 'warn-broken' in self.options:
+            env._xref_files.add(path)
+
+        return parser.gen_output()
+
     def run(self):
+        """Include a file as part of the content of this reST file."""
         env = self.state.document.settings.env
         path = os.path.realpath(os.path.expandvars(self.arguments[0]))
 
@@ -105,12 +145,6 @@ class KernelInclude(Include):
 
         env.note_dependency(os.path.abspath(path))
 
-        # return super(KernelInclude, self).run() # won't work, see HINTs in _run()
-        return self._run(env)
-
-    def _run(self, env):
-        """Include a file as part of the content of this reST file."""
-
         # HINT: I had to copy&paste the whole Include.run method. I'am not happy
         # with this, but due to security reasons, the Include.run method does
         # not allow absolute or relative pathnames pointing to locations *above*
@@ -139,47 +173,17 @@ class KernelInclude(Include):
 
         # Get optional arguments to related to cross-references generation
         if 'generate-cross-refs' in self.options:
-            parser = ParseDataStructs()
-            parser.parse_file(path)
-
-            exceptions_file = self.options.get('exception-file')
-            if exceptions_file:
-                exceptions_file = os.path.join(source_dir, exceptions_file)
-                parser.process_exceptions(exceptions_file)
+            rawtext = self.read_rawtext_with_xrefs(env, path)
 
             title = os.path.basename(path)
-            rawtext = parser.gen_output()
+
             if startline or endline:
                 raise self.severe('generate-cross-refs can\'t be used together with "start-line" or "end-line"')
 
             if "code" not in self.options:
                 rawtext = ".. parsed-literal::\n\n" + rawtext
-
-            # Store references on a symbol dict to be used at check time
-            if 'warn-broken' in self.options:
-                env._xref_files.add(path)
         else:
-            try:
-                self.state.document.settings.record_dependencies.add(path)
-                include_file = io.FileInput(source_path=path, encoding=encoding,
-                                            error_handler=e_handler)
-            except UnicodeEncodeError:
-                raise self.severe('Problems with "%s" directive path:\n'
-                                'Cannot encode input file path "%s" '
-                                "(wrong locale?)." % (self.name, SafeString(path)))
-            except IOError as error:
-                raise self.severe('Problems with "%s" directive path:\n%s.'
-                                % (self.name, ErrorString(error)))
-
-            try:
-                if startline or (endline is not None):
-                    lines = include_file.readlines()
-                    rawtext = "".join(lines[startline:endline])
-                else:
-                    rawtext = include_file.read()
-            except UnicodeError as error:
-                raise self.severe('Problem with "%s" directive:\n%s' %
-                                (self.name, ErrorString(error)))
+            rawtext = self.read_rawtext(path, encoding)
 
         # start-after/end-before: no restrictions on newlines in match-text,
         # and no restrictions on matching inside lines vs. line boundaries
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 14/24] docs: kernel_include.py: move range logic to a separate function
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (12 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 13/24] docs: kernel_include.py: move rawtext logic to separate functions Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 15/24] docs: kernel_include.py: remove range restriction for gen docs Mauro Carvalho Chehab
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

Cleanup run() function by moving the range logic to a separate
function.

Here, I ended checking the current Sphinx implementation, as it
has some extra logic for the range check.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 51 +++++++++++++++++---------
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index ef86ee9e79d6..c5f4f34e22cb 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -131,6 +131,38 @@ class KernelInclude(Include):
 
         return parser.gen_output()
 
+    def apply_range(self, rawtext):
+        # Get to-be-included content
+        startline = self.options.get('start-line', None)
+        endline = self.options.get('end-line', None)
+        try:
+            if startline or (endline is not None):
+                lines = rawtext.splitlines()
+                rawtext = '\n'.join(lines[startline:endline])
+        except UnicodeError as error:
+            raise self.severe(f'Problem with "{self.name}" directive:\n'
+                              + io.error_string(error))
+        # start-after/end-before: no restrictions on newlines in match-text,
+        # and no restrictions on matching inside lines vs. line boundaries
+        after_text = self.options.get("start-after", None)
+        if after_text:
+            # skip content in rawtext before *and incl.* a matching text
+            after_index = rawtext.find(after_text)
+            if after_index < 0:
+                raise self.severe('Problem with "start-after" option of "%s" '
+                                  "directive:\nText not found." % self.name)
+            rawtext = rawtext[after_index + len(after_text) :]
+        before_text = self.options.get("end-before", None)
+        if before_text:
+            # skip content in rawtext after *and incl.* a matching text
+            before_index = rawtext.find(before_text)
+            if before_index < 0:
+                raise self.severe('Problem with "end-before" option of "%s" '
+                                  "directive:\nText not found." % self.name)
+            rawtext = rawtext[:before_index]
+
+        return rawtext
+
     def run(self):
         """Include a file as part of the content of this reST file."""
         env = self.state.document.settings.env
@@ -185,24 +217,7 @@ class KernelInclude(Include):
         else:
             rawtext = self.read_rawtext(path, encoding)
 
-        # start-after/end-before: no restrictions on newlines in match-text,
-        # and no restrictions on matching inside lines vs. line boundaries
-        after_text = self.options.get("start-after", None)
-        if after_text:
-            # skip content in rawtext before *and incl.* a matching text
-            after_index = rawtext.find(after_text)
-            if after_index < 0:
-                raise self.severe('Problem with "start-after" option of "%s" '
-                                  "directive:\nText not found." % self.name)
-            rawtext = rawtext[after_index + len(after_text) :]
-        before_text = self.options.get("end-before", None)
-        if before_text:
-            # skip content in rawtext after *and incl.* a matching text
-            before_index = rawtext.find(before_text)
-            if before_index < 0:
-                raise self.severe('Problem with "end-before" option of "%s" '
-                                  "directive:\nText not found." % self.name)
-            rawtext = rawtext[:before_index]
+        rawtext = self.apply_range(rawtext)
 
         include_lines = statemachine.string2lines(rawtext, tab_width,
                                                   convert_whitespace=True)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 15/24] docs: kernel_include.py: remove range restriction for gen docs
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (13 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 14/24] docs: kernel_include.py: move range logic to a separate function Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 16/24] docs: kernel_include.py: move code and literal functions Mauro Carvalho Chehab
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

Originally, parse-readers were generating an output where
the first two lines were setting a literal block.

The script now gets only the actual parsed data without that,
so it is now safe to allow start-line and end-line parameters
to be handled.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index c5f4f34e22cb..4cdd1c77982e 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -122,9 +122,6 @@ class KernelInclude(Include):
             exceptions_file = os.path.join(source_dir, self.options['exception-file'])
             parser.process_exceptions(exceptions_file)
 
-        if self.options.get("start-line") or self.options.get("end-line"):
-            raise self.severe('generate-cross-refs can\'t be used with "start-line" or "end-line"')
-
         # Store references on a symbol dict to be used at check time
         if 'warn-broken' in self.options:
             env._xref_files.add(path)
@@ -209,9 +206,6 @@ class KernelInclude(Include):
 
             title = os.path.basename(path)
 
-            if startline or endline:
-                raise self.severe('generate-cross-refs can\'t be used together with "start-line" or "end-line"')
-
             if "code" not in self.options:
                 rawtext = ".. parsed-literal::\n\n" + rawtext
         else:
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 16/24] docs: kernel_include.py: move code and literal functions
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (14 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 15/24] docs: kernel_include.py: remove range restriction for gen docs Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 17/24] docs: kernel_include.py: add support to generate a TOC table Mauro Carvalho Chehab
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

Simplify run() even more by moving the code which handles
with code and literal blocks to their own functions.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 100 +++++++++++++++----------
 1 file changed, 59 insertions(+), 41 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 4cdd1c77982e..0909eb3a07ea 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -160,6 +160,52 @@ class KernelInclude(Include):
 
         return rawtext
 
+    def literal(self, path, tab_width, rawtext):
+        """Output a literal block"""
+
+        # Convert tabs to spaces, if `tab_width` is positive.
+        if tab_width >= 0:
+            text = rawtext.expandtabs(tab_width)
+        else:
+            text = rawtext
+        literal_block = nodes.literal_block(rawtext, source=path,
+                                            classes=self.options.get("class", []))
+        literal_block.line = 1
+        self.add_name(literal_block)
+        if "number-lines" in self.options:
+            try:
+                startline = int(self.options["number-lines"] or 1)
+            except ValueError:
+                raise self.error(":number-lines: with non-integer start value")
+            endline = startline + len(include_lines)
+            if text.endswith("\n"):
+                text = text[:-1]
+            tokens = NumberLines([([], text)], startline, endline)
+            for classes, value in tokens:
+                if classes:
+                    literal_block += nodes.inline(value, value,
+                                                    classes=classes)
+                else:
+                    literal_block += nodes.Text(value, value)
+        else:
+            literal_block += nodes.Text(text, text)
+        return [literal_block]
+
+    def code(self, path, include_lines):
+        """Output a code block"""
+
+        self.options["source"] = path
+        codeblock = CodeBlock(self.name,
+                                [self.options.pop("code")],  # arguments
+                                self.options,
+                                include_lines,
+                                self.lineno,
+                                self.content_offset,
+                                self.block_text,
+                                self.state,
+                                self.state_machine)
+        return codeblock.run()
+
     def run(self):
         """Include a file as part of the content of this reST file."""
         env = self.state.document.settings.env
@@ -200,6 +246,13 @@ class KernelInclude(Include):
         startline = self.options.get("start-line", None)
         endline = self.options.get("end-line", None)
 
+        if "literal" in self.options:
+            ouptut_type = "literal"
+        elif "code" in self.options:
+            ouptut_type = "code"
+        else:
+            ouptut_type = "normal"
+
         # Get optional arguments to related to cross-references generation
         if 'generate-cross-refs' in self.options:
             rawtext = self.read_rawtext_with_xrefs(env, path)
@@ -213,50 +266,15 @@ class KernelInclude(Include):
 
         rawtext = self.apply_range(rawtext)
 
+        if ouptut_type == "literal":
+            return self.literal(path, tab_width, rawtext)
+
         include_lines = statemachine.string2lines(rawtext, tab_width,
                                                   convert_whitespace=True)
-        if "literal" in self.options:
-            # Convert tabs to spaces, if `tab_width` is positive.
-            if tab_width >= 0:
-                text = rawtext.expandtabs(tab_width)
-            else:
-                text = rawtext
-            literal_block = nodes.literal_block(rawtext, source=path,
-                                                classes=self.options.get("class", [])
-            )
-            literal_block.line = 1
-            self.add_name(literal_block)
-            if "number-lines" in self.options:
-                try:
-                    startline = int(self.options["number-lines"] or 1)
-                except ValueError:
-                    raise self.error(":number-lines: with non-integer start value")
-                endline = startline + len(include_lines)
-                if text.endswith("\n"):
-                    text = text[:-1]
-                tokens = NumberLines([([], text)], startline, endline)
-                for classes, value in tokens:
-                    if classes:
-                        literal_block += nodes.inline(value, value,
-                                                      classes=classes)
-                    else:
-                        literal_block += nodes.Text(value, value)
-            else:
-                literal_block += nodes.Text(text, text)
-            return [literal_block]
 
-        if "code" in self.options:
-            self.options["source"] = path
-            codeblock = CodeBlock(self.name,
-                                  [self.options.pop("code")],  # arguments
-                                  self.options,
-                                  include_lines,  # content
-                                  self.lineno,
-                                  self.content_offset,
-                                  self.block_text,
-                                  self.state,
-                                  self.state_machine)
-            return codeblock.run()
+        if ouptut_type == "code":
+            return self.code(path, include_lines)
+
         self.state_machine.insert_input(include_lines, path)
         return []
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 17/24] docs: kernel_include.py: add support to generate a TOC table
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (15 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 16/24] docs: kernel_include.py: move code and literal functions Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 18/24] docs: kernel_include.py: append line numbers to better report errors Mauro Carvalho Chehab
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

When generate-cross-refs is used, instead of just implementing
the default of generating a literal block, we can also
generate a ReST file as a TOC.

The advantage is that, by being a ReST file, missing references
will point to the place inside the header file that has the
broken link.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 36 ++++++++++++++++----------
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 0909eb3a07ea..79682408105e 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -89,6 +89,7 @@ class KernelInclude(Include):
     option_spec.update({
         'generate-cross-refs': directives.flag,
         'warn-broken': directives.flag,
+        'toc': directives.flag,
         'exception-file': directives.unchanged,
     })
 
@@ -111,7 +112,7 @@ class KernelInclude(Include):
             except UnicodeError as error:
                 raise self.severe('Problem with directive:\n%s' % ErrorString(error))
 
-    def read_rawtext_with_xrefs(self, env, path):
+    def read_rawtext_with_xrefs(self, env, path, output_type):
         parser = ParseDataStructs()
         parser.parse_file(path)
 
@@ -126,7 +127,10 @@ class KernelInclude(Include):
         if 'warn-broken' in self.options:
             env._xref_files.add(path)
 
-        return parser.gen_output()
+        if output_type == "toc":
+            return parser.gen_toc()
+
+        return ".. parsed-literal::\n\n" + parser.gen_output()
 
     def apply_range(self, rawtext):
         # Get to-be-included content
@@ -243,39 +247,43 @@ class KernelInclude(Include):
         e_handler = self.state.document.settings.input_encoding_error_handler
         tab_width = self.options.get("tab-width",
                                      self.state.document.settings.tab_width)
-        startline = self.options.get("start-line", None)
-        endline = self.options.get("end-line", None)
 
         if "literal" in self.options:
-            ouptut_type = "literal"
+            output_type = "literal"
         elif "code" in self.options:
-            ouptut_type = "code"
+            output_type = "code"
         else:
-            ouptut_type = "normal"
+            output_type = "rst"
 
         # Get optional arguments to related to cross-references generation
-        if 'generate-cross-refs' in self.options:
-            rawtext = self.read_rawtext_with_xrefs(env, path)
+        if "generate-cross-refs" in self.options:
+            if "toc" in self.options:
+                 output_type = "toc"
+
+            rawtext = self.read_rawtext_with_xrefs(env, path, output_type)
+
+            # When :generate-cross-refs: is used, the input is always a C
+            # file, so it has to be handled as a parsed-literal
+            if output_type == "rst":
+                output_type = "literal"
 
             title = os.path.basename(path)
-
-            if "code" not in self.options:
-                rawtext = ".. parsed-literal::\n\n" + rawtext
         else:
             rawtext = self.read_rawtext(path, encoding)
 
         rawtext = self.apply_range(rawtext)
 
-        if ouptut_type == "literal":
+        if output_type == "literal":
             return self.literal(path, tab_width, rawtext)
 
         include_lines = statemachine.string2lines(rawtext, tab_width,
                                                   convert_whitespace=True)
 
-        if ouptut_type == "code":
+        if output_type == "code":
             return self.code(path, include_lines)
 
         self.state_machine.insert_input(include_lines, path)
+
         return []
 
 # ==============================================================================
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 18/24] docs: kernel_include.py: append line numbers to better report errors
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (16 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 17/24] docs: kernel_include.py: add support to generate a TOC table Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 19/24] docs: kernel_include.py: move apply_range() and add a docstring Mauro Carvalho Chehab
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

It is best to point to the original line of code that generated
an error than to point to the beginning of a directive.

Add support for it. It should be noticed that this won't work
for literal or code blocks, as Sphinx will ignore it, pointing
to the beginning of the directive. Yet, when the output is known
to be in ReST format, like on TOC, this makes the error a lot
more easier to be handled.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 81 ++++++++++++++------------
 1 file changed, 44 insertions(+), 37 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 79682408105e..90ed8428f776 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -60,6 +60,7 @@ import re
 import sys
 
 from docutils import io, nodes, statemachine
+from docutils.statemachine import ViewList
 from docutils.utils.error_reporting import SafeString, ErrorString
 from docutils.parsers.rst import directives
 from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
@@ -112,7 +113,14 @@ class KernelInclude(Include):
             except UnicodeError as error:
                 raise self.severe('Problem with directive:\n%s' % ErrorString(error))
 
-    def read_rawtext_with_xrefs(self, env, path, output_type):
+    def xref_text(self, env, path, tab_width):
+        """
+        Read and add contents from a C file parsed to have cross references.
+
+        There are two types of supported output here:
+        - A C source code with cross-references;
+        - a TOC table containing cross references.
+        """
         parser = ParseDataStructs()
         parser.parse_file(path)
 
@@ -127,10 +135,33 @@ class KernelInclude(Include):
         if 'warn-broken' in self.options:
             env._xref_files.add(path)
 
-        if output_type == "toc":
-            return parser.gen_toc()
+        if "toc" in self.options:
+            rawtext = parser.gen_toc()
+        else:
+            rawtext = ".. parsed-literal::\n\n" + parser.gen_output()
+            self.apply_range(rawtext)
 
-        return ".. parsed-literal::\n\n" + parser.gen_output()
+        title = os.path.basename(path)
+
+        include_lines = statemachine.string2lines(rawtext, tab_width,
+                                                  convert_whitespace=True)
+
+        # Append line numbers data
+
+        startline = self.options.get('start-line', None)
+
+        result = ViewList()
+        if startline and startline > 0:
+            offset = startline - 1
+        else:
+            offset = 0
+
+        for ln, line in enumerate(include_lines, start=offset):
+            result.append(line, path, ln)
+
+        self.state_machine.insert_input(result, path)
+
+        return []
 
     def apply_range(self, rawtext):
         # Get to-be-included content
@@ -195,9 +226,12 @@ class KernelInclude(Include):
             literal_block += nodes.Text(text, text)
         return [literal_block]
 
-    def code(self, path, include_lines):
+    def code(self, path, tab_width):
         """Output a code block"""
 
+        include_lines = statemachine.string2lines(rawtext, tab_width,
+                                                  convert_whitespace=True)
+
         self.options["source"] = path
         codeblock = CodeBlock(self.name,
                                 [self.options.pop("code")],  # arguments
@@ -244,47 +278,20 @@ class KernelInclude(Include):
 
         encoding = self.options.get("encoding",
                                     self.state.document.settings.input_encoding)
-        e_handler = self.state.document.settings.input_encoding_error_handler
         tab_width = self.options.get("tab-width",
                                      self.state.document.settings.tab_width)
 
-        if "literal" in self.options:
-            output_type = "literal"
-        elif "code" in self.options:
-            output_type = "code"
-        else:
-            output_type = "rst"
-
         # Get optional arguments to related to cross-references generation
         if "generate-cross-refs" in self.options:
-            if "toc" in self.options:
-                 output_type = "toc"
-
-            rawtext = self.read_rawtext_with_xrefs(env, path, output_type)
-
-            # When :generate-cross-refs: is used, the input is always a C
-            # file, so it has to be handled as a parsed-literal
-            if output_type == "rst":
-                output_type = "literal"
-
-            title = os.path.basename(path)
-        else:
-            rawtext = self.read_rawtext(path, encoding)
+            return self.xref_text(env, path, tab_width)
 
+        rawtext = self.read_rawtext(path, encoding)
         rawtext = self.apply_range(rawtext)
 
-        if output_type == "literal":
-            return self.literal(path, tab_width, rawtext)
+        if "code" in self.options:
+            return self.code(path, tab_width, rawtext)
 
-        include_lines = statemachine.string2lines(rawtext, tab_width,
-                                                  convert_whitespace=True)
-
-        if output_type == "code":
-            return self.code(path, include_lines)
-
-        self.state_machine.insert_input(include_lines, path)
-
-        return []
+        return self.literal(path, tab_width, rawtext)
 
 # ==============================================================================
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 19/24] docs: kernel_include.py: move apply_range() and add a docstring
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (17 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 18/24] docs: kernel_include.py: append line numbers to better report errors Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 20/24] docs: kernel_include.py: remove line numbers from parsed-literal Mauro Carvalho Chehab
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

While not required, better to have caller functions at the end.
As apply_range() is now called by xref_text(), move it to be
before the latter.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 68 ++++++++++++++------------
 1 file changed, 36 insertions(+), 32 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 90ed8428f776..fd4887f80577 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -113,6 +113,42 @@ class KernelInclude(Include):
             except UnicodeError as error:
                 raise self.severe('Problem with directive:\n%s' % ErrorString(error))
 
+    def apply_range(self, rawtext):
+        """
+        Handles start-line, end-line, start-after and end-before parameters
+        """
+
+        # Get to-be-included content
+        startline = self.options.get('start-line', None)
+        endline = self.options.get('end-line', None)
+        try:
+            if startline or (endline is not None):
+                lines = rawtext.splitlines()
+                rawtext = '\n'.join(lines[startline:endline])
+        except UnicodeError as error:
+            raise self.severe(f'Problem with "{self.name}" directive:\n'
+                              + io.error_string(error))
+        # start-after/end-before: no restrictions on newlines in match-text,
+        # and no restrictions on matching inside lines vs. line boundaries
+        after_text = self.options.get("start-after", None)
+        if after_text:
+            # skip content in rawtext before *and incl.* a matching text
+            after_index = rawtext.find(after_text)
+            if after_index < 0:
+                raise self.severe('Problem with "start-after" option of "%s" '
+                                  "directive:\nText not found." % self.name)
+            rawtext = rawtext[after_index + len(after_text) :]
+        before_text = self.options.get("end-before", None)
+        if before_text:
+            # skip content in rawtext after *and incl.* a matching text
+            before_index = rawtext.find(before_text)
+            if before_index < 0:
+                raise self.severe('Problem with "end-before" option of "%s" '
+                                  "directive:\nText not found." % self.name)
+            rawtext = rawtext[:before_index]
+
+        return rawtext
+
     def xref_text(self, env, path, tab_width):
         """
         Read and add contents from a C file parsed to have cross references.
@@ -163,38 +199,6 @@ class KernelInclude(Include):
 
         return []
 
-    def apply_range(self, rawtext):
-        # Get to-be-included content
-        startline = self.options.get('start-line', None)
-        endline = self.options.get('end-line', None)
-        try:
-            if startline or (endline is not None):
-                lines = rawtext.splitlines()
-                rawtext = '\n'.join(lines[startline:endline])
-        except UnicodeError as error:
-            raise self.severe(f'Problem with "{self.name}" directive:\n'
-                              + io.error_string(error))
-        # start-after/end-before: no restrictions on newlines in match-text,
-        # and no restrictions on matching inside lines vs. line boundaries
-        after_text = self.options.get("start-after", None)
-        if after_text:
-            # skip content in rawtext before *and incl.* a matching text
-            after_index = rawtext.find(after_text)
-            if after_index < 0:
-                raise self.severe('Problem with "start-after" option of "%s" '
-                                  "directive:\nText not found." % self.name)
-            rawtext = rawtext[after_index + len(after_text) :]
-        before_text = self.options.get("end-before", None)
-        if before_text:
-            # skip content in rawtext after *and incl.* a matching text
-            before_index = rawtext.find(before_text)
-            if before_index < 0:
-                raise self.severe('Problem with "end-before" option of "%s" '
-                                  "directive:\nText not found." % self.name)
-            rawtext = rawtext[:before_index]
-
-        return rawtext
-
     def literal(self, path, tab_width, rawtext):
         """Output a literal block"""
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 20/24] docs: kernel_include.py: remove line numbers from parsed-literal
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (18 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 19/24] docs: kernel_include.py: move apply_range() and add a docstring Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 21/24] docs: kernel_include.py: remove Include class inheritance Mauro Carvalho Chehab
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

When parsed-literal directive is added to rawtext, while cross
references will be properly displayed, Sphinx will ignore
line numbers. So, it is not worth adding them.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index fd4887f80577..3a1753486319 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -171,13 +171,24 @@ class KernelInclude(Include):
         if 'warn-broken' in self.options:
             env._xref_files.add(path)
 
-        if "toc" in self.options:
-            rawtext = parser.gen_toc()
-        else:
+        if "toc" not in self.options:
+
             rawtext = ".. parsed-literal::\n\n" + parser.gen_output()
             self.apply_range(rawtext)
 
-        title = os.path.basename(path)
+            include_lines = statemachine.string2lines(rawtext, tab_width,
+                                                      convert_whitespace=True)
+
+            # Sphinx always blame the ".. <directive>", so placing
+            # line numbers here won't make any difference
+
+            self.state_machine.insert_input(include_lines, path)
+            return []
+
+        # TOC output is a ReST file, not a literal. So, we can add line
+        # numbers
+
+        rawtext = parser.gen_toc()
 
         include_lines = statemachine.string2lines(rawtext, tab_width,
                                                   convert_whitespace=True)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 21/24] docs: kernel_include.py: remove Include class inheritance
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (19 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 20/24] docs: kernel_include.py: remove line numbers from parsed-literal Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 22/24] docs: kernel_include.py: document all supported parameters Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

While the original code came from the Sphinx Include class,
such class is monolithic: it has only one function that does
everything, and 3 variables that are used:

	- required_arguments
	- optional_arguments
	- option_spec

So, basically those are the only members that remain from
the original class, but hey! Those are the same vars that every
other Sphinx directive extension has to define!

In summary, keeping inheritance here doesn't make much sense.

Worse than that, kernel-include doesn't support the current set
of options that the original Include class has, but it also
has its own set of options.

So, let's fill in the argument vars with what it does
support, dropping the rest.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 40 ++++++++++++++++++++------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index 3a1753486319..e6f734476ab3 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -62,9 +62,8 @@ import sys
 from docutils import io, nodes, statemachine
 from docutils.statemachine import ViewList
 from docutils.utils.error_reporting import SafeString, ErrorString
-from docutils.parsers.rst import directives
+from docutils.parsers.rst import Directive, directives
 from docutils.parsers.rst.directives.body import CodeBlock, NumberLines
-from docutils.parsers.rst.directives.misc import Include
 
 from sphinx.util import logging
 
@@ -81,18 +80,43 @@ RE_SIMPLE_REF = re.compile(r'`([^`]+)`')
 
 
 # ==============================================================================
-class KernelInclude(Include):
-    """KernelInclude (``kernel-include``) directive"""
+class KernelInclude(Directive):
+    """
+    KernelInclude (``kernel-include``) directive
 
-    # Add extra options
-    option_spec = Include.option_spec.copy()
+    Most of the stuff here came from Include directive defined at:
+        docutils/parsers/rst/directives/misc.py
 
-    option_spec.update({
+    Yet, overriding the class don't has any benefits: the original class
+    only have run() and argument list. Not all of them are implemented,
+    when checked against latest Sphinx version, as with time more arguments
+    were added.
+
+    So, keep its own list of supported arguments
+    """
+
+    required_arguments = 1
+    optional_arguments = 0
+    final_argument_whitespace = True
+    option_spec = {
+        'literal': directives.flag,
+        'code': directives.unchanged,
+        'encoding': directives.encoding,
+        'tab-width': int,
+        'start-line': int,
+        'end-line': int,
+        'start-after': directives.unchanged_required,
+        'end-before': directives.unchanged_required,
+        # ignored except for 'literal' or 'code':
+        'number-lines': directives.unchanged,  # integer or None
+        'class': directives.class_option,
+
+        # Arguments that aren't from Sphinx Include directive
         'generate-cross-refs': directives.flag,
         'warn-broken': directives.flag,
         'toc': directives.flag,
         'exception-file': directives.unchanged,
-    })
+    }
 
     def read_rawtext(self, path, encoding):
             """Read and process file content with error handling"""
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 22/24] docs: kernel_include.py: document all supported parameters
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (20 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 21/24] docs: kernel_include.py: remove Include class inheritance Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 23/24] scripts: sphinx-build-wrapper: get rid of uapi/media Makefile Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

As we're actually a fork of Sphinx Include, update its
docstring to contain the documentation for the actual
implemented parameters.

Let's use :param: for parameters, as defined at:
https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kernel_include.py | 88 +++++++++++++++++---------
 1 file changed, 58 insertions(+), 30 deletions(-)

diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py
index e6f734476ab3..23566ab74866 100755
--- a/Documentation/sphinx/kernel_include.py
+++ b/Documentation/sphinx/kernel_include.py
@@ -2,53 +2,81 @@
 # SPDX-License-Identifier: GPL-2.0
 # pylint: disable=R0903, R0912, R0914, R0915, C0209,W0707
 
+
 """
-    kernel-include
-    ~~~~~~~~~~~~~~
+Implementation of the ``kernel-include`` reST-directive.
 
-    Implementation of the ``kernel-include`` reST-directive.
+:copyright:  Copyright (C) 2016  Markus Heiser
+:license:    GPL Version 2, June 1991 see linux/COPYING for details.
 
-    :copyright:  Copyright (C) 2016  Markus Heiser
-    :license:    GPL Version 2, June 1991 see linux/COPYING for details.
+The ``kernel-include`` reST-directive is a replacement for the ``include``
+directive. The ``kernel-include`` directive expand environment variables in
+the path name and allows to include files from arbitrary locations.
 
-    The ``kernel-include`` reST-directive is a replacement for the ``include``
-    directive. The ``kernel-include`` directive expand environment variables in
-    the path name and allows to include files from arbitrary locations.
+.. hint::
 
-    .. hint::
+    Including files from arbitrary locations (e.g. from ``/etc``) is a
+    security risk for builders. This is why the ``include`` directive from
+    docutils *prohibit* pathnames pointing to locations *above* the filesystem
+    tree where the reST document with the include directive is placed.
 
-      Including files from arbitrary locations (e.g. from ``/etc``) is a
-      security risk for builders. This is why the ``include`` directive from
-      docutils *prohibit* pathnames pointing to locations *above* the filesystem
-      tree where the reST document with the include directive is placed.
+Substrings of the form $name or ${name} are replaced by the value of
+environment variable name. Malformed variable names and references to
+non-existing variables are left unchanged.
 
-    Substrings of the form $name or ${name} are replaced by the value of
-    environment variable name. Malformed variable names and references to
-    non-existing variables are left unchanged.
+**Supported Sphinx Include Options**:
 
-    This extension overrides Sphinx include directory, adding some extra
-    arguments:
+:param literal:
+    If present, the included file is inserted as a literal block.
 
-    1. :generate-cross-refs:
+:param code:
+    Specify the language for syntax highlighting (e.g., 'c', 'python').
 
-        If present, instead of reading the file, it calls ParseDataStructs()
-        class, which converts C data structures into cross-references to
-        be linked to ReST files containing a more comprehensive documentation;
+:param encoding:
+    Specify the encoding of the included file (default: 'utf-8').
 
-    2. :exception-file:
+:param tab-width:
+    Specify the number of spaces that a tab represents.
 
-        Used together with :generate-cross-refs
+:param start-line:
+    Line number at which to start including the file (1-based).
 
-        Points to a file containing rules to ignore C data structs or to
-        use a different reference name, optionally using a different
-        reference type.
+:param end-line:
+    Line number at which to stop including the file (inclusive).
 
-    3. :warn-broken:
+:param start-after:
+    Include lines after the first line matching this text.
 
-        Used together with :generate-cross-refs:
+:param end-before:
+    Include lines before the first line matching this text.
 
-        Detect if the auto-generated cross references doesn't exist.
+:param number-lines:
+    Number the included lines (integer specifies start number).
+    Only effective with 'literal' or 'code' options.
 
+:param class:
+    Specify HTML class attribute for the included content.
+
+**Kernel-specific Extensions**:
+
+:param generate-cross-refs:
+    If present, instead of directly including the file, it calls
+    ParseDataStructs() to convert C data structures into cross-references
+    that link to comprehensive documentation in other ReST files.
+
+:param exception-file:
+    (Used with generate-cross-refs)
+
+    Path to a file containing rules for handling special cases:
+    - Ignore specific C data structures
+    - Use alternative reference names
+    - Specify different reference types
+
+:param warn-broken:
+    (Used with generate-cross-refs)
+
+    Enables warnings when auto-generated cross-references don't point to
+    existing documentation targets.
 """
 
 # ==============================================================================
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 23/24] scripts: sphinx-build-wrapper: get rid of uapi/media Makefile
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (21 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 22/24] docs: kernel_include.py: document all supported parameters Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-21 14:21 ` [PATCH 24/24] docs: sphinx: drop parse-headers.pl Mauro Carvalho Chehab
  2025-08-25  7:48 ` [PATCH 00/24] better handle media headers Jani Nikula
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Benjamin Gaignard, Erling Ljunggren,
	Hans Verkuil, Hans de Goede, Mauro Carvalho Chehab,
	Ricardo Ribalda, Sean Young, Yunke Cao, linux-kernel, linux-media

Now that kernel-include directive supports parsing data
structs directly, we can finally get rid of the horrible hack
we added to support parsing media uAPI symbols.

As a side effect, Documentation/output doesn't have anymore
media auto-generated .rst files on it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/userspace-api/media/Makefile    | 64 -------------------
 .../userspace-api/media/cec/cec-header.rst    |  5 +-
 .../media/{ => cec}/cec.h.rst.exceptions      |  0
 .../media/{ => dvb}/ca.h.rst.exceptions       |  0
 .../media/{ => dvb}/dmx.h.rst.exceptions      |  0
 .../media/{ => dvb}/frontend.h.rst.exceptions |  0
 .../userspace-api/media/dvb/headers.rst       | 17 +++--
 .../media/{ => dvb}/net.h.rst.exceptions      |  0
 .../media/mediactl/media-header.rst           |  5 +-
 .../{ => mediactl}/media.h.rst.exceptions     |  0
 .../userspace-api/media/rc/lirc-header.rst    |  4 +-
 .../media/{ => rc}/lirc.h.rst.exceptions      |  0
 .../userspace-api/media/v4l/videodev.rst      |  4 +-
 .../{ => v4l}/videodev2.h.rst.exceptions      |  0
 scripts/sphinx-build-wrapper                  | 48 --------------
 15 files changed, 25 insertions(+), 122 deletions(-)
 delete mode 100644 Documentation/userspace-api/media/Makefile
 rename Documentation/userspace-api/media/{ => cec}/cec.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/ca.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/dmx.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/frontend.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => dvb}/net.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => mediactl}/media.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => rc}/lirc.h.rst.exceptions (100%)
 rename Documentation/userspace-api/media/{ => v4l}/videodev2.h.rst.exceptions (100%)

diff --git a/Documentation/userspace-api/media/Makefile b/Documentation/userspace-api/media/Makefile
deleted file mode 100644
index accc734d045a..000000000000
--- a/Documentation/userspace-api/media/Makefile
+++ /dev/null
@@ -1,64 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-# Rules to convert a .h file to inline RST documentation
-
-SRC_DIR=$(srctree)/Documentation/userspace-api/media
-PARSER = $(srctree)/tools/docs/parse-headers.py
-UAPI = $(srctree)/include/uapi/linux
-KAPI = $(srctree)/include/linux
-
-FILES = ca.h.rst dmx.h.rst frontend.h.rst net.h.rst \
-	videodev2.h.rst media.h.rst cec.h.rst lirc.h.rst
-
-TARGETS := $(addprefix $(BUILDDIR)/, $(FILES))
-
-gen_rst = \
-	echo ${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions; \
-	${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions
-
-quiet_gen_rst = echo '  PARSE   $(patsubst $(srctree)/%,%,$<)'; \
-	${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions
-
-silent_gen_rst = ${gen_rst}
-
-$(BUILDDIR)/ca.h.rst: ${UAPI}/dvb/ca.h ${PARSER} $(SRC_DIR)/ca.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/dmx.h.rst: ${UAPI}/dvb/dmx.h ${PARSER} $(SRC_DIR)/dmx.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/frontend.h.rst: ${UAPI}/dvb/frontend.h ${PARSER} $(SRC_DIR)/frontend.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/net.h.rst: ${UAPI}/dvb/net.h ${PARSER} $(SRC_DIR)/net.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/videodev2.h.rst: ${UAPI}/videodev2.h ${PARSER} $(SRC_DIR)/videodev2.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/media.h.rst: ${UAPI}/media.h ${PARSER} $(SRC_DIR)/media.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/cec.h.rst: ${UAPI}/cec.h ${PARSER} $(SRC_DIR)/cec.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-$(BUILDDIR)/lirc.h.rst: ${UAPI}/lirc.h ${PARSER} $(SRC_DIR)/lirc.h.rst.exceptions
-	@$($(quiet)gen_rst)
-
-# Media build rules
-
-.PHONY: all html texinfo epub xml latex
-
-all: $(IMGDOT) $(BUILDDIR) ${TARGETS}
-html: all
-texinfo: all
-epub: all
-xml: all
-latex: $(IMGPDF) all
-linkcheck:
-
-clean:
-	-rm -f $(DOTTGT) $(IMGTGT) ${TARGETS} 2>/dev/null
-
-$(BUILDDIR):
-	$(Q)mkdir -p $@
diff --git a/Documentation/userspace-api/media/cec/cec-header.rst b/Documentation/userspace-api/media/cec/cec-header.rst
index d70736ac2b1d..f67003bb8740 100644
--- a/Documentation/userspace-api/media/cec/cec-header.rst
+++ b/Documentation/userspace-api/media/cec/cec-header.rst
@@ -6,5 +6,6 @@
 CEC Header File
 ***************
 
-.. kernel-include:: $BUILDDIR/cec.h.rst
-
+.. kernel-include:: include/uapi/linux/cec.h
+    :generate-cross-refs:
+    :exception-file: cec.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/cec.h.rst.exceptions b/Documentation/userspace-api/media/cec/cec.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/cec.h.rst.exceptions
rename to Documentation/userspace-api/media/cec/cec.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/ca.h.rst.exceptions b/Documentation/userspace-api/media/dvb/ca.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/ca.h.rst.exceptions
rename to Documentation/userspace-api/media/dvb/ca.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/dmx.h.rst.exceptions b/Documentation/userspace-api/media/dvb/dmx.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/dmx.h.rst.exceptions
rename to Documentation/userspace-api/media/dvb/dmx.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/frontend.h.rst.exceptions b/Documentation/userspace-api/media/dvb/frontend.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/frontend.h.rst.exceptions
rename to Documentation/userspace-api/media/dvb/frontend.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/dvb/headers.rst b/Documentation/userspace-api/media/dvb/headers.rst
index 88c3eb33a89e..c75f64cf21d5 100644
--- a/Documentation/userspace-api/media/dvb/headers.rst
+++ b/Documentation/userspace-api/media/dvb/headers.rst
@@ -7,10 +7,19 @@ Digital TV uAPI header files
 Digital TV uAPI headers
 ***********************
 
-.. kernel-include:: $BUILDDIR/frontend.h.rst
+.. kernel-include:: include/uapi/linux/dvb/frontend.h
+    :generate-cross-refs:
+    :exception-file: frontend.h.rst.exceptions
 
-.. kernel-include:: $BUILDDIR/dmx.h.rst
+.. kernel-include:: include/uapi/linux/dvb/dmx.h
+    :generate-cross-refs:
+    :exception-file: dmx.h.rst.exceptions
 
-.. kernel-include:: $BUILDDIR/ca.h.rst
+.. kernel-include:: include/uapi/linux/dvb/ca.h
+    :generate-cross-refs:
+    :exception-file: ca.h.rst.exceptions
+
+.. kernel-include:: include/uapi/linux/dvb/net.h
+    :generate-cross-refs:
+    :exception-file: net.h.rst.exceptions
 
-.. kernel-include:: $BUILDDIR/net.h.rst
diff --git a/Documentation/userspace-api/media/net.h.rst.exceptions b/Documentation/userspace-api/media/dvb/net.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/net.h.rst.exceptions
rename to Documentation/userspace-api/media/dvb/net.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/mediactl/media-header.rst b/Documentation/userspace-api/media/mediactl/media-header.rst
index c674271c93f5..d561d2845f3d 100644
--- a/Documentation/userspace-api/media/mediactl/media-header.rst
+++ b/Documentation/userspace-api/media/mediactl/media-header.rst
@@ -6,5 +6,6 @@
 Media Controller Header File
 ****************************
 
-.. kernel-include:: $BUILDDIR/media.h.rst
-
+.. kernel-include:: include/uapi/linux/media.h
+    :generate-cross-refs:
+    :exception-file: media.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/media.h.rst.exceptions b/Documentation/userspace-api/media/mediactl/media.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/media.h.rst.exceptions
rename to Documentation/userspace-api/media/mediactl/media.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/rc/lirc-header.rst b/Documentation/userspace-api/media/rc/lirc-header.rst
index 54cb40b8a065..a53328327847 100644
--- a/Documentation/userspace-api/media/rc/lirc-header.rst
+++ b/Documentation/userspace-api/media/rc/lirc-header.rst
@@ -6,5 +6,7 @@
 LIRC Header File
 ****************
 
-.. kernel-include:: $BUILDDIR/lirc.h.rst
+.. kernel-include:: include/uapi/linux/lirc.h
+    :generate-cross-refs:
+    :exception-file: lirc.h.rst.exceptions
 
diff --git a/Documentation/userspace-api/media/lirc.h.rst.exceptions b/Documentation/userspace-api/media/rc/lirc.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/lirc.h.rst.exceptions
rename to Documentation/userspace-api/media/rc/lirc.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/v4l/videodev.rst b/Documentation/userspace-api/media/v4l/videodev.rst
index c866fec417eb..cde485bc9a5f 100644
--- a/Documentation/userspace-api/media/v4l/videodev.rst
+++ b/Documentation/userspace-api/media/v4l/videodev.rst
@@ -6,4 +6,6 @@
 Video For Linux Two Header File
 *******************************
 
-.. kernel-include:: $BUILDDIR/videodev2.h.rst
+.. kernel-include:: include/uapi/linux/videodev2.h
+    :generate-cross-refs:
+    :exception-file: videodev2.h.rst.exceptions
diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/v4l/videodev2.h.rst.exceptions
similarity index 100%
rename from Documentation/userspace-api/media/videodev2.h.rst.exceptions
rename to Documentation/userspace-api/media/v4l/videodev2.h.rst.exceptions
diff --git a/scripts/sphinx-build-wrapper b/scripts/sphinx-build-wrapper
index 0d13c19f6df3..abe8c26ae137 100755
--- a/scripts/sphinx-build-wrapper
+++ b/scripts/sphinx-build-wrapper
@@ -463,56 +463,10 @@ class SphinxBuilder:
             except subprocess.CalledProcessError as e:
                 sys.exit(f"Error generating info docs: {e}")
 
-    def get_make_media(self):
-        """
-        The media uAPI requires an additional Makefile target.
-        """
-
-        mediadir = f"{self.obj}/userspace-api/media"
-
-        make = os.environ.get("MAKE", "make")
-        build = os.environ.get("build", "-f $(srctree)/scripts/Makefile.build obj")
-
-        # Check if the script was started outside docs Makefile
-        if not os.environ.get("obj"):
-            mediadir = os.path.abspath(mediadir)
-
-        # the build makefile var contains macros that require expand
-        make_media = f"{make} {build}={mediadir}"
-        make_media = make_media.replace("$(", "${").replace(")", "}")
-        make_media = os.path.expandvars(make_media)
-
-        # As it also contains multiple arguments, use shlex to split it
-        return shlex.split(make_media)
-
-    def prepare_media(self, builder):
-        """
-        Run userspace-api/media Makefile.
-
-        The logic behind it are from the initial ports to Sphinx.
-        They're old and need to be replaced by a proper Sphinx extension.
-        While we don't do that, we need to explicitly call media Makefile
-        to build some files.
-        """
-
-        cmd = self.get_make_media() + [builder]
-
-        if self.verbose:
-            print(" ".join(cmd))
-
-        with JobserverExec() as jobserver:
-            rc = jobserver.run(cmd, env=self.env)
-
-        if rc:
-            cmd_str = " ".join(cmd)
-            sys.exit(f"Failed to run {cmd_str}")
-
     def cleandocs(self, builder):
 
         shutil.rmtree(self.builddir, ignore_errors=True)
 
-        self.prepare_media(builder)
-
     def build(self, target, sphinxdirs=None, conf="conf.py",
               theme=None, css=None, paper=None):
         """
@@ -533,8 +487,6 @@ class SphinxBuilder:
         if not sphinxbuild:
             sys.exit(f"Error: {self.sphinxbuild} not found in PATH.\n")
 
-        self.prepare_media(builder)
-
         if builder == "latex":
             if not self.pdflatex_cmd and not self.latexmk_cmd:
                 sys.exit("Error: pdflatex or latexmk required for PDF generation")
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 24/24] docs: sphinx: drop parse-headers.pl
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (22 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 23/24] scripts: sphinx-build-wrapper: get rid of uapi/media Makefile Mauro Carvalho Chehab
@ 2025-08-21 14:21 ` Mauro Carvalho Chehab
  2025-08-25  7:48 ` [PATCH 00/24] better handle media headers Jani Nikula
  24 siblings, 0 replies; 26+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-21 14:21 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel

Now that we have a replacement in place, drop the old version.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/parse-headers.pl | 419 --------------------------
 1 file changed, 419 deletions(-)
 delete mode 100755 Documentation/sphinx/parse-headers.pl

diff --git a/Documentation/sphinx/parse-headers.pl b/Documentation/sphinx/parse-headers.pl
deleted file mode 100755
index 560685926cdb..000000000000
--- a/Documentation/sphinx/parse-headers.pl
+++ /dev/null
@@ -1,419 +0,0 @@
-#!/usr/bin/env perl
-# SPDX-License-Identifier: GPL-2.0
-# Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
-
-use strict;
-use Text::Tabs;
-use Getopt::Long;
-use Pod::Usage;
-
-my $debug;
-my $help;
-my $man;
-
-GetOptions(
-	"debug" => \$debug,
-	'usage|?' => \$help,
-	'help' => \$man
-) or pod2usage(2);
-
-pod2usage(1) if $help;
-pod2usage(-exitstatus => 0, -verbose => 2) if $man;
-pod2usage(2) if (scalar @ARGV < 2 || scalar @ARGV > 3);
-
-my ($file_in, $file_out, $file_exceptions) = @ARGV;
-
-my $data;
-my %ioctls;
-my %defines;
-my %typedefs;
-my %enums;
-my %enum_symbols;
-my %structs;
-
-#
-# read the file and get identifiers
-#
-
-my $is_enum = 0;
-my $is_comment = 0;
-open IN, $file_in or die "Can't open $file_in";
-while (<IN>) {
-	$data .= $_;
-
-	my $ln = $_;
-	if (!$is_comment) {
-		$ln =~ s,/\*.*(\*/),,g;
-
-		$is_comment = 1 if ($ln =~ s,/\*.*,,);
-	} else {
-		if ($ln =~ s,^(.*\*/),,) {
-			$is_comment = 0;
-		} else {
-			next;
-		}
-	}
-
-	if ($is_enum && $ln =~ m/^\s*([_\w][\w\d_]+)\s*[\,=]?/) {
-		my $s = $1;
-		my $n = $1;
-		$n =~ tr/A-Z/a-z/;
-		$n =~ tr/_/-/;
-
-		$enum_symbols{$s} =  "\\ :ref:`$s <$n>`\\ ";
-
-		$is_enum = 0 if ($is_enum && m/\}/);
-		next;
-	}
-	$is_enum = 0 if ($is_enum && m/\}/);
-
-	if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+_IO/) {
-		my $s = $1;
-		my $n = $1;
-		$n =~ tr/A-Z/a-z/;
-
-		$ioctls{$s} = "\\ :ref:`$s <$n>`\\ ";
-		next;
-	}
-
-	if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+/) {
-		my $s = $1;
-		my $n = $1;
-		$n =~ tr/A-Z/a-z/;
-		$n =~ tr/_/-/;
-
-		$defines{$s} = "\\ :ref:`$s <$n>`\\ ";
-		next;
-	}
-
-	if ($ln =~ m/^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);/) {
-		my $s = $2;
-		my $n = $3;
-
-		$typedefs{$n} = "\\ :c:type:`$n <$s>`\\ ";
-		next;
-	}
-	if ($ln =~ m/^\s*enum\s+([_\w][\w\d_]+)\s+\{/
-	    || $ln =~ m/^\s*enum\s+([_\w][\w\d_]+)$/
-	    || $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)\s+\{/
-	    || $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)$/) {
-		my $s = $1;
-
-		$enums{$s} =  "enum :c:type:`$s`\\ ";
-
-		$is_enum = $1;
-		next;
-	}
-	if ($ln =~ m/^\s*struct\s+([_\w][\w\d_]+)\s+\{/
-	    || $ln =~ m/^\s*struct\s+([[_\w][\w\d_]+)$/
-	    || $ln =~ m/^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s+\{/
-	    || $ln =~ m/^\s*typedef\s*struct\s+([[_\w][\w\d_]+)$/
-	    ) {
-		my $s = $1;
-
-		$structs{$s} = "struct $s\\ ";
-		next;
-	}
-}
-close IN;
-
-#
-# Handle multi-line typedefs
-#
-
-my @matches = ($data =~ m/typedef\s+struct\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g,
-	       $data =~ m/typedef\s+enum\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g,);
-foreach my $m (@matches) {
-	my $s = $m;
-
-	$typedefs{$s} = "\\ :c:type:`$s`\\ ";
-	next;
-}
-
-#
-# Handle exceptions, if any
-#
-
-my %def_reftype = (
-	"ioctl"   => ":ref",
-	"define"  => ":ref",
-	"symbol"  => ":ref",
-	"typedef" => ":c:type",
-	"enum"    => ":c:type",
-	"struct"  => ":c:type",
-);
-
-if ($file_exceptions) {
-	open IN, $file_exceptions or die "Can't read $file_exceptions";
-	while (<IN>) {
-		next if (m/^\s*$/ || m/^\s*#/);
-
-		# Parsers to ignore a symbol
-
-		if (m/^ignore\s+ioctl\s+(\S+)/) {
-			delete $ioctls{$1} if (exists($ioctls{$1}));
-			next;
-		}
-		if (m/^ignore\s+define\s+(\S+)/) {
-			delete $defines{$1} if (exists($defines{$1}));
-			next;
-		}
-		if (m/^ignore\s+typedef\s+(\S+)/) {
-			delete $typedefs{$1} if (exists($typedefs{$1}));
-			next;
-		}
-		if (m/^ignore\s+enum\s+(\S+)/) {
-			delete $enums{$1} if (exists($enums{$1}));
-			next;
-		}
-		if (m/^ignore\s+struct\s+(\S+)/) {
-			delete $structs{$1} if (exists($structs{$1}));
-			next;
-		}
-		if (m/^ignore\s+symbol\s+(\S+)/) {
-			delete $enum_symbols{$1} if (exists($enum_symbols{$1}));
-			next;
-		}
-
-		# Parsers to replace a symbol
-		my ($type, $old, $new, $reftype);
-
-		if (m/^replace\s+(\S+)\s+(\S+)\s+(\S+)/) {
-			$type = $1;
-			$old = $2;
-			$new = $3;
-		} else {
-			die "Can't parse $file_exceptions: $_";
-		}
-
-		if ($new =~ m/^\:c\:(data|func|macro|type)\:\`(.+)\`/) {
-			$reftype = ":c:$1";
-			$new = $2;
-		} elsif ($new =~ m/\:ref\:\`(.+)\`/) {
-			$reftype = ":ref";
-			$new = $1;
-		} else {
-			$reftype = $def_reftype{$type};
-		}
-		if (!$reftype) {
-		    print STDERR "Warning: can't find ref type for $type";
-		}
-		$new = "$reftype:`$old <$new>`";
-
-		if ($type eq "ioctl") {
-			$ioctls{$old} = $new if (exists($ioctls{$old}));
-			next;
-		}
-		if ($type eq "define") {
-			$defines{$old} = $new if (exists($defines{$old}));
-			next;
-		}
-		if ($type eq "symbol") {
-			$enum_symbols{$old} = $new if (exists($enum_symbols{$old}));
-			next;
-		}
-		if ($type eq "typedef") {
-			$typedefs{$old} = $new if (exists($typedefs{$old}));
-			next;
-		}
-		if ($type eq "enum") {
-			$enums{$old} = $new if (exists($enums{$old}));
-			next;
-		}
-		if ($type eq "struct") {
-			$structs{$old} = $new if (exists($structs{$old}));
-			next;
-		}
-
-		die "Can't parse $file_exceptions: $_";
-	}
-}
-
-if ($debug) {
-	my @all_hashes = (
-		{ioctl      => \%ioctls},
-		{typedef    => \%typedefs},
-		{enum       => \%enums},
-		{struct     => \%structs},
-		{define     => \%defines},
-		{symbol     => \%enum_symbols}
-	);
-
-	foreach my $hash (@all_hashes) {
-		while (my ($name, $hash_ref) = each %$hash) {
-			next unless %$hash_ref;  # Skip empty hashes
-
-			print "$name:\n";
-			for my $key (sort keys %$hash_ref) {
-				print "  $key -> $hash_ref->{$key}\n";
-			}
-			print "\n";
-		}
-	}
-}
-
-#
-# Align block
-#
-$data = expand($data);
-$data = "    " . $data;
-$data =~ s/\n/\n    /g;
-$data =~ s/\n\s+$/\n/g;
-$data =~ s/\n\s+\n/\n\n/g;
-
-#
-# Add escape codes for special characters
-#
-$data =~ s,([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^]),\\$1,g;
-
-$data =~ s,DEPRECATED,**DEPRECATED**,g;
-
-#
-# Add references
-#
-
-my $start_delim = "[ \n\t\(\=\*\@]";
-my $end_delim = "(\\s|,|\\\\=|\\\\:|\\;|\\\)|\\}|\\{)";
-
-foreach my $r (keys %ioctls) {
-	my $s = $ioctls{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-
-	$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
-}
-
-foreach my $r (keys %defines) {
-	my $s = $defines{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-
-	$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
-}
-
-foreach my $r (keys %enum_symbols) {
-	my $s = $enum_symbols{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-
-	$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
-}
-
-foreach my $r (keys %enums) {
-	my $s = $enums{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-
-	$data =~ s/enum\s+($r)$end_delim/$s$2/g;
-}
-
-foreach my $r (keys %structs) {
-	my $s = $structs{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-
-	$data =~ s/struct\s+($r)$end_delim/$s$2/g;
-}
-
-foreach my $r (keys %typedefs) {
-	my $s = $typedefs{$r};
-
-	$r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g;
-
-	print "$r -> $s\n" if ($debug);
-	$data =~ s/($start_delim)($r)$end_delim/$1$s$3/g;
-}
-
-$data =~ s/\\ ([\n\s])/\1/g;
-
-#
-# Generate output file
-#
-
-my $title = $file_in;
-$title =~ s,.*/,,;
-
-open OUT, "> $file_out" or die "Can't open $file_out";
-print OUT ".. -*- coding: utf-8; mode: rst -*-\n\n";
-print OUT "$title\n";
-print OUT "=" x length($title);
-print OUT "\n\n.. parsed-literal::\n\n";
-print OUT $data;
-close OUT;
-
-__END__
-
-=head1 NAME
-
-parse_headers.pl - parse a C file, in order to identify functions, structs,
-enums and defines and create cross-references to a Sphinx book.
-
-=head1 SYNOPSIS
-
-B<parse_headers.pl> [<options>] <C_FILE> <OUT_FILE> [<EXCEPTIONS_FILE>]
-
-Where <options> can be: --debug, --help or --usage.
-
-=head1 OPTIONS
-
-=over 8
-
-=item B<--debug>
-
-Put the script in verbose mode, useful for debugging.
-
-=item B<--usage>
-
-Prints a brief help message and exits.
-
-=item B<--help>
-
-Prints a more detailed help message and exits.
-
-=back
-
-=head1 DESCRIPTION
-
-Convert a C header or source file (C_FILE), into a ReStructured Text
-included via ..parsed-literal block with cross-references for the
-documentation files that describe the API. It accepts an optional
-EXCEPTIONS_FILE with describes what elements will be either ignored or
-be pointed to a non-default reference.
-
-The output is written at the (OUT_FILE).
-
-It is capable of identifying defines, functions, structs, typedefs,
-enums and enum symbols and create cross-references for all of them.
-It is also capable of distinguish #define used for specifying a Linux
-ioctl.
-
-The EXCEPTIONS_FILE contain two rules to allow ignoring a symbol or
-to replace the default references by a custom one.
-
-Please read Documentation/doc-guide/parse-headers.rst at the Kernel's
-tree for more details.
-
-=head1 BUGS
-
-Report bugs to Mauro Carvalho Chehab <mchehab@kernel.org>
-
-=head1 COPYRIGHT
-
-Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>.
-
-License GPLv2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html>.
-
-This is free software: you are free to change and redistribute it.
-There is NO WARRANTY, to the extent permitted by law.
-
-=cut
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 00/24] better handle media headers
  2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
                   ` (23 preceding siblings ...)
  2025-08-21 14:21 ` [PATCH 24/24] docs: sphinx: drop parse-headers.pl Mauro Carvalho Chehab
@ 2025-08-25  7:48 ` Jani Nikula
  24 siblings, 0 replies; 26+ messages in thread
From: Jani Nikula @ 2025-08-25  7:48 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel, Mauro Carvalho Chehab,
	Sean Young, linux-media

On Thu, 21 Aug 2025, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Its goal is to drop one of the most ancient and ugliest hack from
> the documentation build system.

Yay! \o/

Cheers,
Jani.


-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-08-25  7:48 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-21 14:21 [PATCH 00/24] better handle media headers Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 01/24] docs: parse-headers.pl: improve its debug output format Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 02/24] docs: parse-headers.py: convert parse-headers.pl Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 03/24] docs: parse-headers.py: improve --help logic Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 04/24] docs: parse-headers.py: better handle @var arguments Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 05/24] docs: parse-headers.py: simplify the rules for hashes Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 06/24] tools: docs: parse-headers.py: move it from sphinx dir Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 07/24] tools: docs: parse_data_structs.py: add methods to return output Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 08/24] MAINTAINERS: add files from tools/docs to documentation entry Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 09/24] docs: uapi: media: Makefile: use parse-headers.py Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 10/24] docs: kernel_include.py: Update its coding style Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 11/24] docs: kernel_include.py: allow cross-reference generation Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 12/24] docs: kernel_include.py: generate warnings for broken refs Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 13/24] docs: kernel_include.py: move rawtext logic to separate functions Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 14/24] docs: kernel_include.py: move range logic to a separate function Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 15/24] docs: kernel_include.py: remove range restriction for gen docs Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 16/24] docs: kernel_include.py: move code and literal functions Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 17/24] docs: kernel_include.py: add support to generate a TOC table Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 18/24] docs: kernel_include.py: append line numbers to better report errors Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 19/24] docs: kernel_include.py: move apply_range() and add a docstring Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 20/24] docs: kernel_include.py: remove line numbers from parsed-literal Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 21/24] docs: kernel_include.py: remove Include class inheritance Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 22/24] docs: kernel_include.py: document all supported parameters Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 23/24] scripts: sphinx-build-wrapper: get rid of uapi/media Makefile Mauro Carvalho Chehab
2025-08-21 14:21 ` [PATCH 24/24] docs: sphinx: drop parse-headers.pl Mauro Carvalho Chehab
2025-08-25  7:48 ` [PATCH 00/24] better handle media headers Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).