linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/33] Implement kernel-doc in Python
@ 2025-04-08 10:09 Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
                   ` (35 more replies)
  0 siblings, 36 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel, Gustavo A. R. Silva,
	Kees Cook, Russell King, linux-hardening, netdev

Hi Jon,

This changeset contains the kernel-doc.py script to replace the verable
kernel-doc originally written in Perl. It replaces the first version and the
second series I sent on the top of it.

I tried to stay as close as possible of the original Perl implementation
on the first patch introducing kernel-doc.py, as it helps to double check
if each function was  properly translated to Python.  This have been 
helpful debugging troubles that happened during the conversion.

I worked hard to make it bug-compatible with the original one. Still, its
output has a couple of differences from the original one:

- The tab expansion works better with the Python script. With that, some
  outputs that contain tabs at kernel-doc markups are now different;

- The new script  works better stripping blank lines. So, there are a couple
  of empty new lines that are now stripped with this version;

- There is a buggy logic at kernel-doc to strip empty description and
  return sections. I was not able to replicate the exact behavior. So, I ended
  adding an extra logic to strip empty sections with a different algorithm.

Yet, on my tests, the results are compatible with the venerable script
output for all .. kernel-doc tags found in Documentation/. I double-checked
this by adding support to output the kernel-doc commands when V=1, and
then I ran a diff between kernel-doc.pl and kernel-doc.py for the same
command lines.

The only patch that doesn't belong to this series is a patch dropping
kernel-doc.pl. I opted to keep it for now, as it can help to better
test the new tools.

With such changes, if one wants to build docs with the old script,
all it is needed is to use KERNELDOC parameter, e.g.:

	$ make KERNELDOC=scripts/kernel-doc.pl htmldocs

---

v3:
- rebased on the top of v6.15-rc1;
- Removed patches that weren't touching kernel-doc and its Sphinx extension;
- The "Re" class was renamed to "KernRe"
- It contains one patch from Sean with an additional hunk for the
  python version.

Mauro Carvalho Chehab (32):
  scripts/kernel-doc: rename it to scripts/kernel-doc.pl
  scripts/kernel-doc: add a symlink to the Perl version of kernel-doc
  scripts/kernel-doc.py: add a Python parser
  scripts/kernel-doc.py: output warnings the same way as kerneldoc
  scripts/kernel-doc.py: better handle empty sections
  scripts/kernel-doc.py: properly handle struct_group macros
  scripts/kernel-doc.py: move regex methods to a separate file
  scripts/kernel-doc.py: move KernelDoc class to a separate file
  scripts/kernel-doc.py: move KernelFiles class to a separate file
  scripts/kernel-doc.py: move output classes to a separate file
  scripts/kernel-doc.py: convert message output to an interactor
  scripts/kernel-doc.py: move file lists to the parser function
  scripts/kernel-doc.py: implement support for -no-doc-sections
  scripts/kernel-doc.py: fix line number output
  scripts/kernel-doc.py: fix handling of doc output check
  scripts/kernel-doc.py: properly handle out_section for ReST
  scripts/kernel-doc.py: postpone warnings to the output plugin
  docs: add a .pylintrc file with sys path for docs scripts
  docs: sphinx: kerneldoc: verbose kernel-doc command if V=1
  docs: sphinx: kerneldoc: ignore "\" characters from options
  docs: sphinx: kerneldoc: use kernel-doc.py script
  scripts/kernel-doc.py: Set an output format for --none
  scripts/kernel-doc.py: adjust some coding style issues
  scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13
  scripts/kernel-doc.py: move modulename to man class
  scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP
  scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency
  scripts/kernel-doc.py: Properly handle Werror and exit codes
  scripts/kernel-doc: switch to use kernel-doc.py
  scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname
  scripts/kernel_doc.py: better handle exported symbols
  scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe

Sean Anderson (1):
  scripts: kernel-doc: fix parsing function-like typedefs (again)

 .pylintrc                         |    2 +
 Documentation/Makefile            |    2 +-
 Documentation/conf.py             |    2 +-
 Documentation/sphinx/kerneldoc.py |   46 +
 scripts/kernel-doc                | 2440 +----------------------------
 scripts/kernel-doc.pl             | 2439 ++++++++++++++++++++++++++++
 scripts/kernel-doc.py             |  315 ++++
 scripts/lib/kdoc/kdoc_files.py    |  282 ++++
 scripts/lib/kdoc/kdoc_output.py   |  793 ++++++++++
 scripts/lib/kdoc/kdoc_parser.py   | 1715 ++++++++++++++++++++
 scripts/lib/kdoc/kdoc_re.py       |  273 ++++
 11 files changed, 5868 insertions(+), 2441 deletions(-)
 create mode 100644 .pylintrc
 mode change 100755 => 120000 scripts/kernel-doc
 create mode 100755 scripts/kernel-doc.pl
 create mode 100755 scripts/kernel-doc.py
 create mode 100644 scripts/lib/kdoc/kdoc_files.py
 create mode 100755 scripts/lib/kdoc/kdoc_output.py
 create mode 100755 scripts/lib/kdoc/kdoc_parser.py
 create mode 100755 scripts/lib/kdoc/kdoc_re.py

-- 
2.49.0



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 02/33] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

In preparation for deprecating scripts/kernel-doc in favor of a
new version written in Perl, rename it to scripts/kernel-doc.pl.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/{kernel-doc => kernel-doc.pl} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename scripts/{kernel-doc => kernel-doc.pl} (100%)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc.pl
similarity index 100%
rename from scripts/kernel-doc
rename to scripts/kernel-doc.pl
-- 
2.49.0


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v3 02/33] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 03/33] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Preserve kernel-doc name, associating with the curent version
in Perl.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc | 1 +
 1 file changed, 1 insertion(+)
 create mode 120000 scripts/kernel-doc

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
new file mode 120000
index 000000000000..f175155c1e66
--- /dev/null
+++ b/scripts/kernel-doc
@@ -0,0 +1 @@
+kernel-doc.pl
\ No newline at end of file
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 03/33] scripts/kernel-doc.py: add a Python parser
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 02/33] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 04/33] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Gustavo A. R. Silva, Kees Cook,
	linux-hardening, linux-kernel

Maintaining kernel-doc has been a challenge, as there aren't many
perl developers among maintainers. Also, the logic there is too
complex. Having lots of global variables and using pure functions
doesn't help.

Rewrite the script in Python, placing most global variables
inside classes. This should help maintaining the script in long
term.

It also allows a better integration with kernel-doc Sphinx
extension in the future.

I opted to keep this version as close as possible to what we
have already in Perl. There are some differences though:

1. There is one regular expression that required a rewrite:

	/\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;/

   As this one uses two features that aren't available by the native
   Python regular expression module (re):

	- recursive patterns: ?1
	- atomic grouping (?>...)

   Rewrite it to use a much simpler regular expression:

	/\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;/

   Extra care should be taken when validating this script, as such
   replacement might cause some regressions.

2. The filters are now applied only during output generation.
   In particular, "nosymbol" argument is only handled there.

   It means that, if the same file is processed twice for
   different symbols, the warnings will be duplicated.

   I opted to use this behavior as it allows the Sphinx extension
   to read the file(s) only once, and apply the filtering only
   when producing the ReST output. This hopefully will help
   to speed up doc generation

3. This version can handle multiple files and multiple directories.

   So, if one just wants to produce a big output with everything
   inside a file, this could be done with

   $ time ./scripts/kernel-doc.py -man . 2>/dev/null >new
   real    0m54.592s
   user    0m53.345s
   sys     0m0.997s

4. I tried to replicate as much as possible the same arguments
   from kernel-doc, with about the same behavior, for the
   command line parameters starting with a single dash (-parameter).

   I also added one letter aliases for each parameter, and a
   --parameter (sometimes with a better name).

5. There are some sutile nuances between how Perl handles
   certain regular expressions. In special, the qr operatior,
   which compiles a regular expression also works as a
   non-capturing group. It means that some regexes like
   this one:

	my $type1 = qr{[\w\s]+};

   needs to be mapped as:

	type1 = r'(?:[\w\s]+)?'

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

---

TODO:
- on this RFC, the man output doesn't match yet the same output of
  kernel-doc. The ReST output matches, except for some whitespaces
  and suppressed empty sectionsl
- this version lacks support for -W<filter> parameters: it will just
  output all warnings.
- all classes are at the same file. I want to split the classes on
  multiple files for the final version, but, during development time,
  it is easier to have everything on a single file, but I plan to split
  classes on different files to help maintaining the script.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py | 2832 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2832 insertions(+)
 create mode 100755 scripts/kernel-doc.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
new file mode 100755
index 000000000000..114f3699bf7c
--- /dev/null
+++ b/scripts/kernel-doc.py
@@ -0,0 +1,2832 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
+# pylint: disable=C0302,C0103,C0301
+# pylint: disable=C0116,C0115,W0511,W0613
+#
+# Converted from the kernel-doc script originally written in Perl
+# under GPLv2, copyrighted since 1998 by the following authors:
+#
+#    Aditya Srivastava <yashsri421@gmail.com>
+#    Akira Yokosawa <akiyks@gmail.com>
+#    Alexander A. Klimov <grandmaster@al2klimov.de>
+#    Alexander Lobakin <aleksander.lobakin@intel.com>
+#    André Almeida <andrealmeid@igalia.com>
+#    Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+#    Anna-Maria Behnsen <anna-maria@linutronix.de>
+#    Armin Kuster <akuster@mvista.com>
+#    Bart Van Assche <bart.vanassche@sandisk.com>
+#    Ben Hutchings <ben@decadent.org.uk>
+#    Borislav Petkov <bbpetkov@yahoo.de>
+#    Chen-Yu Tsai <wenst@chromium.org>
+#    Coco Li <lixiaoyan@google.com>
+#    Conchúr Navid <conchur@web.de>
+#    Daniel Santos <daniel.santos@pobox.com>
+#    Danilo Cesar Lemes de Paula <danilo.cesar@collabora.co.uk>
+#    Dan Luedtke <mail@danrl.de>
+#    Donald Hunter <donald.hunter@gmail.com>
+#    Gabriel Krisman Bertazi <krisman@collabora.co.uk>
+#    Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+#    Harvey Harrison <harvey.harrison@gmail.com>
+#    Horia Geanta <horia.geanta@freescale.com>
+#    Ilya Dryomov <idryomov@gmail.com>
+#    Jakub Kicinski <kuba@kernel.org>
+#    Jani Nikula <jani.nikula@intel.com>
+#    Jason Baron <jbaron@redhat.com>
+#    Jason Gunthorpe <jgg@nvidia.com>
+#    Jérémy Bobbio <lunar@debian.org>
+#    Johannes Berg <johannes.berg@intel.com>
+#    Johannes Weiner <hannes@cmpxchg.org>
+#    Jonathan Cameron <Jonathan.Cameron@huawei.com>
+#    Jonathan Corbet <corbet@lwn.net>
+#    Jonathan Neuschäfer <j.neuschaefer@gmx.net>
+#    Kamil Rytarowski <n54@gmx.com>
+#    Kees Cook <kees@kernel.org>
+#    Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+#    Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>
+#    Linus Torvalds <torvalds@linux-foundation.org>
+#    Lucas De Marchi <lucas.demarchi@profusion.mobi>
+#    Mark Rutland <mark.rutland@arm.com>
+#    Markus Heiser <markus.heiser@darmarit.de>
+#    Martin Waitz <tali@admingilde.org>
+#    Masahiro Yamada <masahiroy@kernel.org>
+#    Matthew Wilcox <willy@infradead.org>
+#    Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+#    Michal Wajdeczko <michal.wajdeczko@intel.com>
+#    Michael Zucchi
+#    Mike Rapoport <rppt@linux.ibm.com>
+#    Niklas Söderlund <niklas.soderlund@corigine.com>
+#    Nishanth Menon <nm@ti.com>
+#    Paolo Bonzini <pbonzini@redhat.com>
+#    Pavan Kumar Linga <pavan.kumar.linga@intel.com>
+#    Pavel Pisa <pisa@cmp.felk.cvut.cz>
+#    Peter Maydell <peter.maydell@linaro.org>
+#    Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+#    Randy Dunlap <rdunlap@infradead.org>
+#    Richard Kennedy <richard@rsk.demon.co.uk>
+#    Rich Walker <rw@shadow.org.uk>
+#    Rolf Eike Beer <eike-kernel@sf-tec.de>
+#    Sakari Ailus <sakari.ailus@linux.intel.com>
+#    Silvio Fricke <silvio.fricke@gmail.com>
+#    Simon Huggins
+#    Tim Waugh <twaugh@redhat.com>
+#    Tomasz Warniełło <tomasz.warniello@gmail.com>
+#    Utkarsh Tripathi <utripathi2002@gmail.com>
+#    valdis.kletnieks@vt.edu <valdis.kletnieks@vt.edu>
+#    Vegard Nossum <vegard.nossum@oracle.com>
+#    Will Deacon <will.deacon@arm.com>
+#    Yacine Belkadi <yacine.belkadi.1@gmail.com>
+#    Yujie Liu <yujie.liu@intel.com>
+
+# TODO: implement warning filtering
+
+"""
+kernel_doc
+==========
+
+Print formatted kernel documentation to stdout
+
+Read C language source or header FILEs, extract embedded
+documentation comments, and print formatted documentation
+to standard output.
+
+The documentation comments are identified by the "/**"
+opening comment mark.
+
+See Documentation/doc-guide/kernel-doc.rst for the
+documentation comment syntax.
+"""
+
+import argparse
+import logging
+import os
+import re
+import sys
+
+from datetime import datetime
+from pprint import pformat
+
+from dateutil import tz
+
+# Local cache for regular expressions
+re_cache = {}
+
+
+class Re:
+    """
+    Helper class to simplify regex declaration and usage,
+
+    It calls re.compile for a given pattern. It also allows adding
+    regular expressions and define sub at class init time.
+
+    Regular expressions can be cached via an argument, helping to speedup
+    searches.
+    """
+
+    def _add_regex(self, string, flags):
+        if string in re_cache:
+            self.regex = re_cache[string]
+        else:
+            self.regex = re.compile(string, flags=flags)
+
+            if self.cache:
+                re_cache[string] = self.regex
+
+    def __init__(self, string, cache=True, flags=0):
+        self.cache = cache
+        self.last_match = None
+
+        self._add_regex(string, flags)
+
+    def __str__(self):
+        return self.regex.pattern
+
+    def __add__(self, other):
+        return Re(str(self) + str(other), cache=self.cache or other.cache,
+                  flags=self.regex.flags | other.regex.flags)
+
+    def match(self, string):
+        self.last_match = self.regex.match(string)
+        return self.last_match
+
+    def search(self, string):
+        self.last_match = self.regex.search(string)
+        return self.last_match
+
+    def findall(self, string):
+        return self.regex.findall(string)
+
+    def split(self, string):
+        return self.regex.split(string)
+
+    def sub(self, sub, string, count=0):
+        return self.regex.sub(sub, string, count=count)
+
+    def group(self, num):
+        return self.last_match.group(num)
+
+#
+# Regular expressions used to parse kernel-doc markups at KernelDoc class.
+#
+# Let's declare them in lowercase outside any class to make easier to
+# convert from the python script.
+#
+# As those are evaluated at the beginning, no need to cache them
+#
+
+
+# Allow whitespace at end of comment start.
+doc_start = Re(r'^/\*\*\s*$', cache=False)
+
+doc_end = Re(r'\*/', cache=False)
+doc_com = Re(r'\s*\*\s*', cache=False)
+doc_com_body = Re(r'\s*\* ?', cache=False)
+doc_decl = doc_com + Re(r'(\w+)', cache=False)
+
+# @params and a strictly limited set of supported section names
+# Specifically:
+#   Match @word:
+#         @...:
+#         @{section-name}:
+# while trying to not match literal block starts like "example::"
+#
+doc_sect = doc_com + \
+            Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:([^:].*)?$',
+                flags=re.I, cache=False)
+
+doc_content = doc_com_body + Re(r'(.*)', cache=False)
+doc_block = doc_com + Re(r'DOC:\s*(.*)?', cache=False)
+doc_inline_start = Re(r'^\s*/\*\*\s*$', cache=False)
+doc_inline_sect = Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
+doc_inline_end = Re(r'^\s*\*/\s*$', cache=False)
+doc_inline_oneline = Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
+function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
+attribute = Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
+               flags=re.I | re.S, cache=False)
+
+# match expressions used to find embedded type information
+type_constant = Re(r"\b``([^\`]+)``\b", cache=False)
+type_constant2 = Re(r"\%([-_*\w]+)", cache=False)
+type_func = Re(r"(\w+)\(\)", cache=False)
+type_param = Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+type_param_ref = Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+
+# Special RST handling for func ptr params
+type_fp_param = Re(r"\@(\w+)\(\)", cache=False)
+
+# Special RST handling for structs with func ptr params
+type_fp_param2 = Re(r"\@(\w+->\S+)\(\)", cache=False)
+
+type_env = Re(r"(\$\w+)", cache=False)
+type_enum = Re(r"\&(enum\s*([_\w]+))", cache=False)
+type_struct = Re(r"\&(struct\s*([_\w]+))", cache=False)
+type_typedef = Re(r"\&(typedef\s*([_\w]+))", cache=False)
+type_union = Re(r"\&(union\s*([_\w]+))", cache=False)
+type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
+type_fallback = Re(r"\&([_\w]+)", cache=False)
+type_member_func = type_member + Re(r"\(\)", cache=False)
+
+export_symbol = Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
+export_symbol_ns = Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
+
+class KernelDoc:
+    # Parser states
+    STATE_NORMAL        = 0        # normal code
+    STATE_NAME          = 1        # looking for function name
+    STATE_BODY_MAYBE    = 2        # body - or maybe more description
+    STATE_BODY          = 3        # the body of the comment
+    STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
+    STATE_PROTO         = 5        # scanning prototype
+    STATE_DOCBLOCK      = 6        # documentation block
+    STATE_INLINE        = 7        # gathering doc outside main block
+
+    st_name = [
+        "NORMAL",
+        "NAME",
+        "BODY_MAYBE",
+        "BODY",
+        "BODY_WITH_BLANK_LINE",
+        "PROTO",
+        "DOCBLOCK",
+        "INLINE",
+    ]
+
+    # Inline documentation state
+    STATE_INLINE_NA     = 0 # not applicable ($state != STATE_INLINE)
+    STATE_INLINE_NAME   = 1 # looking for member name (@foo:)
+    STATE_INLINE_TEXT   = 2 # looking for member documentation
+    STATE_INLINE_END    = 3 # done
+    STATE_INLINE_ERROR  = 4 # error - Comment without header was found.
+                            # Spit a warning as it's not
+                            # proper kernel-doc and ignore the rest.
+
+    st_inline_name = [
+        "",
+        "_NAME",
+        "_TEXT",
+        "_END",
+        "_ERROR",
+    ]
+
+    # Section names
+
+    section_default = "Description"  # default section
+    section_intro = "Introduction"
+    section_context = "Context"
+    section_return = "Return"
+
+    undescribed = "-- undescribed --"
+
+    def __init__(self, config, fname):
+        """Initialize internal variables"""
+
+        self.fname = fname
+        self.config = config
+
+        # Initial state for the state machines
+        self.state = self.STATE_NORMAL
+        self.inline_doc_state = self.STATE_INLINE_NA
+
+        # Store entry currently being processed
+        self.entry = None
+
+        # Place all potential outputs into an array
+        self.entries = []
+
+    def show_warnings(self, dtype, declaration_name):
+        # TODO: implement it
+
+        return True
+
+    # TODO: rename to emit_message
+    def emit_warning(self, ln, msg, warning=True):
+        """Emit a message"""
+
+        if warning:
+            self.config.log.warning("%s:%d %s", self.fname, ln, msg)
+        else:
+            self.config.log.info("%s:%d %s", self.fname, ln, msg)
+
+    def dump_section(self, start_new=True):
+        """
+        Dumps section contents to arrays/hashes intended for that purpose.
+        """
+
+        name = self.entry.section
+        contents = self.entry.contents
+
+        if type_param.match(name):
+            name = type_param.group(1)
+
+            self.entry.parameterdescs[name] = contents
+            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
+
+            self.entry.sectcheck += name + " "
+            self.entry.new_start_line = 0
+
+        elif name == "@...":
+            name = "..."
+            self.entry.parameterdescs[name] = contents
+            self.entry.sectcheck += name + " "
+            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
+            self.entry.new_start_line = 0
+
+        else:
+            if name in self.entry.sections and self.entry.sections[name] != "":
+                # Only warn on user-specified duplicate section names
+                if name != self.section_default:
+                    self.emit_warning(self.entry.new_start_line,
+                                      f"duplicate section name '{name}'\n")
+                self.entry.sections[name] += contents
+            else:
+                self.entry.sections[name] = contents
+                self.entry.sectionlist.append(name)
+                self.entry.section_start_lines[name] = self.entry.new_start_line
+                self.entry.new_start_line = 0
+
+#        self.config.log.debug("Section: %s : %s", name, pformat(vars(self.entry)))
+
+        if start_new:
+            self.entry.section = self.section_default
+            self.entry.contents = ""
+
+    # TODO: rename it to store_declaration
+    def output_declaration(self, dtype, name, **args):
+        """
+        Stores the entry into an entry array.
+
+        The actual output and output filters will be handled elsewhere
+        """
+
+        # The implementation here is different than the original kernel-doc:
+        # instead of checking for output filters or actually output anything,
+        # it just stores the declaration content at self.entries, as the
+        # output will happen on a separate class.
+        #
+        # For now, we're keeping the same name of the function just to make
+        # easier to compare the source code of both scripts
+
+        if "declaration_start_line" not in args:
+            args["declaration_start_line"] = self.entry.declaration_start_line
+
+        args["type"] = dtype
+
+        self.entries.append((name, args))
+
+        self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
+
+    def reset_state(self, ln):
+        """
+        Ancillary routine to create a new entry. It initializes all
+        variables used by the state machine.
+        """
+
+        self.entry = argparse.Namespace
+
+        self.entry.contents = ""
+        self.entry.function = ""
+        self.entry.sectcheck = ""
+        self.entry.struct_actual = ""
+        self.entry.prototype = ""
+
+        self.entry.parameterlist = []
+        self.entry.parameterdescs = {}
+        self.entry.parametertypes = {}
+        self.entry.parameterdesc_start_lines = {}
+
+        self.entry.section_start_lines = {}
+        self.entry.sectionlist = []
+        self.entry.sections = {}
+
+        self.entry.anon_struct_union = False
+
+        self.entry.leading_space = None
+
+        # State flags
+        self.state = self.STATE_NORMAL
+        self.inline_doc_state = self.STATE_INLINE_NA
+        self.entry.brcount = 0
+
+        self.entry.in_doc_sect = False
+        self.entry.declaration_start_line = ln
+
+    def push_parameter(self, ln, decl_type, param, dtype,
+                       org_arg, declaration_name):
+        if self.entry.anon_struct_union and dtype == "" and param == "}":
+            return  # Ignore the ending }; from anonymous struct/union
+
+        self.entry.anon_struct_union = False
+
+        param = Re(r'[\[\)].*').sub('', param, count=1)
+
+        if dtype == "" and param.endswith("..."):
+            if Re(r'\w\.\.\.$').search(param):
+                # For named variable parameters of the form `x...`,
+                # remove the dots
+                param = param[:-3]
+            else:
+                # Handles unnamed variable parameters
+                param = "..."
+
+            if param not in self.entry.parameterdescs or \
+                not self.entry.parameterdescs[param]:
+
+                self.entry.parameterdescs[param] = "variable arguments"
+
+        elif dtype == "" and (not param or param == "void"):
+            param = "void"
+            self.entry.parameterdescs[param] = "no arguments"
+
+        elif dtype == "" and param in ["struct", "union"]:
+            # Handle unnamed (anonymous) union or struct
+            dtype = param
+            param = "{unnamed_" + param + "}"
+            self.entry.parameterdescs[param] = "anonymous\n"
+            self.entry.anon_struct_union = True
+
+        # Handle cache group enforcing variables: they do not need
+        # to be described in header files
+        elif "__cacheline_group" in param:
+            # Ignore __cacheline_group_begin and __cacheline_group_end
+            return
+
+        # Warn if parameter has no description
+        # (but ignore ones starting with # as these are not parameters
+        # but inline preprocessor statements)
+        if param not in self.entry.parameterdescs and not param.startswith("#"):
+            self.entry.parameterdescs[param] = self.undescribed
+
+            if self.show_warnings(dtype, declaration_name) and "." not in param:
+                if decl_type == 'function':
+                    dname = f"{decl_type} parameter"
+                else:
+                    dname = f"{decl_type} member"
+
+                self.emit_warning(ln,
+                                  f"{dname} '{param}' not described in '{declaration_name}'")
+
+        # Strip spaces from param so that it is one continuous string on
+        # parameterlist. This fixes a problem where check_sections()
+        # cannot find a parameter like "addr[6 + 2]" because it actually
+        # appears as "addr[6", "+", "2]" on the parameter list.
+        # However, it's better to maintain the param string unchanged for
+        # output, so just weaken the string compare in check_sections()
+        # to ignore "[blah" in a parameter string.
+
+        self.entry.parameterlist.append(param)
+        org_arg = Re(r'\s\s+').sub(' ', org_arg, count=1)
+        self.entry.parametertypes[param] = org_arg
+
+    def save_struct_actual(self, actual):
+        """
+        Strip all spaces from the actual param so that it looks like
+        one string item.
+        """
+
+        actual = Re(r'\s*').sub("", actual, count=1)
+
+        self.entry.struct_actual += actual + " "
+
+    def create_parameter_list(self, ln, decl_type, args, splitter, declaration_name):
+
+        # temporarily replace all commas inside function pointer definition
+        arg_expr = Re(r'(\([^\),]+),')
+        while arg_expr.search(args):
+            args = arg_expr.sub(r"\1#", args)
+
+        for arg in args.split(splitter):
+            # Strip comments
+            arg = Re(r'\/\*.*\*\/').sub('', arg)
+
+            # Ignore argument attributes
+            arg = Re(r'\sPOS0?\s').sub(' ', arg)
+
+            # Strip leading/trailing spaces
+            arg = arg.strip()
+            arg = Re(r'\s+').sub(' ', arg, count=1)
+
+            if arg.startswith('#'):
+                # Treat preprocessor directive as a typeless variable just to fill
+                # corresponding data structures "correctly". Catch it later in
+                # output_* subs.
+
+                # Treat preprocessor directive as a typeless variable
+                self.push_parameter(ln, decl_type, arg, "",
+                                    "", declaration_name)
+
+            elif Re(r'\(.+\)\s*\(').search(arg):
+                # Pointer-to-function
+
+                arg = arg.replace('#', ',')
+
+                r = Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
+                if r.match(arg):
+                    param = r.group(1)
+                else:
+                    self.emit_warning(ln, f"Invalid param: {arg}")
+                    param = arg
+
+                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+                self.save_struct_actual(param)
+                self.push_parameter(ln, decl_type, param, dtype,
+                                    arg, declaration_name)
+
+            elif Re(r'\(.+\)\s*\[').search(arg):
+                # Array-of-pointers
+
+                arg = arg.replace('#', ',')
+                r = Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
+                if r.match(arg):
+                    param = r.group(1)
+                else:
+                    self.emit_warning(ln, f"Invalid param: {arg}")
+                    param = arg
+
+                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+
+                self.save_struct_actual(param)
+                self.push_parameter(ln, decl_type, param, dtype,
+                                    arg, declaration_name)
+
+            elif arg:
+                arg = Re(r'\s*:\s*').sub(":", arg)
+                arg = Re(r'\s*\[').sub('[', arg)
+
+                args = Re(r'\s*,\s*').split(arg)
+                if args[0] and '*' in args[0]:
+                    args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
+
+                first_arg = []
+                r = Re(r'^(.*\s+)(.*?\[.*\].*)$')
+                if args[0] and r.match(args[0]):
+                    args.pop(0)
+                    first_arg.extend(r.group(1))
+                    first_arg.append(r.group(2))
+                else:
+                    first_arg = Re(r'\s+').split(args.pop(0))
+
+                args.insert(0, first_arg.pop())
+                dtype = ' '.join(first_arg)
+
+                for param in args:
+                    if Re(r'^(\*+)\s*(.*)').match(param):
+                        r = Re(r'^(\*+)\s*(.*)')
+                        if not r.match(param):
+                            self.emit_warning(ln, f"Invalid param: {param}")
+                            continue
+
+                        param = r.group(1)
+
+                        self.save_struct_actual(r.group(2))
+                        self.push_parameter(ln, decl_type, r.group(2),
+                                            f"{dtype} {r.group(1)}",
+                                            arg, declaration_name)
+
+                    elif Re(r'(.*?):(\w+)').search(param):
+                        r = Re(r'(.*?):(\w+)')
+                        if not r.match(param):
+                            self.emit_warning(ln, f"Invalid param: {param}")
+                            continue
+
+                        if dtype != "":  # Skip unnamed bit-fields
+                            self.save_struct_actual(r.group(1))
+                            self.push_parameter(ln, decl_type, r.group(1),
+                                                f"{dtype}:{r.group(2)}",
+                                                arg, declaration_name)
+                    else:
+                        self.save_struct_actual(param)
+                        self.push_parameter(ln, decl_type, param, dtype,
+                                            arg, declaration_name)
+
+    def check_sections(self, ln, decl_name, decl_type, sectcheck, prmscheck):
+        sects = sectcheck.split()
+        prms = prmscheck.split()
+        err = False
+
+        for sx in range(len(sects)):                  # pylint: disable=C0200
+            err = True
+            for px in range(len(prms)):               # pylint: disable=C0200
+                prm_clean = prms[px]
+                prm_clean = Re(r'\[.*\]').sub('', prm_clean)
+                prm_clean = attribute.sub('', prm_clean)
+
+                # ignore array size in a parameter string;
+                # however, the original param string may contain
+                # spaces, e.g.:  addr[6 + 2]
+                # and this appears in @prms as "addr[6" since the
+                # parameter list is split at spaces;
+                # hence just ignore "[..." for the sections check;
+                prm_clean = Re(r'\[.*').sub('', prm_clean)
+
+                if prm_clean == sects[sx]:
+                    err = False
+                    break
+
+            if err:
+                if decl_type == 'function':
+                    dname = f"{decl_type} parameter"
+                else:
+                    dname = f"{decl_type} member"
+
+                self.emit_warning(ln,
+                                  f"Excess {dname} '{sects[sx]}' description in '{decl_name}'")
+
+    def check_return_section(self, ln, declaration_name, return_type):
+
+        if not self.config.wreturn:
+            return
+
+        # Ignore an empty return type (It's a macro)
+        # Ignore functions with a "void" return type (but not "void *")
+        if not return_type or Re(r'void\s*\w*\s*$').search(return_type):
+            return
+
+        if not self.entry.sections.get("Return", None):
+            self.emit_warning(ln,
+                              f"No description found for return value of '{declaration_name}'")
+
+    def dump_struct(self, ln, proto):
+        """
+        Store an entry for an struct or union
+        """
+
+        type_pattern = r'(struct|union)'
+
+        qualifiers = [
+            "__attribute__",
+            "__packed",
+            "__aligned",
+            "____cacheline_aligned_in_smp",
+            "____cacheline_aligned",
+        ]
+
+        definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
+        struct_members = Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
+
+        # Extract struct/union definition
+        members = None
+        declaration_name = None
+        decl_type = None
+
+        r = Re(type_pattern + r'\s+(\w+)\s*' + definition_body)
+        if r.search(proto):
+            decl_type = r.group(1)
+            declaration_name = r.group(2)
+            members = r.group(3)
+        else:
+            r = Re(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
+
+            if r.search(proto):
+                decl_type = r.group(1)
+                declaration_name = r.group(3)
+                members = r.group(2)
+
+        if not members:
+            self.emit_warning(ln, f"{proto} error: Cannot parse struct or union!")
+            self.config.errors += 1
+            return
+
+        if self.entry.identifier != declaration_name:
+            self.emit_warning(ln,
+                              f"expecting prototype for {decl_type} {self.entry.identifier}. Prototype was for {decl_type} {declaration_name} instead\n")
+            return
+
+        args_pattern =r'([^,)]+)'
+
+        sub_prefixes = [
+            (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I),  ''),
+            (Re(r'\/\*\s*private:.*', re.S| re.I),  ''),
+
+            # Strip comments
+            (Re(r'\/\*.*?\*\/', re.S),  ''),
+
+            # Strip attributes
+            (attribute, ' '),
+            (Re(r'\s*__aligned\s*\([^;]*\)', re.S),  ' '),
+            (Re(r'\s*__counted_by\s*\([^;]*\)', re.S),  ' '),
+            (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S),  ' '),
+            (Re(r'\s*__packed\s*', re.S),  ' '),
+            (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S),  ' '),
+            (Re(r'\s*____cacheline_aligned_in_smp', re.S),  ' '),
+            (Re(r'\s*____cacheline_aligned', re.S),  ' '),
+
+            # Unwrap struct_group() based on this definition:
+            # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
+            # which has variants like: struct_group(NAME, MEMBERS...)
+
+            (Re(r'\bstruct_group\s*\(([^,]*,)', re.S),  r'STRUCT_GROUP('),
+            (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S),  r'STRUCT_GROUP('),
+            (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S),  r'struct \1 \2; STRUCT_GROUP('),
+            (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S),  r'STRUCT_GROUP('),
+
+            # This is incompatible with Python re, as it uses:
+            #  recursive patterns ((?1)) and atomic grouping ((?>...)):
+            #   '\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;'
+            # Let's see if this works instead:
+            (Re(r'\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;', re.S),  r'\1'),
+
+            # Replace macros
+            (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
+            (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
+            (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'unsigned long \1[BITS_TO_LONGS(\2)]'),
+            (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'unsigned long \1[1 << ((\2) - 1)]'),
+            (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\2 *\1'),
+            (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\2 *\1'),
+            (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\1 \2[]'),
+            (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S),  r'dma_addr_t \1'),
+            (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S),  r'__u32 \1'),
+        ]
+
+        for search, sub in sub_prefixes:
+            members = search.sub(sub, members)
+
+        # Keeps the original declaration as-is
+        declaration = members
+
+        # Split nested struct/union elements
+        #
+        # This loop was simpler at the original kernel-doc perl version, as
+        #   while ($members =~ m/$struct_members/) { ... }
+        # reads 'members' string on each interaction.
+        #
+        # Python behavior is different: it parses 'members' only once,
+        # creating a list of tuples from the first interaction.
+        #
+        # On other words, this won't get nested structs.
+        #
+        # So, we need to have an extra loop on Python to override such
+        # re limitation.
+
+        while True:
+            tuples = struct_members.findall(members)
+            if not tuples:
+                break
+
+            for t in tuples:
+                newmember = ""
+                maintype = t[0]
+                s_ids = t[5]
+                content = t[3]
+
+                oldmember = "".join(t)
+
+                for s_id in s_ids.split(','):
+                    s_id = s_id.strip()
+
+                    newmember += f"{maintype} {s_id}; "
+                    s_id = Re(r'[:\[].*').sub('', s_id)
+                    s_id = Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
+
+                    for arg in content.split(';'):
+                        arg = arg.strip()
+
+                        if not arg:
+                            continue
+
+                        r = Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
+                        if r.match(arg):
+                            # Pointer-to-function
+                            dtype = r.group(1)
+                            name = r.group(2)
+                            extra = r.group(3)
+
+                            if not name:
+                                continue
+
+                            if not s_id:
+                                # Anonymous struct/union
+                                newmember += f"{dtype}{name}{extra}; "
+                            else:
+                                newmember += f"{dtype}{s_id}.{name}{extra}; "
+
+                        else:
+                            arg = arg.strip()
+                            # Handle bitmaps
+                            arg = Re(r':\s*\d+\s*').sub('', arg)
+
+                            # Handle arrays
+                            arg = Re(r'\[.*\]').sub('', arg)
+
+                            # Handle multiple IDs
+                            arg = Re(r'\s*,\s*').sub(',', arg)
+
+
+                            r = Re(r'(.*)\s+([\S+,]+)')
+
+                            if r.search(arg):
+                                dtype = r.group(1)
+                                names = r.group(2)
+                            else:
+                                newmember += f"{arg}; "
+                                continue
+
+                            for name in names.split(','):
+                                name = Re(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
+
+                                if not name:
+                                    continue
+
+                                if not s_id:
+                                    # Anonymous struct/union
+                                    newmember += f"{dtype} {name}; "
+                                else:
+                                    newmember += f"{dtype} {s_id}.{name}; "
+
+                members = members.replace(oldmember, newmember)
+
+        # Ignore other nested elements, like enums
+        members = re.sub(r'(\{[^\{\}]*\})', '', members)
+
+        self.create_parameter_list(ln, decl_type, members, ';',
+                                   declaration_name)
+        self.check_sections(ln, declaration_name, decl_type,
+                            self.entry.sectcheck, self.entry.struct_actual)
+
+        # Adjust declaration for better display
+        declaration = Re(r'([\{;])').sub(r'\1\n', declaration)
+        declaration = Re(r'\}\s+;').sub('};', declaration)
+
+        # Better handle inlined enums
+        while True:
+            r = Re(r'(enum\s+\{[^\}]+),([^\n])')
+            if not r.search(declaration):
+                break
+
+            declaration = r.sub(r'\1,\n\2', declaration)
+
+        def_args = declaration.split('\n')
+        level = 1
+        declaration = ""
+        for clause in def_args:
+
+            clause = clause.strip()
+            clause = Re(r'\s+').sub(' ', clause, count=1)
+
+            if not clause:
+                continue
+
+            if '}' in clause and level > 1:
+                level -= 1
+
+            if not Re(r'^\s*#').match(clause):
+                declaration += "\t" * level
+
+            declaration += "\t" + clause + "\n"
+            if "{" in clause and "}" not in clause:
+                level += 1
+
+        self.output_declaration(decl_type, declaration_name,
+                    struct=declaration_name,
+                    module=self.entry.modulename,
+                    definition=declaration,
+                    parameterlist=self.entry.parameterlist,
+                    parameterdescs=self.entry.parameterdescs,
+                    parametertypes=self.entry.parametertypes,
+                    sectionlist=self.entry.sectionlist,
+                    sections=self.entry.sections,
+                    purpose=self.entry.declaration_purpose)
+
+    def dump_enum(self, ln, proto):
+
+        # Ignore members marked private
+        proto = Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
+        proto = Re(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
+
+        # Strip comments
+        proto = Re(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
+
+        # Strip #define macros inside enums
+        proto = Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
+
+        members = None
+        declaration_name = None
+
+        r = Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
+        if r.search(proto):
+            declaration_name = r.group(2)
+            members = r.group(1).rstrip()
+        else:
+            r = Re(r'enum\s+(\w*)\s*\{(.*)\}')
+            if r.match(proto):
+                declaration_name = r.group(1)
+                members = r.group(2).rstrip()
+
+        if not members:
+            self.emit_warning(ln, f"{proto}: error: Cannot parse enum!")
+            self.config.errors += 1
+            return
+
+        if self.entry.identifier != declaration_name:
+            if self.entry.identifier == "":
+                self.emit_warning(ln,
+                                  f"{proto}: wrong kernel-doc identifier on prototype")
+            else:
+                self.emit_warning(ln,
+                                  f"expecting prototype for enum {self.entry.identifier}. Prototype was for enum {declaration_name} instead")
+            return
+
+        if not declaration_name:
+            declaration_name = "(anonymous)"
+
+        member_set = set()
+
+        members = Re(r'\([^;]*?[\)]').sub('', members)
+
+        for arg in members.split(','):
+            if not arg:
+                continue
+            arg = Re(r'^\s*(\w+).*').sub(r'\1', arg)
+            self.entry.parameterlist.append(arg)
+            if arg not in self.entry.parameterdescs:
+                self.entry.parameterdescs[arg] = self.undescribed
+                if self.show_warnings("enum", declaration_name):
+                    self.emit_warning(ln,
+                                      f"Enum value '{arg}' not described in enum '{declaration_name}'")
+            member_set.add(arg)
+
+        for k in self.entry.parameterdescs:
+            if k not in member_set:
+                if self.show_warnings("enum", declaration_name):
+                    self.emit_warning(ln,
+                                      f"Excess enum value '%{k}' description in '{declaration_name}'")
+
+        self.output_declaration('enum', declaration_name,
+                   enum=declaration_name,
+                   module=self.config.modulename,
+                   parameterlist=self.entry.parameterlist,
+                   parameterdescs=self.entry.parameterdescs,
+                   sectionlist=self.entry.sectionlist,
+                   sections=self.entry.sections,
+                   purpose=self.entry.declaration_purpose)
+
+    def dump_declaration(self, ln, prototype):
+        if self.entry.decl_type == "enum":
+            self.dump_enum(ln, prototype)
+            return
+
+        if self.entry.decl_type == "typedef":
+            self.dump_typedef(ln, prototype)
+            return
+
+        if self.entry.decl_type in ["union", "struct"]:
+            self.dump_struct(ln, prototype)
+            return
+
+        # TODO: handle other types
+        self.output_declaration(self.entry.decl_type, prototype,
+                   entry=self.entry)
+
+    def dump_function(self, ln, prototype):
+
+        func_macro = False
+        return_type = ''
+        decl_type = 'function'
+
+        # Prefixes that would be removed
+        sub_prefixes = [
+            (r"^static +", "", 0),
+            (r"^extern +", "", 0),
+            (r"^asmlinkage +", "", 0),
+            (r"^inline +", "", 0),
+            (r"^__inline__ +", "", 0),
+            (r"^__inline +", "", 0),
+            (r"^__always_inline +", "", 0),
+            (r"^noinline +", "", 0),
+            (r"^__FORTIFY_INLINE +", "", 0),
+            (r"__init +", "", 0),
+            (r"__init_or_module +", "", 0),
+            (r"__deprecated +", "", 0),
+            (r"__flatten +", "", 0),
+            (r"__meminit +", "", 0),
+            (r"__must_check +", "", 0),
+            (r"__weak +", "", 0),
+            (r"__sched +", "", 0),
+            (r"_noprof", "", 0),
+            (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0),
+            (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", 0),
+            (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0),
+            (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2", 0),
+            (r"__attribute_const__ +", "", 0),
+
+            # It seems that Python support for re.X is broken:
+            # At least for me (Python 3.13), this didn't work
+#            (r"""
+#              __attribute__\s*\(\(
+#                (?:
+#                    [\w\s]+          # attribute name
+#                    (?:\([^)]*\))?   # attribute arguments
+#                    \s*,?            # optional comma at the end
+#                )+
+#              \)\)\s+
+#             """, "", re.X),
+
+            # So, remove whitespaces and comments from it
+            (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+", "", 0),
+        ]
+
+        for search, sub, flags in sub_prefixes:
+            prototype = Re(search, flags).sub(sub, prototype)
+
+        # Macros are a special case, as they change the prototype format
+        new_proto = Re(r"^#\s*define\s+").sub("", prototype)
+        if new_proto != prototype:
+            is_define_proto = True
+            prototype = new_proto
+        else:
+            is_define_proto = False
+
+        # Yes, this truly is vile.  We are looking for:
+        # 1. Return type (may be nothing if we're looking at a macro)
+        # 2. Function name
+        # 3. Function parameters.
+        #
+        # All the while we have to watch out for function pointer parameters
+        # (which IIRC is what the two sections are for), C types (these
+        # regexps don't even start to express all the possibilities), and
+        # so on.
+        #
+        # If you mess with these regexps, it's a good idea to check that
+        # the following functions' documentation still comes out right:
+        # - parport_register_device (function pointer parameters)
+        # - atomic_set (macro)
+        # - pci_match_device, __copy_to_user (long return type)
+
+        name = r'[a-zA-Z0-9_~:]+'
+        prototype_end1 = r'[^\(]*'
+        prototype_end2 = r'[^\{]*'
+        prototype_end = fr'\(({prototype_end1}|{prototype_end2})\)'
+
+        # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing group.
+        # So, this needs to be mapped in Python with (?:...)? or (?:...)+
+
+        type1 = r'(?:[\w\s]+)?'
+        type2 = r'(?:[\w\s]+\*+)+'
+
+        found = False
+
+        if is_define_proto:
+            r = Re(r'^()(' + name + r')\s+')
+
+            if r.search(prototype):
+                return_type = ''
+                declaration_name = r.group(2)
+                func_macro = True
+
+                found = True
+
+        if not found:
+            patterns = [
+                rf'^()({name})\s*{prototype_end}',
+                rf'^({type1})\s+({name})\s*{prototype_end}',
+                rf'^({type2})\s*({name})\s*{prototype_end}',
+            ]
+
+            for p in patterns:
+                r = Re(p)
+
+                if r.match(prototype):
+
+                    return_type = r.group(1)
+                    declaration_name = r.group(2)
+                    args = r.group(3)
+
+                    self.create_parameter_list(ln, decl_type, args, ',',
+                                               declaration_name)
+
+                    found = True
+                    break
+        if not found:
+            self.emit_warning(ln,
+                              f"cannot understand function prototype: '{prototype}'")
+            return
+
+        if self.entry.identifier != declaration_name:
+            self.emit_warning(ln,
+                              f"expecting prototype for {self.entry.identifier}(). Prototype was for {declaration_name}() instead")
+            return
+
+        prms = " ".join(self.entry.parameterlist)
+        self.check_sections(ln, declaration_name, "function",
+                            self.entry.sectcheck, prms)
+
+        self.check_return_section(ln, declaration_name, return_type)
+
+        if 'typedef' in return_type:
+            self.output_declaration(decl_type, declaration_name,
+                       function=declaration_name,
+                       typedef=True,
+                       module=self.config.modulename,
+                       functiontype=return_type,
+                       parameterlist=self.entry.parameterlist,
+                       parameterdescs=self.entry.parameterdescs,
+                       parametertypes=self.entry.parametertypes,
+                       sectionlist=self.entry.sectionlist,
+                       sections=self.entry.sections,
+                       purpose=self.entry.declaration_purpose,
+                       func_macro=func_macro)
+        else:
+            self.output_declaration(decl_type, declaration_name,
+                       function=declaration_name,
+                       typedef=False,
+                       module=self.config.modulename,
+                       functiontype=return_type,
+                       parameterlist=self.entry.parameterlist,
+                       parameterdescs=self.entry.parameterdescs,
+                       parametertypes=self.entry.parametertypes,
+                       sectionlist=self.entry.sectionlist,
+                       sections=self.entry.sections,
+                       purpose=self.entry.declaration_purpose,
+                       func_macro=func_macro)
+
+    def dump_typedef(self, ln, proto):
+        typedef_type = r'((?:\s+[\w\*]+\b){1,8})\s*'
+        typedef_ident = r'\*?\s*(\w\S+)\s*'
+        typedef_args = r'\s*\((.*)\);'
+
+        typedef1 = Re(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
+        typedef2 = Re(r'typedef' + typedef_type + typedef_ident + typedef_args)
+
+        # Strip comments
+        proto = Re(r'/\*.*?\*/', flags=re.S).sub('', proto)
+
+        # Parse function typedef prototypes
+        for r in [typedef1, typedef2]:
+            if not r.match(proto):
+                continue
+
+            return_type = r.group(1).strip()
+            declaration_name = r.group(2)
+            args = r.group(3)
+
+            if self.entry.identifier != declaration_name:
+                self.emit_warning(ln,
+                                  f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+                return
+
+            decl_type = 'function'
+            self.create_parameter_list(ln, decl_type, args, ',', declaration_name)
+
+            self.output_declaration(decl_type, declaration_name,
+                       function=declaration_name,
+                       typedef=True,
+                       module=self.entry.modulename,
+                       functiontype=return_type,
+                       parameterlist=self.entry.parameterlist,
+                       parameterdescs=self.entry.parameterdescs,
+                       parametertypes=self.entry.parametertypes,
+                       sectionlist=self.entry.sectionlist,
+                       sections=self.entry.sections,
+                       purpose=self.entry.declaration_purpose)
+            return
+
+        # Handle nested parentheses or brackets
+        r = Re(r'(\(*.\)\s*|\[*.\]\s*);$')
+        while r.search(proto):
+            proto = r.sub('', proto)
+
+        # Parse simple typedefs
+        r = Re(r'typedef.*\s+(\w+)\s*;')
+        if r.match(proto):
+            declaration_name = r.group(1)
+
+            if self.entry.identifier != declaration_name:
+                self.emit_warning(ln, f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+                return
+
+            self.output_declaration('typedef', declaration_name,
+                       typedef=declaration_name,
+                       module=self.entry.modulename,
+                       sectionlist=self.entry.sectionlist,
+                       sections=self.entry.sections,
+                       purpose=self.entry.declaration_purpose)
+            return
+
+        self.emit_warning(ln, "error: Cannot parse typedef!")
+        self.config.errors += 1
+
+    @staticmethod
+    def process_export(function_table, line):
+        """
+        process EXPORT_SYMBOL* tags
+
+        This method is called both internally and externally, so, it
+        doesn't use self.
+        """
+
+        if export_symbol.search(line):
+            symbol = export_symbol.group(2)
+            function_table.add(symbol)
+
+        if export_symbol_ns.search(line):
+            symbol = export_symbol_ns.group(2)
+            function_table.add(symbol)
+
+    def process_normal(self, ln, line):
+        """
+        STATE_NORMAL: looking for the /** to begin everything.
+        """
+
+        if not doc_start.match(line):
+            return
+
+        # start a new entry
+        self.reset_state(ln + 1)
+        self.entry.in_doc_sect = False
+
+        # next line is always the function name
+        self.state = self.STATE_NAME
+
+    def process_name(self, ln, line):
+        """
+        STATE_NAME: Looking for the "name - description" line
+        """
+
+        if doc_block.search(line):
+            self.entry.new_start_line = ln
+
+            if not doc_block.group(1):
+                self.entry.section = self.section_intro
+            else:
+                self.entry.section = doc_block.group(1)
+
+            self.state = self.STATE_DOCBLOCK
+            return
+
+        if doc_decl.search(line):
+            self.entry.identifier = doc_decl.group(1)
+            self.entry.is_kernel_comment = False
+
+            decl_start = str(doc_com)       # comment block asterisk
+            fn_type = r"(?:\w+\s*\*\s*)?"  # type (for non-functions)
+            parenthesis = r"(?:\(\w*\))?"   # optional parenthesis on function
+            decl_end = r"(?:[-:].*)"         # end of the name part
+
+            # test for pointer declaration type, foo * bar() - desc
+            r = Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}?$")
+            if r.search(line):
+                self.entry.identifier = r.group(1)
+
+            # Test for data declaration
+            r = Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)")
+            if r.search(line):
+                self.entry.decl_type = r.group(1)
+                self.entry.identifier = r.group(2)
+                self.entry.is_kernel_comment = True
+            else:
+                # Look for foo() or static void foo() - description;
+                # or misspelt identifier
+
+                r1 = Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s*{decl_end}?$")
+                r2 = Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis}\s*{decl_end}$")
+
+                for r in [r1, r2]:
+                    if r.search(line):
+                        self.entry.identifier = r.group(1)
+                        self.entry.decl_type = "function"
+
+                        r = Re(r"define\s+")
+                        self.entry.identifier = r.sub("", self.entry.identifier)
+                        self.entry.is_kernel_comment = True
+                        break
+
+            self.entry.identifier = self.entry.identifier.strip(" ")
+
+            self.state = self.STATE_BODY
+
+            # if there's no @param blocks need to set up default section here
+            self.entry.section = self.section_default
+            self.entry.new_start_line = ln + 1
+
+            r = Re("[-:](.*)")
+            if r.search(line):
+                # strip leading/trailing/multiple spaces
+                self.entry.descr = r.group(1).strip(" ")
+
+                r = Re(r"\s+")
+                self.entry.descr = r.sub(" ", self.entry.descr)
+                self.entry.declaration_purpose = self.entry.descr
+                self.state = self.STATE_BODY_MAYBE
+            else:
+                self.entry.declaration_purpose = ""
+
+            if not self.entry.is_kernel_comment:
+                self.emit_warning(ln,
+                                  f"This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{line}")
+                self.state = self.STATE_NORMAL
+
+            if not self.entry.declaration_purpose and self.config.wshort_desc:
+                self.emit_warning(ln,
+                                  f"missing initial short description on line:\n{line}")
+
+            if not self.entry.identifier and self.entry.decl_type != "enum":
+                self.emit_warning(ln,
+                                  f"wrong kernel-doc identifier on line:\n{line}")
+                self.state = self.STATE_NORMAL
+
+            if self.config.verbose:
+                self.emit_warning(ln,
+                                  f"Scanning doc for {self.entry.decl_type} {self.entry.identifier}",
+                             warning=False)
+
+            return
+
+        # Failed to find an identifier. Emit a warning
+        self.emit_warning(ln, f"Cannot find identifier on line:\n{line}")
+
+    def process_body(self, ln, line):
+        """
+        STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment.
+        """
+
+        if self.state == self.STATE_BODY_WITH_BLANK_LINE:
+            r = Re(r"\s*\*\s?\S")
+            if r.match(line):
+                self.dump_section()
+                self.entry.section = self.section_default
+                self.entry.new_start_line = line
+                self.entry.contents = ""
+
+        if doc_sect.search(line):
+            self.entry.in_doc_sect = True
+            newsection = doc_sect.group(1)
+
+            if newsection.lower() in ["description", "context"]:
+                newsection = newsection.title()
+
+            # Special case: @return is a section, not a param description
+            if newsection.lower() in ["@return", "@returns",
+                                    "return", "returns"]:
+                newsection = "Return"
+
+            # Perl kernel-doc has a check here for contents before sections.
+            # the logic there is always false, as in_doc_sect variable is
+            # always true. So, just don't implement Wcontents_before_sections
+
+            # .title()
+            newcontents = doc_sect.group(2)
+            if not newcontents:
+                newcontents = ""
+
+            if self.entry.contents.strip("\n"):
+                self.dump_section()
+
+            self.entry.new_start_line = ln
+            self.entry.section = newsection
+            self.entry.leading_space = None
+
+            self.entry.contents = newcontents.lstrip()
+            if self.entry.contents:
+                self.entry.contents += "\n"
+
+            self.state = self.STATE_BODY
+            return
+
+        if doc_end.search(line):
+            if self.entry.contents.strip("\n"):
+                self.dump_section()
+
+            # Look for doc_com + <text> + doc_end:
+            r = Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
+            if r.match(line):
+                self.emit_warning(ln, f"suspicious ending line: {line}")
+
+            self.entry.prototype = ""
+            self.entry.new_start_line = ln + 1
+
+            self.state = self.STATE_PROTO
+            return
+
+        if doc_content.search(line):
+            cont = doc_content.group(1)
+
+            if cont == "":
+                if self.entry.section == self.section_context:
+                    self.dump_section()
+
+                    self.entry.new_start_line = ln
+                    self.state = self.STATE_BODY
+                else:
+                    if self.entry.section != self.section_default:
+                        self.state = self.STATE_BODY_WITH_BLANK_LINE
+                    else:
+                        self.state = self.STATE_BODY
+
+                    self.entry.contents += "\n"
+
+            elif self.state == self.STATE_BODY_MAYBE:
+
+                # Continued declaration purpose
+                self.entry.declaration_purpose = self.entry.declaration_purpose.rstrip()
+                self.entry.declaration_purpose += " " + cont
+
+                r = Re(r"\s+")
+                self.entry.declaration_purpose = r.sub(' ',
+                                                       self.entry.declaration_purpose)
+
+            else:
+                if self.entry.section.startswith('@') or        \
+                   self.entry.section == self.section_context:
+                    if self.entry.leading_space is None:
+                        r = Re(r'^(\s+)')
+                        if r.match(cont):
+                            self.entry.leading_space = len(r.group(1))
+                        else:
+                            self.entry.leading_space = 0
+
+                    # Double-check if leading space are realy spaces
+                    pos = 0
+                    for i in range(0, self.entry.leading_space):
+                        if cont[i] != " ":
+                            break
+                        pos += 1
+
+                    cont = cont[pos:]
+
+                    # NEW LOGIC:
+                    # In case it is different, update it
+                    if self.entry.leading_space != pos:
+                        self.entry.leading_space = pos
+
+                self.entry.contents += cont + "\n"
+            return
+
+        # Unknown line, ignore
+        self.emit_warning(ln, f"bad line: {line}")
+
+    def process_inline(self, ln, line):
+        """STATE_INLINE: docbook comments within a prototype."""
+
+        if self.inline_doc_state == self.STATE_INLINE_NAME and \
+           doc_inline_sect.search(line):
+            self.entry.section = doc_inline_sect.group(1)
+            self.entry.new_start_line = ln
+
+            self.entry.contents = doc_inline_sect.group(2).lstrip()
+            if self.entry.contents != "":
+                self.entry.contents += "\n"
+
+            self.inline_doc_state = self.STATE_INLINE_TEXT
+            # Documentation block end */
+            return
+
+        if doc_inline_end.search(line):
+            if self.entry.contents not in ["", "\n"]:
+                self.dump_section()
+
+            self.state = self.STATE_PROTO
+            self.inline_doc_state = self.STATE_INLINE_NA
+            return
+
+        if doc_content.search(line):
+            if self.inline_doc_state == self.STATE_INLINE_TEXT:
+                self.entry.contents += doc_content.group(1) + "\n"
+                if not self.entry.contents.strip(" ").rstrip("\n"):
+                    self.entry.contents = ""
+
+            elif self.inline_doc_state == self.STATE_INLINE_NAME:
+                self.emit_warning(ln,
+                                  f"Incorrect use of kernel-doc format: {line}")
+
+                self.inline_doc_state = self.STATE_INLINE_ERROR
+
+    def syscall_munge(self, ln, proto):
+        """
+        Handle syscall definitions
+        """
+
+        is_void = False
+
+        # Strip newlines/CR's
+        proto = re.sub(r'[\r\n]+', ' ', proto)
+
+        # Check if it's a SYSCALL_DEFINE0
+        if 'SYSCALL_DEFINE0' in proto:
+            is_void = True
+
+        # Replace SYSCALL_DEFINE with correct return type & function name
+        proto = Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
+
+        r = Re(r'long\s+(sys_.*?),')
+        if r.search(proto):
+            proto = proto.replace(',', '(', count=1)
+        elif is_void:
+            proto = proto.replace(')', '(void)', count=1)
+
+        # Now delete all of the odd-numbered commas in the proto
+        # so that argument types & names don't have a comma between them
+        count = 0
+        length = len(proto)
+
+        if is_void:
+            length = 0  # skip the loop if is_void
+
+        for ix in range(length):
+            if proto[ix] == ',':
+                count += 1
+                if count % 2 == 1:
+                    proto = proto[:ix] + ' ' + proto[ix+1:]
+
+        return proto
+
+    def tracepoint_munge(self, ln, proto):
+        """
+        Handle tracepoint definitions
+        """
+
+        tracepointname = None
+        tracepointargs = None
+
+        # Match tracepoint name based on different patterns
+        r = Re(r'TRACE_EVENT\((.*?),')
+        if r.search(proto):
+            tracepointname = r.group(1)
+
+        r = Re(r'DEFINE_SINGLE_EVENT\((.*?),')
+        if r.search(proto):
+            tracepointname = r.group(1)
+
+        r = Re(r'DEFINE_EVENT\((.*?),(.*?),')
+        if r.search(proto):
+            tracepointname = r.group(2)
+
+        if tracepointname:
+            tracepointname = tracepointname.lstrip()
+
+        r = Re(r'TP_PROTO\((.*?)\)')
+        if r.search(proto):
+            tracepointargs = r.group(1)
+
+        if not tracepointname or not tracepointargs:
+            self.emit_warning(ln,
+                              f"Unrecognized tracepoint format:\n{proto}\n")
+        else:
+            proto = f"static inline void trace_{tracepointname}({tracepointargs})"
+            self.entry.identifier = f"trace_{self.entry.identifier}"
+
+        return proto
+
+    def process_proto_function(self, ln, line):
+        """Ancillary routine to process a function prototype"""
+
+        # strip C99-style comments to end of line
+        r = Re(r"\/\/.*$", re.S)
+        line = r.sub('', line)
+
+        if Re(r'\s*#\s*define').match(line):
+            self.entry.prototype = line
+        elif line.startswith('#'):
+            # Strip other macros like #ifdef/#ifndef/#endif/...
+            pass
+        else:
+            r = Re(r'([^\{]*)')
+            if r.match(line):
+                self.entry.prototype += r.group(1) + " "
+
+        if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line):
+            # strip comments
+            r = Re(r'/\*.*?\*/')
+            self.entry.prototype = r.sub('', self.entry.prototype)
+
+            # strip newlines/cr's
+            r = Re(r'[\r\n]+')
+            self.entry.prototype = r.sub(' ', self.entry.prototype)
+
+            # strip leading spaces
+            r = Re(r'^\s+')
+            self.entry.prototype = r.sub('', self.entry.prototype)
+
+            # Handle self.entry.prototypes for function pointers like:
+            #       int (*pcs_config)(struct foo)
+
+            r = Re(r'^(\S+\s+)\(\s*\*(\S+)\)')
+            self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
+
+            if 'SYSCALL_DEFINE' in self.entry.prototype:
+                self.entry.prototype = self.syscall_munge(ln,
+                                                          self.entry.prototype)
+
+            r = Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
+            if r.search(self.entry.prototype):
+                self.entry.prototype = self.tracepoint_munge(ln,
+                                                             self.entry.prototype)
+
+            self.dump_function(ln, self.entry.prototype)
+            self.reset_state(ln)
+
+    def process_proto_type(self, ln, line):
+        """Ancillary routine to process a type"""
+
+        # Strip newlines/cr's.
+        line = Re(r'[\r\n]+', re.S).sub(' ', line)
+
+        # Strip leading spaces
+        line = Re(r'^\s+', re.S).sub('', line)
+
+        # Strip trailing spaces
+        line = Re(r'\s+$', re.S).sub('', line)
+
+        # Strip C99-style comments to the end of the line
+        line = Re(r"\/\/.*$", re.S).sub('', line)
+
+        # To distinguish preprocessor directive from regular declaration later.
+        if line.startswith('#'):
+            line += ";"
+
+        r = Re(r'([^\{\};]*)([\{\};])(.*)')
+        while True:
+            if r.search(line):
+                if self.entry.prototype:
+                    self.entry.prototype += " "
+                self.entry.prototype += r.group(1) + r.group(2)
+
+                self.entry.brcount += r.group(2).count('{')
+                self.entry.brcount -= r.group(2).count('}')
+
+                self.entry.brcount = max(self.entry.brcount, 0)
+
+                if r.group(2) == ';' and self.entry.brcount == 0:
+                    self.dump_declaration(ln, self.entry.prototype)
+                    self.reset_state(ln)
+                    break
+
+                line = r.group(3)
+            else:
+                self.entry.prototype += line
+                break
+
+    def process_proto(self, ln, line):
+        """STATE_PROTO: reading a function/whatever prototype."""
+
+        if doc_inline_oneline.search(line):
+            self.entry.section = doc_inline_oneline.group(1)
+            self.entry.contents = doc_inline_oneline.group(2)
+
+            if self.entry.contents != "":
+                self.entry.contents += "\n"
+                self.dump_section(start_new=False)
+
+        elif doc_inline_start.search(line):
+            self.state = self.STATE_INLINE
+            self.inline_doc_state = self.STATE_INLINE_NAME
+
+        elif self.entry.decl_type == 'function':
+            self.process_proto_function(ln, line)
+
+        else:
+            self.process_proto_type(ln, line)
+
+    def process_docblock(self, ln, line):
+        """STATE_DOCBLOCK: within a DOC: block."""
+
+        if doc_end.search(line):
+            self.dump_section()
+            self.output_declaration("doc", None,
+                       sectionlist=self.entry.sectionlist,
+                       sections=self.entry.sections,                    module=self.config.modulename)
+            self.reset_state(ln)
+
+        elif doc_content.search(line):
+            self.entry.contents += doc_content.group(1) + "\n"
+
+    def run(self):
+        """
+        Open and process each line of a C source file.
+        he parsing is controlled via a state machine, and the line is passed
+        to a different process function depending on the state. The process
+        function may update the state as needed.
+        """
+
+        cont = False
+        prev = ""
+        prev_ln = None
+
+        try:
+            with open(self.fname, "r", encoding="utf8",
+                      errors="backslashreplace") as fp:
+                for ln, line in enumerate(fp):
+
+                    line = line.expandtabs().strip("\n")
+
+                    # Group continuation lines on prototypes
+                    if self.state == self.STATE_PROTO:
+                        if line.endswith("\\"):
+                            prev += line.removesuffix("\\")
+                            cont = True
+
+                            if not prev_ln:
+                                prev_ln = ln
+
+                            continue
+
+                        if cont:
+                            ln = prev_ln
+                            line = prev + line
+                            prev = ""
+                            cont = False
+                            prev_ln = None
+
+                    self.config.log.debug("%d %s%s: %s",
+                                          ln, self.st_name[self.state],
+                                          self.st_inline_name[self.inline_doc_state],
+                                          line)
+
+                    # TODO: not all states allow EXPORT_SYMBOL*, so this
+                    # can be optimized later on to speedup parsing
+                    self.process_export(self.config.function_table, line)
+
+                    # Hand this line to the appropriate state handler
+                    if self.state == self.STATE_NORMAL:
+                        self.process_normal(ln, line)
+                    elif self.state == self.STATE_NAME:
+                        self.process_name(ln, line)
+                    elif self.state in [self.STATE_BODY, self.STATE_BODY_MAYBE,
+                                        self.STATE_BODY_WITH_BLANK_LINE]:
+                        self.process_body(ln, line)
+                    elif self.state == self.STATE_INLINE:  # scanning for inline parameters
+                        self.process_inline(ln, line)
+                    elif self.state == self.STATE_PROTO:
+                        self.process_proto(ln, line)
+                    elif self.state == self.STATE_DOCBLOCK:
+                        self.process_docblock(ln, line)
+        except OSError:
+            self.config.log.error(f"Error: Cannot open file {self.fname}")
+            self.config.errors += 1
+
+
+class GlobSourceFiles:
+    """
+    Parse C source code file names and directories via an Interactor.
+
+    """
+
+    def __init__(self, srctree=None, valid_extensions=None):
+        """
+        Initialize valid extensions with a tuple.
+
+        If not defined, assume default C extensions (.c and .h)
+
+        It would be possible to use python's glob function, but it is
+        very slow, and it is not interactive. So, it would wait to read all
+        directories before actually do something.
+
+        So, let's use our own implementation.
+        """
+
+        if not valid_extensions:
+            self.extensions = (".c", ".h")
+        else:
+            self.extensions = valid_extensions
+
+        self.srctree = srctree
+
+    def _parse_dir(self, dirname):
+        """Internal function to parse files recursively"""
+
+        with os.scandir(dirname) as obj:
+            for entry in obj:
+                name = os.path.join(dirname, entry.name)
+
+                if entry.is_dir():
+                    yield from self._parse_dir(name)
+
+                if not entry.is_file():
+                    continue
+
+                basename = os.path.basename(name)
+
+                if not basename.endswith(self.extensions):
+                    continue
+
+                yield name
+
+    def parse_files(self, file_list, file_not_found_cb):
+        for fname in file_list:
+            if self.srctree:
+                f = os.path.join(self.srctree, fname)
+            else:
+                f = fname
+
+            if os.path.isdir(f):
+                yield from self._parse_dir(f)
+            elif os.path.isfile(f):
+                yield f
+            elif file_not_found_cb:
+                file_not_found_cb(fname)
+
+
+class KernelFiles():
+
+    def parse_file(self, fname):
+
+        doc = KernelDoc(self.config, fname)
+        doc.run()
+
+        return doc
+
+    def process_export_file(self, fname):
+        try:
+            with open(fname, "r", encoding="utf8",
+                      errors="backslashreplace") as fp:
+                for line in fp:
+                    KernelDoc.process_export(self.config.function_table, line)
+
+        except IOError:
+            print(f"Error: Cannot open fname {fname}", fname=sys.stderr)
+            self.config.errors += 1
+
+    def file_not_found_cb(self, fname):
+        self.config.log.error("Cannot find file %s", fname)
+        self.config.errors += 1
+
+    def __init__(self, files=None, verbose=False, out_style=None,
+                 werror=False, wreturn=False, wshort_desc=False,
+                 wcontents_before_sections=False,
+                 logger=None, modulename=None, export_file=None):
+        """Initialize startup variables and parse all files"""
+
+
+        if not verbose:
+            verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
+
+        if not modulename:
+            modulename = "Kernel API"
+
+        dt = datetime.now()
+        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
+            # use UTC TZ
+            to_zone = tz.gettz('UTC')
+            dt = dt.astimezone(to_zone)
+
+        if not werror:
+            kcflags = os.environ.get("KCFLAGS", None)
+            if kcflags:
+                match = re.search(r"(\s|^)-Werror(\s|$)/", kcflags)
+                if match:
+                    werror = True
+
+            # reading this variable is for backwards compat just in case
+            # someone was calling it with the variable from outside the
+            # kernel's build system
+            kdoc_werror = os.environ.get("KDOC_WERROR", None)
+            if kdoc_werror:
+                werror = kdoc_werror
+
+        # Set global config data used on all files
+        self.config = argparse.Namespace
+
+        self.config.verbose = verbose
+        self.config.werror = werror
+        self.config.wreturn = wreturn
+        self.config.wshort_desc = wshort_desc
+        self.config.wcontents_before_sections = wcontents_before_sections
+        self.config.modulename = modulename
+
+        self.config.function_table = set()
+        self.config.source_map = {}
+
+        if not logger:
+            self.config.log = logging.getLogger("kernel-doc")
+        else:
+            self.config.log = logger
+
+        self.config.kernel_version = os.environ.get("KERNELVERSION",
+                                                    "unknown kernel version'")
+        self.config.src_tree = os.environ.get("SRCTREE", None)
+
+        self.out_style = out_style
+        self.export_file = export_file
+
+        # Initialize internal variables
+
+        self.config.errors = 0
+        self.results = []
+
+        self.file_list = files
+        self.files = set()
+
+    def parse(self):
+        """
+        Parse all files
+        """
+
+        glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+        # Let's use a set here to avoid duplicating files
+
+        for fname in glob.parse_files(self.file_list, self.file_not_found_cb):
+            if fname in self.files:
+                continue
+
+            self.files.add(fname)
+
+            res = self.parse_file(fname)
+            self.results.append((res.fname, res.entries))
+
+        if not self.files:
+            sys.exit(1)
+
+        # If a list of export files was provided, parse EXPORT_SYMBOL*
+        # from the ones not already parsed
+
+        if self.export_file:
+            files = self.files
+
+            glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+            for fname in glob.parse_files(self.export_file,
+                                          self.file_not_found_cb):
+                if fname not in files:
+                    files.add(fname)
+
+                    self.process_export_file(fname)
+
+    def out_msg(self, fname, name, arg):
+        # TODO: filter out unwanted parts
+
+        return self.out_style.msg(fname, name, arg)
+
+    def msg(self, enable_lineno=False, export=False, internal=False,
+            symbol=None, nosymbol=None):
+
+        function_table = self.config.function_table
+
+        if symbol:
+            for s in symbol:
+                function_table.add(s)
+
+        # Output none mode: only warnings will be shown
+        if not self.out_style:
+            return
+
+        self.out_style.set_config(self.config)
+
+        self.out_style.set_filter(export, internal, symbol, nosymbol,
+                                  function_table, enable_lineno)
+
+        for fname, arg_tuple in self.results:
+            for name, arg in arg_tuple:
+                if self.out_msg(fname, name, arg):
+                    ln = arg.get("ln", 0)
+                    dtype = arg.get('type', "")
+
+                    self.config.log.warning("%s:%d Can't handle %s",
+                                            fname, ln, dtype)
+
+
+class OutputFormat:
+    # output mode.
+    OUTPUT_ALL          = 0 # output all symbols and doc sections
+    OUTPUT_INCLUDE      = 1 # output only specified symbols
+    OUTPUT_EXPORTED     = 2 # output exported symbols
+    OUTPUT_INTERNAL     = 3 # output non-exported symbols
+
+    # Virtual member to be overriden at the  inherited classes
+    highlights = []
+
+    def __init__(self):
+        """Declare internal vars and set mode to OUTPUT_ALL"""
+
+        self.out_mode = self.OUTPUT_ALL
+        self.enable_lineno = None
+        self.nosymbol = {}
+        self.symbol = None
+        self.function_table = set()
+        self.config = None
+
+    def set_config(self, config):
+        self.config = config
+
+    def set_filter(self, export, internal, symbol, nosymbol, function_table,
+                   enable_lineno):
+        """
+        Initialize filter variables according with the requested mode.
+
+        Only one choice is valid between export, internal and symbol.
+
+        The nosymbol filter can be used on all modes.
+        """
+
+        self.enable_lineno = enable_lineno
+
+        if symbol:
+            self.out_mode = self.OUTPUT_INCLUDE
+            function_table = symbol
+        elif export:
+            self.out_mode = self.OUTPUT_EXPORTED
+        elif internal:
+            self.out_mode = self.OUTPUT_INTERNAL
+        else:
+            self.out_mode = self.OUTPUT_ALL
+
+        if nosymbol:
+            self.nosymbol = set(nosymbol)
+
+        if function_table:
+            self.function_table = function_table
+
+    def highlight_block(self, block):
+        """
+        Apply the RST highlights to a sub-block of text.
+        """
+
+        for r, sub in self.highlights:
+            block = r.sub(sub, block)
+
+        return block
+
+    def check_doc(self, name):
+        """Check if DOC should be output"""
+
+        if self.out_mode == self.OUTPUT_ALL:
+            return True
+
+        if self.out_mode == self.OUTPUT_INCLUDE:
+            if name in self.nosymbol:
+                return False
+
+            if name in self.function_table:
+                return True
+
+        return False
+
+    def check_declaration(self, dtype, name):
+        if name in self.nosymbol:
+            return False
+
+        if self.out_mode == self.OUTPUT_ALL:
+            return True
+
+        if self.out_mode in [ self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED ]:
+            if name in self.function_table:
+                return True
+
+        if self.out_mode == self.OUTPUT_INTERNAL:
+            if dtype != "function":
+                return True
+
+            if name not in self.function_table:
+                return True
+
+        return False
+
+    def check_function(self, fname, name, args):
+        return True
+
+    def check_enum(self, fname, name, args):
+        return True
+
+    def check_typedef(self, fname, name, args):
+        return True
+
+    def msg(self, fname, name, args):
+
+        dtype = args.get('type', "")
+
+        if dtype == "doc":
+            self.out_doc(fname, name, args)
+            return False
+
+        if not self.check_declaration(dtype, name):
+            return False
+
+        if dtype == "function":
+            self.out_function(fname, name, args)
+            return False
+
+        if dtype == "enum":
+            self.out_enum(fname, name, args)
+            return False
+
+        if dtype == "typedef":
+            self.out_typedef(fname, name, args)
+            return False
+
+        if dtype in ["struct", "union"]:
+            self.out_struct(fname, name, args)
+            return False
+
+        # Warn if some type requires an output logic
+        self.config.log.warning("doesn't now how to output '%s' block",
+                                dtype)
+
+        return True
+
+    # Virtual methods to be overridden by inherited classes
+    def out_doc(self, fname, name, args):
+        pass
+
+    def out_function(self, fname, name, args):
+        pass
+
+    def out_enum(self, fname, name, args):
+        pass
+
+    def out_typedef(self, fname, name, args):
+        pass
+
+    def out_struct(self, fname, name, args):
+        pass
+
+
+class RestFormat(OutputFormat):
+    # """Consts and functions used by ReST output"""
+
+    highlights = [
+        (type_constant, r"``\1``"),
+        (type_constant2, r"``\1``"),
+
+        # Note: need to escape () to avoid func matching later
+        (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
+        (type_member, r":c:type:`\1\2\3 <\1>`"),
+        (type_fp_param, r"**\1\\(\\)**"),
+        (type_fp_param2, r"**\1\\(\\)**"),
+        (type_func, r"\1()"),
+        (type_enum, r":c:type:`\1 <\2>`"),
+        (type_struct, r":c:type:`\1 <\2>`"),
+        (type_typedef, r":c:type:`\1 <\2>`"),
+        (type_union, r":c:type:`\1 <\2>`"),
+
+        # in rst this can refer to any type
+        (type_fallback, r":c:type:`\1`"),
+        (type_param_ref, r"**\1\2**")
+    ]
+    blankline = "\n"
+
+    sphinx_literal = Re(r'^[^.].*::$', cache=False)
+    sphinx_cblock = Re(r'^\.\.\ +code-block::', cache=False)
+
+    def __init__(self):
+        """
+        Creates class variables.
+
+        Not really mandatory, but it is a good coding style and makes
+        pylint happy.
+        """
+
+        super().__init__()
+        self.lineprefix = ""
+
+    def print_lineno (self, ln):
+        """Outputs a line number"""
+
+        if self.enable_lineno and ln:
+            print(f".. LINENO {ln}")
+
+    def output_highlight(self, args):
+        input_text = args
+        output = ""
+        in_literal = False
+        litprefix = ""
+        block = ""
+
+        for line in input_text.strip("\n").split("\n"):
+
+            # If we're in a literal block, see if we should drop out of it.
+            # Otherwise, pass the line straight through unmunged.
+            if in_literal:
+                if line.strip():  # If the line is not blank
+                    # If this is the first non-blank line in a literal block,
+                    # figure out the proper indent.
+                    if not litprefix:
+                        r = Re(r'^(\s*)')
+                        if r.match(line):
+                            litprefix = '^' + r.group(1)
+                        else:
+                            litprefix = ""
+
+                        output += line + "\n"
+                    elif not Re(litprefix).match(line):
+                        in_literal = False
+                    else:
+                        output += line + "\n"
+                else:
+                    output += line + "\n"
+
+            # Not in a literal block (or just dropped out)
+            if not in_literal:
+                block += line + "\n"
+                if self.sphinx_literal.match(line) or self.sphinx_cblock.match(line):
+                    in_literal = True
+                    litprefix = ""
+                    output += self.highlight_block(block)
+                    block = ""
+
+        # Handle any remaining block
+        if block:
+            output += self.highlight_block(block)
+
+        # Print the output with the line prefix
+        for line in output.strip("\n").split("\n"):
+            print(self.lineprefix + line)
+
+    def out_section(self, args, out_reference=False):
+        """
+        Outputs a block section.
+
+        This could use some work; it's used to output the DOC: sections, and
+        starts by putting out the name of the doc section itself, but that
+        tends to duplicate a header already in the template file.
+        """
+
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+        section_start_lines = args.get('section_start_lines', {})
+
+        for section in sectionlist:
+            # Skip sections that are in the nosymbol_table
+            if section in self.nosymbol:
+                continue
+
+            if not self.out_mode == self.OUTPUT_INCLUDE:
+                if out_reference:
+                    print(f".. _{section}:\n")
+
+                if not self.symbol:
+                    print(f'{self.lineprefix}**{section}**\n')
+
+            self.print_lineno(section_start_lines.get(section, 0))
+            self.output_highlight(sections[section])
+            print()
+        print()
+
+    def out_doc(self, fname, name, args):
+        if not self.check_doc(name):
+            return
+
+        self.out_section(args, out_reference=True)
+
+    def out_function(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        signature = ""
+
+        func_macro = args.get('func_macro', False)
+        if func_macro:
+            signature = args['function']
+        else:
+            if args.get('functiontype'):
+                signature = args['functiontype'] + " "
+            signature += args['function'] + " ("
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
+
+        ln = args.get('ln', 0)
+
+        count = 0
+        for parameter in parameterlist:
+            if count != 0:
+                signature += ", "
+            count += 1
+            dtype = args['parametertypes'].get(parameter, "")
+
+            if function_pointer.search(dtype):
+                signature += function_pointer.group(1) + parameter + function_pointer.group(3)
+            else:
+                signature += dtype
+
+        if not func_macro:
+            signature += ")"
+
+        if args.get('typedef') or not args.get('functiontype'):
+            print(f".. c:macro:: {args['function']}\n")
+
+            if args.get('typedef'):
+                self.print_lineno(ln)
+                print("   **Typedef**: ", end="")
+                self.lineprefix = ""
+                self.output_highlight(args.get('purpose', ""))
+                print("\n\n**Syntax**\n")
+                print(f"  ``{signature}``\n")
+            else:
+                print(f"``{signature}``\n")
+        else:
+            print(f".. c:function:: {signature}\n")
+
+        if not args.get('typedef'):
+            self.print_lineno(ln)
+            self.lineprefix = "   "
+            self.output_highlight(args.get('purpose', ""))
+            print()
+
+        # Put descriptive text into a container (HTML <div>) to help set
+        # function prototypes apart
+        self.lineprefix = "  "
+
+        if parameterlist:
+            print(".. container:: kernelindent\n")
+            print(f"{self.lineprefix}**Parameters**\n")
+
+        for parameter in parameterlist:
+            parameter_name = Re(r'\[.*').sub('', parameter)
+            dtype = args['parametertypes'].get(parameter, "")
+
+            if dtype:
+                print(f"{self.lineprefix}``{dtype}``")
+            else:
+                print(f"{self.lineprefix}``{parameter}``")
+
+            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
+
+            self.lineprefix = "    "
+            if parameter_name in parameterdescs and \
+               parameterdescs[parameter_name] != KernelDoc.undescribed:
+
+                self.output_highlight(parameterdescs[parameter_name])
+                print()
+            else:
+                print(f"{self.lineprefix}*undescribed*\n")
+            self.lineprefix = "  "
+
+        self.out_section(args)
+        self.lineprefix = oldprefix
+
+    def out_enum(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        name = args.get('enum', '')
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        ln = args.get('ln', 0)
+
+        print(f"\n\n.. c:enum:: {name}\n")
+
+        self.print_lineno(ln)
+        self.lineprefix = "  "
+        self.output_highlight(args.get('purpose', ''))
+        print()
+
+        print(".. container:: kernelindent\n")
+        outer = self.lineprefix + "  "
+        self.lineprefix = outer + "  "
+        print(f"{outer}**Constants**\n")
+
+        for parameter in parameterlist:
+            print(f"{outer}``{parameter}``")
+
+            if parameterdescs.get(parameter, '') != KernelDoc.undescribed:
+                self.output_highlight(parameterdescs[parameter])
+            else:
+                print(f"{self.lineprefix}*undescribed*\n")
+            print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+    def out_typedef(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        name = args.get('typedef', '')
+        ln = args.get('ln', 0)
+
+        print(f"\n\n.. c:type:: {name}\n")
+
+        self.print_lineno(ln)
+        self.lineprefix = "   "
+
+        self.output_highlight(args.get('purpose', ''))
+
+        print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+    def out_struct(self, fname, name, args):
+
+        name = args.get('struct', "")
+        purpose = args.get('purpose', "")
+        declaration = args.get('definition', "")
+        dtype = args.get('type', "struct")
+        ln = args.get('ln', 0)
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
+
+        print(f"\n\n.. c:{dtype}:: {name}\n")
+
+        self.print_lineno(ln)
+
+        oldprefix = self.lineprefix
+        self.lineprefix += "  "
+
+        self.output_highlight(purpose)
+        print()
+
+        print(".. container:: kernelindent\n")
+        print(f"{self.lineprefix}**Definition**::\n")
+
+        self.lineprefix = self.lineprefix + "  "
+
+        declaration = declaration.replace("\t", self.lineprefix)
+
+        print(f"{self.lineprefix}{dtype} {name}" + ' {')
+        print(f"{declaration}{self.lineprefix}" + "};\n")
+
+        self.lineprefix = "  "
+        print(f"{self.lineprefix}**Members**\n")
+        for parameter in parameterlist:
+            if not parameter or parameter.startswith("#"):
+                continue
+
+            parameter_name = parameter.split("[", maxsplit=1)[0]
+
+            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+                continue
+
+            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
+
+            print(f"{self.lineprefix}``{parameter}``")
+
+            self.lineprefix = "    "
+            self.output_highlight(parameterdescs[parameter_name])
+            self.lineprefix = "  "
+
+            print()
+
+        print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+
+class ManFormat(OutputFormat):
+    """Consts and functions used by man pages output"""
+
+    highlights = (
+        (type_constant, r"\1"),
+        (type_constant2, r"\1"),
+        (type_func, r"\\fB\1\\fP"),
+        (type_enum, r"\\fI\1\\fP"),
+        (type_struct, r"\\fI\1\\fP"),
+        (type_typedef, r"\\fI\1\\fP"),
+        (type_union, r"\\fI\1\\fP"),
+        (type_param, r"\\fI\1\\fP"),
+        (type_param_ref, r"\\fI\1\2\\fP"),
+        (type_member, r"\\fI\1\2\3\\fP"),
+        (type_fallback, r"\\fI\1\\fP")
+    )
+    blankline = ""
+
+    def __init__(self):
+        """
+        Creates class variables.
+
+        Not really mandatory, but it is a good coding style and makes
+        pylint happy.
+        """
+
+        super().__init__()
+
+        dt = datetime.now()
+        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
+            # use UTC TZ
+            to_zone = tz.gettz('UTC')
+            dt = dt.astimezone(to_zone)
+
+        self.man_date = dt.strftime("%B %Y")
+
+    def output_highlight(self, block):
+
+        contents = self.highlight_block(block)
+
+        if isinstance(contents, list):
+            contents = "\n".join(contents)
+
+        for line in contents.strip("\n").split("\n"):
+            line = Re(r"^\s*").sub("", line)
+
+            if line and line[0] == ".":
+                print("\\&" + line)
+            else:
+                print(line)
+
+    def out_doc(self, fname, name, args):
+        module = args.get('module')
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX')
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
+
+    def out_function(self, fname, name, args):
+        """output function in man"""
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"{args['function']} \\- {args['purpose']}")
+
+        print(".SH SYNOPSIS")
+        if args.get('functiontype', ''):
+            print(f'.B "{args['functiontype']}" {args['function']}')
+        else:
+            print(f'.B "{args['function']}')
+
+        count = 0
+        parenth = "("
+        post = ","
+
+        for parameter in parameterlist:
+            if count == len(parameterlist) - 1:
+                post = ");"
+
+            dtype = args['parametertypes'].get(parameter, "")
+            if function_pointer.match(dtype):
+                # Pointer-to-function
+                print(f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"')
+            else:
+                dtype = Re(r'([^\*])$').sub(r'\1 ', dtype)
+
+                print(f'.BI "{parenth}{dtype}"  "{post}"')
+            count += 1
+            parenth = ""
+
+        if parameterlist:
+            print(".SH ARGUMENTS")
+
+        for parameter in parameterlist:
+            parameter_name = re.sub(r'\[.*', '', parameter)
+
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(parameterdescs.get(parameter_name, ""))
+
+        for section in sectionlist:
+            print(f'.SH "{section.upper()}"')
+            self.output_highlight(sections[section])
+
+    def out_enum(self, fname, name, args):
+
+        name = args.get('enum', '')
+        parameterlist = args.get('parameterlist', [])
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"enum {args['enum']} \\- {args['purpose']}")
+
+        print(".SH SYNOPSIS")
+        print(f"enum {args['enum']}" + " {")
+
+        count = 0
+        for parameter in parameterlist:
+            print(f'.br\n.BI "    {parameter}"')
+            if count == len(parameterlist) - 1:
+                print("\n};")
+            else:
+                print(", \n.br")
+
+            count += 1
+
+        print(".SH Constants")
+
+        for parameter in parameterlist:
+            parameter_name = Re(r'\[.*').sub('', parameter)
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(args['parameterdescs'].get(parameter_name, ""))
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections[section])
+
+    def out_typedef(self, fname, name, args):
+        module = args.get('module')
+        typedef = args.get('typedef')
+        purpose = args.get('purpose')
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"typedef {typedef} \\- {purpose}")
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
+
+    def out_struct(self, fname, name, args):
+        module = args.get('module')
+        struct_type = args.get('type')
+        struct_name = args.get('struct')
+        purpose = args.get('purpose')
+        definition = args.get('definition')
+        sectionlist = args.get('sectionlist', [])
+        parameterlist = args.get('parameterlist', [])
+        sections = args.get('sections', {})
+        parameterdescs = args.get('parameterdescs', {})
+
+        print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"{struct_type} {struct_name} \\- {purpose}")
+
+        # Replace tabs with two spaces and handle newlines
+        declaration = definition.replace("\t", "  ")
+        declaration = Re(r"\n").sub('"\n.br\n.BI "', declaration)
+
+        print(".SH SYNOPSIS")
+        print(f"{struct_type} {struct_name} " + "{" +"\n.br")
+        print(f'.BI "{declaration}\n' + "};\n.br\n")
+
+        print(".SH Members")
+        for parameter in parameterlist:
+            if parameter.startswith("#"):
+                continue
+
+            parameter_name = re.sub(r"\[.*", "", parameter)
+
+            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+                continue
+
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(parameterdescs.get(parameter_name))
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
+
+
+# Command line interface
+
+
+DESC = """
+Read C language source or header FILEs, extract embedded documentation comments,
+and print formatted documentation to standard output.
+
+The documentation comments are identified by the "/**" opening comment mark.
+
+See Documentation/doc-guide/kernel-doc.rst for the documentation comment syntax.
+"""
+
+EXPORT_FILE_DESC = """
+Specify an additional FILE in which to look for EXPORT_SYMBOL information.
+
+May be used multiple times.
+"""
+
+EXPORT_DESC = """
+Only output documentation for the symbols that have been
+exported using EXPORT_SYMBOL() and related macros in any input
+FILE or -export-file FILE.
+"""
+
+INTERNAL_DESC = """
+Only output documentation for the symbols that have NOT been
+exported using EXPORT_SYMBOL() and related macros in any input
+FILE or -export-file FILE.
+"""
+
+FUNCTION_DESC = """
+Only output documentation for the given function or DOC: section
+title. All other functions and DOC: sections are ignored.
+
+May be used multiple times.
+"""
+
+NOSYMBOL_DESC = """
+Exclude the specified symbol from the output documentation.
+
+May be used multiple times.
+"""
+
+FILES_DESC = """
+Header and C source files to be parsed.
+"""
+
+WARN_CONTENTS_BEFORE_SECTIONS_DESC = """
+Warns if there are contents before sections (deprecated).
+
+This option is kept just for backward-compatibility, but it does nothing,
+neither here nor at the original Perl script.
+"""
+
+
+def main():
+    """Main program"""
+
+    parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter,
+                                     description=DESC)
+
+    # Normal arguments
+
+    parser.add_argument("-v", "-verbose", "--verbose", action="store_true",
+                        help="Verbose output, more warnings and other information.")
+
+    parser.add_argument("-d", "-debug", "--debug", action="store_true",
+                        help="Enable debug messages")
+
+    parser.add_argument("-M", "-modulename", "--modulename",
+                        help="Allow setting a module name at the output.")
+
+    parser.add_argument("-l", "-enable-lineno", "--enable_lineno",
+                        action="store_true",
+                        help="Enable line number output (only in ReST mode)")
+
+    # Arguments to control the warning behavior
+
+    parser.add_argument("-Wreturn", "--wreturn", action="store_true",
+                        help="Warns about the lack of a return markup on functions.")
+
+    parser.add_argument("-Wshort-desc", "-Wshort-description", "--wshort-desc",
+                        action="store_true",
+                        help="Warns if initial short description is missing")
+
+    parser.add_argument("-Wcontents-before-sections",
+                        "--wcontents-before-sections", action="store_true",
+                        help=WARN_CONTENTS_BEFORE_SECTIONS_DESC)
+
+    parser.add_argument("-Wall", "--wall", action="store_true",
+                        help="Enable all types of warnings")
+
+    parser.add_argument("-Werror", "--werror", action="store_true",
+                        help="Treat warnings as errors.")
+
+    parser.add_argument("-export-file", "--export-file", action='append',
+                        help=EXPORT_FILE_DESC)
+
+    # Output format mutually-exclusive group
+
+    out_group = parser.add_argument_group("Output format selection (mutually exclusive)")
+
+    out_fmt = out_group.add_mutually_exclusive_group()
+
+    out_fmt.add_argument("-m", "-man", "--man", action="store_true",
+                         help="Output troff manual page format.")
+    out_fmt.add_argument("-r", "-rst", "--rst", action="store_true",
+                         help="Output reStructuredText format (default).")
+    out_fmt.add_argument("-N", "-none", "--none", action="store_true",
+                         help="Do not output documentation, only warnings.")
+
+    # Output selection mutually-exclusive group
+
+    sel_group = parser.add_argument_group("Output selection (mutually exclusive)")
+    sel_mut = sel_group.add_mutually_exclusive_group()
+
+    sel_mut.add_argument("-e", "-export", "--export", action='store_true',
+                         help=EXPORT_DESC)
+
+    sel_mut.add_argument("-i", "-internal", "--internal", action='store_true',
+                         help=INTERNAL_DESC)
+
+    sel_mut.add_argument("-s", "-function", "--symbol", action='append',
+                         help=FUNCTION_DESC)
+
+    # This one is valid for all 3 types of filter
+    parser.add_argument("-n", "-nosymbol", "--nosymbol", action='append',
+                         help=NOSYMBOL_DESC)
+
+    parser.add_argument("files", metavar="FILE",
+                        nargs="+", help=FILES_DESC)
+
+    args = parser.parse_args()
+
+    if args.wall:
+        args.wreturn = True
+        args.wshort_desc = True
+        args.wcontents_before_sections = True
+
+    if not args.debug:
+        level = logging.INFO
+    else:
+        level = logging.DEBUG
+
+    if args.man:
+        out_style = ManFormat()
+    elif args.none:
+        out_style = None
+    else:
+        out_style = RestFormat()
+
+    logging.basicConfig(level=level, format="%(levelname)s: %(message)s")
+
+    kfiles = KernelFiles(files=args.files, verbose=args.verbose,
+                         out_style=out_style, werror=args.werror,
+                         wreturn=args.wreturn, wshort_desc=args.wshort_desc,
+                         wcontents_before_sections=args.wcontents_before_sections,
+                         modulename=args.modulename,
+                         export_file=args.export_file)
+
+    kfiles.parse()
+
+    kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
+               internal=args.internal, symbol=args.symbol,
+               nosymbol=args.nosymbol)
+
+
+# Call main method
+if __name__ == "__main__":
+    main()
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 04/33] scripts/kernel-doc.py: output warnings the same way as kerneldoc
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 03/33] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 05/33] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Add a formatter to logging to produce outputs in a similar way
to kernel-doc. This should help making it more compatible with
existing scripts.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 114f3699bf7c..8625209d6293 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -2715,6 +2715,11 @@ neither here nor at the original Perl script.
 """
 
 
+class MsgFormatter(logging.Formatter):
+    def format(self, record):
+        record.levelname = record.levelname.capitalize()
+        return logging.Formatter.format(self, record)
+
 def main():
     """Main program"""
 
@@ -2799,10 +2804,19 @@ def main():
         args.wshort_desc = True
         args.wcontents_before_sections = True
 
+    logger = logging.getLogger()
+
     if not args.debug:
-        level = logging.INFO
+        logger.setLevel(logging.INFO)
     else:
-        level = logging.DEBUG
+        logger.setLevel(logging.DEBUG)
+
+    formatter = MsgFormatter('%(levelname)s: %(message)s')
+
+    handler = logging.StreamHandler()
+    handler.setFormatter(formatter)
+
+    logger.addHandler(handler)
 
     if args.man:
         out_style = ManFormat()
@@ -2811,8 +2825,6 @@ def main():
     else:
         out_style = RestFormat()
 
-    logging.basicConfig(level=level, format="%(levelname)s: %(message)s")
-
     kfiles = KernelFiles(files=args.files, verbose=args.verbose,
                          out_style=out_style, werror=args.werror,
                          wreturn=args.wreturn, wshort_desc=args.wshort_desc,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 05/33] scripts/kernel-doc.py: better handle empty sections
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 04/33] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 06/33] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

While doing the conversion, we opted to skip empty sections
(description, return), but this makes harder to see the differences
between kernel-doc (Perl) and kernel-doc.py.

Also, the logic doesn't always work properly. So, change the
way this is done by adding an extra step to remove such
sections, doing it only for Return and Description.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 8625209d6293..90808d538de7 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -317,6 +317,19 @@ class KernelDoc:
         name = self.entry.section
         contents = self.entry.contents
 
+        # TODO: we can prevent dumping empty sections here with:
+        #
+        #    if self.entry.contents.strip("\n"):
+        #       if start_new:
+        #           self.entry.section = self.section_default
+        #           self.entry.contents = ""
+        #
+        #        return
+        #
+        # But, as we want to be producing the same output of the
+        # venerable kernel-doc Perl tool, let's just output everything,
+        # at least for now
+
         if type_param.match(name):
             name = type_param.group(1)
 
@@ -373,6 +386,19 @@ class KernelDoc:
 
         args["type"] = dtype
 
+        # TODO: use colletions.OrderedDict
+
+        sections = args.get('sections', {})
+        sectionlist = args.get('sectionlist', [])
+
+        # Drop empty sections
+        # TODO: improve it to emit warnings
+        for section in [ "Description", "Return" ]:
+            if section in sectionlist:
+                if not sections[section].rstrip():
+                    del sections[section]
+                    sectionlist.remove(section)
+
         self.entries.append((name, args))
 
         self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
@@ -476,7 +502,7 @@ class KernelDoc:
         # to ignore "[blah" in a parameter string.
 
         self.entry.parameterlist.append(param)
-        org_arg = Re(r'\s\s+').sub(' ', org_arg, count=1)
+        org_arg = Re(r'\s\s+').sub(' ', org_arg)
         self.entry.parametertypes[param] = org_arg
 
     def save_struct_actual(self, actual):
@@ -1384,8 +1410,7 @@ class KernelDoc:
             return
 
         if doc_end.search(line):
-            if self.entry.contents.strip("\n"):
-                self.dump_section()
+            self.dump_section()
 
             # Look for doc_com + <text> + doc_end:
             r = Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 06/33] scripts/kernel-doc.py: properly handle struct_group macros
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (4 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 05/33] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 07/33] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Handing nested parenthesis with regular expressions is not an
easy task. It is even harder with Python's re module, as it
has a limited subset of regular expressions, missing more
advanced features.

We might use instead Python regex module, but still the
regular expressions are very hard to understand. So, instead,
add a logic to properly match delimiters.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py | 220 ++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 213 insertions(+), 7 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 90808d538de7..fb96d42d287c 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -167,6 +167,172 @@ class Re:
     def group(self, num):
         return self.last_match.group(num)
 
+class NestedMatch:
+    """
+    Finding nested delimiters is hard with regular expressions. It is
+    even harder on Python with its normal re module, as there are several
+    advanced regular expressions that are missing.
+
+    This is the case of this pattern:
+
+            '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;'
+
+    which is used to properly match open/close parenthesis of the
+    string search STRUCT_GROUP(),
+
+    Add a class that counts pairs of delimiters, using it to match and
+    replace nested expressions.
+
+    The original approach was suggested by:
+        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+
+    Although I re-implemented it to make it more generic and match 3 types
+    of delimiters. The logic checks if delimiters are paired. If not, it
+    will ignore the search string.
+    """
+
+    # TODO:
+    # Right now, regular expressions to match it are defined only up to
+    #       the start delimiter, e.g.:
+    #
+    #       \bSTRUCT_GROUP\(
+    #
+    # is similar to: STRUCT_GROUP\((.*)\)
+    # except that the content inside the match group is delimiter's aligned.
+    #
+    # The content inside parenthesis are converted into a single replace
+    # group (e.g. r`\1').
+    #
+    # It would be nice to change such definition to support multiple
+    # match groups, allowing a regex equivalent to.
+    #
+    #   FOO\((.*), (.*), (.*)\)
+    #
+    # it is probably easier to define it not as a regular expression, but
+    # with some lexical definition like:
+    #
+    #   FOO(arg1, arg2, arg3)
+
+
+    DELIMITER_PAIRS = {
+        '{': '}',
+        '(': ')',
+        '[': ']',
+    }
+
+    RE_DELIM = re.compile(r'[\{\}\[\]\(\)]')
+
+    def _search(self, regex, line):
+        """
+        Finds paired blocks for a regex that ends with a delimiter.
+
+        The suggestion of using finditer to match pairs came from:
+        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+        but I ended using a different implementation to align all three types
+        of delimiters and seek for an initial regular expression.
+
+        The algorithm seeks for open/close paired delimiters and place them
+        into a stack, yielding a start/stop position of each match  when the
+        stack is zeroed.
+
+        The algorithm shoud work fine for properly paired lines, but will
+        silently ignore end delimiters that preceeds an start delimiter.
+        This should be OK for kernel-doc parser, as unaligned delimiters
+        would cause compilation errors. So, we don't need to rise exceptions
+        to cover such issues.
+        """
+
+        stack = []
+
+        for match_re in regex.finditer(line):
+            start = match_re.start()
+            offset = match_re.end()
+
+            d = line[offset -1]
+            if d not in self.DELIMITER_PAIRS:
+                continue
+
+            end = self.DELIMITER_PAIRS[d]
+            stack.append(end)
+
+            for match in self.RE_DELIM.finditer(line[offset:]):
+                pos = match.start() + offset
+
+                d = line[pos]
+
+                if d in self.DELIMITER_PAIRS:
+                    end = self.DELIMITER_PAIRS[d]
+
+                    stack.append(end)
+                    continue
+
+                # Does the end delimiter match what it is expected?
+                if stack and d == stack[-1]:
+                    stack.pop()
+
+                    if not stack:
+                        yield start, offset, pos + 1
+                        break
+
+    def search(self, regex, line):
+        """
+        This is similar to re.search:
+
+        It matches a regex that it is followed by a delimiter,
+        returning occurrences only if all delimiters are paired.
+        """
+
+        for t in self._search(regex, line):
+
+            yield line[t[0]:t[2]]
+
+    def sub(self, regex, sub, line, count=0):
+        """
+        This is similar to re.sub:
+
+        It matches a regex that it is followed by a delimiter,
+        replacing occurrences only if all delimiters are paired.
+
+        if r'\1' is used, it works just like re: it places there the
+        matched paired data with the delimiter stripped.
+
+        If count is different than zero, it will replace at most count
+        items.
+        """
+        out = ""
+
+        cur_pos = 0
+        n = 0
+
+        found = False
+        for start, end, pos in self._search(regex, line):
+            out += line[cur_pos:start]
+
+            # Value, ignoring start/end delimiters
+            value = line[end:pos - 1]
+
+            # replaces \1 at the sub string, if \1 is used there
+            new_sub = sub
+            new_sub = new_sub.replace(r'\1', value)
+
+            out += new_sub
+
+            # Drop end ';' if any
+            if line[pos] == ';':
+                pos += 1
+
+            cur_pos = pos
+            n += 1
+
+            if count and count >= n:
+                break
+
+        # Append the remaining string
+        l = len(line)
+        out += line[cur_pos:l]
+
+        return out
+
 #
 # Regular expressions used to parse kernel-doc markups at KernelDoc class.
 #
@@ -738,22 +904,49 @@ class KernelDoc:
             (Re(r'\s*____cacheline_aligned_in_smp', re.S),  ' '),
             (Re(r'\s*____cacheline_aligned', re.S),  ' '),
 
-            # Unwrap struct_group() based on this definition:
+            # Unwrap struct_group macros based on this definition:
             # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
             # which has variants like: struct_group(NAME, MEMBERS...)
+            # Only MEMBERS arguments require documentation.
+            #
+            # Parsing them happens on two steps:
+            #
+            # 1. drop struct group arguments that aren't at MEMBERS,
+            #    storing them as STRUCT_GROUP(MEMBERS)
+            #
+            # 2. remove STRUCT_GROUP() ancillary macro.
+            #
+            # The original logic used to remove STRUCT_GROUP() using an
+            # advanced regex:
+            #
+            #   \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;
+            #
+            # with two patterns that are incompatible with
+            # Python re module, as it has:
+            #
+            #   - a recursive pattern: (?1)
+            #   - an atomic grouping: (?>...)
+            #
+            # I tried a simpler version: but it didn't work either:
+            #   \bSTRUCT_GROUP\(([^\)]+)\)[^;]*;
+            #
+            # As it doesn't properly match the end parenthesis on some cases.
+            #
+            # So, a better solution was crafted: there's now a NestedMatch
+            # class that ensures that delimiters after a search are properly
+            # matched. So, the implementation to drop STRUCT_GROUP() will be
+            # handled in separate.
 
             (Re(r'\bstruct_group\s*\(([^,]*,)', re.S),  r'STRUCT_GROUP('),
             (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S),  r'STRUCT_GROUP('),
             (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S),  r'struct \1 \2; STRUCT_GROUP('),
             (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S),  r'STRUCT_GROUP('),
 
-            # This is incompatible with Python re, as it uses:
-            #  recursive patterns ((?1)) and atomic grouping ((?>...)):
-            #   '\bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;'
-            # Let's see if this works instead:
-            (Re(r'\bSTRUCT_GROUP\(([^\)]+)\)[^;]*;', re.S),  r'\1'),
-
             # Replace macros
+            #
+            # TODO: it is better to also move those to the NestedMatch logic,
+            # to ensure that parenthesis will be properly matched.
+
             (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
             (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
             (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'unsigned long \1[BITS_TO_LONGS(\2)]'),
@@ -765,9 +958,22 @@ class KernelDoc:
             (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S),  r'__u32 \1'),
         ]
 
+        # Regexes here are guaranteed to have the end limiter matching
+        # the start delimiter. Yet, right now, only one replace group
+        # is allowed.
+
+        sub_nested_prefixes = [
+            (re.compile(r'\bSTRUCT_GROUP\('),  r'\1'),
+        ]
+
         for search, sub in sub_prefixes:
             members = search.sub(sub, members)
 
+        nested = NestedMatch()
+
+        for search, sub in sub_nested_prefixes:
+            members = nested.sub(search, sub, members)
+
         # Keeps the original declaration as-is
         declaration = members
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 07/33] scripts/kernel-doc.py: move regex methods to a separate file
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (5 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 06/33] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 08/33] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

In preparation for letting kerneldoc Sphinx extension to import
Python libraries, move regex ancillary classes to a separate
file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py       | 223 +----------------------------
 scripts/lib/kdoc/kdoc_re.py | 272 ++++++++++++++++++++++++++++++++++++
 2 files changed, 277 insertions(+), 218 deletions(-)
 create mode 100755 scripts/lib/kdoc/kdoc_re.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index fb96d42d287c..7f00c8c86a78 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -110,228 +110,15 @@ from pprint import pformat
 
 from dateutil import tz
 
-# Local cache for regular expressions
-re_cache = {}
+# Import Python modules
 
+LIB_DIR = "lib/kdoc"
+SRC_DIR = os.path.dirname(os.path.realpath(__file__))
 
-class Re:
-    """
-    Helper class to simplify regex declaration and usage,
+sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
 
-    It calls re.compile for a given pattern. It also allows adding
-    regular expressions and define sub at class init time.
+from kdoc_re import Re, NestedMatch
 
-    Regular expressions can be cached via an argument, helping to speedup
-    searches.
-    """
-
-    def _add_regex(self, string, flags):
-        if string in re_cache:
-            self.regex = re_cache[string]
-        else:
-            self.regex = re.compile(string, flags=flags)
-
-            if self.cache:
-                re_cache[string] = self.regex
-
-    def __init__(self, string, cache=True, flags=0):
-        self.cache = cache
-        self.last_match = None
-
-        self._add_regex(string, flags)
-
-    def __str__(self):
-        return self.regex.pattern
-
-    def __add__(self, other):
-        return Re(str(self) + str(other), cache=self.cache or other.cache,
-                  flags=self.regex.flags | other.regex.flags)
-
-    def match(self, string):
-        self.last_match = self.regex.match(string)
-        return self.last_match
-
-    def search(self, string):
-        self.last_match = self.regex.search(string)
-        return self.last_match
-
-    def findall(self, string):
-        return self.regex.findall(string)
-
-    def split(self, string):
-        return self.regex.split(string)
-
-    def sub(self, sub, string, count=0):
-        return self.regex.sub(sub, string, count=count)
-
-    def group(self, num):
-        return self.last_match.group(num)
-
-class NestedMatch:
-    """
-    Finding nested delimiters is hard with regular expressions. It is
-    even harder on Python with its normal re module, as there are several
-    advanced regular expressions that are missing.
-
-    This is the case of this pattern:
-
-            '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;'
-
-    which is used to properly match open/close parenthesis of the
-    string search STRUCT_GROUP(),
-
-    Add a class that counts pairs of delimiters, using it to match and
-    replace nested expressions.
-
-    The original approach was suggested by:
-        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
-
-    Although I re-implemented it to make it more generic and match 3 types
-    of delimiters. The logic checks if delimiters are paired. If not, it
-    will ignore the search string.
-    """
-
-    # TODO:
-    # Right now, regular expressions to match it are defined only up to
-    #       the start delimiter, e.g.:
-    #
-    #       \bSTRUCT_GROUP\(
-    #
-    # is similar to: STRUCT_GROUP\((.*)\)
-    # except that the content inside the match group is delimiter's aligned.
-    #
-    # The content inside parenthesis are converted into a single replace
-    # group (e.g. r`\1').
-    #
-    # It would be nice to change such definition to support multiple
-    # match groups, allowing a regex equivalent to.
-    #
-    #   FOO\((.*), (.*), (.*)\)
-    #
-    # it is probably easier to define it not as a regular expression, but
-    # with some lexical definition like:
-    #
-    #   FOO(arg1, arg2, arg3)
-
-
-    DELIMITER_PAIRS = {
-        '{': '}',
-        '(': ')',
-        '[': ']',
-    }
-
-    RE_DELIM = re.compile(r'[\{\}\[\]\(\)]')
-
-    def _search(self, regex, line):
-        """
-        Finds paired blocks for a regex that ends with a delimiter.
-
-        The suggestion of using finditer to match pairs came from:
-        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
-        but I ended using a different implementation to align all three types
-        of delimiters and seek for an initial regular expression.
-
-        The algorithm seeks for open/close paired delimiters and place them
-        into a stack, yielding a start/stop position of each match  when the
-        stack is zeroed.
-
-        The algorithm shoud work fine for properly paired lines, but will
-        silently ignore end delimiters that preceeds an start delimiter.
-        This should be OK for kernel-doc parser, as unaligned delimiters
-        would cause compilation errors. So, we don't need to rise exceptions
-        to cover such issues.
-        """
-
-        stack = []
-
-        for match_re in regex.finditer(line):
-            start = match_re.start()
-            offset = match_re.end()
-
-            d = line[offset -1]
-            if d not in self.DELIMITER_PAIRS:
-                continue
-
-            end = self.DELIMITER_PAIRS[d]
-            stack.append(end)
-
-            for match in self.RE_DELIM.finditer(line[offset:]):
-                pos = match.start() + offset
-
-                d = line[pos]
-
-                if d in self.DELIMITER_PAIRS:
-                    end = self.DELIMITER_PAIRS[d]
-
-                    stack.append(end)
-                    continue
-
-                # Does the end delimiter match what it is expected?
-                if stack and d == stack[-1]:
-                    stack.pop()
-
-                    if not stack:
-                        yield start, offset, pos + 1
-                        break
-
-    def search(self, regex, line):
-        """
-        This is similar to re.search:
-
-        It matches a regex that it is followed by a delimiter,
-        returning occurrences only if all delimiters are paired.
-        """
-
-        for t in self._search(regex, line):
-
-            yield line[t[0]:t[2]]
-
-    def sub(self, regex, sub, line, count=0):
-        """
-        This is similar to re.sub:
-
-        It matches a regex that it is followed by a delimiter,
-        replacing occurrences only if all delimiters are paired.
-
-        if r'\1' is used, it works just like re: it places there the
-        matched paired data with the delimiter stripped.
-
-        If count is different than zero, it will replace at most count
-        items.
-        """
-        out = ""
-
-        cur_pos = 0
-        n = 0
-
-        found = False
-        for start, end, pos in self._search(regex, line):
-            out += line[cur_pos:start]
-
-            # Value, ignoring start/end delimiters
-            value = line[end:pos - 1]
-
-            # replaces \1 at the sub string, if \1 is used there
-            new_sub = sub
-            new_sub = new_sub.replace(r'\1', value)
-
-            out += new_sub
-
-            # Drop end ';' if any
-            if line[pos] == ';':
-                pos += 1
-
-            cur_pos = pos
-            n += 1
-
-            if count and count >= n:
-                break
-
-        # Append the remaining string
-        l = len(line)
-        out += line[cur_pos:l]
-
-        return out
 
 #
 # Regular expressions used to parse kernel-doc markups at KernelDoc class.
diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
new file mode 100755
index 000000000000..512b6521e79d
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_re.py
@@ -0,0 +1,272 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+
+"""
+Regular expression ancillary classes.
+
+Those help caching regular expressions and do matching for kernel-doc.
+"""
+
+import re
+
+# Local cache for regular expressions
+re_cache = {}
+
+
+class Re:
+    """
+    Helper class to simplify regex declaration and usage,
+
+    It calls re.compile for a given pattern. It also allows adding
+    regular expressions and define sub at class init time.
+
+    Regular expressions can be cached via an argument, helping to speedup
+    searches.
+    """
+
+    def _add_regex(self, string, flags):
+        """
+        Adds a new regex or re-use it from the cache.
+        """
+
+        if string in re_cache:
+            self.regex = re_cache[string]
+        else:
+            self.regex = re.compile(string, flags=flags)
+
+            if self.cache:
+                re_cache[string] = self.regex
+
+    def __init__(self, string, cache=True, flags=0):
+        """
+        Compile a regular expression and initialize internal vars.
+        """
+
+        self.cache = cache
+        self.last_match = None
+
+        self._add_regex(string, flags)
+
+    def __str__(self):
+        """
+        Return the regular expression pattern.
+        """
+        return self.regex.pattern
+
+    def __add__(self, other):
+        """
+        Allows adding two regular expressions into one.
+        """
+
+        return Re(str(self) + str(other), cache=self.cache or other.cache,
+                  flags=self.regex.flags | other.regex.flags)
+
+    def match(self, string):
+        """
+        Handles a re.match storing its results
+        """
+
+        self.last_match = self.regex.match(string)
+        return self.last_match
+
+    def search(self, string):
+        """
+        Handles a re.search storing its results
+        """
+
+        self.last_match = self.regex.search(string)
+        return self.last_match
+
+    def findall(self, string):
+        """
+        Alias to re.findall
+        """
+
+        return self.regex.findall(string)
+
+    def split(self, string):
+        """
+        Alias to re.split
+        """
+
+        return self.regex.split(string)
+
+    def sub(self, sub, string, count=0):
+        """
+        Alias to re.sub
+        """
+
+        return self.regex.sub(sub, string, count=count)
+
+    def group(self, num):
+        """
+        Returns the group results of the last match
+        """
+
+        return self.last_match.group(num)
+
+
+class NestedMatch:
+    """
+    Finding nested delimiters is hard with regular expressions. It is
+    even harder on Python with its normal re module, as there are several
+    advanced regular expressions that are missing.
+
+    This is the case of this pattern:
+
+            '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;'
+
+    which is used to properly match open/close parenthesis of the
+    string search STRUCT_GROUP(),
+
+    Add a class that counts pairs of delimiters, using it to match and
+    replace nested expressions.
+
+    The original approach was suggested by:
+        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+
+    Although I re-implemented it to make it more generic and match 3 types
+    of delimiters. The logic checks if delimiters are paired. If not, it
+    will ignore the search string.
+    """
+
+    # TODO:
+    # Right now, regular expressions to match it are defined only up to
+    #       the start delimiter, e.g.:
+    #
+    #       \bSTRUCT_GROUP\(
+    #
+    # is similar to: STRUCT_GROUP\((.*)\)
+    # except that the content inside the match group is delimiter's aligned.
+    #
+    # The content inside parenthesis are converted into a single replace
+    # group (e.g. r`\1').
+    #
+    # It would be nice to change such definition to support multiple
+    # match groups, allowing a regex equivalent to.
+    #
+    #   FOO\((.*), (.*), (.*)\)
+    #
+    # it is probably easier to define it not as a regular expression, but
+    # with some lexical definition like:
+    #
+    #   FOO(arg1, arg2, arg3)
+
+    DELIMITER_PAIRS = {
+        '{': '}',
+        '(': ')',
+        '[': ']',
+    }
+
+    RE_DELIM = re.compile(r'[\{\}\[\]\(\)]')
+
+    def _search(self, regex, line):
+        """
+        Finds paired blocks for a regex that ends with a delimiter.
+
+        The suggestion of using finditer to match pairs came from:
+        https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+        but I ended using a different implementation to align all three types
+        of delimiters and seek for an initial regular expression.
+
+        The algorithm seeks for open/close paired delimiters and place them
+        into a stack, yielding a start/stop position of each match  when the
+        stack is zeroed.
+
+        The algorithm shoud work fine for properly paired lines, but will
+        silently ignore end delimiters that preceeds an start delimiter.
+        This should be OK for kernel-doc parser, as unaligned delimiters
+        would cause compilation errors. So, we don't need to rise exceptions
+        to cover such issues.
+        """
+
+        stack = []
+
+        for match_re in regex.finditer(line):
+            start = match_re.start()
+            offset = match_re.end()
+
+            d = line[offset - 1]
+            if d not in self.DELIMITER_PAIRS:
+                continue
+
+            end = self.DELIMITER_PAIRS[d]
+            stack.append(end)
+
+            for match in self.RE_DELIM.finditer(line[offset:]):
+                pos = match.start() + offset
+
+                d = line[pos]
+
+                if d in self.DELIMITER_PAIRS:
+                    end = self.DELIMITER_PAIRS[d]
+
+                    stack.append(end)
+                    continue
+
+                # Does the end delimiter match what it is expected?
+                if stack and d == stack[-1]:
+                    stack.pop()
+
+                    if not stack:
+                        yield start, offset, pos + 1
+                        break
+
+    def search(self, regex, line):
+        """
+        This is similar to re.search:
+
+        It matches a regex that it is followed by a delimiter,
+        returning occurrences only if all delimiters are paired.
+        """
+
+        for t in self._search(regex, line):
+
+            yield line[t[0]:t[2]]
+
+    def sub(self, regex, sub, line, count=0):
+        """
+        This is similar to re.sub:
+
+        It matches a regex that it is followed by a delimiter,
+        replacing occurrences only if all delimiters are paired.
+
+        if r'\1' is used, it works just like re: it places there the
+        matched paired data with the delimiter stripped.
+
+        If count is different than zero, it will replace at most count
+        items.
+        """
+        out = ""
+
+        cur_pos = 0
+        n = 0
+
+        for start, end, pos in self._search(regex, line):
+            out += line[cur_pos:start]
+
+            # Value, ignoring start/end delimiters
+            value = line[end:pos - 1]
+
+            # replaces \1 at the sub string, if \1 is used there
+            new_sub = sub
+            new_sub = new_sub.replace(r'\1', value)
+
+            out += new_sub
+
+            # Drop end ';' if any
+            if line[pos] == ';':
+                pos += 1
+
+            cur_pos = pos
+            n += 1
+
+            if count and count >= n:
+                break
+
+        # Append the remaining string
+        l = len(line)
+        out += line[cur_pos:l]
+
+        return out
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 08/33] scripts/kernel-doc.py: move KernelDoc class to a separate file
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (6 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 07/33] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 09/33] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Gustavo A. R. Silva, Kees Cook,
	Sean Anderson, linux-hardening, linux-kernel

In preparation for letting kerneldoc Sphinx extension to import
Python libraries, move regex ancillary classes to a separate
file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           | 1634 +-----------------------------
 scripts/lib/kdoc/kdoc_parser.py | 1690 +++++++++++++++++++++++++++++++
 2 files changed, 1692 insertions(+), 1632 deletions(-)
 create mode 100755 scripts/lib/kdoc/kdoc_parser.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 7f00c8c86a78..f030a36a165b 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -117,53 +117,15 @@ SRC_DIR = os.path.dirname(os.path.realpath(__file__))
 
 sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
 
-from kdoc_re import Re, NestedMatch
+from kdoc_parser import KernelDoc, type_param
+from kdoc_re import Re
 
-
-#
-# Regular expressions used to parse kernel-doc markups at KernelDoc class.
-#
-# Let's declare them in lowercase outside any class to make easier to
-# convert from the python script.
-#
-# As those are evaluated at the beginning, no need to cache them
-#
-
-
-# Allow whitespace at end of comment start.
-doc_start = Re(r'^/\*\*\s*$', cache=False)
-
-doc_end = Re(r'\*/', cache=False)
-doc_com = Re(r'\s*\*\s*', cache=False)
-doc_com_body = Re(r'\s*\* ?', cache=False)
-doc_decl = doc_com + Re(r'(\w+)', cache=False)
-
-# @params and a strictly limited set of supported section names
-# Specifically:
-#   Match @word:
-#         @...:
-#         @{section-name}:
-# while trying to not match literal block starts like "example::"
-#
-doc_sect = doc_com + \
-            Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:([^:].*)?$',
-                flags=re.I, cache=False)
-
-doc_content = doc_com_body + Re(r'(.*)', cache=False)
-doc_block = doc_com + Re(r'DOC:\s*(.*)?', cache=False)
-doc_inline_start = Re(r'^\s*/\*\*\s*$', cache=False)
-doc_inline_sect = Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
-doc_inline_end = Re(r'^\s*\*/\s*$', cache=False)
-doc_inline_oneline = Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
 function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
-attribute = Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
-               flags=re.I | re.S, cache=False)
 
 # match expressions used to find embedded type information
 type_constant = Re(r"\b``([^\`]+)``\b", cache=False)
 type_constant2 = Re(r"\%([-_*\w]+)", cache=False)
 type_func = Re(r"(\w+)\(\)", cache=False)
-type_param = Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
 type_param_ref = Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
 
 # Special RST handling for func ptr params
@@ -181,1598 +143,6 @@ type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
 type_fallback = Re(r"\&([_\w]+)", cache=False)
 type_member_func = type_member + Re(r"\(\)", cache=False)
 
-export_symbol = Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
-export_symbol_ns = Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
-
-class KernelDoc:
-    # Parser states
-    STATE_NORMAL        = 0        # normal code
-    STATE_NAME          = 1        # looking for function name
-    STATE_BODY_MAYBE    = 2        # body - or maybe more description
-    STATE_BODY          = 3        # the body of the comment
-    STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
-    STATE_PROTO         = 5        # scanning prototype
-    STATE_DOCBLOCK      = 6        # documentation block
-    STATE_INLINE        = 7        # gathering doc outside main block
-
-    st_name = [
-        "NORMAL",
-        "NAME",
-        "BODY_MAYBE",
-        "BODY",
-        "BODY_WITH_BLANK_LINE",
-        "PROTO",
-        "DOCBLOCK",
-        "INLINE",
-    ]
-
-    # Inline documentation state
-    STATE_INLINE_NA     = 0 # not applicable ($state != STATE_INLINE)
-    STATE_INLINE_NAME   = 1 # looking for member name (@foo:)
-    STATE_INLINE_TEXT   = 2 # looking for member documentation
-    STATE_INLINE_END    = 3 # done
-    STATE_INLINE_ERROR  = 4 # error - Comment without header was found.
-                            # Spit a warning as it's not
-                            # proper kernel-doc and ignore the rest.
-
-    st_inline_name = [
-        "",
-        "_NAME",
-        "_TEXT",
-        "_END",
-        "_ERROR",
-    ]
-
-    # Section names
-
-    section_default = "Description"  # default section
-    section_intro = "Introduction"
-    section_context = "Context"
-    section_return = "Return"
-
-    undescribed = "-- undescribed --"
-
-    def __init__(self, config, fname):
-        """Initialize internal variables"""
-
-        self.fname = fname
-        self.config = config
-
-        # Initial state for the state machines
-        self.state = self.STATE_NORMAL
-        self.inline_doc_state = self.STATE_INLINE_NA
-
-        # Store entry currently being processed
-        self.entry = None
-
-        # Place all potential outputs into an array
-        self.entries = []
-
-    def show_warnings(self, dtype, declaration_name):
-        # TODO: implement it
-
-        return True
-
-    # TODO: rename to emit_message
-    def emit_warning(self, ln, msg, warning=True):
-        """Emit a message"""
-
-        if warning:
-            self.config.log.warning("%s:%d %s", self.fname, ln, msg)
-        else:
-            self.config.log.info("%s:%d %s", self.fname, ln, msg)
-
-    def dump_section(self, start_new=True):
-        """
-        Dumps section contents to arrays/hashes intended for that purpose.
-        """
-
-        name = self.entry.section
-        contents = self.entry.contents
-
-        # TODO: we can prevent dumping empty sections here with:
-        #
-        #    if self.entry.contents.strip("\n"):
-        #       if start_new:
-        #           self.entry.section = self.section_default
-        #           self.entry.contents = ""
-        #
-        #        return
-        #
-        # But, as we want to be producing the same output of the
-        # venerable kernel-doc Perl tool, let's just output everything,
-        # at least for now
-
-        if type_param.match(name):
-            name = type_param.group(1)
-
-            self.entry.parameterdescs[name] = contents
-            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
-
-            self.entry.sectcheck += name + " "
-            self.entry.new_start_line = 0
-
-        elif name == "@...":
-            name = "..."
-            self.entry.parameterdescs[name] = contents
-            self.entry.sectcheck += name + " "
-            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
-            self.entry.new_start_line = 0
-
-        else:
-            if name in self.entry.sections and self.entry.sections[name] != "":
-                # Only warn on user-specified duplicate section names
-                if name != self.section_default:
-                    self.emit_warning(self.entry.new_start_line,
-                                      f"duplicate section name '{name}'\n")
-                self.entry.sections[name] += contents
-            else:
-                self.entry.sections[name] = contents
-                self.entry.sectionlist.append(name)
-                self.entry.section_start_lines[name] = self.entry.new_start_line
-                self.entry.new_start_line = 0
-
-#        self.config.log.debug("Section: %s : %s", name, pformat(vars(self.entry)))
-
-        if start_new:
-            self.entry.section = self.section_default
-            self.entry.contents = ""
-
-    # TODO: rename it to store_declaration
-    def output_declaration(self, dtype, name, **args):
-        """
-        Stores the entry into an entry array.
-
-        The actual output and output filters will be handled elsewhere
-        """
-
-        # The implementation here is different than the original kernel-doc:
-        # instead of checking for output filters or actually output anything,
-        # it just stores the declaration content at self.entries, as the
-        # output will happen on a separate class.
-        #
-        # For now, we're keeping the same name of the function just to make
-        # easier to compare the source code of both scripts
-
-        if "declaration_start_line" not in args:
-            args["declaration_start_line"] = self.entry.declaration_start_line
-
-        args["type"] = dtype
-
-        # TODO: use colletions.OrderedDict
-
-        sections = args.get('sections', {})
-        sectionlist = args.get('sectionlist', [])
-
-        # Drop empty sections
-        # TODO: improve it to emit warnings
-        for section in [ "Description", "Return" ]:
-            if section in sectionlist:
-                if not sections[section].rstrip():
-                    del sections[section]
-                    sectionlist.remove(section)
-
-        self.entries.append((name, args))
-
-        self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
-
-    def reset_state(self, ln):
-        """
-        Ancillary routine to create a new entry. It initializes all
-        variables used by the state machine.
-        """
-
-        self.entry = argparse.Namespace
-
-        self.entry.contents = ""
-        self.entry.function = ""
-        self.entry.sectcheck = ""
-        self.entry.struct_actual = ""
-        self.entry.prototype = ""
-
-        self.entry.parameterlist = []
-        self.entry.parameterdescs = {}
-        self.entry.parametertypes = {}
-        self.entry.parameterdesc_start_lines = {}
-
-        self.entry.section_start_lines = {}
-        self.entry.sectionlist = []
-        self.entry.sections = {}
-
-        self.entry.anon_struct_union = False
-
-        self.entry.leading_space = None
-
-        # State flags
-        self.state = self.STATE_NORMAL
-        self.inline_doc_state = self.STATE_INLINE_NA
-        self.entry.brcount = 0
-
-        self.entry.in_doc_sect = False
-        self.entry.declaration_start_line = ln
-
-    def push_parameter(self, ln, decl_type, param, dtype,
-                       org_arg, declaration_name):
-        if self.entry.anon_struct_union and dtype == "" and param == "}":
-            return  # Ignore the ending }; from anonymous struct/union
-
-        self.entry.anon_struct_union = False
-
-        param = Re(r'[\[\)].*').sub('', param, count=1)
-
-        if dtype == "" and param.endswith("..."):
-            if Re(r'\w\.\.\.$').search(param):
-                # For named variable parameters of the form `x...`,
-                # remove the dots
-                param = param[:-3]
-            else:
-                # Handles unnamed variable parameters
-                param = "..."
-
-            if param not in self.entry.parameterdescs or \
-                not self.entry.parameterdescs[param]:
-
-                self.entry.parameterdescs[param] = "variable arguments"
-
-        elif dtype == "" and (not param or param == "void"):
-            param = "void"
-            self.entry.parameterdescs[param] = "no arguments"
-
-        elif dtype == "" and param in ["struct", "union"]:
-            # Handle unnamed (anonymous) union or struct
-            dtype = param
-            param = "{unnamed_" + param + "}"
-            self.entry.parameterdescs[param] = "anonymous\n"
-            self.entry.anon_struct_union = True
-
-        # Handle cache group enforcing variables: they do not need
-        # to be described in header files
-        elif "__cacheline_group" in param:
-            # Ignore __cacheline_group_begin and __cacheline_group_end
-            return
-
-        # Warn if parameter has no description
-        # (but ignore ones starting with # as these are not parameters
-        # but inline preprocessor statements)
-        if param not in self.entry.parameterdescs and not param.startswith("#"):
-            self.entry.parameterdescs[param] = self.undescribed
-
-            if self.show_warnings(dtype, declaration_name) and "." not in param:
-                if decl_type == 'function':
-                    dname = f"{decl_type} parameter"
-                else:
-                    dname = f"{decl_type} member"
-
-                self.emit_warning(ln,
-                                  f"{dname} '{param}' not described in '{declaration_name}'")
-
-        # Strip spaces from param so that it is one continuous string on
-        # parameterlist. This fixes a problem where check_sections()
-        # cannot find a parameter like "addr[6 + 2]" because it actually
-        # appears as "addr[6", "+", "2]" on the parameter list.
-        # However, it's better to maintain the param string unchanged for
-        # output, so just weaken the string compare in check_sections()
-        # to ignore "[blah" in a parameter string.
-
-        self.entry.parameterlist.append(param)
-        org_arg = Re(r'\s\s+').sub(' ', org_arg)
-        self.entry.parametertypes[param] = org_arg
-
-    def save_struct_actual(self, actual):
-        """
-        Strip all spaces from the actual param so that it looks like
-        one string item.
-        """
-
-        actual = Re(r'\s*').sub("", actual, count=1)
-
-        self.entry.struct_actual += actual + " "
-
-    def create_parameter_list(self, ln, decl_type, args, splitter, declaration_name):
-
-        # temporarily replace all commas inside function pointer definition
-        arg_expr = Re(r'(\([^\),]+),')
-        while arg_expr.search(args):
-            args = arg_expr.sub(r"\1#", args)
-
-        for arg in args.split(splitter):
-            # Strip comments
-            arg = Re(r'\/\*.*\*\/').sub('', arg)
-
-            # Ignore argument attributes
-            arg = Re(r'\sPOS0?\s').sub(' ', arg)
-
-            # Strip leading/trailing spaces
-            arg = arg.strip()
-            arg = Re(r'\s+').sub(' ', arg, count=1)
-
-            if arg.startswith('#'):
-                # Treat preprocessor directive as a typeless variable just to fill
-                # corresponding data structures "correctly". Catch it later in
-                # output_* subs.
-
-                # Treat preprocessor directive as a typeless variable
-                self.push_parameter(ln, decl_type, arg, "",
-                                    "", declaration_name)
-
-            elif Re(r'\(.+\)\s*\(').search(arg):
-                # Pointer-to-function
-
-                arg = arg.replace('#', ',')
-
-                r = Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
-                if r.match(arg):
-                    param = r.group(1)
-                else:
-                    self.emit_warning(ln, f"Invalid param: {arg}")
-                    param = arg
-
-                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
-                self.save_struct_actual(param)
-                self.push_parameter(ln, decl_type, param, dtype,
-                                    arg, declaration_name)
-
-            elif Re(r'\(.+\)\s*\[').search(arg):
-                # Array-of-pointers
-
-                arg = arg.replace('#', ',')
-                r = Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
-                if r.match(arg):
-                    param = r.group(1)
-                else:
-                    self.emit_warning(ln, f"Invalid param: {arg}")
-                    param = arg
-
-                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
-
-                self.save_struct_actual(param)
-                self.push_parameter(ln, decl_type, param, dtype,
-                                    arg, declaration_name)
-
-            elif arg:
-                arg = Re(r'\s*:\s*').sub(":", arg)
-                arg = Re(r'\s*\[').sub('[', arg)
-
-                args = Re(r'\s*,\s*').split(arg)
-                if args[0] and '*' in args[0]:
-                    args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
-
-                first_arg = []
-                r = Re(r'^(.*\s+)(.*?\[.*\].*)$')
-                if args[0] and r.match(args[0]):
-                    args.pop(0)
-                    first_arg.extend(r.group(1))
-                    first_arg.append(r.group(2))
-                else:
-                    first_arg = Re(r'\s+').split(args.pop(0))
-
-                args.insert(0, first_arg.pop())
-                dtype = ' '.join(first_arg)
-
-                for param in args:
-                    if Re(r'^(\*+)\s*(.*)').match(param):
-                        r = Re(r'^(\*+)\s*(.*)')
-                        if not r.match(param):
-                            self.emit_warning(ln, f"Invalid param: {param}")
-                            continue
-
-                        param = r.group(1)
-
-                        self.save_struct_actual(r.group(2))
-                        self.push_parameter(ln, decl_type, r.group(2),
-                                            f"{dtype} {r.group(1)}",
-                                            arg, declaration_name)
-
-                    elif Re(r'(.*?):(\w+)').search(param):
-                        r = Re(r'(.*?):(\w+)')
-                        if not r.match(param):
-                            self.emit_warning(ln, f"Invalid param: {param}")
-                            continue
-
-                        if dtype != "":  # Skip unnamed bit-fields
-                            self.save_struct_actual(r.group(1))
-                            self.push_parameter(ln, decl_type, r.group(1),
-                                                f"{dtype}:{r.group(2)}",
-                                                arg, declaration_name)
-                    else:
-                        self.save_struct_actual(param)
-                        self.push_parameter(ln, decl_type, param, dtype,
-                                            arg, declaration_name)
-
-    def check_sections(self, ln, decl_name, decl_type, sectcheck, prmscheck):
-        sects = sectcheck.split()
-        prms = prmscheck.split()
-        err = False
-
-        for sx in range(len(sects)):                  # pylint: disable=C0200
-            err = True
-            for px in range(len(prms)):               # pylint: disable=C0200
-                prm_clean = prms[px]
-                prm_clean = Re(r'\[.*\]').sub('', prm_clean)
-                prm_clean = attribute.sub('', prm_clean)
-
-                # ignore array size in a parameter string;
-                # however, the original param string may contain
-                # spaces, e.g.:  addr[6 + 2]
-                # and this appears in @prms as "addr[6" since the
-                # parameter list is split at spaces;
-                # hence just ignore "[..." for the sections check;
-                prm_clean = Re(r'\[.*').sub('', prm_clean)
-
-                if prm_clean == sects[sx]:
-                    err = False
-                    break
-
-            if err:
-                if decl_type == 'function':
-                    dname = f"{decl_type} parameter"
-                else:
-                    dname = f"{decl_type} member"
-
-                self.emit_warning(ln,
-                                  f"Excess {dname} '{sects[sx]}' description in '{decl_name}'")
-
-    def check_return_section(self, ln, declaration_name, return_type):
-
-        if not self.config.wreturn:
-            return
-
-        # Ignore an empty return type (It's a macro)
-        # Ignore functions with a "void" return type (but not "void *")
-        if not return_type or Re(r'void\s*\w*\s*$').search(return_type):
-            return
-
-        if not self.entry.sections.get("Return", None):
-            self.emit_warning(ln,
-                              f"No description found for return value of '{declaration_name}'")
-
-    def dump_struct(self, ln, proto):
-        """
-        Store an entry for an struct or union
-        """
-
-        type_pattern = r'(struct|union)'
-
-        qualifiers = [
-            "__attribute__",
-            "__packed",
-            "__aligned",
-            "____cacheline_aligned_in_smp",
-            "____cacheline_aligned",
-        ]
-
-        definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
-        struct_members = Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
-
-        # Extract struct/union definition
-        members = None
-        declaration_name = None
-        decl_type = None
-
-        r = Re(type_pattern + r'\s+(\w+)\s*' + definition_body)
-        if r.search(proto):
-            decl_type = r.group(1)
-            declaration_name = r.group(2)
-            members = r.group(3)
-        else:
-            r = Re(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
-
-            if r.search(proto):
-                decl_type = r.group(1)
-                declaration_name = r.group(3)
-                members = r.group(2)
-
-        if not members:
-            self.emit_warning(ln, f"{proto} error: Cannot parse struct or union!")
-            self.config.errors += 1
-            return
-
-        if self.entry.identifier != declaration_name:
-            self.emit_warning(ln,
-                              f"expecting prototype for {decl_type} {self.entry.identifier}. Prototype was for {decl_type} {declaration_name} instead\n")
-            return
-
-        args_pattern =r'([^,)]+)'
-
-        sub_prefixes = [
-            (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I),  ''),
-            (Re(r'\/\*\s*private:.*', re.S| re.I),  ''),
-
-            # Strip comments
-            (Re(r'\/\*.*?\*\/', re.S),  ''),
-
-            # Strip attributes
-            (attribute, ' '),
-            (Re(r'\s*__aligned\s*\([^;]*\)', re.S),  ' '),
-            (Re(r'\s*__counted_by\s*\([^;]*\)', re.S),  ' '),
-            (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S),  ' '),
-            (Re(r'\s*__packed\s*', re.S),  ' '),
-            (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S),  ' '),
-            (Re(r'\s*____cacheline_aligned_in_smp', re.S),  ' '),
-            (Re(r'\s*____cacheline_aligned', re.S),  ' '),
-
-            # Unwrap struct_group macros based on this definition:
-            # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
-            # which has variants like: struct_group(NAME, MEMBERS...)
-            # Only MEMBERS arguments require documentation.
-            #
-            # Parsing them happens on two steps:
-            #
-            # 1. drop struct group arguments that aren't at MEMBERS,
-            #    storing them as STRUCT_GROUP(MEMBERS)
-            #
-            # 2. remove STRUCT_GROUP() ancillary macro.
-            #
-            # The original logic used to remove STRUCT_GROUP() using an
-            # advanced regex:
-            #
-            #   \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;
-            #
-            # with two patterns that are incompatible with
-            # Python re module, as it has:
-            #
-            #   - a recursive pattern: (?1)
-            #   - an atomic grouping: (?>...)
-            #
-            # I tried a simpler version: but it didn't work either:
-            #   \bSTRUCT_GROUP\(([^\)]+)\)[^;]*;
-            #
-            # As it doesn't properly match the end parenthesis on some cases.
-            #
-            # So, a better solution was crafted: there's now a NestedMatch
-            # class that ensures that delimiters after a search are properly
-            # matched. So, the implementation to drop STRUCT_GROUP() will be
-            # handled in separate.
-
-            (Re(r'\bstruct_group\s*\(([^,]*,)', re.S),  r'STRUCT_GROUP('),
-            (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S),  r'STRUCT_GROUP('),
-            (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S),  r'struct \1 \2; STRUCT_GROUP('),
-            (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S),  r'STRUCT_GROUP('),
-
-            # Replace macros
-            #
-            # TODO: it is better to also move those to the NestedMatch logic,
-            # to ensure that parenthesis will be properly matched.
-
-            (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
-            (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S),  r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
-            (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'unsigned long \1[BITS_TO_LONGS(\2)]'),
-            (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'unsigned long \1[1 << ((\2) - 1)]'),
-            (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\2 *\1'),
-            (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\2 *\1'),
-            (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S),  r'\1 \2[]'),
-            (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S),  r'dma_addr_t \1'),
-            (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S),  r'__u32 \1'),
-        ]
-
-        # Regexes here are guaranteed to have the end limiter matching
-        # the start delimiter. Yet, right now, only one replace group
-        # is allowed.
-
-        sub_nested_prefixes = [
-            (re.compile(r'\bSTRUCT_GROUP\('),  r'\1'),
-        ]
-
-        for search, sub in sub_prefixes:
-            members = search.sub(sub, members)
-
-        nested = NestedMatch()
-
-        for search, sub in sub_nested_prefixes:
-            members = nested.sub(search, sub, members)
-
-        # Keeps the original declaration as-is
-        declaration = members
-
-        # Split nested struct/union elements
-        #
-        # This loop was simpler at the original kernel-doc perl version, as
-        #   while ($members =~ m/$struct_members/) { ... }
-        # reads 'members' string on each interaction.
-        #
-        # Python behavior is different: it parses 'members' only once,
-        # creating a list of tuples from the first interaction.
-        #
-        # On other words, this won't get nested structs.
-        #
-        # So, we need to have an extra loop on Python to override such
-        # re limitation.
-
-        while True:
-            tuples = struct_members.findall(members)
-            if not tuples:
-                break
-
-            for t in tuples:
-                newmember = ""
-                maintype = t[0]
-                s_ids = t[5]
-                content = t[3]
-
-                oldmember = "".join(t)
-
-                for s_id in s_ids.split(','):
-                    s_id = s_id.strip()
-
-                    newmember += f"{maintype} {s_id}; "
-                    s_id = Re(r'[:\[].*').sub('', s_id)
-                    s_id = Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
-
-                    for arg in content.split(';'):
-                        arg = arg.strip()
-
-                        if not arg:
-                            continue
-
-                        r = Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
-                        if r.match(arg):
-                            # Pointer-to-function
-                            dtype = r.group(1)
-                            name = r.group(2)
-                            extra = r.group(3)
-
-                            if not name:
-                                continue
-
-                            if not s_id:
-                                # Anonymous struct/union
-                                newmember += f"{dtype}{name}{extra}; "
-                            else:
-                                newmember += f"{dtype}{s_id}.{name}{extra}; "
-
-                        else:
-                            arg = arg.strip()
-                            # Handle bitmaps
-                            arg = Re(r':\s*\d+\s*').sub('', arg)
-
-                            # Handle arrays
-                            arg = Re(r'\[.*\]').sub('', arg)
-
-                            # Handle multiple IDs
-                            arg = Re(r'\s*,\s*').sub(',', arg)
-
-
-                            r = Re(r'(.*)\s+([\S+,]+)')
-
-                            if r.search(arg):
-                                dtype = r.group(1)
-                                names = r.group(2)
-                            else:
-                                newmember += f"{arg}; "
-                                continue
-
-                            for name in names.split(','):
-                                name = Re(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
-
-                                if not name:
-                                    continue
-
-                                if not s_id:
-                                    # Anonymous struct/union
-                                    newmember += f"{dtype} {name}; "
-                                else:
-                                    newmember += f"{dtype} {s_id}.{name}; "
-
-                members = members.replace(oldmember, newmember)
-
-        # Ignore other nested elements, like enums
-        members = re.sub(r'(\{[^\{\}]*\})', '', members)
-
-        self.create_parameter_list(ln, decl_type, members, ';',
-                                   declaration_name)
-        self.check_sections(ln, declaration_name, decl_type,
-                            self.entry.sectcheck, self.entry.struct_actual)
-
-        # Adjust declaration for better display
-        declaration = Re(r'([\{;])').sub(r'\1\n', declaration)
-        declaration = Re(r'\}\s+;').sub('};', declaration)
-
-        # Better handle inlined enums
-        while True:
-            r = Re(r'(enum\s+\{[^\}]+),([^\n])')
-            if not r.search(declaration):
-                break
-
-            declaration = r.sub(r'\1,\n\2', declaration)
-
-        def_args = declaration.split('\n')
-        level = 1
-        declaration = ""
-        for clause in def_args:
-
-            clause = clause.strip()
-            clause = Re(r'\s+').sub(' ', clause, count=1)
-
-            if not clause:
-                continue
-
-            if '}' in clause and level > 1:
-                level -= 1
-
-            if not Re(r'^\s*#').match(clause):
-                declaration += "\t" * level
-
-            declaration += "\t" + clause + "\n"
-            if "{" in clause and "}" not in clause:
-                level += 1
-
-        self.output_declaration(decl_type, declaration_name,
-                    struct=declaration_name,
-                    module=self.entry.modulename,
-                    definition=declaration,
-                    parameterlist=self.entry.parameterlist,
-                    parameterdescs=self.entry.parameterdescs,
-                    parametertypes=self.entry.parametertypes,
-                    sectionlist=self.entry.sectionlist,
-                    sections=self.entry.sections,
-                    purpose=self.entry.declaration_purpose)
-
-    def dump_enum(self, ln, proto):
-
-        # Ignore members marked private
-        proto = Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
-        proto = Re(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
-
-        # Strip comments
-        proto = Re(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
-
-        # Strip #define macros inside enums
-        proto = Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
-
-        members = None
-        declaration_name = None
-
-        r = Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
-        if r.search(proto):
-            declaration_name = r.group(2)
-            members = r.group(1).rstrip()
-        else:
-            r = Re(r'enum\s+(\w*)\s*\{(.*)\}')
-            if r.match(proto):
-                declaration_name = r.group(1)
-                members = r.group(2).rstrip()
-
-        if not members:
-            self.emit_warning(ln, f"{proto}: error: Cannot parse enum!")
-            self.config.errors += 1
-            return
-
-        if self.entry.identifier != declaration_name:
-            if self.entry.identifier == "":
-                self.emit_warning(ln,
-                                  f"{proto}: wrong kernel-doc identifier on prototype")
-            else:
-                self.emit_warning(ln,
-                                  f"expecting prototype for enum {self.entry.identifier}. Prototype was for enum {declaration_name} instead")
-            return
-
-        if not declaration_name:
-            declaration_name = "(anonymous)"
-
-        member_set = set()
-
-        members = Re(r'\([^;]*?[\)]').sub('', members)
-
-        for arg in members.split(','):
-            if not arg:
-                continue
-            arg = Re(r'^\s*(\w+).*').sub(r'\1', arg)
-            self.entry.parameterlist.append(arg)
-            if arg not in self.entry.parameterdescs:
-                self.entry.parameterdescs[arg] = self.undescribed
-                if self.show_warnings("enum", declaration_name):
-                    self.emit_warning(ln,
-                                      f"Enum value '{arg}' not described in enum '{declaration_name}'")
-            member_set.add(arg)
-
-        for k in self.entry.parameterdescs:
-            if k not in member_set:
-                if self.show_warnings("enum", declaration_name):
-                    self.emit_warning(ln,
-                                      f"Excess enum value '%{k}' description in '{declaration_name}'")
-
-        self.output_declaration('enum', declaration_name,
-                   enum=declaration_name,
-                   module=self.config.modulename,
-                   parameterlist=self.entry.parameterlist,
-                   parameterdescs=self.entry.parameterdescs,
-                   sectionlist=self.entry.sectionlist,
-                   sections=self.entry.sections,
-                   purpose=self.entry.declaration_purpose)
-
-    def dump_declaration(self, ln, prototype):
-        if self.entry.decl_type == "enum":
-            self.dump_enum(ln, prototype)
-            return
-
-        if self.entry.decl_type == "typedef":
-            self.dump_typedef(ln, prototype)
-            return
-
-        if self.entry.decl_type in ["union", "struct"]:
-            self.dump_struct(ln, prototype)
-            return
-
-        # TODO: handle other types
-        self.output_declaration(self.entry.decl_type, prototype,
-                   entry=self.entry)
-
-    def dump_function(self, ln, prototype):
-
-        func_macro = False
-        return_type = ''
-        decl_type = 'function'
-
-        # Prefixes that would be removed
-        sub_prefixes = [
-            (r"^static +", "", 0),
-            (r"^extern +", "", 0),
-            (r"^asmlinkage +", "", 0),
-            (r"^inline +", "", 0),
-            (r"^__inline__ +", "", 0),
-            (r"^__inline +", "", 0),
-            (r"^__always_inline +", "", 0),
-            (r"^noinline +", "", 0),
-            (r"^__FORTIFY_INLINE +", "", 0),
-            (r"__init +", "", 0),
-            (r"__init_or_module +", "", 0),
-            (r"__deprecated +", "", 0),
-            (r"__flatten +", "", 0),
-            (r"__meminit +", "", 0),
-            (r"__must_check +", "", 0),
-            (r"__weak +", "", 0),
-            (r"__sched +", "", 0),
-            (r"_noprof", "", 0),
-            (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0),
-            (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", 0),
-            (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0),
-            (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2", 0),
-            (r"__attribute_const__ +", "", 0),
-
-            # It seems that Python support for re.X is broken:
-            # At least for me (Python 3.13), this didn't work
-#            (r"""
-#              __attribute__\s*\(\(
-#                (?:
-#                    [\w\s]+          # attribute name
-#                    (?:\([^)]*\))?   # attribute arguments
-#                    \s*,?            # optional comma at the end
-#                )+
-#              \)\)\s+
-#             """, "", re.X),
-
-            # So, remove whitespaces and comments from it
-            (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+", "", 0),
-        ]
-
-        for search, sub, flags in sub_prefixes:
-            prototype = Re(search, flags).sub(sub, prototype)
-
-        # Macros are a special case, as they change the prototype format
-        new_proto = Re(r"^#\s*define\s+").sub("", prototype)
-        if new_proto != prototype:
-            is_define_proto = True
-            prototype = new_proto
-        else:
-            is_define_proto = False
-
-        # Yes, this truly is vile.  We are looking for:
-        # 1. Return type (may be nothing if we're looking at a macro)
-        # 2. Function name
-        # 3. Function parameters.
-        #
-        # All the while we have to watch out for function pointer parameters
-        # (which IIRC is what the two sections are for), C types (these
-        # regexps don't even start to express all the possibilities), and
-        # so on.
-        #
-        # If you mess with these regexps, it's a good idea to check that
-        # the following functions' documentation still comes out right:
-        # - parport_register_device (function pointer parameters)
-        # - atomic_set (macro)
-        # - pci_match_device, __copy_to_user (long return type)
-
-        name = r'[a-zA-Z0-9_~:]+'
-        prototype_end1 = r'[^\(]*'
-        prototype_end2 = r'[^\{]*'
-        prototype_end = fr'\(({prototype_end1}|{prototype_end2})\)'
-
-        # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing group.
-        # So, this needs to be mapped in Python with (?:...)? or (?:...)+
-
-        type1 = r'(?:[\w\s]+)?'
-        type2 = r'(?:[\w\s]+\*+)+'
-
-        found = False
-
-        if is_define_proto:
-            r = Re(r'^()(' + name + r')\s+')
-
-            if r.search(prototype):
-                return_type = ''
-                declaration_name = r.group(2)
-                func_macro = True
-
-                found = True
-
-        if not found:
-            patterns = [
-                rf'^()({name})\s*{prototype_end}',
-                rf'^({type1})\s+({name})\s*{prototype_end}',
-                rf'^({type2})\s*({name})\s*{prototype_end}',
-            ]
-
-            for p in patterns:
-                r = Re(p)
-
-                if r.match(prototype):
-
-                    return_type = r.group(1)
-                    declaration_name = r.group(2)
-                    args = r.group(3)
-
-                    self.create_parameter_list(ln, decl_type, args, ',',
-                                               declaration_name)
-
-                    found = True
-                    break
-        if not found:
-            self.emit_warning(ln,
-                              f"cannot understand function prototype: '{prototype}'")
-            return
-
-        if self.entry.identifier != declaration_name:
-            self.emit_warning(ln,
-                              f"expecting prototype for {self.entry.identifier}(). Prototype was for {declaration_name}() instead")
-            return
-
-        prms = " ".join(self.entry.parameterlist)
-        self.check_sections(ln, declaration_name, "function",
-                            self.entry.sectcheck, prms)
-
-        self.check_return_section(ln, declaration_name, return_type)
-
-        if 'typedef' in return_type:
-            self.output_declaration(decl_type, declaration_name,
-                       function=declaration_name,
-                       typedef=True,
-                       module=self.config.modulename,
-                       functiontype=return_type,
-                       parameterlist=self.entry.parameterlist,
-                       parameterdescs=self.entry.parameterdescs,
-                       parametertypes=self.entry.parametertypes,
-                       sectionlist=self.entry.sectionlist,
-                       sections=self.entry.sections,
-                       purpose=self.entry.declaration_purpose,
-                       func_macro=func_macro)
-        else:
-            self.output_declaration(decl_type, declaration_name,
-                       function=declaration_name,
-                       typedef=False,
-                       module=self.config.modulename,
-                       functiontype=return_type,
-                       parameterlist=self.entry.parameterlist,
-                       parameterdescs=self.entry.parameterdescs,
-                       parametertypes=self.entry.parametertypes,
-                       sectionlist=self.entry.sectionlist,
-                       sections=self.entry.sections,
-                       purpose=self.entry.declaration_purpose,
-                       func_macro=func_macro)
-
-    def dump_typedef(self, ln, proto):
-        typedef_type = r'((?:\s+[\w\*]+\b){1,8})\s*'
-        typedef_ident = r'\*?\s*(\w\S+)\s*'
-        typedef_args = r'\s*\((.*)\);'
-
-        typedef1 = Re(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
-        typedef2 = Re(r'typedef' + typedef_type + typedef_ident + typedef_args)
-
-        # Strip comments
-        proto = Re(r'/\*.*?\*/', flags=re.S).sub('', proto)
-
-        # Parse function typedef prototypes
-        for r in [typedef1, typedef2]:
-            if not r.match(proto):
-                continue
-
-            return_type = r.group(1).strip()
-            declaration_name = r.group(2)
-            args = r.group(3)
-
-            if self.entry.identifier != declaration_name:
-                self.emit_warning(ln,
-                                  f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
-                return
-
-            decl_type = 'function'
-            self.create_parameter_list(ln, decl_type, args, ',', declaration_name)
-
-            self.output_declaration(decl_type, declaration_name,
-                       function=declaration_name,
-                       typedef=True,
-                       module=self.entry.modulename,
-                       functiontype=return_type,
-                       parameterlist=self.entry.parameterlist,
-                       parameterdescs=self.entry.parameterdescs,
-                       parametertypes=self.entry.parametertypes,
-                       sectionlist=self.entry.sectionlist,
-                       sections=self.entry.sections,
-                       purpose=self.entry.declaration_purpose)
-            return
-
-        # Handle nested parentheses or brackets
-        r = Re(r'(\(*.\)\s*|\[*.\]\s*);$')
-        while r.search(proto):
-            proto = r.sub('', proto)
-
-        # Parse simple typedefs
-        r = Re(r'typedef.*\s+(\w+)\s*;')
-        if r.match(proto):
-            declaration_name = r.group(1)
-
-            if self.entry.identifier != declaration_name:
-                self.emit_warning(ln, f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
-                return
-
-            self.output_declaration('typedef', declaration_name,
-                       typedef=declaration_name,
-                       module=self.entry.modulename,
-                       sectionlist=self.entry.sectionlist,
-                       sections=self.entry.sections,
-                       purpose=self.entry.declaration_purpose)
-            return
-
-        self.emit_warning(ln, "error: Cannot parse typedef!")
-        self.config.errors += 1
-
-    @staticmethod
-    def process_export(function_table, line):
-        """
-        process EXPORT_SYMBOL* tags
-
-        This method is called both internally and externally, so, it
-        doesn't use self.
-        """
-
-        if export_symbol.search(line):
-            symbol = export_symbol.group(2)
-            function_table.add(symbol)
-
-        if export_symbol_ns.search(line):
-            symbol = export_symbol_ns.group(2)
-            function_table.add(symbol)
-
-    def process_normal(self, ln, line):
-        """
-        STATE_NORMAL: looking for the /** to begin everything.
-        """
-
-        if not doc_start.match(line):
-            return
-
-        # start a new entry
-        self.reset_state(ln + 1)
-        self.entry.in_doc_sect = False
-
-        # next line is always the function name
-        self.state = self.STATE_NAME
-
-    def process_name(self, ln, line):
-        """
-        STATE_NAME: Looking for the "name - description" line
-        """
-
-        if doc_block.search(line):
-            self.entry.new_start_line = ln
-
-            if not doc_block.group(1):
-                self.entry.section = self.section_intro
-            else:
-                self.entry.section = doc_block.group(1)
-
-            self.state = self.STATE_DOCBLOCK
-            return
-
-        if doc_decl.search(line):
-            self.entry.identifier = doc_decl.group(1)
-            self.entry.is_kernel_comment = False
-
-            decl_start = str(doc_com)       # comment block asterisk
-            fn_type = r"(?:\w+\s*\*\s*)?"  # type (for non-functions)
-            parenthesis = r"(?:\(\w*\))?"   # optional parenthesis on function
-            decl_end = r"(?:[-:].*)"         # end of the name part
-
-            # test for pointer declaration type, foo * bar() - desc
-            r = Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}?$")
-            if r.search(line):
-                self.entry.identifier = r.group(1)
-
-            # Test for data declaration
-            r = Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)")
-            if r.search(line):
-                self.entry.decl_type = r.group(1)
-                self.entry.identifier = r.group(2)
-                self.entry.is_kernel_comment = True
-            else:
-                # Look for foo() or static void foo() - description;
-                # or misspelt identifier
-
-                r1 = Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s*{decl_end}?$")
-                r2 = Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis}\s*{decl_end}$")
-
-                for r in [r1, r2]:
-                    if r.search(line):
-                        self.entry.identifier = r.group(1)
-                        self.entry.decl_type = "function"
-
-                        r = Re(r"define\s+")
-                        self.entry.identifier = r.sub("", self.entry.identifier)
-                        self.entry.is_kernel_comment = True
-                        break
-
-            self.entry.identifier = self.entry.identifier.strip(" ")
-
-            self.state = self.STATE_BODY
-
-            # if there's no @param blocks need to set up default section here
-            self.entry.section = self.section_default
-            self.entry.new_start_line = ln + 1
-
-            r = Re("[-:](.*)")
-            if r.search(line):
-                # strip leading/trailing/multiple spaces
-                self.entry.descr = r.group(1).strip(" ")
-
-                r = Re(r"\s+")
-                self.entry.descr = r.sub(" ", self.entry.descr)
-                self.entry.declaration_purpose = self.entry.descr
-                self.state = self.STATE_BODY_MAYBE
-            else:
-                self.entry.declaration_purpose = ""
-
-            if not self.entry.is_kernel_comment:
-                self.emit_warning(ln,
-                                  f"This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{line}")
-                self.state = self.STATE_NORMAL
-
-            if not self.entry.declaration_purpose and self.config.wshort_desc:
-                self.emit_warning(ln,
-                                  f"missing initial short description on line:\n{line}")
-
-            if not self.entry.identifier and self.entry.decl_type != "enum":
-                self.emit_warning(ln,
-                                  f"wrong kernel-doc identifier on line:\n{line}")
-                self.state = self.STATE_NORMAL
-
-            if self.config.verbose:
-                self.emit_warning(ln,
-                                  f"Scanning doc for {self.entry.decl_type} {self.entry.identifier}",
-                             warning=False)
-
-            return
-
-        # Failed to find an identifier. Emit a warning
-        self.emit_warning(ln, f"Cannot find identifier on line:\n{line}")
-
-    def process_body(self, ln, line):
-        """
-        STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment.
-        """
-
-        if self.state == self.STATE_BODY_WITH_BLANK_LINE:
-            r = Re(r"\s*\*\s?\S")
-            if r.match(line):
-                self.dump_section()
-                self.entry.section = self.section_default
-                self.entry.new_start_line = line
-                self.entry.contents = ""
-
-        if doc_sect.search(line):
-            self.entry.in_doc_sect = True
-            newsection = doc_sect.group(1)
-
-            if newsection.lower() in ["description", "context"]:
-                newsection = newsection.title()
-
-            # Special case: @return is a section, not a param description
-            if newsection.lower() in ["@return", "@returns",
-                                    "return", "returns"]:
-                newsection = "Return"
-
-            # Perl kernel-doc has a check here for contents before sections.
-            # the logic there is always false, as in_doc_sect variable is
-            # always true. So, just don't implement Wcontents_before_sections
-
-            # .title()
-            newcontents = doc_sect.group(2)
-            if not newcontents:
-                newcontents = ""
-
-            if self.entry.contents.strip("\n"):
-                self.dump_section()
-
-            self.entry.new_start_line = ln
-            self.entry.section = newsection
-            self.entry.leading_space = None
-
-            self.entry.contents = newcontents.lstrip()
-            if self.entry.contents:
-                self.entry.contents += "\n"
-
-            self.state = self.STATE_BODY
-            return
-
-        if doc_end.search(line):
-            self.dump_section()
-
-            # Look for doc_com + <text> + doc_end:
-            r = Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
-            if r.match(line):
-                self.emit_warning(ln, f"suspicious ending line: {line}")
-
-            self.entry.prototype = ""
-            self.entry.new_start_line = ln + 1
-
-            self.state = self.STATE_PROTO
-            return
-
-        if doc_content.search(line):
-            cont = doc_content.group(1)
-
-            if cont == "":
-                if self.entry.section == self.section_context:
-                    self.dump_section()
-
-                    self.entry.new_start_line = ln
-                    self.state = self.STATE_BODY
-                else:
-                    if self.entry.section != self.section_default:
-                        self.state = self.STATE_BODY_WITH_BLANK_LINE
-                    else:
-                        self.state = self.STATE_BODY
-
-                    self.entry.contents += "\n"
-
-            elif self.state == self.STATE_BODY_MAYBE:
-
-                # Continued declaration purpose
-                self.entry.declaration_purpose = self.entry.declaration_purpose.rstrip()
-                self.entry.declaration_purpose += " " + cont
-
-                r = Re(r"\s+")
-                self.entry.declaration_purpose = r.sub(' ',
-                                                       self.entry.declaration_purpose)
-
-            else:
-                if self.entry.section.startswith('@') or        \
-                   self.entry.section == self.section_context:
-                    if self.entry.leading_space is None:
-                        r = Re(r'^(\s+)')
-                        if r.match(cont):
-                            self.entry.leading_space = len(r.group(1))
-                        else:
-                            self.entry.leading_space = 0
-
-                    # Double-check if leading space are realy spaces
-                    pos = 0
-                    for i in range(0, self.entry.leading_space):
-                        if cont[i] != " ":
-                            break
-                        pos += 1
-
-                    cont = cont[pos:]
-
-                    # NEW LOGIC:
-                    # In case it is different, update it
-                    if self.entry.leading_space != pos:
-                        self.entry.leading_space = pos
-
-                self.entry.contents += cont + "\n"
-            return
-
-        # Unknown line, ignore
-        self.emit_warning(ln, f"bad line: {line}")
-
-    def process_inline(self, ln, line):
-        """STATE_INLINE: docbook comments within a prototype."""
-
-        if self.inline_doc_state == self.STATE_INLINE_NAME and \
-           doc_inline_sect.search(line):
-            self.entry.section = doc_inline_sect.group(1)
-            self.entry.new_start_line = ln
-
-            self.entry.contents = doc_inline_sect.group(2).lstrip()
-            if self.entry.contents != "":
-                self.entry.contents += "\n"
-
-            self.inline_doc_state = self.STATE_INLINE_TEXT
-            # Documentation block end */
-            return
-
-        if doc_inline_end.search(line):
-            if self.entry.contents not in ["", "\n"]:
-                self.dump_section()
-
-            self.state = self.STATE_PROTO
-            self.inline_doc_state = self.STATE_INLINE_NA
-            return
-
-        if doc_content.search(line):
-            if self.inline_doc_state == self.STATE_INLINE_TEXT:
-                self.entry.contents += doc_content.group(1) + "\n"
-                if not self.entry.contents.strip(" ").rstrip("\n"):
-                    self.entry.contents = ""
-
-            elif self.inline_doc_state == self.STATE_INLINE_NAME:
-                self.emit_warning(ln,
-                                  f"Incorrect use of kernel-doc format: {line}")
-
-                self.inline_doc_state = self.STATE_INLINE_ERROR
-
-    def syscall_munge(self, ln, proto):
-        """
-        Handle syscall definitions
-        """
-
-        is_void = False
-
-        # Strip newlines/CR's
-        proto = re.sub(r'[\r\n]+', ' ', proto)
-
-        # Check if it's a SYSCALL_DEFINE0
-        if 'SYSCALL_DEFINE0' in proto:
-            is_void = True
-
-        # Replace SYSCALL_DEFINE with correct return type & function name
-        proto = Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
-
-        r = Re(r'long\s+(sys_.*?),')
-        if r.search(proto):
-            proto = proto.replace(',', '(', count=1)
-        elif is_void:
-            proto = proto.replace(')', '(void)', count=1)
-
-        # Now delete all of the odd-numbered commas in the proto
-        # so that argument types & names don't have a comma between them
-        count = 0
-        length = len(proto)
-
-        if is_void:
-            length = 0  # skip the loop if is_void
-
-        for ix in range(length):
-            if proto[ix] == ',':
-                count += 1
-                if count % 2 == 1:
-                    proto = proto[:ix] + ' ' + proto[ix+1:]
-
-        return proto
-
-    def tracepoint_munge(self, ln, proto):
-        """
-        Handle tracepoint definitions
-        """
-
-        tracepointname = None
-        tracepointargs = None
-
-        # Match tracepoint name based on different patterns
-        r = Re(r'TRACE_EVENT\((.*?),')
-        if r.search(proto):
-            tracepointname = r.group(1)
-
-        r = Re(r'DEFINE_SINGLE_EVENT\((.*?),')
-        if r.search(proto):
-            tracepointname = r.group(1)
-
-        r = Re(r'DEFINE_EVENT\((.*?),(.*?),')
-        if r.search(proto):
-            tracepointname = r.group(2)
-
-        if tracepointname:
-            tracepointname = tracepointname.lstrip()
-
-        r = Re(r'TP_PROTO\((.*?)\)')
-        if r.search(proto):
-            tracepointargs = r.group(1)
-
-        if not tracepointname or not tracepointargs:
-            self.emit_warning(ln,
-                              f"Unrecognized tracepoint format:\n{proto}\n")
-        else:
-            proto = f"static inline void trace_{tracepointname}({tracepointargs})"
-            self.entry.identifier = f"trace_{self.entry.identifier}"
-
-        return proto
-
-    def process_proto_function(self, ln, line):
-        """Ancillary routine to process a function prototype"""
-
-        # strip C99-style comments to end of line
-        r = Re(r"\/\/.*$", re.S)
-        line = r.sub('', line)
-
-        if Re(r'\s*#\s*define').match(line):
-            self.entry.prototype = line
-        elif line.startswith('#'):
-            # Strip other macros like #ifdef/#ifndef/#endif/...
-            pass
-        else:
-            r = Re(r'([^\{]*)')
-            if r.match(line):
-                self.entry.prototype += r.group(1) + " "
-
-        if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line):
-            # strip comments
-            r = Re(r'/\*.*?\*/')
-            self.entry.prototype = r.sub('', self.entry.prototype)
-
-            # strip newlines/cr's
-            r = Re(r'[\r\n]+')
-            self.entry.prototype = r.sub(' ', self.entry.prototype)
-
-            # strip leading spaces
-            r = Re(r'^\s+')
-            self.entry.prototype = r.sub('', self.entry.prototype)
-
-            # Handle self.entry.prototypes for function pointers like:
-            #       int (*pcs_config)(struct foo)
-
-            r = Re(r'^(\S+\s+)\(\s*\*(\S+)\)')
-            self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
-
-            if 'SYSCALL_DEFINE' in self.entry.prototype:
-                self.entry.prototype = self.syscall_munge(ln,
-                                                          self.entry.prototype)
-
-            r = Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
-            if r.search(self.entry.prototype):
-                self.entry.prototype = self.tracepoint_munge(ln,
-                                                             self.entry.prototype)
-
-            self.dump_function(ln, self.entry.prototype)
-            self.reset_state(ln)
-
-    def process_proto_type(self, ln, line):
-        """Ancillary routine to process a type"""
-
-        # Strip newlines/cr's.
-        line = Re(r'[\r\n]+', re.S).sub(' ', line)
-
-        # Strip leading spaces
-        line = Re(r'^\s+', re.S).sub('', line)
-
-        # Strip trailing spaces
-        line = Re(r'\s+$', re.S).sub('', line)
-
-        # Strip C99-style comments to the end of the line
-        line = Re(r"\/\/.*$", re.S).sub('', line)
-
-        # To distinguish preprocessor directive from regular declaration later.
-        if line.startswith('#'):
-            line += ";"
-
-        r = Re(r'([^\{\};]*)([\{\};])(.*)')
-        while True:
-            if r.search(line):
-                if self.entry.prototype:
-                    self.entry.prototype += " "
-                self.entry.prototype += r.group(1) + r.group(2)
-
-                self.entry.brcount += r.group(2).count('{')
-                self.entry.brcount -= r.group(2).count('}')
-
-                self.entry.brcount = max(self.entry.brcount, 0)
-
-                if r.group(2) == ';' and self.entry.brcount == 0:
-                    self.dump_declaration(ln, self.entry.prototype)
-                    self.reset_state(ln)
-                    break
-
-                line = r.group(3)
-            else:
-                self.entry.prototype += line
-                break
-
-    def process_proto(self, ln, line):
-        """STATE_PROTO: reading a function/whatever prototype."""
-
-        if doc_inline_oneline.search(line):
-            self.entry.section = doc_inline_oneline.group(1)
-            self.entry.contents = doc_inline_oneline.group(2)
-
-            if self.entry.contents != "":
-                self.entry.contents += "\n"
-                self.dump_section(start_new=False)
-
-        elif doc_inline_start.search(line):
-            self.state = self.STATE_INLINE
-            self.inline_doc_state = self.STATE_INLINE_NAME
-
-        elif self.entry.decl_type == 'function':
-            self.process_proto_function(ln, line)
-
-        else:
-            self.process_proto_type(ln, line)
-
-    def process_docblock(self, ln, line):
-        """STATE_DOCBLOCK: within a DOC: block."""
-
-        if doc_end.search(line):
-            self.dump_section()
-            self.output_declaration("doc", None,
-                       sectionlist=self.entry.sectionlist,
-                       sections=self.entry.sections,                    module=self.config.modulename)
-            self.reset_state(ln)
-
-        elif doc_content.search(line):
-            self.entry.contents += doc_content.group(1) + "\n"
-
-    def run(self):
-        """
-        Open and process each line of a C source file.
-        he parsing is controlled via a state machine, and the line is passed
-        to a different process function depending on the state. The process
-        function may update the state as needed.
-        """
-
-        cont = False
-        prev = ""
-        prev_ln = None
-
-        try:
-            with open(self.fname, "r", encoding="utf8",
-                      errors="backslashreplace") as fp:
-                for ln, line in enumerate(fp):
-
-                    line = line.expandtabs().strip("\n")
-
-                    # Group continuation lines on prototypes
-                    if self.state == self.STATE_PROTO:
-                        if line.endswith("\\"):
-                            prev += line.removesuffix("\\")
-                            cont = True
-
-                            if not prev_ln:
-                                prev_ln = ln
-
-                            continue
-
-                        if cont:
-                            ln = prev_ln
-                            line = prev + line
-                            prev = ""
-                            cont = False
-                            prev_ln = None
-
-                    self.config.log.debug("%d %s%s: %s",
-                                          ln, self.st_name[self.state],
-                                          self.st_inline_name[self.inline_doc_state],
-                                          line)
-
-                    # TODO: not all states allow EXPORT_SYMBOL*, so this
-                    # can be optimized later on to speedup parsing
-                    self.process_export(self.config.function_table, line)
-
-                    # Hand this line to the appropriate state handler
-                    if self.state == self.STATE_NORMAL:
-                        self.process_normal(ln, line)
-                    elif self.state == self.STATE_NAME:
-                        self.process_name(ln, line)
-                    elif self.state in [self.STATE_BODY, self.STATE_BODY_MAYBE,
-                                        self.STATE_BODY_WITH_BLANK_LINE]:
-                        self.process_body(ln, line)
-                    elif self.state == self.STATE_INLINE:  # scanning for inline parameters
-                        self.process_inline(ln, line)
-                    elif self.state == self.STATE_PROTO:
-                        self.process_proto(ln, line)
-                    elif self.state == self.STATE_DOCBLOCK:
-                        self.process_docblock(ln, line)
-        except OSError:
-            self.config.log.error(f"Error: Cannot open file {self.fname}")
-            self.config.errors += 1
-
-
 class GlobSourceFiles:
     """
     Parse C source code file names and directories via an Interactor.
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
new file mode 100755
index 000000000000..3ce116595546
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -0,0 +1,1690 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=C0301,C0302,R0904,R0912,R0913,R0914,R0915,R0917,R1702
+
+"""
+kdoc_parser
+===========
+
+Read a C language source or header FILE and extract embedded
+documentation comments
+"""
+
+import argparse
+import re
+from pprint import pformat
+
+from kdoc_re import NestedMatch, Re
+
+
+#
+# Regular expressions used to parse kernel-doc markups at KernelDoc class.
+#
+# Let's declare them in lowercase outside any class to make easier to
+# convert from the python script.
+#
+# As those are evaluated at the beginning, no need to cache them
+#
+
+# Allow whitespace at end of comment start.
+doc_start = Re(r'^/\*\*\s*$', cache=False)
+
+doc_end = Re(r'\*/', cache=False)
+doc_com = Re(r'\s*\*\s*', cache=False)
+doc_com_body = Re(r'\s*\* ?', cache=False)
+doc_decl = doc_com + Re(r'(\w+)', cache=False)
+
+# @params and a strictly limited set of supported section names
+# Specifically:
+#   Match @word:
+#         @...:
+#         @{section-name}:
+# while trying to not match literal block starts like "example::"
+#
+doc_sect = doc_com + \
+            Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:([^:].*)?$',
+                flags=re.I, cache=False)
+
+doc_content = doc_com_body + Re(r'(.*)', cache=False)
+doc_block = doc_com + Re(r'DOC:\s*(.*)?', cache=False)
+doc_inline_start = Re(r'^\s*/\*\*\s*$', cache=False)
+doc_inline_sect = Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
+doc_inline_end = Re(r'^\s*\*/\s*$', cache=False)
+doc_inline_oneline = Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
+attribute = Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
+               flags=re.I | re.S, cache=False)
+
+export_symbol = Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
+export_symbol_ns = Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
+
+type_param = Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+
+
+class KernelDoc:
+    """
+    Read a C language source or header FILE and extract embedded
+    documentation comments.
+    """
+
+    # Parser states
+    STATE_NORMAL        = 0        # normal code
+    STATE_NAME          = 1        # looking for function name
+    STATE_BODY_MAYBE    = 2        # body - or maybe more description
+    STATE_BODY          = 3        # the body of the comment
+    STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
+    STATE_PROTO         = 5        # scanning prototype
+    STATE_DOCBLOCK      = 6        # documentation block
+    STATE_INLINE        = 7        # gathering doc outside main block
+
+    st_name = [
+        "NORMAL",
+        "NAME",
+        "BODY_MAYBE",
+        "BODY",
+        "BODY_WITH_BLANK_LINE",
+        "PROTO",
+        "DOCBLOCK",
+        "INLINE",
+    ]
+
+    # Inline documentation state
+    STATE_INLINE_NA     = 0 # not applicable ($state != STATE_INLINE)
+    STATE_INLINE_NAME   = 1 # looking for member name (@foo:)
+    STATE_INLINE_TEXT   = 2 # looking for member documentation
+    STATE_INLINE_END    = 3 # done
+    STATE_INLINE_ERROR  = 4 # error - Comment without header was found.
+                            # Spit a warning as it's not
+                            # proper kernel-doc and ignore the rest.
+
+    st_inline_name = [
+        "",
+        "_NAME",
+        "_TEXT",
+        "_END",
+        "_ERROR",
+    ]
+
+    # Section names
+
+    section_default = "Description"  # default section
+    section_intro = "Introduction"
+    section_context = "Context"
+    section_return = "Return"
+
+    undescribed = "-- undescribed --"
+
+    def __init__(self, config, fname):
+        """Initialize internal variables"""
+
+        self.fname = fname
+        self.config = config
+
+        # Initial state for the state machines
+        self.state = self.STATE_NORMAL
+        self.inline_doc_state = self.STATE_INLINE_NA
+
+        # Store entry currently being processed
+        self.entry = None
+
+        # Place all potential outputs into an array
+        self.entries = []
+
+    def show_warnings(self, dtype, declaration_name):  # pylint: disable=W0613
+        """
+        Allow filtering out warnings
+        """
+
+        # TODO: implement it
+
+        return True
+
+    # TODO: rename to emit_message
+    def emit_warning(self, ln, msg, warning=True):
+        """Emit a message"""
+
+        if warning:
+            self.config.log.warning("%s:%d %s", self.fname, ln, msg)
+        else:
+            self.config.log.info("%s:%d %s", self.fname, ln, msg)
+
+    def dump_section(self, start_new=True):
+        """
+        Dumps section contents to arrays/hashes intended for that purpose.
+        """
+
+        name = self.entry.section
+        contents = self.entry.contents
+
+        # TODO: we can prevent dumping empty sections here with:
+        #
+        #    if self.entry.contents.strip("\n"):
+        #       if start_new:
+        #           self.entry.section = self.section_default
+        #           self.entry.contents = ""
+        #
+        #        return
+        #
+        # But, as we want to be producing the same output of the
+        # venerable kernel-doc Perl tool, let's just output everything,
+        # at least for now
+
+        if type_param.match(name):
+            name = type_param.group(1)
+
+            self.entry.parameterdescs[name] = contents
+            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
+
+            self.entry.sectcheck += name + " "
+            self.entry.new_start_line = 0
+
+        elif name == "@...":
+            name = "..."
+            self.entry.parameterdescs[name] = contents
+            self.entry.sectcheck += name + " "
+            self.entry.parameterdesc_start_lines[name] = self.entry.new_start_line
+            self.entry.new_start_line = 0
+
+        else:
+            if name in self.entry.sections and self.entry.sections[name] != "":
+                # Only warn on user-specified duplicate section names
+                if name != self.section_default:
+                    self.emit_warning(self.entry.new_start_line,
+                                      f"duplicate section name '{name}'\n")
+                self.entry.sections[name] += contents
+            else:
+                self.entry.sections[name] = contents
+                self.entry.sectionlist.append(name)
+                self.entry.section_start_lines[name] = self.entry.new_start_line
+                self.entry.new_start_line = 0
+
+#        self.config.log.debug("Section: %s : %s", name, pformat(vars(self.entry)))
+
+        if start_new:
+            self.entry.section = self.section_default
+            self.entry.contents = ""
+
+    # TODO: rename it to store_declaration
+    def output_declaration(self, dtype, name, **args):
+        """
+        Stores the entry into an entry array.
+
+        The actual output and output filters will be handled elsewhere
+        """
+
+        # The implementation here is different than the original kernel-doc:
+        # instead of checking for output filters or actually output anything,
+        # it just stores the declaration content at self.entries, as the
+        # output will happen on a separate class.
+        #
+        # For now, we're keeping the same name of the function just to make
+        # easier to compare the source code of both scripts
+
+        if "declaration_start_line" not in args:
+            args["declaration_start_line"] = self.entry.declaration_start_line
+
+        args["type"] = dtype
+
+        # TODO: use colletions.OrderedDict
+
+        sections = args.get('sections', {})
+        sectionlist = args.get('sectionlist', [])
+
+        # Drop empty sections
+        # TODO: improve it to emit warnings
+        for section in ["Description", "Return"]:
+            if section in sectionlist:
+                if not sections[section].rstrip():
+                    del sections[section]
+                    sectionlist.remove(section)
+
+        self.entries.append((name, args))
+
+        self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
+
+    def reset_state(self, ln):
+        """
+        Ancillary routine to create a new entry. It initializes all
+        variables used by the state machine.
+        """
+
+        self.entry = argparse.Namespace
+
+        self.entry.contents = ""
+        self.entry.function = ""
+        self.entry.sectcheck = ""
+        self.entry.struct_actual = ""
+        self.entry.prototype = ""
+
+        self.entry.parameterlist = []
+        self.entry.parameterdescs = {}
+        self.entry.parametertypes = {}
+        self.entry.parameterdesc_start_lines = {}
+
+        self.entry.section_start_lines = {}
+        self.entry.sectionlist = []
+        self.entry.sections = {}
+
+        self.entry.anon_struct_union = False
+
+        self.entry.leading_space = None
+
+        # State flags
+        self.state = self.STATE_NORMAL
+        self.inline_doc_state = self.STATE_INLINE_NA
+        self.entry.brcount = 0
+
+        self.entry.in_doc_sect = False
+        self.entry.declaration_start_line = ln
+
+    def push_parameter(self, ln, decl_type, param, dtype,
+                       org_arg, declaration_name):
+        """
+        Store parameters and their descriptions at self.entry.
+        """
+
+        if self.entry.anon_struct_union and dtype == "" and param == "}":
+            return  # Ignore the ending }; from anonymous struct/union
+
+        self.entry.anon_struct_union = False
+
+        param = Re(r'[\[\)].*').sub('', param, count=1)
+
+        if dtype == "" and param.endswith("..."):
+            if Re(r'\w\.\.\.$').search(param):
+                # For named variable parameters of the form `x...`,
+                # remove the dots
+                param = param[:-3]
+            else:
+                # Handles unnamed variable parameters
+                param = "..."
+
+            if param not in self.entry.parameterdescs or \
+                not self.entry.parameterdescs[param]:
+
+                self.entry.parameterdescs[param] = "variable arguments"
+
+        elif dtype == "" and (not param or param == "void"):
+            param = "void"
+            self.entry.parameterdescs[param] = "no arguments"
+
+        elif dtype == "" and param in ["struct", "union"]:
+            # Handle unnamed (anonymous) union or struct
+            dtype = param
+            param = "{unnamed_" + param + "}"
+            self.entry.parameterdescs[param] = "anonymous\n"
+            self.entry.anon_struct_union = True
+
+        # Handle cache group enforcing variables: they do not need
+        # to be described in header files
+        elif "__cacheline_group" in param:
+            # Ignore __cacheline_group_begin and __cacheline_group_end
+            return
+
+        # Warn if parameter has no description
+        # (but ignore ones starting with # as these are not parameters
+        # but inline preprocessor statements)
+        if param not in self.entry.parameterdescs and not param.startswith("#"):
+            self.entry.parameterdescs[param] = self.undescribed
+
+            if self.show_warnings(dtype, declaration_name) and "." not in param:
+                if decl_type == 'function':
+                    dname = f"{decl_type} parameter"
+                else:
+                    dname = f"{decl_type} member"
+
+                self.emit_warning(ln,
+                                  f"{dname} '{param}' not described in '{declaration_name}'")
+
+        # Strip spaces from param so that it is one continuous string on
+        # parameterlist. This fixes a problem where check_sections()
+        # cannot find a parameter like "addr[6 + 2]" because it actually
+        # appears as "addr[6", "+", "2]" on the parameter list.
+        # However, it's better to maintain the param string unchanged for
+        # output, so just weaken the string compare in check_sections()
+        # to ignore "[blah" in a parameter string.
+
+        self.entry.parameterlist.append(param)
+        org_arg = Re(r'\s\s+').sub(' ', org_arg)
+        self.entry.parametertypes[param] = org_arg
+
+    def save_struct_actual(self, actual):
+        """
+        Strip all spaces from the actual param so that it looks like
+        one string item.
+        """
+
+        actual = Re(r'\s*').sub("", actual, count=1)
+
+        self.entry.struct_actual += actual + " "
+
+    def create_parameter_list(self, ln, decl_type, args,
+                              splitter, declaration_name):
+        """
+        Creates a list of parameters, storing them at self.entry.
+        """
+
+        # temporarily replace all commas inside function pointer definition
+        arg_expr = Re(r'(\([^\),]+),')
+        while arg_expr.search(args):
+            args = arg_expr.sub(r"\1#", args)
+
+        for arg in args.split(splitter):
+            # Strip comments
+            arg = Re(r'\/\*.*\*\/').sub('', arg)
+
+            # Ignore argument attributes
+            arg = Re(r'\sPOS0?\s').sub(' ', arg)
+
+            # Strip leading/trailing spaces
+            arg = arg.strip()
+            arg = Re(r'\s+').sub(' ', arg, count=1)
+
+            if arg.startswith('#'):
+                # Treat preprocessor directive as a typeless variable just to fill
+                # corresponding data structures "correctly". Catch it later in
+                # output_* subs.
+
+                # Treat preprocessor directive as a typeless variable
+                self.push_parameter(ln, decl_type, arg, "",
+                                    "", declaration_name)
+
+            elif Re(r'\(.+\)\s*\(').search(arg):
+                # Pointer-to-function
+
+                arg = arg.replace('#', ',')
+
+                r = Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
+                if r.match(arg):
+                    param = r.group(1)
+                else:
+                    self.emit_warning(ln, f"Invalid param: {arg}")
+                    param = arg
+
+                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+                self.save_struct_actual(param)
+                self.push_parameter(ln, decl_type, param, dtype,
+                                    arg, declaration_name)
+
+            elif Re(r'\(.+\)\s*\[').search(arg):
+                # Array-of-pointers
+
+                arg = arg.replace('#', ',')
+                r = Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
+                if r.match(arg):
+                    param = r.group(1)
+                else:
+                    self.emit_warning(ln, f"Invalid param: {arg}")
+                    param = arg
+
+                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+
+                self.save_struct_actual(param)
+                self.push_parameter(ln, decl_type, param, dtype,
+                                    arg, declaration_name)
+
+            elif arg:
+                arg = Re(r'\s*:\s*').sub(":", arg)
+                arg = Re(r'\s*\[').sub('[', arg)
+
+                args = Re(r'\s*,\s*').split(arg)
+                if args[0] and '*' in args[0]:
+                    args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
+
+                first_arg = []
+                r = Re(r'^(.*\s+)(.*?\[.*\].*)$')
+                if args[0] and r.match(args[0]):
+                    args.pop(0)
+                    first_arg.extend(r.group(1))
+                    first_arg.append(r.group(2))
+                else:
+                    first_arg = Re(r'\s+').split(args.pop(0))
+
+                args.insert(0, first_arg.pop())
+                dtype = ' '.join(first_arg)
+
+                for param in args:
+                    if Re(r'^(\*+)\s*(.*)').match(param):
+                        r = Re(r'^(\*+)\s*(.*)')
+                        if not r.match(param):
+                            self.emit_warning(ln, f"Invalid param: {param}")
+                            continue
+
+                        param = r.group(1)
+
+                        self.save_struct_actual(r.group(2))
+                        self.push_parameter(ln, decl_type, r.group(2),
+                                            f"{dtype} {r.group(1)}",
+                                            arg, declaration_name)
+
+                    elif Re(r'(.*?):(\w+)').search(param):
+                        r = Re(r'(.*?):(\w+)')
+                        if not r.match(param):
+                            self.emit_warning(ln, f"Invalid param: {param}")
+                            continue
+
+                        if dtype != "":  # Skip unnamed bit-fields
+                            self.save_struct_actual(r.group(1))
+                            self.push_parameter(ln, decl_type, r.group(1),
+                                                f"{dtype}:{r.group(2)}",
+                                                arg, declaration_name)
+                    else:
+                        self.save_struct_actual(param)
+                        self.push_parameter(ln, decl_type, param, dtype,
+                                            arg, declaration_name)
+
+    def check_sections(self, ln, decl_name, decl_type, sectcheck, prmscheck):
+        """
+        Check for errors inside sections, emitting warnings if not found
+        parameters are described.
+        """
+
+        sects = sectcheck.split()
+        prms = prmscheck.split()
+        err = False
+
+        for sx in range(len(sects)):                  # pylint: disable=C0200
+            err = True
+            for px in range(len(prms)):               # pylint: disable=C0200
+                prm_clean = prms[px]
+                prm_clean = Re(r'\[.*\]').sub('', prm_clean)
+                prm_clean = attribute.sub('', prm_clean)
+
+                # ignore array size in a parameter string;
+                # however, the original param string may contain
+                # spaces, e.g.:  addr[6 + 2]
+                # and this appears in @prms as "addr[6" since the
+                # parameter list is split at spaces;
+                # hence just ignore "[..." for the sections check;
+                prm_clean = Re(r'\[.*').sub('', prm_clean)
+
+                if prm_clean == sects[sx]:
+                    err = False
+                    break
+
+            if err:
+                if decl_type == 'function':
+                    dname = f"{decl_type} parameter"
+                else:
+                    dname = f"{decl_type} member"
+
+                self.emit_warning(ln,
+                                  f"Excess {dname} '{sects[sx]}' description in '{decl_name}'")
+
+    def check_return_section(self, ln, declaration_name, return_type):
+        """
+        If the function doesn't return void, warns about the lack of a
+        return description.
+        """
+
+        if not self.config.wreturn:
+            return
+
+        # Ignore an empty return type (It's a macro)
+        # Ignore functions with a "void" return type (but not "void *")
+        if not return_type or Re(r'void\s*\w*\s*$').search(return_type):
+            return
+
+        if not self.entry.sections.get("Return", None):
+            self.emit_warning(ln,
+                              f"No description found for return value of '{declaration_name}'")
+
+    def dump_struct(self, ln, proto):
+        """
+        Store an entry for an struct or union
+        """
+
+        type_pattern = r'(struct|union)'
+
+        qualifiers = [
+            "__attribute__",
+            "__packed",
+            "__aligned",
+            "____cacheline_aligned_in_smp",
+            "____cacheline_aligned",
+        ]
+
+        definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
+        struct_members = Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
+
+        # Extract struct/union definition
+        members = None
+        declaration_name = None
+        decl_type = None
+
+        r = Re(type_pattern + r'\s+(\w+)\s*' + definition_body)
+        if r.search(proto):
+            decl_type = r.group(1)
+            declaration_name = r.group(2)
+            members = r.group(3)
+        else:
+            r = Re(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
+
+            if r.search(proto):
+                decl_type = r.group(1)
+                declaration_name = r.group(3)
+                members = r.group(2)
+
+        if not members:
+            self.emit_warning(ln, f"{proto} error: Cannot parse struct or union!")
+            self.config.errors += 1
+            return
+
+        if self.entry.identifier != declaration_name:
+            self.emit_warning(ln,
+                              f"expecting prototype for {decl_type} {self.entry.identifier}. Prototype was for {decl_type} {declaration_name} instead\n")
+            return
+
+        args_pattern = r'([^,)]+)'
+
+        sub_prefixes = [
+            (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), ''),
+            (Re(r'\/\*\s*private:.*', re.S | re.I), ''),
+
+            # Strip comments
+            (Re(r'\/\*.*?\*\/', re.S), ''),
+
+            # Strip attributes
+            (attribute, ' '),
+            (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '),
+            (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '),
+            (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '),
+            (Re(r'\s*__packed\s*', re.S), ' '),
+            (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '),
+            (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '),
+            (Re(r'\s*____cacheline_aligned', re.S), ' '),
+
+            # Unwrap struct_group macros based on this definition:
+            # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
+            # which has variants like: struct_group(NAME, MEMBERS...)
+            # Only MEMBERS arguments require documentation.
+            #
+            # Parsing them happens on two steps:
+            #
+            # 1. drop struct group arguments that aren't at MEMBERS,
+            #    storing them as STRUCT_GROUP(MEMBERS)
+            #
+            # 2. remove STRUCT_GROUP() ancillary macro.
+            #
+            # The original logic used to remove STRUCT_GROUP() using an
+            # advanced regex:
+            #
+            #   \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;
+            #
+            # with two patterns that are incompatible with
+            # Python re module, as it has:
+            #
+            #   - a recursive pattern: (?1)
+            #   - an atomic grouping: (?>...)
+            #
+            # I tried a simpler version: but it didn't work either:
+            #   \bSTRUCT_GROUP\(([^\)]+)\)[^;]*;
+            #
+            # As it doesn't properly match the end parenthesis on some cases.
+            #
+            # So, a better solution was crafted: there's now a NestedMatch
+            # class that ensures that delimiters after a search are properly
+            # matched. So, the implementation to drop STRUCT_GROUP() will be
+            # handled in separate.
+
+            (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('),
+            (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GROUP('),
+            (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'struct \1 \2; STRUCT_GROUP('),
+            (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP('),
+
+            # Replace macros
+            #
+            # TODO: it is better to also move those to the NestedMatch logic,
+            # to ensure that parenthesis will be properly matched.
+
+            (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
+            (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
+            (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'),
+            (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'),
+            (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+            (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+            (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\1 \2[]'),
+            (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S), r'dma_addr_t \1'),
+            (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S), r'__u32 \1'),
+        ]
+
+        # Regexes here are guaranteed to have the end limiter matching
+        # the start delimiter. Yet, right now, only one replace group
+        # is allowed.
+
+        sub_nested_prefixes = [
+            (re.compile(r'\bSTRUCT_GROUP\('), r'\1'),
+        ]
+
+        for search, sub in sub_prefixes:
+            members = search.sub(sub, members)
+
+        nested = NestedMatch()
+
+        for search, sub in sub_nested_prefixes:
+            members = nested.sub(search, sub, members)
+
+        # Keeps the original declaration as-is
+        declaration = members
+
+        # Split nested struct/union elements
+        #
+        # This loop was simpler at the original kernel-doc perl version, as
+        #   while ($members =~ m/$struct_members/) { ... }
+        # reads 'members' string on each interaction.
+        #
+        # Python behavior is different: it parses 'members' only once,
+        # creating a list of tuples from the first interaction.
+        #
+        # On other words, this won't get nested structs.
+        #
+        # So, we need to have an extra loop on Python to override such
+        # re limitation.
+
+        while True:
+            tuples = struct_members.findall(members)
+            if not tuples:
+                break
+
+            for t in tuples:
+                newmember = ""
+                maintype = t[0]
+                s_ids = t[5]
+                content = t[3]
+
+                oldmember = "".join(t)
+
+                for s_id in s_ids.split(','):
+                    s_id = s_id.strip()
+
+                    newmember += f"{maintype} {s_id}; "
+                    s_id = Re(r'[:\[].*').sub('', s_id)
+                    s_id = Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
+
+                    for arg in content.split(';'):
+                        arg = arg.strip()
+
+                        if not arg:
+                            continue
+
+                        r = Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
+                        if r.match(arg):
+                            # Pointer-to-function
+                            dtype = r.group(1)
+                            name = r.group(2)
+                            extra = r.group(3)
+
+                            if not name:
+                                continue
+
+                            if not s_id:
+                                # Anonymous struct/union
+                                newmember += f"{dtype}{name}{extra}; "
+                            else:
+                                newmember += f"{dtype}{s_id}.{name}{extra}; "
+
+                        else:
+                            arg = arg.strip()
+                            # Handle bitmaps
+                            arg = Re(r':\s*\d+\s*').sub('', arg)
+
+                            # Handle arrays
+                            arg = Re(r'\[.*\]').sub('', arg)
+
+                            # Handle multiple IDs
+                            arg = Re(r'\s*,\s*').sub(',', arg)
+
+                            r = Re(r'(.*)\s+([\S+,]+)')
+
+                            if r.search(arg):
+                                dtype = r.group(1)
+                                names = r.group(2)
+                            else:
+                                newmember += f"{arg}; "
+                                continue
+
+                            for name in names.split(','):
+                                name = Re(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
+
+                                if not name:
+                                    continue
+
+                                if not s_id:
+                                    # Anonymous struct/union
+                                    newmember += f"{dtype} {name}; "
+                                else:
+                                    newmember += f"{dtype} {s_id}.{name}; "
+
+                members = members.replace(oldmember, newmember)
+
+        # Ignore other nested elements, like enums
+        members = re.sub(r'(\{[^\{\}]*\})', '', members)
+
+        self.create_parameter_list(ln, decl_type, members, ';',
+                                   declaration_name)
+        self.check_sections(ln, declaration_name, decl_type,
+                            self.entry.sectcheck, self.entry.struct_actual)
+
+        # Adjust declaration for better display
+        declaration = Re(r'([\{;])').sub(r'\1\n', declaration)
+        declaration = Re(r'\}\s+;').sub('};', declaration)
+
+        # Better handle inlined enums
+        while True:
+            r = Re(r'(enum\s+\{[^\}]+),([^\n])')
+            if not r.search(declaration):
+                break
+
+            declaration = r.sub(r'\1,\n\2', declaration)
+
+        def_args = declaration.split('\n')
+        level = 1
+        declaration = ""
+        for clause in def_args:
+
+            clause = clause.strip()
+            clause = Re(r'\s+').sub(' ', clause, count=1)
+
+            if not clause:
+                continue
+
+            if '}' in clause and level > 1:
+                level -= 1
+
+            if not Re(r'^\s*#').match(clause):
+                declaration += "\t" * level
+
+            declaration += "\t" + clause + "\n"
+            if "{" in clause and "}" not in clause:
+                level += 1
+
+        self.output_declaration(decl_type, declaration_name,
+                                struct=declaration_name,
+                                module=self.entry.modulename,
+                                definition=declaration,
+                                parameterlist=self.entry.parameterlist,
+                                parameterdescs=self.entry.parameterdescs,
+                                parametertypes=self.entry.parametertypes,
+                                sectionlist=self.entry.sectionlist,
+                                sections=self.entry.sections,
+                                purpose=self.entry.declaration_purpose)
+
+    def dump_enum(self, ln, proto):
+        """
+        Stores an enum inside self.entries array.
+        """
+
+        # Ignore members marked private
+        proto = Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
+        proto = Re(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
+
+        # Strip comments
+        proto = Re(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
+
+        # Strip #define macros inside enums
+        proto = Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
+
+        members = None
+        declaration_name = None
+
+        r = Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
+        if r.search(proto):
+            declaration_name = r.group(2)
+            members = r.group(1).rstrip()
+        else:
+            r = Re(r'enum\s+(\w*)\s*\{(.*)\}')
+            if r.match(proto):
+                declaration_name = r.group(1)
+                members = r.group(2).rstrip()
+
+        if not members:
+            self.emit_warning(ln, f"{proto}: error: Cannot parse enum!")
+            self.config.errors += 1
+            return
+
+        if self.entry.identifier != declaration_name:
+            if self.entry.identifier == "":
+                self.emit_warning(ln,
+                                  f"{proto}: wrong kernel-doc identifier on prototype")
+            else:
+                self.emit_warning(ln,
+                                  f"expecting prototype for enum {self.entry.identifier}. Prototype was for enum {declaration_name} instead")
+            return
+
+        if not declaration_name:
+            declaration_name = "(anonymous)"
+
+        member_set = set()
+
+        members = Re(r'\([^;]*?[\)]').sub('', members)
+
+        for arg in members.split(','):
+            if not arg:
+                continue
+            arg = Re(r'^\s*(\w+).*').sub(r'\1', arg)
+            self.entry.parameterlist.append(arg)
+            if arg not in self.entry.parameterdescs:
+                self.entry.parameterdescs[arg] = self.undescribed
+                if self.show_warnings("enum", declaration_name):
+                    self.emit_warning(ln,
+                                      f"Enum value '{arg}' not described in enum '{declaration_name}'")
+            member_set.add(arg)
+
+        for k in self.entry.parameterdescs:
+            if k not in member_set:
+                if self.show_warnings("enum", declaration_name):
+                    self.emit_warning(ln,
+                                      f"Excess enum value '%{k}' description in '{declaration_name}'")
+
+        self.output_declaration('enum', declaration_name,
+                                enum=declaration_name,
+                                module=self.config.modulename,
+                                parameterlist=self.entry.parameterlist,
+                                parameterdescs=self.entry.parameterdescs,
+                                sectionlist=self.entry.sectionlist,
+                                sections=self.entry.sections,
+                                purpose=self.entry.declaration_purpose)
+
+    def dump_declaration(self, ln, prototype):
+        """
+        Stores a data declaration inside self.entries array.
+        """
+
+        if self.entry.decl_type == "enum":
+            self.dump_enum(ln, prototype)
+            return
+
+        if self.entry.decl_type == "typedef":
+            self.dump_typedef(ln, prototype)
+            return
+
+        if self.entry.decl_type in ["union", "struct"]:
+            self.dump_struct(ln, prototype)
+            return
+
+        # TODO: handle other types
+        self.output_declaration(self.entry.decl_type, prototype,
+                                entry=self.entry)
+
+    def dump_function(self, ln, prototype):
+        """
+        Stores a function of function macro inside self.entries array.
+        """
+
+        func_macro = False
+        return_type = ''
+        decl_type = 'function'
+
+        # Prefixes that would be removed
+        sub_prefixes = [
+            (r"^static +", "", 0),
+            (r"^extern +", "", 0),
+            (r"^asmlinkage +", "", 0),
+            (r"^inline +", "", 0),
+            (r"^__inline__ +", "", 0),
+            (r"^__inline +", "", 0),
+            (r"^__always_inline +", "", 0),
+            (r"^noinline +", "", 0),
+            (r"^__FORTIFY_INLINE +", "", 0),
+            (r"__init +", "", 0),
+            (r"__init_or_module +", "", 0),
+            (r"__deprecated +", "", 0),
+            (r"__flatten +", "", 0),
+            (r"__meminit +", "", 0),
+            (r"__must_check +", "", 0),
+            (r"__weak +", "", 0),
+            (r"__sched +", "", 0),
+            (r"_noprof", "", 0),
+            (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0),
+            (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", 0),
+            (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0),
+            (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2", 0),
+            (r"__attribute_const__ +", "", 0),
+
+            # It seems that Python support for re.X is broken:
+            # At least for me (Python 3.13), this didn't work
+#            (r"""
+#              __attribute__\s*\(\(
+#                (?:
+#                    [\w\s]+          # attribute name
+#                    (?:\([^)]*\))?   # attribute arguments
+#                    \s*,?            # optional comma at the end
+#                )+
+#              \)\)\s+
+#             """, "", re.X),
+
+            # So, remove whitespaces and comments from it
+            (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+", "", 0),
+        ]
+
+        for search, sub, flags in sub_prefixes:
+            prototype = Re(search, flags).sub(sub, prototype)
+
+        # Macros are a special case, as they change the prototype format
+        new_proto = Re(r"^#\s*define\s+").sub("", prototype)
+        if new_proto != prototype:
+            is_define_proto = True
+            prototype = new_proto
+        else:
+            is_define_proto = False
+
+        # Yes, this truly is vile.  We are looking for:
+        # 1. Return type (may be nothing if we're looking at a macro)
+        # 2. Function name
+        # 3. Function parameters.
+        #
+        # All the while we have to watch out for function pointer parameters
+        # (which IIRC is what the two sections are for), C types (these
+        # regexps don't even start to express all the possibilities), and
+        # so on.
+        #
+        # If you mess with these regexps, it's a good idea to check that
+        # the following functions' documentation still comes out right:
+        # - parport_register_device (function pointer parameters)
+        # - atomic_set (macro)
+        # - pci_match_device, __copy_to_user (long return type)
+
+        name = r'[a-zA-Z0-9_~:]+'
+        prototype_end1 = r'[^\(]*'
+        prototype_end2 = r'[^\{]*'
+        prototype_end = fr'\(({prototype_end1}|{prototype_end2})\)'
+
+        # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing group.
+        # So, this needs to be mapped in Python with (?:...)? or (?:...)+
+
+        type1 = r'(?:[\w\s]+)?'
+        type2 = r'(?:[\w\s]+\*+)+'
+
+        found = False
+
+        if is_define_proto:
+            r = Re(r'^()(' + name + r')\s+')
+
+            if r.search(prototype):
+                return_type = ''
+                declaration_name = r.group(2)
+                func_macro = True
+
+                found = True
+
+        if not found:
+            patterns = [
+                rf'^()({name})\s*{prototype_end}',
+                rf'^({type1})\s+({name})\s*{prototype_end}',
+                rf'^({type2})\s*({name})\s*{prototype_end}',
+            ]
+
+            for p in patterns:
+                r = Re(p)
+
+                if r.match(prototype):
+
+                    return_type = r.group(1)
+                    declaration_name = r.group(2)
+                    args = r.group(3)
+
+                    self.create_parameter_list(ln, decl_type, args, ',',
+                                               declaration_name)
+
+                    found = True
+                    break
+        if not found:
+            self.emit_warning(ln,
+                              f"cannot understand function prototype: '{prototype}'")
+            return
+
+        if self.entry.identifier != declaration_name:
+            self.emit_warning(ln,
+                              f"expecting prototype for {self.entry.identifier}(). Prototype was for {declaration_name}() instead")
+            return
+
+        prms = " ".join(self.entry.parameterlist)
+        self.check_sections(ln, declaration_name, "function",
+                            self.entry.sectcheck, prms)
+
+        self.check_return_section(ln, declaration_name, return_type)
+
+        if 'typedef' in return_type:
+            self.output_declaration(decl_type, declaration_name,
+                                    function=declaration_name,
+                                    typedef=True,
+                                    module=self.config.modulename,
+                                    functiontype=return_type,
+                                    parameterlist=self.entry.parameterlist,
+                                    parameterdescs=self.entry.parameterdescs,
+                                    parametertypes=self.entry.parametertypes,
+                                    sectionlist=self.entry.sectionlist,
+                                    sections=self.entry.sections,
+                                    purpose=self.entry.declaration_purpose,
+                                    func_macro=func_macro)
+        else:
+            self.output_declaration(decl_type, declaration_name,
+                                    function=declaration_name,
+                                    typedef=False,
+                                    module=self.config.modulename,
+                                    functiontype=return_type,
+                                    parameterlist=self.entry.parameterlist,
+                                    parameterdescs=self.entry.parameterdescs,
+                                    parametertypes=self.entry.parametertypes,
+                                    sectionlist=self.entry.sectionlist,
+                                    sections=self.entry.sections,
+                                    purpose=self.entry.declaration_purpose,
+                                    func_macro=func_macro)
+
+    def dump_typedef(self, ln, proto):
+        """
+        Stores a typedef inside self.entries array.
+        """
+
+        typedef_type = r'((?:\s+[\w\*]+\b){1,8})\s*'
+        typedef_ident = r'\*?\s*(\w\S+)\s*'
+        typedef_args = r'\s*\((.*)\);'
+
+        typedef1 = Re(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
+        typedef2 = Re(r'typedef' + typedef_type + typedef_ident + typedef_args)
+
+        # Strip comments
+        proto = Re(r'/\*.*?\*/', flags=re.S).sub('', proto)
+
+        # Parse function typedef prototypes
+        for r in [typedef1, typedef2]:
+            if not r.match(proto):
+                continue
+
+            return_type = r.group(1).strip()
+            declaration_name = r.group(2)
+            args = r.group(3)
+
+            if self.entry.identifier != declaration_name:
+                self.emit_warning(ln,
+                                  f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+                return
+
+            decl_type = 'function'
+            self.create_parameter_list(ln, decl_type, args, ',', declaration_name)
+
+            self.output_declaration(decl_type, declaration_name,
+                                    function=declaration_name,
+                                    typedef=True,
+                                    module=self.entry.modulename,
+                                    functiontype=return_type,
+                                    parameterlist=self.entry.parameterlist,
+                                    parameterdescs=self.entry.parameterdescs,
+                                    parametertypes=self.entry.parametertypes,
+                                    sectionlist=self.entry.sectionlist,
+                                    sections=self.entry.sections,
+                                    purpose=self.entry.declaration_purpose)
+            return
+
+        # Handle nested parentheses or brackets
+        r = Re(r'(\(*.\)\s*|\[*.\]\s*);$')
+        while r.search(proto):
+            proto = r.sub('', proto)
+
+        # Parse simple typedefs
+        r = Re(r'typedef.*\s+(\w+)\s*;')
+        if r.match(proto):
+            declaration_name = r.group(1)
+
+            if self.entry.identifier != declaration_name:
+                self.emit_warning(ln, f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+                return
+
+            self.output_declaration('typedef', declaration_name,
+                                    typedef=declaration_name,
+                                    module=self.entry.modulename,
+                                    sectionlist=self.entry.sectionlist,
+                                    sections=self.entry.sections,
+                                    purpose=self.entry.declaration_purpose)
+            return
+
+        self.emit_warning(ln, "error: Cannot parse typedef!")
+        self.config.errors += 1
+
+    @staticmethod
+    def process_export(function_table, line):
+        """
+        process EXPORT_SYMBOL* tags
+
+        This method is called both internally and externally, so, it
+        doesn't use self.
+        """
+
+        if export_symbol.search(line):
+            symbol = export_symbol.group(2)
+            function_table.add(symbol)
+
+        if export_symbol_ns.search(line):
+            symbol = export_symbol_ns.group(2)
+            function_table.add(symbol)
+
+    def process_normal(self, ln, line):
+        """
+        STATE_NORMAL: looking for the /** to begin everything.
+        """
+
+        if not doc_start.match(line):
+            return
+
+        # start a new entry
+        self.reset_state(ln + 1)
+        self.entry.in_doc_sect = False
+
+        # next line is always the function name
+        self.state = self.STATE_NAME
+
+    def process_name(self, ln, line):
+        """
+        STATE_NAME: Looking for the "name - description" line
+        """
+
+        if doc_block.search(line):
+            self.entry.new_start_line = ln
+
+            if not doc_block.group(1):
+                self.entry.section = self.section_intro
+            else:
+                self.entry.section = doc_block.group(1)
+
+            self.state = self.STATE_DOCBLOCK
+            return
+
+        if doc_decl.search(line):
+            self.entry.identifier = doc_decl.group(1)
+            self.entry.is_kernel_comment = False
+
+            decl_start = str(doc_com)       # comment block asterisk
+            fn_type = r"(?:\w+\s*\*\s*)?"  # type (for non-functions)
+            parenthesis = r"(?:\(\w*\))?"   # optional parenthesis on function
+            decl_end = r"(?:[-:].*)"         # end of the name part
+
+            # test for pointer declaration type, foo * bar() - desc
+            r = Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}?$")
+            if r.search(line):
+                self.entry.identifier = r.group(1)
+
+            # Test for data declaration
+            r = Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)")
+            if r.search(line):
+                self.entry.decl_type = r.group(1)
+                self.entry.identifier = r.group(2)
+                self.entry.is_kernel_comment = True
+            else:
+                # Look for foo() or static void foo() - description;
+                # or misspelt identifier
+
+                r1 = Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s*{decl_end}?$")
+                r2 = Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis}\s*{decl_end}$")
+
+                for r in [r1, r2]:
+                    if r.search(line):
+                        self.entry.identifier = r.group(1)
+                        self.entry.decl_type = "function"
+
+                        r = Re(r"define\s+")
+                        self.entry.identifier = r.sub("", self.entry.identifier)
+                        self.entry.is_kernel_comment = True
+                        break
+
+            self.entry.identifier = self.entry.identifier.strip(" ")
+
+            self.state = self.STATE_BODY
+
+            # if there's no @param blocks need to set up default section here
+            self.entry.section = self.section_default
+            self.entry.new_start_line = ln + 1
+
+            r = Re("[-:](.*)")
+            if r.search(line):
+                # strip leading/trailing/multiple spaces
+                self.entry.descr = r.group(1).strip(" ")
+
+                r = Re(r"\s+")
+                self.entry.descr = r.sub(" ", self.entry.descr)
+                self.entry.declaration_purpose = self.entry.descr
+                self.state = self.STATE_BODY_MAYBE
+            else:
+                self.entry.declaration_purpose = ""
+
+            if not self.entry.is_kernel_comment:
+                self.emit_warning(ln,
+                                  f"This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{line}")
+                self.state = self.STATE_NORMAL
+
+            if not self.entry.declaration_purpose and self.config.wshort_desc:
+                self.emit_warning(ln,
+                                  f"missing initial short description on line:\n{line}")
+
+            if not self.entry.identifier and self.entry.decl_type != "enum":
+                self.emit_warning(ln,
+                                  f"wrong kernel-doc identifier on line:\n{line}")
+                self.state = self.STATE_NORMAL
+
+            if self.config.verbose:
+                self.emit_warning(ln,
+                                  f"Scanning doc for {self.entry.decl_type} {self.entry.identifier}",
+                                  warning=False)
+
+            return
+
+        # Failed to find an identifier. Emit a warning
+        self.emit_warning(ln, f"Cannot find identifier on line:\n{line}")
+
+    def process_body(self, ln, line):
+        """
+        STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment.
+        """
+
+        if self.state == self.STATE_BODY_WITH_BLANK_LINE:
+            r = Re(r"\s*\*\s?\S")
+            if r.match(line):
+                self.dump_section()
+                self.entry.section = self.section_default
+                self.entry.new_start_line = line
+                self.entry.contents = ""
+
+        if doc_sect.search(line):
+            self.entry.in_doc_sect = True
+            newsection = doc_sect.group(1)
+
+            if newsection.lower() in ["description", "context"]:
+                newsection = newsection.title()
+
+            # Special case: @return is a section, not a param description
+            if newsection.lower() in ["@return", "@returns",
+                                      "return", "returns"]:
+                newsection = "Return"
+
+            # Perl kernel-doc has a check here for contents before sections.
+            # the logic there is always false, as in_doc_sect variable is
+            # always true. So, just don't implement Wcontents_before_sections
+
+            # .title()
+            newcontents = doc_sect.group(2)
+            if not newcontents:
+                newcontents = ""
+
+            if self.entry.contents.strip("\n"):
+                self.dump_section()
+
+            self.entry.new_start_line = ln
+            self.entry.section = newsection
+            self.entry.leading_space = None
+
+            self.entry.contents = newcontents.lstrip()
+            if self.entry.contents:
+                self.entry.contents += "\n"
+
+            self.state = self.STATE_BODY
+            return
+
+        if doc_end.search(line):
+            self.dump_section()
+
+            # Look for doc_com + <text> + doc_end:
+            r = Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
+            if r.match(line):
+                self.emit_warning(ln, f"suspicious ending line: {line}")
+
+            self.entry.prototype = ""
+            self.entry.new_start_line = ln + 1
+
+            self.state = self.STATE_PROTO
+            return
+
+        if doc_content.search(line):
+            cont = doc_content.group(1)
+
+            if cont == "":
+                if self.entry.section == self.section_context:
+                    self.dump_section()
+
+                    self.entry.new_start_line = ln
+                    self.state = self.STATE_BODY
+                else:
+                    if self.entry.section != self.section_default:
+                        self.state = self.STATE_BODY_WITH_BLANK_LINE
+                    else:
+                        self.state = self.STATE_BODY
+
+                    self.entry.contents += "\n"
+
+            elif self.state == self.STATE_BODY_MAYBE:
+
+                # Continued declaration purpose
+                self.entry.declaration_purpose = self.entry.declaration_purpose.rstrip()
+                self.entry.declaration_purpose += " " + cont
+
+                r = Re(r"\s+")
+                self.entry.declaration_purpose = r.sub(' ',
+                                                       self.entry.declaration_purpose)
+
+            else:
+                if self.entry.section.startswith('@') or        \
+                   self.entry.section == self.section_context:
+                    if self.entry.leading_space is None:
+                        r = Re(r'^(\s+)')
+                        if r.match(cont):
+                            self.entry.leading_space = len(r.group(1))
+                        else:
+                            self.entry.leading_space = 0
+
+                    # Double-check if leading space are realy spaces
+                    pos = 0
+                    for i in range(0, self.entry.leading_space):
+                        if cont[i] != " ":
+                            break
+                        pos += 1
+
+                    cont = cont[pos:]
+
+                    # NEW LOGIC:
+                    # In case it is different, update it
+                    if self.entry.leading_space != pos:
+                        self.entry.leading_space = pos
+
+                self.entry.contents += cont + "\n"
+            return
+
+        # Unknown line, ignore
+        self.emit_warning(ln, f"bad line: {line}")
+
+    def process_inline(self, ln, line):
+        """STATE_INLINE: docbook comments within a prototype."""
+
+        if self.inline_doc_state == self.STATE_INLINE_NAME and \
+           doc_inline_sect.search(line):
+            self.entry.section = doc_inline_sect.group(1)
+            self.entry.new_start_line = ln
+
+            self.entry.contents = doc_inline_sect.group(2).lstrip()
+            if self.entry.contents != "":
+                self.entry.contents += "\n"
+
+            self.inline_doc_state = self.STATE_INLINE_TEXT
+            # Documentation block end */
+            return
+
+        if doc_inline_end.search(line):
+            if self.entry.contents not in ["", "\n"]:
+                self.dump_section()
+
+            self.state = self.STATE_PROTO
+            self.inline_doc_state = self.STATE_INLINE_NA
+            return
+
+        if doc_content.search(line):
+            if self.inline_doc_state == self.STATE_INLINE_TEXT:
+                self.entry.contents += doc_content.group(1) + "\n"
+                if not self.entry.contents.strip(" ").rstrip("\n"):
+                    self.entry.contents = ""
+
+            elif self.inline_doc_state == self.STATE_INLINE_NAME:
+                self.emit_warning(ln,
+                                  f"Incorrect use of kernel-doc format: {line}")
+
+                self.inline_doc_state = self.STATE_INLINE_ERROR
+
+    def syscall_munge(self, ln, proto):         # pylint: disable=W0613
+        """
+        Handle syscall definitions
+        """
+
+        is_void = False
+
+        # Strip newlines/CR's
+        proto = re.sub(r'[\r\n]+', ' ', proto)
+
+        # Check if it's a SYSCALL_DEFINE0
+        if 'SYSCALL_DEFINE0' in proto:
+            is_void = True
+
+        # Replace SYSCALL_DEFINE with correct return type & function name
+        proto = Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
+
+        r = Re(r'long\s+(sys_.*?),')
+        if r.search(proto):
+            proto = proto.replace(',', '(', count=1)
+        elif is_void:
+            proto = proto.replace(')', '(void)', count=1)
+
+        # Now delete all of the odd-numbered commas in the proto
+        # so that argument types & names don't have a comma between them
+        count = 0
+        length = len(proto)
+
+        if is_void:
+            length = 0  # skip the loop if is_void
+
+        for ix in range(length):
+            if proto[ix] == ',':
+                count += 1
+                if count % 2 == 1:
+                    proto = proto[:ix] + ' ' + proto[ix + 1:]
+
+        return proto
+
+    def tracepoint_munge(self, ln, proto):
+        """
+        Handle tracepoint definitions
+        """
+
+        tracepointname = None
+        tracepointargs = None
+
+        # Match tracepoint name based on different patterns
+        r = Re(r'TRACE_EVENT\((.*?),')
+        if r.search(proto):
+            tracepointname = r.group(1)
+
+        r = Re(r'DEFINE_SINGLE_EVENT\((.*?),')
+        if r.search(proto):
+            tracepointname = r.group(1)
+
+        r = Re(r'DEFINE_EVENT\((.*?),(.*?),')
+        if r.search(proto):
+            tracepointname = r.group(2)
+
+        if tracepointname:
+            tracepointname = tracepointname.lstrip()
+
+        r = Re(r'TP_PROTO\((.*?)\)')
+        if r.search(proto):
+            tracepointargs = r.group(1)
+
+        if not tracepointname or not tracepointargs:
+            self.emit_warning(ln,
+                              f"Unrecognized tracepoint format:\n{proto}\n")
+        else:
+            proto = f"static inline void trace_{tracepointname}({tracepointargs})"
+            self.entry.identifier = f"trace_{self.entry.identifier}"
+
+        return proto
+
+    def process_proto_function(self, ln, line):
+        """Ancillary routine to process a function prototype"""
+
+        # strip C99-style comments to end of line
+        r = Re(r"\/\/.*$", re.S)
+        line = r.sub('', line)
+
+        if Re(r'\s*#\s*define').match(line):
+            self.entry.prototype = line
+        elif line.startswith('#'):
+            # Strip other macros like #ifdef/#ifndef/#endif/...
+            pass
+        else:
+            r = Re(r'([^\{]*)')
+            if r.match(line):
+                self.entry.prototype += r.group(1) + " "
+
+        if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line):
+            # strip comments
+            r = Re(r'/\*.*?\*/')
+            self.entry.prototype = r.sub('', self.entry.prototype)
+
+            # strip newlines/cr's
+            r = Re(r'[\r\n]+')
+            self.entry.prototype = r.sub(' ', self.entry.prototype)
+
+            # strip leading spaces
+            r = Re(r'^\s+')
+            self.entry.prototype = r.sub('', self.entry.prototype)
+
+            # Handle self.entry.prototypes for function pointers like:
+            #       int (*pcs_config)(struct foo)
+
+            r = Re(r'^(\S+\s+)\(\s*\*(\S+)\)')
+            self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
+
+            if 'SYSCALL_DEFINE' in self.entry.prototype:
+                self.entry.prototype = self.syscall_munge(ln,
+                                                          self.entry.prototype)
+
+            r = Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
+            if r.search(self.entry.prototype):
+                self.entry.prototype = self.tracepoint_munge(ln,
+                                                             self.entry.prototype)
+
+            self.dump_function(ln, self.entry.prototype)
+            self.reset_state(ln)
+
+    def process_proto_type(self, ln, line):
+        """Ancillary routine to process a type"""
+
+        # Strip newlines/cr's.
+        line = Re(r'[\r\n]+', re.S).sub(' ', line)
+
+        # Strip leading spaces
+        line = Re(r'^\s+', re.S).sub('', line)
+
+        # Strip trailing spaces
+        line = Re(r'\s+$', re.S).sub('', line)
+
+        # Strip C99-style comments to the end of the line
+        line = Re(r"\/\/.*$", re.S).sub('', line)
+
+        # To distinguish preprocessor directive from regular declaration later.
+        if line.startswith('#'):
+            line += ";"
+
+        r = Re(r'([^\{\};]*)([\{\};])(.*)')
+        while True:
+            if r.search(line):
+                if self.entry.prototype:
+                    self.entry.prototype += " "
+                self.entry.prototype += r.group(1) + r.group(2)
+
+                self.entry.brcount += r.group(2).count('{')
+                self.entry.brcount -= r.group(2).count('}')
+
+                self.entry.brcount = max(self.entry.brcount, 0)
+
+                if r.group(2) == ';' and self.entry.brcount == 0:
+                    self.dump_declaration(ln, self.entry.prototype)
+                    self.reset_state(ln)
+                    break
+
+                line = r.group(3)
+            else:
+                self.entry.prototype += line
+                break
+
+    def process_proto(self, ln, line):
+        """STATE_PROTO: reading a function/whatever prototype."""
+
+        if doc_inline_oneline.search(line):
+            self.entry.section = doc_inline_oneline.group(1)
+            self.entry.contents = doc_inline_oneline.group(2)
+
+            if self.entry.contents != "":
+                self.entry.contents += "\n"
+                self.dump_section(start_new=False)
+
+        elif doc_inline_start.search(line):
+            self.state = self.STATE_INLINE
+            self.inline_doc_state = self.STATE_INLINE_NAME
+
+        elif self.entry.decl_type == 'function':
+            self.process_proto_function(ln, line)
+
+        else:
+            self.process_proto_type(ln, line)
+
+    def process_docblock(self, ln, line):
+        """STATE_DOCBLOCK: within a DOC: block."""
+
+        if doc_end.search(line):
+            self.dump_section()
+            self.output_declaration("doc", None,
+                                    sectionlist=self.entry.sectionlist,
+                                    sections=self.entry.sections, module=self.config.modulename)
+            self.reset_state(ln)
+
+        elif doc_content.search(line):
+            self.entry.contents += doc_content.group(1) + "\n"
+
+    def run(self):
+        """
+        Open and process each line of a C source file.
+        he parsing is controlled via a state machine, and the line is passed
+        to a different process function depending on the state. The process
+        function may update the state as needed.
+        """
+
+        cont = False
+        prev = ""
+        prev_ln = None
+
+        try:
+            with open(self.fname, "r", encoding="utf8",
+                      errors="backslashreplace") as fp:
+                for ln, line in enumerate(fp):
+
+                    line = line.expandtabs().strip("\n")
+
+                    # Group continuation lines on prototypes
+                    if self.state == self.STATE_PROTO:
+                        if line.endswith("\\"):
+                            prev += line.removesuffix("\\")
+                            cont = True
+
+                            if not prev_ln:
+                                prev_ln = ln
+
+                            continue
+
+                        if cont:
+                            ln = prev_ln
+                            line = prev + line
+                            prev = ""
+                            cont = False
+                            prev_ln = None
+
+                    self.config.log.debug("%d %s%s: %s",
+                                          ln, self.st_name[self.state],
+                                          self.st_inline_name[self.inline_doc_state],
+                                          line)
+
+                    # TODO: not all states allow EXPORT_SYMBOL*, so this
+                    # can be optimized later on to speedup parsing
+                    self.process_export(self.config.function_table, line)
+
+                    # Hand this line to the appropriate state handler
+                    if self.state == self.STATE_NORMAL:
+                        self.process_normal(ln, line)
+                    elif self.state == self.STATE_NAME:
+                        self.process_name(ln, line)
+                    elif self.state in [self.STATE_BODY, self.STATE_BODY_MAYBE,
+                                        self.STATE_BODY_WITH_BLANK_LINE]:
+                        self.process_body(ln, line)
+                    elif self.state == self.STATE_INLINE:  # scanning for inline parameters
+                        self.process_inline(ln, line)
+                    elif self.state == self.STATE_PROTO:
+                        self.process_proto(ln, line)
+                    elif self.state == self.STATE_DOCBLOCK:
+                        self.process_docblock(ln, line)
+        except OSError:
+            self.config.log.error(f"Error: Cannot open file {self.fname}")
+            self.config.errors += 1
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 09/33] scripts/kernel-doc.py: move KernelFiles class to a separate file
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (7 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 08/33] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 10/33] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

The KernelFiles class is the main dispatcher which parses each
source file.

In preparation for letting kerneldoc Sphinx extension to import
Python libraries, move regex ancillary classes to a separate
file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py          | 220 +--------------------------
 scripts/lib/kdoc/kdoc_files.py | 270 +++++++++++++++++++++++++++++++++
 2 files changed, 271 insertions(+), 219 deletions(-)
 create mode 100755 scripts/lib/kdoc/kdoc_files.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index f030a36a165b..d09ada2d862a 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -119,6 +119,7 @@ sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
 
 from kdoc_parser import KernelDoc, type_param
 from kdoc_re import Re
+from kdoc_files import KernelFiles
 
 function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
 
@@ -143,225 +144,6 @@ type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
 type_fallback = Re(r"\&([_\w]+)", cache=False)
 type_member_func = type_member + Re(r"\(\)", cache=False)
 
-class GlobSourceFiles:
-    """
-    Parse C source code file names and directories via an Interactor.
-
-    """
-
-    def __init__(self, srctree=None, valid_extensions=None):
-        """
-        Initialize valid extensions with a tuple.
-
-        If not defined, assume default C extensions (.c and .h)
-
-        It would be possible to use python's glob function, but it is
-        very slow, and it is not interactive. So, it would wait to read all
-        directories before actually do something.
-
-        So, let's use our own implementation.
-        """
-
-        if not valid_extensions:
-            self.extensions = (".c", ".h")
-        else:
-            self.extensions = valid_extensions
-
-        self.srctree = srctree
-
-    def _parse_dir(self, dirname):
-        """Internal function to parse files recursively"""
-
-        with os.scandir(dirname) as obj:
-            for entry in obj:
-                name = os.path.join(dirname, entry.name)
-
-                if entry.is_dir():
-                    yield from self._parse_dir(name)
-
-                if not entry.is_file():
-                    continue
-
-                basename = os.path.basename(name)
-
-                if not basename.endswith(self.extensions):
-                    continue
-
-                yield name
-
-    def parse_files(self, file_list, file_not_found_cb):
-        for fname in file_list:
-            if self.srctree:
-                f = os.path.join(self.srctree, fname)
-            else:
-                f = fname
-
-            if os.path.isdir(f):
-                yield from self._parse_dir(f)
-            elif os.path.isfile(f):
-                yield f
-            elif file_not_found_cb:
-                file_not_found_cb(fname)
-
-
-class KernelFiles():
-
-    def parse_file(self, fname):
-
-        doc = KernelDoc(self.config, fname)
-        doc.run()
-
-        return doc
-
-    def process_export_file(self, fname):
-        try:
-            with open(fname, "r", encoding="utf8",
-                      errors="backslashreplace") as fp:
-                for line in fp:
-                    KernelDoc.process_export(self.config.function_table, line)
-
-        except IOError:
-            print(f"Error: Cannot open fname {fname}", fname=sys.stderr)
-            self.config.errors += 1
-
-    def file_not_found_cb(self, fname):
-        self.config.log.error("Cannot find file %s", fname)
-        self.config.errors += 1
-
-    def __init__(self, files=None, verbose=False, out_style=None,
-                 werror=False, wreturn=False, wshort_desc=False,
-                 wcontents_before_sections=False,
-                 logger=None, modulename=None, export_file=None):
-        """Initialize startup variables and parse all files"""
-
-
-        if not verbose:
-            verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
-
-        if not modulename:
-            modulename = "Kernel API"
-
-        dt = datetime.now()
-        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
-            # use UTC TZ
-            to_zone = tz.gettz('UTC')
-            dt = dt.astimezone(to_zone)
-
-        if not werror:
-            kcflags = os.environ.get("KCFLAGS", None)
-            if kcflags:
-                match = re.search(r"(\s|^)-Werror(\s|$)/", kcflags)
-                if match:
-                    werror = True
-
-            # reading this variable is for backwards compat just in case
-            # someone was calling it with the variable from outside the
-            # kernel's build system
-            kdoc_werror = os.environ.get("KDOC_WERROR", None)
-            if kdoc_werror:
-                werror = kdoc_werror
-
-        # Set global config data used on all files
-        self.config = argparse.Namespace
-
-        self.config.verbose = verbose
-        self.config.werror = werror
-        self.config.wreturn = wreturn
-        self.config.wshort_desc = wshort_desc
-        self.config.wcontents_before_sections = wcontents_before_sections
-        self.config.modulename = modulename
-
-        self.config.function_table = set()
-        self.config.source_map = {}
-
-        if not logger:
-            self.config.log = logging.getLogger("kernel-doc")
-        else:
-            self.config.log = logger
-
-        self.config.kernel_version = os.environ.get("KERNELVERSION",
-                                                    "unknown kernel version'")
-        self.config.src_tree = os.environ.get("SRCTREE", None)
-
-        self.out_style = out_style
-        self.export_file = export_file
-
-        # Initialize internal variables
-
-        self.config.errors = 0
-        self.results = []
-
-        self.file_list = files
-        self.files = set()
-
-    def parse(self):
-        """
-        Parse all files
-        """
-
-        glob = GlobSourceFiles(srctree=self.config.src_tree)
-
-        # Let's use a set here to avoid duplicating files
-
-        for fname in glob.parse_files(self.file_list, self.file_not_found_cb):
-            if fname in self.files:
-                continue
-
-            self.files.add(fname)
-
-            res = self.parse_file(fname)
-            self.results.append((res.fname, res.entries))
-
-        if not self.files:
-            sys.exit(1)
-
-        # If a list of export files was provided, parse EXPORT_SYMBOL*
-        # from the ones not already parsed
-
-        if self.export_file:
-            files = self.files
-
-            glob = GlobSourceFiles(srctree=self.config.src_tree)
-
-            for fname in glob.parse_files(self.export_file,
-                                          self.file_not_found_cb):
-                if fname not in files:
-                    files.add(fname)
-
-                    self.process_export_file(fname)
-
-    def out_msg(self, fname, name, arg):
-        # TODO: filter out unwanted parts
-
-        return self.out_style.msg(fname, name, arg)
-
-    def msg(self, enable_lineno=False, export=False, internal=False,
-            symbol=None, nosymbol=None):
-
-        function_table = self.config.function_table
-
-        if symbol:
-            for s in symbol:
-                function_table.add(s)
-
-        # Output none mode: only warnings will be shown
-        if not self.out_style:
-            return
-
-        self.out_style.set_config(self.config)
-
-        self.out_style.set_filter(export, internal, symbol, nosymbol,
-                                  function_table, enable_lineno)
-
-        for fname, arg_tuple in self.results:
-            for name, arg in arg_tuple:
-                if self.out_msg(fname, name, arg):
-                    ln = arg.get("ln", 0)
-                    dtype = arg.get('type', "")
-
-                    self.config.log.warning("%s:%d Can't handle %s",
-                                            fname, ln, dtype)
-
 
 class OutputFormat:
     # output mode.
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
new file mode 100755
index 000000000000..8bcdc7ead984
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -0,0 +1,270 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=R0903,R0913,R0914,R0917
+
+# TODO: implement warning filtering
+
+"""
+Parse lernel-doc tags on multiple kernel source files.
+"""
+
+import argparse
+import logging
+import os
+import re
+import sys
+from datetime import datetime
+
+from dateutil import tz
+
+from kdoc_parser import KernelDoc
+
+
+class GlobSourceFiles:
+    """
+    Parse C source code file names and directories via an Interactor.
+    """
+
+    def __init__(self, srctree=None, valid_extensions=None):
+        """
+        Initialize valid extensions with a tuple.
+
+        If not defined, assume default C extensions (.c and .h)
+
+        It would be possible to use python's glob function, but it is
+        very slow, and it is not interactive. So, it would wait to read all
+        directories before actually do something.
+
+        So, let's use our own implementation.
+        """
+
+        if not valid_extensions:
+            self.extensions = (".c", ".h")
+        else:
+            self.extensions = valid_extensions
+
+        self.srctree = srctree
+
+    def _parse_dir(self, dirname):
+        """Internal function to parse files recursively"""
+
+        with os.scandir(dirname) as obj:
+            for entry in obj:
+                name = os.path.join(dirname, entry.name)
+
+                if entry.is_dir():
+                    yield from self._parse_dir(name)
+
+                if not entry.is_file():
+                    continue
+
+                basename = os.path.basename(name)
+
+                if not basename.endswith(self.extensions):
+                    continue
+
+                yield name
+
+    def parse_files(self, file_list, file_not_found_cb):
+        """
+        Define an interator to parse all source files from file_list,
+        handling directories if any
+        """
+
+        for fname in file_list:
+            if self.srctree:
+                f = os.path.join(self.srctree, fname)
+            else:
+                f = fname
+
+            if os.path.isdir(f):
+                yield from self._parse_dir(f)
+            elif os.path.isfile(f):
+                yield f
+            elif file_not_found_cb:
+                file_not_found_cb(fname)
+
+
+class KernelFiles():
+    """
+    Parse lernel-doc tags on multiple kernel source files.
+    """
+
+    def parse_file(self, fname):
+        """
+        Parse a single Kernel source.
+        """
+
+        doc = KernelDoc(self.config, fname)
+        doc.run()
+
+        return doc
+
+    def process_export_file(self, fname):
+        """
+        Parses EXPORT_SYMBOL* macros from a single Kernel source file.
+        """
+        try:
+            with open(fname, "r", encoding="utf8",
+                      errors="backslashreplace") as fp:
+                for line in fp:
+                    KernelDoc.process_export(self.config.function_table, line)
+
+        except IOError:
+            print(f"Error: Cannot open fname {fname}", fname=sys.stderr)
+            self.config.errors += 1
+
+    def file_not_found_cb(self, fname):
+        """
+        Callback to warn if a file was not found.
+        """
+
+        self.config.log.error("Cannot find file %s", fname)
+        self.config.errors += 1
+
+    def __init__(self, files=None, verbose=False, out_style=None,
+                 werror=False, wreturn=False, wshort_desc=False,
+                 wcontents_before_sections=False,
+                 logger=None, modulename=None, export_file=None):
+        """
+        Initialize startup variables and parse all files
+        """
+
+        if not verbose:
+            verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
+
+        if not modulename:
+            modulename = "Kernel API"
+
+        dt = datetime.now()
+        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
+            # use UTC TZ
+            to_zone = tz.gettz('UTC')
+            dt = dt.astimezone(to_zone)
+
+        if not werror:
+            kcflags = os.environ.get("KCFLAGS", None)
+            if kcflags:
+                match = re.search(r"(\s|^)-Werror(\s|$)/", kcflags)
+                if match:
+                    werror = True
+
+            # reading this variable is for backwards compat just in case
+            # someone was calling it with the variable from outside the
+            # kernel's build system
+            kdoc_werror = os.environ.get("KDOC_WERROR", None)
+            if kdoc_werror:
+                werror = kdoc_werror
+
+        # Set global config data used on all files
+        self.config = argparse.Namespace
+
+        self.config.verbose = verbose
+        self.config.werror = werror
+        self.config.wreturn = wreturn
+        self.config.wshort_desc = wshort_desc
+        self.config.wcontents_before_sections = wcontents_before_sections
+        self.config.modulename = modulename
+
+        self.config.function_table = set()
+        self.config.source_map = {}
+
+        if not logger:
+            self.config.log = logging.getLogger("kernel-doc")
+        else:
+            self.config.log = logger
+
+        self.config.kernel_version = os.environ.get("KERNELVERSION",
+                                                    "unknown kernel version'")
+        self.config.src_tree = os.environ.get("SRCTREE", None)
+
+        self.out_style = out_style
+        self.export_file = export_file
+
+        # Initialize internal variables
+
+        self.config.errors = 0
+        self.results = []
+
+        self.file_list = files
+        self.files = set()
+
+    def parse(self):
+        """
+        Parse all files
+        """
+
+        glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+        # Let's use a set here to avoid duplicating files
+
+        for fname in glob.parse_files(self.file_list, self.file_not_found_cb):
+            if fname in self.files:
+                continue
+
+            self.files.add(fname)
+
+            res = self.parse_file(fname)
+            self.results.append((res.fname, res.entries))
+
+        if not self.files:
+            sys.exit(1)
+
+        # If a list of export files was provided, parse EXPORT_SYMBOL*
+        # from the ones not already parsed
+
+        if self.export_file:
+            files = self.files
+
+            glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+            for fname in glob.parse_files(self.export_file,
+                                          self.file_not_found_cb):
+                if fname not in files:
+                    files.add(fname)
+
+                    self.process_export_file(fname)
+
+    def out_msg(self, fname, name, arg):
+        """
+        Output messages from a file name using the output style filtering.
+
+        If output type was not handled by the syler, return False.
+        """
+
+        # NOTE: we can add rules here to filter out unwanted parts,
+        # although OutputFormat.msg already does that.
+
+        return self.out_style.msg(fname, name, arg)
+
+    def msg(self, enable_lineno=False, export=False, internal=False,
+            symbol=None, nosymbol=None):
+        """
+        Interacts over the kernel-doc results and output messages.
+        """
+
+        function_table = self.config.function_table
+
+        if symbol:
+            for s in symbol:
+                function_table.add(s)
+
+        # Output none mode: only warnings will be shown
+        if not self.out_style:
+            return
+
+        self.out_style.set_config(self.config)
+
+        self.out_style.set_filter(export, internal, symbol, nosymbol,
+                                  function_table, enable_lineno)
+
+        for fname, arg_tuple in self.results:
+            for name, arg in arg_tuple:
+                if self.out_msg(fname, name, arg):
+                    ln = arg.get("ln", 0)
+                    dtype = arg.get('type', "")
+
+                    self.config.log.warning("%s:%d Can't handle %s",
+                                            fname, ln, dtype)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 10/33] scripts/kernel-doc.py: move output classes to a separate file
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (8 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 09/33] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 11/33] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

In preparation for letting kerneldoc Sphinx extension to import
Python libraries, move kernel-doc output logic to a separate file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           | 727 +------------------------------
 scripts/lib/kdoc/kdoc_output.py | 736 ++++++++++++++++++++++++++++++++
 2 files changed, 739 insertions(+), 724 deletions(-)
 create mode 100755 scripts/lib/kdoc/kdoc_output.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index d09ada2d862a..abff78e9160f 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -2,9 +2,7 @@
 # SPDX-License-Identifier: GPL-2.0
 # Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
 #
-# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
-# pylint: disable=C0302,C0103,C0301
-# pylint: disable=C0116,C0115,W0511,W0613
+# pylint: disable=C0103
 #
 # Converted from the kernel-doc script originally written in Perl
 # under GPLv2, copyrighted since 1998 by the following authors:
@@ -102,14 +100,8 @@ documentation comment syntax.
 import argparse
 import logging
 import os
-import re
 import sys
 
-from datetime import datetime
-from pprint import pformat
-
-from dateutil import tz
-
 # Import Python modules
 
 LIB_DIR = "lib/kdoc"
@@ -117,721 +109,8 @@ SRC_DIR = os.path.dirname(os.path.realpath(__file__))
 
 sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
 
-from kdoc_parser import KernelDoc, type_param
-from kdoc_re import Re
-from kdoc_files import KernelFiles
-
-function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
-
-# match expressions used to find embedded type information
-type_constant = Re(r"\b``([^\`]+)``\b", cache=False)
-type_constant2 = Re(r"\%([-_*\w]+)", cache=False)
-type_func = Re(r"(\w+)\(\)", cache=False)
-type_param_ref = Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
-
-# Special RST handling for func ptr params
-type_fp_param = Re(r"\@(\w+)\(\)", cache=False)
-
-# Special RST handling for structs with func ptr params
-type_fp_param2 = Re(r"\@(\w+->\S+)\(\)", cache=False)
-
-type_env = Re(r"(\$\w+)", cache=False)
-type_enum = Re(r"\&(enum\s*([_\w]+))", cache=False)
-type_struct = Re(r"\&(struct\s*([_\w]+))", cache=False)
-type_typedef = Re(r"\&(typedef\s*([_\w]+))", cache=False)
-type_union = Re(r"\&(union\s*([_\w]+))", cache=False)
-type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
-type_fallback = Re(r"\&([_\w]+)", cache=False)
-type_member_func = type_member + Re(r"\(\)", cache=False)
-
-
-class OutputFormat:
-    # output mode.
-    OUTPUT_ALL          = 0 # output all symbols and doc sections
-    OUTPUT_INCLUDE      = 1 # output only specified symbols
-    OUTPUT_EXPORTED     = 2 # output exported symbols
-    OUTPUT_INTERNAL     = 3 # output non-exported symbols
-
-    # Virtual member to be overriden at the  inherited classes
-    highlights = []
-
-    def __init__(self):
-        """Declare internal vars and set mode to OUTPUT_ALL"""
-
-        self.out_mode = self.OUTPUT_ALL
-        self.enable_lineno = None
-        self.nosymbol = {}
-        self.symbol = None
-        self.function_table = set()
-        self.config = None
-
-    def set_config(self, config):
-        self.config = config
-
-    def set_filter(self, export, internal, symbol, nosymbol, function_table,
-                   enable_lineno):
-        """
-        Initialize filter variables according with the requested mode.
-
-        Only one choice is valid between export, internal and symbol.
-
-        The nosymbol filter can be used on all modes.
-        """
-
-        self.enable_lineno = enable_lineno
-
-        if symbol:
-            self.out_mode = self.OUTPUT_INCLUDE
-            function_table = symbol
-        elif export:
-            self.out_mode = self.OUTPUT_EXPORTED
-        elif internal:
-            self.out_mode = self.OUTPUT_INTERNAL
-        else:
-            self.out_mode = self.OUTPUT_ALL
-
-        if nosymbol:
-            self.nosymbol = set(nosymbol)
-
-        if function_table:
-            self.function_table = function_table
-
-    def highlight_block(self, block):
-        """
-        Apply the RST highlights to a sub-block of text.
-        """
-
-        for r, sub in self.highlights:
-            block = r.sub(sub, block)
-
-        return block
-
-    def check_doc(self, name):
-        """Check if DOC should be output"""
-
-        if self.out_mode == self.OUTPUT_ALL:
-            return True
-
-        if self.out_mode == self.OUTPUT_INCLUDE:
-            if name in self.nosymbol:
-                return False
-
-            if name in self.function_table:
-                return True
-
-        return False
-
-    def check_declaration(self, dtype, name):
-        if name in self.nosymbol:
-            return False
-
-        if self.out_mode == self.OUTPUT_ALL:
-            return True
-
-        if self.out_mode in [ self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED ]:
-            if name in self.function_table:
-                return True
-
-        if self.out_mode == self.OUTPUT_INTERNAL:
-            if dtype != "function":
-                return True
-
-            if name not in self.function_table:
-                return True
-
-        return False
-
-    def check_function(self, fname, name, args):
-        return True
-
-    def check_enum(self, fname, name, args):
-        return True
-
-    def check_typedef(self, fname, name, args):
-        return True
-
-    def msg(self, fname, name, args):
-
-        dtype = args.get('type', "")
-
-        if dtype == "doc":
-            self.out_doc(fname, name, args)
-            return False
-
-        if not self.check_declaration(dtype, name):
-            return False
-
-        if dtype == "function":
-            self.out_function(fname, name, args)
-            return False
-
-        if dtype == "enum":
-            self.out_enum(fname, name, args)
-            return False
-
-        if dtype == "typedef":
-            self.out_typedef(fname, name, args)
-            return False
-
-        if dtype in ["struct", "union"]:
-            self.out_struct(fname, name, args)
-            return False
-
-        # Warn if some type requires an output logic
-        self.config.log.warning("doesn't now how to output '%s' block",
-                                dtype)
-
-        return True
-
-    # Virtual methods to be overridden by inherited classes
-    def out_doc(self, fname, name, args):
-        pass
-
-    def out_function(self, fname, name, args):
-        pass
-
-    def out_enum(self, fname, name, args):
-        pass
-
-    def out_typedef(self, fname, name, args):
-        pass
-
-    def out_struct(self, fname, name, args):
-        pass
-
-
-class RestFormat(OutputFormat):
-    # """Consts and functions used by ReST output"""
-
-    highlights = [
-        (type_constant, r"``\1``"),
-        (type_constant2, r"``\1``"),
-
-        # Note: need to escape () to avoid func matching later
-        (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
-        (type_member, r":c:type:`\1\2\3 <\1>`"),
-        (type_fp_param, r"**\1\\(\\)**"),
-        (type_fp_param2, r"**\1\\(\\)**"),
-        (type_func, r"\1()"),
-        (type_enum, r":c:type:`\1 <\2>`"),
-        (type_struct, r":c:type:`\1 <\2>`"),
-        (type_typedef, r":c:type:`\1 <\2>`"),
-        (type_union, r":c:type:`\1 <\2>`"),
-
-        # in rst this can refer to any type
-        (type_fallback, r":c:type:`\1`"),
-        (type_param_ref, r"**\1\2**")
-    ]
-    blankline = "\n"
-
-    sphinx_literal = Re(r'^[^.].*::$', cache=False)
-    sphinx_cblock = Re(r'^\.\.\ +code-block::', cache=False)
-
-    def __init__(self):
-        """
-        Creates class variables.
-
-        Not really mandatory, but it is a good coding style and makes
-        pylint happy.
-        """
-
-        super().__init__()
-        self.lineprefix = ""
-
-    def print_lineno (self, ln):
-        """Outputs a line number"""
-
-        if self.enable_lineno and ln:
-            print(f".. LINENO {ln}")
-
-    def output_highlight(self, args):
-        input_text = args
-        output = ""
-        in_literal = False
-        litprefix = ""
-        block = ""
-
-        for line in input_text.strip("\n").split("\n"):
-
-            # If we're in a literal block, see if we should drop out of it.
-            # Otherwise, pass the line straight through unmunged.
-            if in_literal:
-                if line.strip():  # If the line is not blank
-                    # If this is the first non-blank line in a literal block,
-                    # figure out the proper indent.
-                    if not litprefix:
-                        r = Re(r'^(\s*)')
-                        if r.match(line):
-                            litprefix = '^' + r.group(1)
-                        else:
-                            litprefix = ""
-
-                        output += line + "\n"
-                    elif not Re(litprefix).match(line):
-                        in_literal = False
-                    else:
-                        output += line + "\n"
-                else:
-                    output += line + "\n"
-
-            # Not in a literal block (or just dropped out)
-            if not in_literal:
-                block += line + "\n"
-                if self.sphinx_literal.match(line) or self.sphinx_cblock.match(line):
-                    in_literal = True
-                    litprefix = ""
-                    output += self.highlight_block(block)
-                    block = ""
-
-        # Handle any remaining block
-        if block:
-            output += self.highlight_block(block)
-
-        # Print the output with the line prefix
-        for line in output.strip("\n").split("\n"):
-            print(self.lineprefix + line)
-
-    def out_section(self, args, out_reference=False):
-        """
-        Outputs a block section.
-
-        This could use some work; it's used to output the DOC: sections, and
-        starts by putting out the name of the doc section itself, but that
-        tends to duplicate a header already in the template file.
-        """
-
-        sectionlist = args.get('sectionlist', [])
-        sections = args.get('sections', {})
-        section_start_lines = args.get('section_start_lines', {})
-
-        for section in sectionlist:
-            # Skip sections that are in the nosymbol_table
-            if section in self.nosymbol:
-                continue
-
-            if not self.out_mode == self.OUTPUT_INCLUDE:
-                if out_reference:
-                    print(f".. _{section}:\n")
-
-                if not self.symbol:
-                    print(f'{self.lineprefix}**{section}**\n')
-
-            self.print_lineno(section_start_lines.get(section, 0))
-            self.output_highlight(sections[section])
-            print()
-        print()
-
-    def out_doc(self, fname, name, args):
-        if not self.check_doc(name):
-            return
-
-        self.out_section(args, out_reference=True)
-
-    def out_function(self, fname, name, args):
-
-        oldprefix = self.lineprefix
-        signature = ""
-
-        func_macro = args.get('func_macro', False)
-        if func_macro:
-            signature = args['function']
-        else:
-            if args.get('functiontype'):
-                signature = args['functiontype'] + " "
-            signature += args['function'] + " ("
-
-        parameterlist = args.get('parameterlist', [])
-        parameterdescs = args.get('parameterdescs', {})
-        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
-
-        ln = args.get('ln', 0)
-
-        count = 0
-        for parameter in parameterlist:
-            if count != 0:
-                signature += ", "
-            count += 1
-            dtype = args['parametertypes'].get(parameter, "")
-
-            if function_pointer.search(dtype):
-                signature += function_pointer.group(1) + parameter + function_pointer.group(3)
-            else:
-                signature += dtype
-
-        if not func_macro:
-            signature += ")"
-
-        if args.get('typedef') or not args.get('functiontype'):
-            print(f".. c:macro:: {args['function']}\n")
-
-            if args.get('typedef'):
-                self.print_lineno(ln)
-                print("   **Typedef**: ", end="")
-                self.lineprefix = ""
-                self.output_highlight(args.get('purpose', ""))
-                print("\n\n**Syntax**\n")
-                print(f"  ``{signature}``\n")
-            else:
-                print(f"``{signature}``\n")
-        else:
-            print(f".. c:function:: {signature}\n")
-
-        if not args.get('typedef'):
-            self.print_lineno(ln)
-            self.lineprefix = "   "
-            self.output_highlight(args.get('purpose', ""))
-            print()
-
-        # Put descriptive text into a container (HTML <div>) to help set
-        # function prototypes apart
-        self.lineprefix = "  "
-
-        if parameterlist:
-            print(".. container:: kernelindent\n")
-            print(f"{self.lineprefix}**Parameters**\n")
-
-        for parameter in parameterlist:
-            parameter_name = Re(r'\[.*').sub('', parameter)
-            dtype = args['parametertypes'].get(parameter, "")
-
-            if dtype:
-                print(f"{self.lineprefix}``{dtype}``")
-            else:
-                print(f"{self.lineprefix}``{parameter}``")
-
-            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
-
-            self.lineprefix = "    "
-            if parameter_name in parameterdescs and \
-               parameterdescs[parameter_name] != KernelDoc.undescribed:
-
-                self.output_highlight(parameterdescs[parameter_name])
-                print()
-            else:
-                print(f"{self.lineprefix}*undescribed*\n")
-            self.lineprefix = "  "
-
-        self.out_section(args)
-        self.lineprefix = oldprefix
-
-    def out_enum(self, fname, name, args):
-
-        oldprefix = self.lineprefix
-        name = args.get('enum', '')
-        parameterlist = args.get('parameterlist', [])
-        parameterdescs = args.get('parameterdescs', {})
-        ln = args.get('ln', 0)
-
-        print(f"\n\n.. c:enum:: {name}\n")
-
-        self.print_lineno(ln)
-        self.lineprefix = "  "
-        self.output_highlight(args.get('purpose', ''))
-        print()
-
-        print(".. container:: kernelindent\n")
-        outer = self.lineprefix + "  "
-        self.lineprefix = outer + "  "
-        print(f"{outer}**Constants**\n")
-
-        for parameter in parameterlist:
-            print(f"{outer}``{parameter}``")
-
-            if parameterdescs.get(parameter, '') != KernelDoc.undescribed:
-                self.output_highlight(parameterdescs[parameter])
-            else:
-                print(f"{self.lineprefix}*undescribed*\n")
-            print()
-
-        self.lineprefix = oldprefix
-        self.out_section(args)
-
-    def out_typedef(self, fname, name, args):
-
-        oldprefix = self.lineprefix
-        name = args.get('typedef', '')
-        ln = args.get('ln', 0)
-
-        print(f"\n\n.. c:type:: {name}\n")
-
-        self.print_lineno(ln)
-        self.lineprefix = "   "
-
-        self.output_highlight(args.get('purpose', ''))
-
-        print()
-
-        self.lineprefix = oldprefix
-        self.out_section(args)
-
-    def out_struct(self, fname, name, args):
-
-        name = args.get('struct', "")
-        purpose = args.get('purpose', "")
-        declaration = args.get('definition', "")
-        dtype = args.get('type', "struct")
-        ln = args.get('ln', 0)
-
-        parameterlist = args.get('parameterlist', [])
-        parameterdescs = args.get('parameterdescs', {})
-        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
-
-        print(f"\n\n.. c:{dtype}:: {name}\n")
-
-        self.print_lineno(ln)
-
-        oldprefix = self.lineprefix
-        self.lineprefix += "  "
-
-        self.output_highlight(purpose)
-        print()
-
-        print(".. container:: kernelindent\n")
-        print(f"{self.lineprefix}**Definition**::\n")
-
-        self.lineprefix = self.lineprefix + "  "
-
-        declaration = declaration.replace("\t", self.lineprefix)
-
-        print(f"{self.lineprefix}{dtype} {name}" + ' {')
-        print(f"{declaration}{self.lineprefix}" + "};\n")
-
-        self.lineprefix = "  "
-        print(f"{self.lineprefix}**Members**\n")
-        for parameter in parameterlist:
-            if not parameter or parameter.startswith("#"):
-                continue
-
-            parameter_name = parameter.split("[", maxsplit=1)[0]
-
-            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
-                continue
-
-            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
-
-            print(f"{self.lineprefix}``{parameter}``")
-
-            self.lineprefix = "    "
-            self.output_highlight(parameterdescs[parameter_name])
-            self.lineprefix = "  "
-
-            print()
-
-        print()
-
-        self.lineprefix = oldprefix
-        self.out_section(args)
-
-
-class ManFormat(OutputFormat):
-    """Consts and functions used by man pages output"""
-
-    highlights = (
-        (type_constant, r"\1"),
-        (type_constant2, r"\1"),
-        (type_func, r"\\fB\1\\fP"),
-        (type_enum, r"\\fI\1\\fP"),
-        (type_struct, r"\\fI\1\\fP"),
-        (type_typedef, r"\\fI\1\\fP"),
-        (type_union, r"\\fI\1\\fP"),
-        (type_param, r"\\fI\1\\fP"),
-        (type_param_ref, r"\\fI\1\2\\fP"),
-        (type_member, r"\\fI\1\2\3\\fP"),
-        (type_fallback, r"\\fI\1\\fP")
-    )
-    blankline = ""
-
-    def __init__(self):
-        """
-        Creates class variables.
-
-        Not really mandatory, but it is a good coding style and makes
-        pylint happy.
-        """
-
-        super().__init__()
-
-        dt = datetime.now()
-        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
-            # use UTC TZ
-            to_zone = tz.gettz('UTC')
-            dt = dt.astimezone(to_zone)
-
-        self.man_date = dt.strftime("%B %Y")
-
-    def output_highlight(self, block):
-
-        contents = self.highlight_block(block)
-
-        if isinstance(contents, list):
-            contents = "\n".join(contents)
-
-        for line in contents.strip("\n").split("\n"):
-            line = Re(r"^\s*").sub("", line)
-
-            if line and line[0] == ".":
-                print("\\&" + line)
-            else:
-                print(line)
-
-    def out_doc(self, fname, name, args):
-        module = args.get('module')
-        sectionlist = args.get('sectionlist', [])
-        sections = args.get('sections', {})
-
-        print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX')
-
-        for section in sectionlist:
-            print(f'.SH "{section}"')
-            self.output_highlight(sections.get(section))
-
-    def out_function(self, fname, name, args):
-        """output function in man"""
-
-        parameterlist = args.get('parameterlist', [])
-        parameterdescs = args.get('parameterdescs', {})
-        sectionlist = args.get('sectionlist', [])
-        sections = args.get('sections', {})
-
-        print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX')
-
-        print(".SH NAME")
-        print(f"{args['function']} \\- {args['purpose']}")
-
-        print(".SH SYNOPSIS")
-        if args.get('functiontype', ''):
-            print(f'.B "{args['functiontype']}" {args['function']}')
-        else:
-            print(f'.B "{args['function']}')
-
-        count = 0
-        parenth = "("
-        post = ","
-
-        for parameter in parameterlist:
-            if count == len(parameterlist) - 1:
-                post = ");"
-
-            dtype = args['parametertypes'].get(parameter, "")
-            if function_pointer.match(dtype):
-                # Pointer-to-function
-                print(f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"')
-            else:
-                dtype = Re(r'([^\*])$').sub(r'\1 ', dtype)
-
-                print(f'.BI "{parenth}{dtype}"  "{post}"')
-            count += 1
-            parenth = ""
-
-        if parameterlist:
-            print(".SH ARGUMENTS")
-
-        for parameter in parameterlist:
-            parameter_name = re.sub(r'\[.*', '', parameter)
-
-            print(f'.IP "{parameter}" 12')
-            self.output_highlight(parameterdescs.get(parameter_name, ""))
-
-        for section in sectionlist:
-            print(f'.SH "{section.upper()}"')
-            self.output_highlight(sections[section])
-
-    def out_enum(self, fname, name, args):
-
-        name = args.get('enum', '')
-        parameterlist = args.get('parameterlist', [])
-        sectionlist = args.get('sectionlist', [])
-        sections = args.get('sections', {})
-
-        print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX')
-
-        print(".SH NAME")
-        print(f"enum {args['enum']} \\- {args['purpose']}")
-
-        print(".SH SYNOPSIS")
-        print(f"enum {args['enum']}" + " {")
-
-        count = 0
-        for parameter in parameterlist:
-            print(f'.br\n.BI "    {parameter}"')
-            if count == len(parameterlist) - 1:
-                print("\n};")
-            else:
-                print(", \n.br")
-
-            count += 1
-
-        print(".SH Constants")
-
-        for parameter in parameterlist:
-            parameter_name = Re(r'\[.*').sub('', parameter)
-            print(f'.IP "{parameter}" 12')
-            self.output_highlight(args['parameterdescs'].get(parameter_name, ""))
-
-        for section in sectionlist:
-            print(f'.SH "{section}"')
-            self.output_highlight(sections[section])
-
-    def out_typedef(self, fname, name, args):
-        module = args.get('module')
-        typedef = args.get('typedef')
-        purpose = args.get('purpose')
-        sectionlist = args.get('sectionlist', [])
-        sections = args.get('sections', {})
-
-        print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual" LINUX')
-
-        print(".SH NAME")
-        print(f"typedef {typedef} \\- {purpose}")
-
-        for section in sectionlist:
-            print(f'.SH "{section}"')
-            self.output_highlight(sections.get(section))
-
-    def out_struct(self, fname, name, args):
-        module = args.get('module')
-        struct_type = args.get('type')
-        struct_name = args.get('struct')
-        purpose = args.get('purpose')
-        definition = args.get('definition')
-        sectionlist = args.get('sectionlist', [])
-        parameterlist = args.get('parameterlist', [])
-        sections = args.get('sections', {})
-        parameterdescs = args.get('parameterdescs', {})
-
-        print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_date}" "API Manual" LINUX')
-
-        print(".SH NAME")
-        print(f"{struct_type} {struct_name} \\- {purpose}")
-
-        # Replace tabs with two spaces and handle newlines
-        declaration = definition.replace("\t", "  ")
-        declaration = Re(r"\n").sub('"\n.br\n.BI "', declaration)
-
-        print(".SH SYNOPSIS")
-        print(f"{struct_type} {struct_name} " + "{" +"\n.br")
-        print(f'.BI "{declaration}\n' + "};\n.br\n")
-
-        print(".SH Members")
-        for parameter in parameterlist:
-            if parameter.startswith("#"):
-                continue
-
-            parameter_name = re.sub(r"\[.*", "", parameter)
-
-            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
-                continue
-
-            print(f'.IP "{parameter}" 12')
-            self.output_highlight(parameterdescs.get(parameter_name))
-
-        for section in sectionlist:
-            print(f'.SH "{section}"')
-            self.output_highlight(sections.get(section))
-
-
-# Command line interface
-
+from kdoc_files import KernelFiles                      # pylint: disable=C0413
+from kdoc_output import RestFormat, ManFormat           # pylint: disable=C0413
 
 DESC = """
 Read C language source or header FILEs, extract embedded documentation comments,
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
new file mode 100755
index 000000000000..24e40b3e7d1d
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -0,0 +1,736 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=C0301,R0911,R0912,R0913,R0914,R0915,R0917
+
+# TODO: implement warning filtering
+
+"""
+Implement output filters to print kernel-doc documentation.
+
+The implementation uses a virtual base class (OutputFormat) which
+contains a dispatches to virtual methods, and some code to filter
+out output messages.
+
+The actual implementation is done on one separate class per each type
+of output. Currently, there are output classes for ReST and man/troff.
+"""
+
+import os
+import re
+from datetime import datetime
+
+from dateutil import tz
+
+from kdoc_parser import KernelDoc, type_param
+from kdoc_re import Re
+
+
+function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
+
+# match expressions used to find embedded type information
+type_constant = Re(r"\b``([^\`]+)``\b", cache=False)
+type_constant2 = Re(r"\%([-_*\w]+)", cache=False)
+type_func = Re(r"(\w+)\(\)", cache=False)
+type_param_ref = Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+
+# Special RST handling for func ptr params
+type_fp_param = Re(r"\@(\w+)\(\)", cache=False)
+
+# Special RST handling for structs with func ptr params
+type_fp_param2 = Re(r"\@(\w+->\S+)\(\)", cache=False)
+
+type_env = Re(r"(\$\w+)", cache=False)
+type_enum = Re(r"\&(enum\s*([_\w]+))", cache=False)
+type_struct = Re(r"\&(struct\s*([_\w]+))", cache=False)
+type_typedef = Re(r"\&(typedef\s*([_\w]+))", cache=False)
+type_union = Re(r"\&(union\s*([_\w]+))", cache=False)
+type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
+type_fallback = Re(r"\&([_\w]+)", cache=False)
+type_member_func = type_member + Re(r"\(\)", cache=False)
+
+
+class OutputFormat:
+    # output mode.
+    OUTPUT_ALL          = 0 # output all symbols and doc sections
+    OUTPUT_INCLUDE      = 1 # output only specified symbols
+    OUTPUT_EXPORTED     = 2 # output exported symbols
+    OUTPUT_INTERNAL     = 3 # output non-exported symbols
+
+    # Virtual member to be overriden at the  inherited classes
+    highlights = []
+
+    def __init__(self):
+        """Declare internal vars and set mode to OUTPUT_ALL"""
+
+        self.out_mode = self.OUTPUT_ALL
+        self.enable_lineno = None
+        self.nosymbol = {}
+        self.symbol = None
+        self.function_table = set()
+        self.config = None
+
+    def set_config(self, config):
+        self.config = config
+
+    def set_filter(self, export, internal, symbol, nosymbol, function_table,
+                   enable_lineno):
+        """
+        Initialize filter variables according with the requested mode.
+
+        Only one choice is valid between export, internal and symbol.
+
+        The nosymbol filter can be used on all modes.
+        """
+
+        self.enable_lineno = enable_lineno
+
+        if symbol:
+            self.out_mode = self.OUTPUT_INCLUDE
+            function_table = symbol
+        elif export:
+            self.out_mode = self.OUTPUT_EXPORTED
+        elif internal:
+            self.out_mode = self.OUTPUT_INTERNAL
+        else:
+            self.out_mode = self.OUTPUT_ALL
+
+        if nosymbol:
+            self.nosymbol = set(nosymbol)
+
+        if function_table:
+            self.function_table = function_table
+
+    def highlight_block(self, block):
+        """
+        Apply the RST highlights to a sub-block of text.
+        """
+
+        for r, sub in self.highlights:
+            block = r.sub(sub, block)
+
+        return block
+
+    def check_doc(self, name):
+        """Check if DOC should be output"""
+
+        if self.out_mode == self.OUTPUT_ALL:
+            return True
+
+        if self.out_mode == self.OUTPUT_INCLUDE:
+            if name in self.nosymbol:
+                return False
+
+            if name in self.function_table:
+                return True
+
+        return False
+
+    def check_declaration(self, dtype, name):
+        if name in self.nosymbol:
+            return False
+
+        if self.out_mode == self.OUTPUT_ALL:
+            return True
+
+        if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]:
+            if name in self.function_table:
+                return True
+
+        if self.out_mode == self.OUTPUT_INTERNAL:
+            if dtype != "function":
+                return True
+
+            if name not in self.function_table:
+                return True
+
+        return False
+
+    def check_function(self, fname, name, args):
+        return True
+
+    def check_enum(self, fname, name, args):
+        return True
+
+    def check_typedef(self, fname, name, args):
+        return True
+
+    def msg(self, fname, name, args):
+
+        dtype = args.get('type', "")
+
+        if dtype == "doc":
+            self.out_doc(fname, name, args)
+            return False
+
+        if not self.check_declaration(dtype, name):
+            return False
+
+        if dtype == "function":
+            self.out_function(fname, name, args)
+            return False
+
+        if dtype == "enum":
+            self.out_enum(fname, name, args)
+            return False
+
+        if dtype == "typedef":
+            self.out_typedef(fname, name, args)
+            return False
+
+        if dtype in ["struct", "union"]:
+            self.out_struct(fname, name, args)
+            return False
+
+        # Warn if some type requires an output logic
+        self.config.log.warning("doesn't now how to output '%s' block",
+                                dtype)
+
+        return True
+
+    # Virtual methods to be overridden by inherited classes
+    def out_doc(self, fname, name, args):
+        pass
+
+    def out_function(self, fname, name, args):
+        pass
+
+    def out_enum(self, fname, name, args):
+        pass
+
+    def out_typedef(self, fname, name, args):
+        pass
+
+    def out_struct(self, fname, name, args):
+        pass
+
+
+class RestFormat(OutputFormat):
+    # """Consts and functions used by ReST output"""
+
+    highlights = [
+        (type_constant, r"``\1``"),
+        (type_constant2, r"``\1``"),
+
+        # Note: need to escape () to avoid func matching later
+        (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
+        (type_member, r":c:type:`\1\2\3 <\1>`"),
+        (type_fp_param, r"**\1\\(\\)**"),
+        (type_fp_param2, r"**\1\\(\\)**"),
+        (type_func, r"\1()"),
+        (type_enum, r":c:type:`\1 <\2>`"),
+        (type_struct, r":c:type:`\1 <\2>`"),
+        (type_typedef, r":c:type:`\1 <\2>`"),
+        (type_union, r":c:type:`\1 <\2>`"),
+
+        # in rst this can refer to any type
+        (type_fallback, r":c:type:`\1`"),
+        (type_param_ref, r"**\1\2**")
+    ]
+    blankline = "\n"
+
+    sphinx_literal = Re(r'^[^.].*::$', cache=False)
+    sphinx_cblock = Re(r'^\.\.\ +code-block::', cache=False)
+
+    def __init__(self):
+        """
+        Creates class variables.
+
+        Not really mandatory, but it is a good coding style and makes
+        pylint happy.
+        """
+
+        super().__init__()
+        self.lineprefix = ""
+
+    def print_lineno(self, ln):
+        """Outputs a line number"""
+
+        if self.enable_lineno and ln:
+            print(f".. LINENO {ln}")
+
+    def output_highlight(self, args):
+        input_text = args
+        output = ""
+        in_literal = False
+        litprefix = ""
+        block = ""
+
+        for line in input_text.strip("\n").split("\n"):
+
+            # If we're in a literal block, see if we should drop out of it.
+            # Otherwise, pass the line straight through unmunged.
+            if in_literal:
+                if line.strip():  # If the line is not blank
+                    # If this is the first non-blank line in a literal block,
+                    # figure out the proper indent.
+                    if not litprefix:
+                        r = Re(r'^(\s*)')
+                        if r.match(line):
+                            litprefix = '^' + r.group(1)
+                        else:
+                            litprefix = ""
+
+                        output += line + "\n"
+                    elif not Re(litprefix).match(line):
+                        in_literal = False
+                    else:
+                        output += line + "\n"
+                else:
+                    output += line + "\n"
+
+            # Not in a literal block (or just dropped out)
+            if not in_literal:
+                block += line + "\n"
+                if self.sphinx_literal.match(line) or self.sphinx_cblock.match(line):
+                    in_literal = True
+                    litprefix = ""
+                    output += self.highlight_block(block)
+                    block = ""
+
+        # Handle any remaining block
+        if block:
+            output += self.highlight_block(block)
+
+        # Print the output with the line prefix
+        for line in output.strip("\n").split("\n"):
+            print(self.lineprefix + line)
+
+    def out_section(self, args, out_reference=False):
+        """
+        Outputs a block section.
+
+        This could use some work; it's used to output the DOC: sections, and
+        starts by putting out the name of the doc section itself, but that
+        tends to duplicate a header already in the template file.
+        """
+
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+        section_start_lines = args.get('section_start_lines', {})
+
+        for section in sectionlist:
+            # Skip sections that are in the nosymbol_table
+            if section in self.nosymbol:
+                continue
+
+            if not self.out_mode == self.OUTPUT_INCLUDE:
+                if out_reference:
+                    print(f".. _{section}:\n")
+
+                if not self.symbol:
+                    print(f'{self.lineprefix}**{section}**\n')
+
+            self.print_lineno(section_start_lines.get(section, 0))
+            self.output_highlight(sections[section])
+            print()
+        print()
+
+    def out_doc(self, fname, name, args):
+        if not self.check_doc(name):
+            return
+
+        self.out_section(args, out_reference=True)
+
+    def out_function(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        signature = ""
+
+        func_macro = args.get('func_macro', False)
+        if func_macro:
+            signature = args['function']
+        else:
+            if args.get('functiontype'):
+                signature = args['functiontype'] + " "
+            signature += args['function'] + " ("
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
+
+        ln = args.get('ln', 0)
+
+        count = 0
+        for parameter in parameterlist:
+            if count != 0:
+                signature += ", "
+            count += 1
+            dtype = args['parametertypes'].get(parameter, "")
+
+            if function_pointer.search(dtype):
+                signature += function_pointer.group(1) + parameter + function_pointer.group(3)
+            else:
+                signature += dtype
+
+        if not func_macro:
+            signature += ")"
+
+        if args.get('typedef') or not args.get('functiontype'):
+            print(f".. c:macro:: {args['function']}\n")
+
+            if args.get('typedef'):
+                self.print_lineno(ln)
+                print("   **Typedef**: ", end="")
+                self.lineprefix = ""
+                self.output_highlight(args.get('purpose', ""))
+                print("\n\n**Syntax**\n")
+                print(f"  ``{signature}``\n")
+            else:
+                print(f"``{signature}``\n")
+        else:
+            print(f".. c:function:: {signature}\n")
+
+        if not args.get('typedef'):
+            self.print_lineno(ln)
+            self.lineprefix = "   "
+            self.output_highlight(args.get('purpose', ""))
+            print()
+
+        # Put descriptive text into a container (HTML <div>) to help set
+        # function prototypes apart
+        self.lineprefix = "  "
+
+        if parameterlist:
+            print(".. container:: kernelindent\n")
+            print(f"{self.lineprefix}**Parameters**\n")
+
+        for parameter in parameterlist:
+            parameter_name = Re(r'\[.*').sub('', parameter)
+            dtype = args['parametertypes'].get(parameter, "")
+
+            if dtype:
+                print(f"{self.lineprefix}``{dtype}``")
+            else:
+                print(f"{self.lineprefix}``{parameter}``")
+
+            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
+
+            self.lineprefix = "    "
+            if parameter_name in parameterdescs and \
+               parameterdescs[parameter_name] != KernelDoc.undescribed:
+
+                self.output_highlight(parameterdescs[parameter_name])
+                print()
+            else:
+                print(f"{self.lineprefix}*undescribed*\n")
+            self.lineprefix = "  "
+
+        self.out_section(args)
+        self.lineprefix = oldprefix
+
+    def out_enum(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        name = args.get('enum', '')
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        ln = args.get('ln', 0)
+
+        print(f"\n\n.. c:enum:: {name}\n")
+
+        self.print_lineno(ln)
+        self.lineprefix = "  "
+        self.output_highlight(args.get('purpose', ''))
+        print()
+
+        print(".. container:: kernelindent\n")
+        outer = self.lineprefix + "  "
+        self.lineprefix = outer + "  "
+        print(f"{outer}**Constants**\n")
+
+        for parameter in parameterlist:
+            print(f"{outer}``{parameter}``")
+
+            if parameterdescs.get(parameter, '') != KernelDoc.undescribed:
+                self.output_highlight(parameterdescs[parameter])
+            else:
+                print(f"{self.lineprefix}*undescribed*\n")
+            print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+    def out_typedef(self, fname, name, args):
+
+        oldprefix = self.lineprefix
+        name = args.get('typedef', '')
+        ln = args.get('ln', 0)
+
+        print(f"\n\n.. c:type:: {name}\n")
+
+        self.print_lineno(ln)
+        self.lineprefix = "   "
+
+        self.output_highlight(args.get('purpose', ''))
+
+        print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+    def out_struct(self, fname, name, args):
+
+        name = args.get('struct', "")
+        purpose = args.get('purpose', "")
+        declaration = args.get('definition', "")
+        dtype = args.get('type', "struct")
+        ln = args.get('ln', 0)
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
+
+        print(f"\n\n.. c:{dtype}:: {name}\n")
+
+        self.print_lineno(ln)
+
+        oldprefix = self.lineprefix
+        self.lineprefix += "  "
+
+        self.output_highlight(purpose)
+        print()
+
+        print(".. container:: kernelindent\n")
+        print(f"{self.lineprefix}**Definition**::\n")
+
+        self.lineprefix = self.lineprefix + "  "
+
+        declaration = declaration.replace("\t", self.lineprefix)
+
+        print(f"{self.lineprefix}{dtype} {name}" + ' {')
+        print(f"{declaration}{self.lineprefix}" + "};\n")
+
+        self.lineprefix = "  "
+        print(f"{self.lineprefix}**Members**\n")
+        for parameter in parameterlist:
+            if not parameter or parameter.startswith("#"):
+                continue
+
+            parameter_name = parameter.split("[", maxsplit=1)[0]
+
+            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+                continue
+
+            self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
+
+            print(f"{self.lineprefix}``{parameter}``")
+
+            self.lineprefix = "    "
+            self.output_highlight(parameterdescs[parameter_name])
+            self.lineprefix = "  "
+
+            print()
+
+        print()
+
+        self.lineprefix = oldprefix
+        self.out_section(args)
+
+
+class ManFormat(OutputFormat):
+    """Consts and functions used by man pages output"""
+
+    highlights = (
+        (type_constant, r"\1"),
+        (type_constant2, r"\1"),
+        (type_func, r"\\fB\1\\fP"),
+        (type_enum, r"\\fI\1\\fP"),
+        (type_struct, r"\\fI\1\\fP"),
+        (type_typedef, r"\\fI\1\\fP"),
+        (type_union, r"\\fI\1\\fP"),
+        (type_param, r"\\fI\1\\fP"),
+        (type_param_ref, r"\\fI\1\2\\fP"),
+        (type_member, r"\\fI\1\2\3\\fP"),
+        (type_fallback, r"\\fI\1\\fP")
+    )
+    blankline = ""
+
+    def __init__(self):
+        """
+        Creates class variables.
+
+        Not really mandatory, but it is a good coding style and makes
+        pylint happy.
+        """
+
+        super().__init__()
+
+        dt = datetime.now()
+        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
+            # use UTC TZ
+            to_zone = tz.gettz('UTC')
+            dt = dt.astimezone(to_zone)
+
+        self.man_date = dt.strftime("%B %Y")
+
+    def output_highlight(self, block):
+
+        contents = self.highlight_block(block)
+
+        if isinstance(contents, list):
+            contents = "\n".join(contents)
+
+        for line in contents.strip("\n").split("\n"):
+            line = Re(r"^\s*").sub("", line)
+
+            if line and line[0] == ".":
+                print("\\&" + line)
+            else:
+                print(line)
+
+    def out_doc(self, fname, name, args):
+        module = args.get('module')
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX')
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
+
+    def out_function(self, fname, name, args):
+        """output function in man"""
+
+        parameterlist = args.get('parameterlist', [])
+        parameterdescs = args.get('parameterdescs', {})
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"{args['function']} \\- {args['purpose']}")
+
+        print(".SH SYNOPSIS")
+        if args.get('functiontype', ''):
+            print(f'.B "{args['functiontype']}" {args['function']}')
+        else:
+            print(f'.B "{args['function']}')
+
+        count = 0
+        parenth = "("
+        post = ","
+
+        for parameter in parameterlist:
+            if count == len(parameterlist) - 1:
+                post = ");"
+
+            dtype = args['parametertypes'].get(parameter, "")
+            if function_pointer.match(dtype):
+                # Pointer-to-function
+                print(f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"')
+            else:
+                dtype = Re(r'([^\*])$').sub(r'\1 ', dtype)
+
+                print(f'.BI "{parenth}{dtype}"  "{post}"')
+            count += 1
+            parenth = ""
+
+        if parameterlist:
+            print(".SH ARGUMENTS")
+
+        for parameter in parameterlist:
+            parameter_name = re.sub(r'\[.*', '', parameter)
+
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(parameterdescs.get(parameter_name, ""))
+
+        for section in sectionlist:
+            print(f'.SH "{section.upper()}"')
+            self.output_highlight(sections[section])
+
+    def out_enum(self, fname, name, args):
+
+        name = args.get('enum', '')
+        parameterlist = args.get('parameterlist', [])
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"enum {args['enum']} \\- {args['purpose']}")
+
+        print(".SH SYNOPSIS")
+        print(f"enum {args['enum']}" + " {")
+
+        count = 0
+        for parameter in parameterlist:
+            print(f'.br\n.BI "    {parameter}"')
+            if count == len(parameterlist) - 1:
+                print("\n};")
+            else:
+                print(", \n.br")
+
+            count += 1
+
+        print(".SH Constants")
+
+        for parameter in parameterlist:
+            parameter_name = Re(r'\[.*').sub('', parameter)
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(args['parameterdescs'].get(parameter_name, ""))
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections[section])
+
+    def out_typedef(self, fname, name, args):
+        module = args.get('module')
+        typedef = args.get('typedef')
+        purpose = args.get('purpose')
+        sectionlist = args.get('sectionlist', [])
+        sections = args.get('sections', {})
+
+        print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"typedef {typedef} \\- {purpose}")
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
+
+    def out_struct(self, fname, name, args):
+        module = args.get('module')
+        struct_type = args.get('type')
+        struct_name = args.get('struct')
+        purpose = args.get('purpose')
+        definition = args.get('definition')
+        sectionlist = args.get('sectionlist', [])
+        parameterlist = args.get('parameterlist', [])
+        sections = args.get('sections', {})
+        parameterdescs = args.get('parameterdescs', {})
+
+        print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_date}" "API Manual" LINUX')
+
+        print(".SH NAME")
+        print(f"{struct_type} {struct_name} \\- {purpose}")
+
+        # Replace tabs with two spaces and handle newlines
+        declaration = definition.replace("\t", "  ")
+        declaration = Re(r"\n").sub('"\n.br\n.BI "', declaration)
+
+        print(".SH SYNOPSIS")
+        print(f"{struct_type} {struct_name} " + "{" + "\n.br")
+        print(f'.BI "{declaration}\n' + "};\n.br\n")
+
+        print(".SH Members")
+        for parameter in parameterlist:
+            if parameter.startswith("#"):
+                continue
+
+            parameter_name = re.sub(r"\[.*", "", parameter)
+
+            if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+                continue
+
+            print(f'.IP "{parameter}" 12')
+            self.output_highlight(parameterdescs.get(parameter_name))
+
+        for section in sectionlist:
+            print(f'.SH "{section}"')
+            self.output_highlight(sections.get(section))
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 11/33] scripts/kernel-doc.py: convert message output to an interactor
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (9 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 10/33] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 12/33] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Instead of directly printing output messages, change kdoc classes
to return an interactor with the output message, letting the
actual display to happen at the command-line command.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           |   9 +-
 scripts/lib/kdoc/kdoc_files.py  |  15 ++-
 scripts/lib/kdoc/kdoc_output.py | 171 ++++++++++++++++----------------
 3 files changed, 104 insertions(+), 91 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index abff78e9160f..63efec4b3f4b 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -283,9 +283,12 @@ def main():
 
     kfiles.parse()
 
-    kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
-               internal=args.internal, symbol=args.symbol,
-               nosymbol=args.nosymbol)
+    for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
+                          internal=args.internal, symbol=args.symbol,
+                          nosymbol=args.nosymbol):
+        msg = t[1]
+        if msg:
+            print(msg)
 
 
 # Call main method
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 8bcdc7ead984..817ed98b2727 100755
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -229,9 +229,10 @@ class KernelFiles():
 
     def out_msg(self, fname, name, arg):
         """
-        Output messages from a file name using the output style filtering.
+        Return output messages from a file name using the output style
+        filtering.
 
-        If output type was not handled by the syler, return False.
+        If output type was not handled by the syler, return None.
         """
 
         # NOTE: we can add rules here to filter out unwanted parts,
@@ -242,7 +243,8 @@ class KernelFiles():
     def msg(self, enable_lineno=False, export=False, internal=False,
             symbol=None, nosymbol=None):
         """
-        Interacts over the kernel-doc results and output messages.
+        Interacts over the kernel-doc results and output messages,
+        returning kernel-doc markups on each interaction
         """
 
         function_table = self.config.function_table
@@ -261,10 +263,15 @@ class KernelFiles():
                                   function_table, enable_lineno)
 
         for fname, arg_tuple in self.results:
+            msg = ""
             for name, arg in arg_tuple:
-                if self.out_msg(fname, name, arg):
+                msg += self.out_msg(fname, name, arg)
+
+                if msg is None:
                     ln = arg.get("ln", 0)
                     dtype = arg.get('type', "")
 
                     self.config.log.warning("%s:%d Can't handle %s",
                                             fname, ln, dtype)
+            if msg:
+                yield fname, msg
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 24e40b3e7d1d..fda07049ecf7 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -71,6 +71,8 @@ class OutputFormat:
         self.function_table = set()
         self.config = None
 
+        self.data = ""
+
     def set_config(self, config):
         self.config = config
 
@@ -157,37 +159,38 @@ class OutputFormat:
         return True
 
     def msg(self, fname, name, args):
+        self.data = ""
 
         dtype = args.get('type', "")
 
         if dtype == "doc":
             self.out_doc(fname, name, args)
-            return False
+            return self.data
 
         if not self.check_declaration(dtype, name):
-            return False
+            return self.data
 
         if dtype == "function":
             self.out_function(fname, name, args)
-            return False
+            return self.data
 
         if dtype == "enum":
             self.out_enum(fname, name, args)
-            return False
+            return self.data
 
         if dtype == "typedef":
             self.out_typedef(fname, name, args)
-            return False
+            return self.data
 
         if dtype in ["struct", "union"]:
             self.out_struct(fname, name, args)
-            return False
+            return self.data
 
         # Warn if some type requires an output logic
         self.config.log.warning("doesn't now how to output '%s' block",
                                 dtype)
 
-        return True
+        return None
 
     # Virtual methods to be overridden by inherited classes
     def out_doc(self, fname, name, args):
@@ -248,7 +251,7 @@ class RestFormat(OutputFormat):
         """Outputs a line number"""
 
         if self.enable_lineno and ln:
-            print(f".. LINENO {ln}")
+            self.data += f".. LINENO {ln}\n"
 
     def output_highlight(self, args):
         input_text = args
@@ -295,7 +298,7 @@ class RestFormat(OutputFormat):
 
         # Print the output with the line prefix
         for line in output.strip("\n").split("\n"):
-            print(self.lineprefix + line)
+            self.data += self.lineprefix + line + "\n"
 
     def out_section(self, args, out_reference=False):
         """
@@ -317,15 +320,15 @@ class RestFormat(OutputFormat):
 
             if not self.out_mode == self.OUTPUT_INCLUDE:
                 if out_reference:
-                    print(f".. _{section}:\n")
+                    self.data += f".. _{section}:\n\n"
 
                 if not self.symbol:
-                    print(f'{self.lineprefix}**{section}**\n')
+                    self.data += f'{self.lineprefix}**{section}**\n\n'
 
             self.print_lineno(section_start_lines.get(section, 0))
             self.output_highlight(sections[section])
-            print()
-        print()
+            self.data += "\n"
+        self.data += "\n"
 
     def out_doc(self, fname, name, args):
         if not self.check_doc(name):
@@ -368,42 +371,42 @@ class RestFormat(OutputFormat):
             signature += ")"
 
         if args.get('typedef') or not args.get('functiontype'):
-            print(f".. c:macro:: {args['function']}\n")
+            self.data += f".. c:macro:: {args['function']}\n\n"
 
             if args.get('typedef'):
                 self.print_lineno(ln)
-                print("   **Typedef**: ", end="")
+                self.data += "   **Typedef**: "
                 self.lineprefix = ""
                 self.output_highlight(args.get('purpose', ""))
-                print("\n\n**Syntax**\n")
-                print(f"  ``{signature}``\n")
+                self.data += "\n\n**Syntax**\n\n"
+                self.data += f"  ``{signature}``\n\n"
             else:
-                print(f"``{signature}``\n")
+                self.data += f"``{signature}``\n\n"
         else:
-            print(f".. c:function:: {signature}\n")
+            self.data += f".. c:function:: {signature}\n\n"
 
         if not args.get('typedef'):
             self.print_lineno(ln)
             self.lineprefix = "   "
             self.output_highlight(args.get('purpose', ""))
-            print()
+            self.data += "\n"
 
         # Put descriptive text into a container (HTML <div>) to help set
         # function prototypes apart
         self.lineprefix = "  "
 
         if parameterlist:
-            print(".. container:: kernelindent\n")
-            print(f"{self.lineprefix}**Parameters**\n")
+            self.data += ".. container:: kernelindent\n\n"
+            self.data += f"{self.lineprefix}**Parameters**\n\n"
 
         for parameter in parameterlist:
             parameter_name = Re(r'\[.*').sub('', parameter)
             dtype = args['parametertypes'].get(parameter, "")
 
             if dtype:
-                print(f"{self.lineprefix}``{dtype}``")
+                self.data += f"{self.lineprefix}``{dtype}``\n"
             else:
-                print(f"{self.lineprefix}``{parameter}``")
+                self.data += f"{self.lineprefix}``{parameter}``\n"
 
             self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
 
@@ -412,9 +415,9 @@ class RestFormat(OutputFormat):
                parameterdescs[parameter_name] != KernelDoc.undescribed:
 
                 self.output_highlight(parameterdescs[parameter_name])
-                print()
+                self.data += "\n"
             else:
-                print(f"{self.lineprefix}*undescribed*\n")
+                self.data += f"{self.lineprefix}*undescribed*\n\n"
             self.lineprefix = "  "
 
         self.out_section(args)
@@ -428,26 +431,26 @@ class RestFormat(OutputFormat):
         parameterdescs = args.get('parameterdescs', {})
         ln = args.get('ln', 0)
 
-        print(f"\n\n.. c:enum:: {name}\n")
+        self.data += f"\n\n.. c:enum:: {name}\n\n"
 
         self.print_lineno(ln)
         self.lineprefix = "  "
         self.output_highlight(args.get('purpose', ''))
-        print()
+        self.data += "\n"
 
-        print(".. container:: kernelindent\n")
+        self.data += ".. container:: kernelindent\n\n"
         outer = self.lineprefix + "  "
         self.lineprefix = outer + "  "
-        print(f"{outer}**Constants**\n")
+        self.data += f"{outer}**Constants**\n\n"
 
         for parameter in parameterlist:
-            print(f"{outer}``{parameter}``")
+            self.data += f"{outer}``{parameter}``\n"
 
             if parameterdescs.get(parameter, '') != KernelDoc.undescribed:
                 self.output_highlight(parameterdescs[parameter])
             else:
-                print(f"{self.lineprefix}*undescribed*\n")
-            print()
+                self.data += f"{self.lineprefix}*undescribed*\n\n"
+            self.data += "\n"
 
         self.lineprefix = oldprefix
         self.out_section(args)
@@ -458,14 +461,14 @@ class RestFormat(OutputFormat):
         name = args.get('typedef', '')
         ln = args.get('ln', 0)
 
-        print(f"\n\n.. c:type:: {name}\n")
+        self.data += f"\n\n.. c:type:: {name}\n\n"
 
         self.print_lineno(ln)
         self.lineprefix = "   "
 
         self.output_highlight(args.get('purpose', ''))
 
-        print()
+        self.data += "\n"
 
         self.lineprefix = oldprefix
         self.out_section(args)
@@ -482,7 +485,7 @@ class RestFormat(OutputFormat):
         parameterdescs = args.get('parameterdescs', {})
         parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
 
-        print(f"\n\n.. c:{dtype}:: {name}\n")
+        self.data += f"\n\n.. c:{dtype}:: {name}\n\n"
 
         self.print_lineno(ln)
 
@@ -490,20 +493,20 @@ class RestFormat(OutputFormat):
         self.lineprefix += "  "
 
         self.output_highlight(purpose)
-        print()
+        self.data += "\n"
 
-        print(".. container:: kernelindent\n")
-        print(f"{self.lineprefix}**Definition**::\n")
+        self.data += ".. container:: kernelindent\n\n"
+        self.data += f"{self.lineprefix}**Definition**::\n\n"
 
         self.lineprefix = self.lineprefix + "  "
 
         declaration = declaration.replace("\t", self.lineprefix)
 
-        print(f"{self.lineprefix}{dtype} {name}" + ' {')
-        print(f"{declaration}{self.lineprefix}" + "};\n")
+        self.data += f"{self.lineprefix}{dtype} {name}" + ' {' + "\n"
+        self.data += f"{declaration}{self.lineprefix}" + "};\n\n"
 
         self.lineprefix = "  "
-        print(f"{self.lineprefix}**Members**\n")
+        self.data += f"{self.lineprefix}**Members**\n\n"
         for parameter in parameterlist:
             if not parameter or parameter.startswith("#"):
                 continue
@@ -515,15 +518,15 @@ class RestFormat(OutputFormat):
 
             self.print_lineno(parameterdesc_start_lines.get(parameter_name, 0))
 
-            print(f"{self.lineprefix}``{parameter}``")
+            self.data += f"{self.lineprefix}``{parameter}``\n"
 
             self.lineprefix = "    "
             self.output_highlight(parameterdescs[parameter_name])
             self.lineprefix = "  "
 
-            print()
+            self.data += "\n"
 
-        print()
+        self.data += "\n"
 
         self.lineprefix = oldprefix
         self.out_section(args)
@@ -576,19 +579,19 @@ class ManFormat(OutputFormat):
             line = Re(r"^\s*").sub("", line)
 
             if line and line[0] == ".":
-                print("\\&" + line)
+                self.data += "\\&" + line + "\n"
             else:
-                print(line)
+                self.data += line + "\n"
 
     def out_doc(self, fname, name, args):
         module = args.get('module')
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        print(f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX')
+        self.data += f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
         for section in sectionlist:
-            print(f'.SH "{section}"')
+            self.data += f'.SH "{section}"' + "\n"
             self.output_highlight(sections.get(section))
 
     def out_function(self, fname, name, args):
@@ -599,16 +602,16 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        print(f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX')
+        self.data += f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n"
 
-        print(".SH NAME")
-        print(f"{args['function']} \\- {args['purpose']}")
+        self.data += ".SH NAME\n"
+        self.data += f"{args['function']} \\- {args['purpose']}\n"
 
-        print(".SH SYNOPSIS")
+        self.data += ".SH SYNOPSIS\n"
         if args.get('functiontype', ''):
-            print(f'.B "{args['functiontype']}" {args['function']}')
+            self.data += f'.B "{args['functiontype']}" {args['function']}' + "\n"
         else:
-            print(f'.B "{args['function']}')
+            self.data += f'.B "{args['function']}' + "\n"
 
         count = 0
         parenth = "("
@@ -621,25 +624,25 @@ class ManFormat(OutputFormat):
             dtype = args['parametertypes'].get(parameter, "")
             if function_pointer.match(dtype):
                 # Pointer-to-function
-                print(f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"')
+                self.data += f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"' + "\n"
             else:
                 dtype = Re(r'([^\*])$').sub(r'\1 ', dtype)
 
-                print(f'.BI "{parenth}{dtype}"  "{post}"')
+                self.data += f'.BI "{parenth}{dtype}"  "{post}"' + "\n"
             count += 1
             parenth = ""
 
         if parameterlist:
-            print(".SH ARGUMENTS")
+            self.data += ".SH ARGUMENTS\n"
 
         for parameter in parameterlist:
             parameter_name = re.sub(r'\[.*', '', parameter)
 
-            print(f'.IP "{parameter}" 12')
+            self.data += f'.IP "{parameter}" 12' + "\n"
             self.output_highlight(parameterdescs.get(parameter_name, ""))
 
         for section in sectionlist:
-            print(f'.SH "{section.upper()}"')
+            self.data += f'.SH "{section.upper()}"' + "\n"
             self.output_highlight(sections[section])
 
     def out_enum(self, fname, name, args):
@@ -649,33 +652,33 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        print(f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX')
+        self.data += f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
-        print(".SH NAME")
-        print(f"enum {args['enum']} \\- {args['purpose']}")
+        self.data += ".SH NAME\n"
+        self.data += f"enum {args['enum']} \\- {args['purpose']}\n"
 
-        print(".SH SYNOPSIS")
-        print(f"enum {args['enum']}" + " {")
+        self.data += ".SH SYNOPSIS\n"
+        self.data += f"enum {args['enum']}" + " {\n"
 
         count = 0
         for parameter in parameterlist:
-            print(f'.br\n.BI "    {parameter}"')
+            self.data += f'.br\n.BI "    {parameter}"' + "\n"
             if count == len(parameterlist) - 1:
-                print("\n};")
+                self.data += "\n};\n"
             else:
-                print(", \n.br")
+                self.data += ", \n.br\n"
 
             count += 1
 
-        print(".SH Constants")
+        self.data += ".SH Constants\n"
 
         for parameter in parameterlist:
             parameter_name = Re(r'\[.*').sub('', parameter)
-            print(f'.IP "{parameter}" 12')
+            self.data += f'.IP "{parameter}" 12' + "\n"
             self.output_highlight(args['parameterdescs'].get(parameter_name, ""))
 
         for section in sectionlist:
-            print(f'.SH "{section}"')
+            self.data += f'.SH "{section}"' + "\n"
             self.output_highlight(sections[section])
 
     def out_typedef(self, fname, name, args):
@@ -685,13 +688,13 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        print(f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual" LINUX')
+        self.data += f'.TH "{module}" 9 "{typedef}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
-        print(".SH NAME")
-        print(f"typedef {typedef} \\- {purpose}")
+        self.data += ".SH NAME\n"
+        self.data += f"typedef {typedef} \\- {purpose}\n"
 
         for section in sectionlist:
-            print(f'.SH "{section}"')
+            self.data += f'.SH "{section}"' + "\n"
             self.output_highlight(sections.get(section))
 
     def out_struct(self, fname, name, args):
@@ -705,20 +708,20 @@ class ManFormat(OutputFormat):
         sections = args.get('sections', {})
         parameterdescs = args.get('parameterdescs', {})
 
-        print(f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_date}" "API Manual" LINUX')
+        self.data += f'.TH "{module}" 9 "{struct_type} {struct_name}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
-        print(".SH NAME")
-        print(f"{struct_type} {struct_name} \\- {purpose}")
+        self.data += ".SH NAME\n"
+        self.data += f"{struct_type} {struct_name} \\- {purpose}\n"
 
         # Replace tabs with two spaces and handle newlines
         declaration = definition.replace("\t", "  ")
         declaration = Re(r"\n").sub('"\n.br\n.BI "', declaration)
 
-        print(".SH SYNOPSIS")
-        print(f"{struct_type} {struct_name} " + "{" + "\n.br")
-        print(f'.BI "{declaration}\n' + "};\n.br\n")
+        self.data += ".SH SYNOPSIS\n"
+        self.data += f"{struct_type} {struct_name} " + "{" + "\n.br\n"
+        self.data += f'.BI "{declaration}\n' + "};\n.br\n\n"
 
-        print(".SH Members")
+        self.data += ".SH Members\n"
         for parameter in parameterlist:
             if parameter.startswith("#"):
                 continue
@@ -728,9 +731,9 @@ class ManFormat(OutputFormat):
             if parameterdescs.get(parameter_name) == KernelDoc.undescribed:
                 continue
 
-            print(f'.IP "{parameter}" 12')
+            self.data += f'.IP "{parameter}" 12' + "\n"
             self.output_highlight(parameterdescs.get(parameter_name))
 
         for section in sectionlist:
-            print(f'.SH "{section}"')
+            self.data += f'.SH "{section}"' + "\n"
             self.output_highlight(sections.get(section))
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 12/33] scripts/kernel-doc.py: move file lists to the parser function
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (10 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 11/33] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 13/33] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Instead of setting file lists at __init__ time, move it to
the actual parsing function. This allows adding more files
to be parsed in real time, by calling parse function multiple
times.

With the new way, the export_files logic was rewritten to
avoid parsing twice EXPORT_SYMBOL for partial matches.

Please notice that, with this logic, it can still read the
same file twice when export_file is used.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py          |  7 +++----
 scripts/lib/kdoc/kdoc_files.py | 37 ++++++++++++++++------------------
 2 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 63efec4b3f4b..e258a9df7f78 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -274,14 +274,13 @@ def main():
     else:
         out_style = RestFormat()
 
-    kfiles = KernelFiles(files=args.files, verbose=args.verbose,
+    kfiles = KernelFiles(verbose=args.verbose,
                          out_style=out_style, werror=args.werror,
                          wreturn=args.wreturn, wshort_desc=args.wshort_desc,
                          wcontents_before_sections=args.wcontents_before_sections,
-                         modulename=args.modulename,
-                         export_file=args.export_file)
+                         modulename=args.modulename)
 
-    kfiles.parse()
+    kfiles.parse(args.files, export_file=args.export_file)
 
     for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
                           internal=args.internal, symbol=args.symbol,
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 817ed98b2727..47dab46c89fe 100755
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -124,7 +124,7 @@ class KernelFiles():
         self.config.log.error("Cannot find file %s", fname)
         self.config.errors += 1
 
-    def __init__(self, files=None, verbose=False, out_style=None,
+    def __init__(self, verbose=False, out_style=None,
                  werror=False, wreturn=False, wshort_desc=False,
                  wcontents_before_sections=False,
                  logger=None, modulename=None, export_file=None):
@@ -181,51 +181,48 @@ class KernelFiles():
         self.config.src_tree = os.environ.get("SRCTREE", None)
 
         self.out_style = out_style
-        self.export_file = export_file
 
         # Initialize internal variables
 
         self.config.errors = 0
         self.results = []
 
-        self.file_list = files
         self.files = set()
+        self.export_files = set()
 
-    def parse(self):
+    def parse(self, file_list, export_file=None):
         """
         Parse all files
         """
 
         glob = GlobSourceFiles(srctree=self.config.src_tree)
 
-        # Let's use a set here to avoid duplicating files
+        # Prevent parsing the same file twice to speedup parsing and
+        # avoid reporting errors multiple times
 
-        for fname in glob.parse_files(self.file_list, self.file_not_found_cb):
+        for fname in glob.parse_files(file_list, self.file_not_found_cb):
             if fname in self.files:
                 continue
 
-            self.files.add(fname)
-
             res = self.parse_file(fname)
+
             self.results.append((res.fname, res.entries))
-
-        if not self.files:
-            sys.exit(1)
+            self.files.add(fname)
 
         # If a list of export files was provided, parse EXPORT_SYMBOL*
-        # from the ones not already parsed
+        # from files that weren't fully parsed
 
-        if self.export_file:
-            files = self.files
+        if not export_file:
+            return
 
-            glob = GlobSourceFiles(srctree=self.config.src_tree)
+        self.export_files |= self.files
 
-            for fname in glob.parse_files(self.export_file,
-                                          self.file_not_found_cb):
-                if fname not in files:
-                    files.add(fname)
+        glob = GlobSourceFiles(srctree=self.config.src_tree)
 
-                    self.process_export_file(fname)
+        for fname in glob.parse_files(export_file, self.file_not_found_cb):
+            if fname not in self.export_files:
+                self.process_export_file(fname)
+                self.export_files.add(fname)
 
     def out_msg(self, fname, name, arg):
         """
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 13/33] scripts/kernel-doc.py: implement support for -no-doc-sections
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (11 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 12/33] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 14/33] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

The venerable kernel-doc Perl script has a number of options that
aren't properly documented. Among them, there is -no-doc-sections,
which is used by the Sphinx extension.

Implement support for it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           | 8 ++++++--
 scripts/lib/kdoc/kdoc_files.py  | 5 +++--
 scripts/lib/kdoc/kdoc_output.py | 7 ++++++-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index e258a9df7f78..90aacd17499a 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -239,10 +239,13 @@ def main():
     sel_mut.add_argument("-s", "-function", "--symbol", action='append',
                          help=FUNCTION_DESC)
 
-    # This one is valid for all 3 types of filter
+    # Those are valid for all 3 types of filter
     parser.add_argument("-n", "-nosymbol", "--nosymbol", action='append',
                          help=NOSYMBOL_DESC)
 
+    parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections",
+                        action='store_true', help="Don't outputt DOC sections")
+
     parser.add_argument("files", metavar="FILE",
                         nargs="+", help=FILES_DESC)
 
@@ -284,7 +287,8 @@ def main():
 
     for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
                           internal=args.internal, symbol=args.symbol,
-                          nosymbol=args.nosymbol):
+                          nosymbol=args.nosymbol,
+                          no_doc_sections=args.no_doc_sections):
         msg = t[1]
         if msg:
             print(msg)
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 47dab46c89fe..4c04546a74fe 100755
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -238,7 +238,7 @@ class KernelFiles():
         return self.out_style.msg(fname, name, arg)
 
     def msg(self, enable_lineno=False, export=False, internal=False,
-            symbol=None, nosymbol=None):
+            symbol=None, nosymbol=None, no_doc_sections=False):
         """
         Interacts over the kernel-doc results and output messages,
         returning kernel-doc markups on each interaction
@@ -257,7 +257,8 @@ class KernelFiles():
         self.out_style.set_config(self.config)
 
         self.out_style.set_filter(export, internal, symbol, nosymbol,
-                                  function_table, enable_lineno)
+                                  function_table, enable_lineno,
+                                  no_doc_sections)
 
         for fname, arg_tuple in self.results:
             msg = ""
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index fda07049ecf7..a246d213523c 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -70,6 +70,7 @@ class OutputFormat:
         self.symbol = None
         self.function_table = set()
         self.config = None
+        self.no_doc_sections = False
 
         self.data = ""
 
@@ -77,7 +78,7 @@ class OutputFormat:
         self.config = config
 
     def set_filter(self, export, internal, symbol, nosymbol, function_table,
-                   enable_lineno):
+                   enable_lineno, no_doc_sections):
         """
         Initialize filter variables according with the requested mode.
 
@@ -87,6 +88,7 @@ class OutputFormat:
         """
 
         self.enable_lineno = enable_lineno
+        self.no_doc_sections = no_doc_sections
 
         if symbol:
             self.out_mode = self.OUTPUT_INCLUDE
@@ -117,6 +119,9 @@ class OutputFormat:
     def check_doc(self, name):
         """Check if DOC should be output"""
 
+        if self.no_doc_sections:
+            return False
+
         if self.out_mode == self.OUTPUT_ALL:
             return True
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 14/33] scripts/kernel-doc.py: fix line number output
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (12 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 13/33] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 15/33] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

With the Pyhton version, the actual output happens after parsing,
from records stored at self.entries.

Ensure that line numbers will be properly stored there and
that they'll produce the desired results at the ReST output.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py | 13 +++++++------
 scripts/lib/kdoc/kdoc_parser.py | 21 +++++++++++++++++----
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index a246d213523c..6a7187980bec 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -255,7 +255,8 @@ class RestFormat(OutputFormat):
     def print_lineno(self, ln):
         """Outputs a line number"""
 
-        if self.enable_lineno and ln:
+        if self.enable_lineno and ln is not None:
+            ln += 1
             self.data += f".. LINENO {ln}\n"
 
     def output_highlight(self, args):
@@ -358,7 +359,7 @@ class RestFormat(OutputFormat):
         parameterdescs = args.get('parameterdescs', {})
         parameterdesc_start_lines = args.get('parameterdesc_start_lines', {})
 
-        ln = args.get('ln', 0)
+        ln = args.get('declaration_start_line', 0)
 
         count = 0
         for parameter in parameterlist:
@@ -375,11 +376,11 @@ class RestFormat(OutputFormat):
         if not func_macro:
             signature += ")"
 
+        self.print_lineno(ln)
         if args.get('typedef') or not args.get('functiontype'):
             self.data += f".. c:macro:: {args['function']}\n\n"
 
             if args.get('typedef'):
-                self.print_lineno(ln)
                 self.data += "   **Typedef**: "
                 self.lineprefix = ""
                 self.output_highlight(args.get('purpose', ""))
@@ -434,7 +435,7 @@ class RestFormat(OutputFormat):
         name = args.get('enum', '')
         parameterlist = args.get('parameterlist', [])
         parameterdescs = args.get('parameterdescs', {})
-        ln = args.get('ln', 0)
+        ln = args.get('declaration_start_line', 0)
 
         self.data += f"\n\n.. c:enum:: {name}\n\n"
 
@@ -464,7 +465,7 @@ class RestFormat(OutputFormat):
 
         oldprefix = self.lineprefix
         name = args.get('typedef', '')
-        ln = args.get('ln', 0)
+        ln = args.get('declaration_start_line', 0)
 
         self.data += f"\n\n.. c:type:: {name}\n\n"
 
@@ -484,7 +485,7 @@ class RestFormat(OutputFormat):
         purpose = args.get('purpose', "")
         declaration = args.get('definition', "")
         dtype = args.get('type', "struct")
-        ln = args.get('ln', 0)
+        ln = args.get('declaration_start_line', 0)
 
         parameterlist = args.get('parameterlist', [])
         parameterdescs = args.get('parameterdescs', {})
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 3ce116595546..e8c86448d6b5 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -276,7 +276,7 @@ class KernelDoc:
         self.entry.brcount = 0
 
         self.entry.in_doc_sect = False
-        self.entry.declaration_start_line = ln
+        self.entry.declaration_start_line = ln + 1
 
     def push_parameter(self, ln, decl_type, param, dtype,
                        org_arg, declaration_name):
@@ -806,8 +806,10 @@ class KernelDoc:
                                 parameterlist=self.entry.parameterlist,
                                 parameterdescs=self.entry.parameterdescs,
                                 parametertypes=self.entry.parametertypes,
+                                parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
                                 sectionlist=self.entry.sectionlist,
                                 sections=self.entry.sections,
+                                section_start_lines=self.entry.section_start_lines,
                                 purpose=self.entry.declaration_purpose)
 
     def dump_enum(self, ln, proto):
@@ -882,8 +884,10 @@ class KernelDoc:
                                 module=self.config.modulename,
                                 parameterlist=self.entry.parameterlist,
                                 parameterdescs=self.entry.parameterdescs,
+                                parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
                                 sectionlist=self.entry.sectionlist,
                                 sections=self.entry.sections,
+                                section_start_lines=self.entry.section_start_lines,
                                 purpose=self.entry.declaration_purpose)
 
     def dump_declaration(self, ln, prototype):
@@ -1054,8 +1058,10 @@ class KernelDoc:
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
                                     parametertypes=self.entry.parametertypes,
+                                    parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
+                                    section_start_lines=self.entry.section_start_lines,
                                     purpose=self.entry.declaration_purpose,
                                     func_macro=func_macro)
         else:
@@ -1067,8 +1073,10 @@ class KernelDoc:
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
                                     parametertypes=self.entry.parametertypes,
+                                    parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
+                                    section_start_lines=self.entry.section_start_lines,
                                     purpose=self.entry.declaration_purpose,
                                     func_macro=func_macro)
 
@@ -1112,8 +1120,10 @@ class KernelDoc:
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
                                     parametertypes=self.entry.parametertypes,
+                                    parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
+                                    section_start_lines=self.entry.section_start_lines,
                                     purpose=self.entry.declaration_purpose)
             return
 
@@ -1136,6 +1146,7 @@ class KernelDoc:
                                     module=self.entry.modulename,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
+                                    section_start_lines=self.entry.section_start_lines,
                                     purpose=self.entry.declaration_purpose)
             return
 
@@ -1168,7 +1179,7 @@ class KernelDoc:
             return
 
         # start a new entry
-        self.reset_state(ln + 1)
+        self.reset_state(ln)
         self.entry.in_doc_sect = False
 
         # next line is always the function name
@@ -1281,7 +1292,7 @@ class KernelDoc:
             if r.match(line):
                 self.dump_section()
                 self.entry.section = self.section_default
-                self.entry.new_start_line = line
+                self.entry.new_start_line = ln
                 self.entry.contents = ""
 
         if doc_sect.search(line):
@@ -1619,7 +1630,9 @@ class KernelDoc:
             self.dump_section()
             self.output_declaration("doc", None,
                                     sectionlist=self.entry.sectionlist,
-                                    sections=self.entry.sections, module=self.config.modulename)
+                                    sections=self.entry.sections,
+                                    section_start_lines=self.entry.section_start_lines,
+                                    module=self.config.modulename)
             self.reset_state(ln)
 
         elif doc_content.search(line):
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 15/33] scripts/kernel-doc.py: fix handling of doc output check
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (13 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 14/33] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 16/33] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

The filtering logic was seeking for the DOC name to check for
symbols, but such data is stored only inside a section. Add it
to the output_declaration, as it is quicker/easier to check
the declaration name than to check inside each section.

While here, make sure that the output for both ReST and man
after filtering will be similar to what kernel-doc Perl
version does.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py | 29 ++++++++++++-----------------
 scripts/lib/kdoc/kdoc_parser.py |  3 ++-
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 6a7187980bec..7a945dd80c9b 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -122,13 +122,13 @@ class OutputFormat:
         if self.no_doc_sections:
             return False
 
+        if name in self.nosymbol:
+            return False
+
         if self.out_mode == self.OUTPUT_ALL:
             return True
 
         if self.out_mode == self.OUTPUT_INCLUDE:
-            if name in self.nosymbol:
-                return False
-
             if name in self.function_table:
                 return True
 
@@ -154,15 +154,6 @@ class OutputFormat:
 
         return False
 
-    def check_function(self, fname, name, args):
-        return True
-
-    def check_enum(self, fname, name, args):
-        return True
-
-    def check_typedef(self, fname, name, args):
-        return True
-
     def msg(self, fname, name, args):
         self.data = ""
 
@@ -306,7 +297,7 @@ class RestFormat(OutputFormat):
         for line in output.strip("\n").split("\n"):
             self.data += self.lineprefix + line + "\n"
 
-    def out_section(self, args, out_reference=False):
+    def out_section(self, args, out_docblock=False):
         """
         Outputs a block section.
 
@@ -325,7 +316,7 @@ class RestFormat(OutputFormat):
                 continue
 
             if not self.out_mode == self.OUTPUT_INCLUDE:
-                if out_reference:
+                if out_docblock:
                     self.data += f".. _{section}:\n\n"
 
                 if not self.symbol:
@@ -339,8 +330,7 @@ class RestFormat(OutputFormat):
     def out_doc(self, fname, name, args):
         if not self.check_doc(name):
             return
-
-        self.out_section(args, out_reference=True)
+        self.out_section(args, out_docblock=True)
 
     def out_function(self, fname, name, args):
 
@@ -583,8 +573,10 @@ class ManFormat(OutputFormat):
 
         for line in contents.strip("\n").split("\n"):
             line = Re(r"^\s*").sub("", line)
+            if not line:
+                continue
 
-            if line and line[0] == ".":
+            if line[0] == ".":
                 self.data += "\\&" + line + "\n"
             else:
                 self.data += line + "\n"
@@ -594,6 +586,9 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
+        if not self.check_doc(name):
+                return
+
         self.data += f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
         for section in sectionlist:
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index e8c86448d6b5..74b311c8184c 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1198,6 +1198,7 @@ class KernelDoc:
             else:
                 self.entry.section = doc_block.group(1)
 
+            self.entry.identifier = self.entry.section
             self.state = self.STATE_DOCBLOCK
             return
 
@@ -1628,7 +1629,7 @@ class KernelDoc:
 
         if doc_end.search(line):
             self.dump_section()
-            self.output_declaration("doc", None,
+            self.output_declaration("doc", self.entry.identifier,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
                                     section_start_lines=self.entry.section_start_lines,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 16/33] scripts/kernel-doc.py: properly handle out_section for ReST
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (14 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 15/33] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 17/33] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

There is a difference at the way DOC sections are output with
the include mode. Handle such difference properly.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 7a945dd80c9b..d0c8cedb0ea5 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -315,12 +315,12 @@ class RestFormat(OutputFormat):
             if section in self.nosymbol:
                 continue
 
-            if not self.out_mode == self.OUTPUT_INCLUDE:
-                if out_docblock:
+            if out_docblock:
+                if not self.out_mode == self.OUTPUT_INCLUDE:
                     self.data += f".. _{section}:\n\n"
-
-                if not self.symbol:
                     self.data += f'{self.lineprefix}**{section}**\n\n'
+            else:
+                self.data += f'{self.lineprefix}**{section}**\n\n'
 
             self.print_lineno(section_start_lines.get(section, 0))
             self.output_highlight(sections[section])
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 17/33] scripts/kernel-doc.py: postpone warnings to the output plugin
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (15 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 16/33] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 18/33] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

We don't want to have warnings displayed for symbols that
weren't output. So, postpone warnings print to the output
plugin, where symbol output is validated.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py | 24 +++++++++++++++----
 scripts/lib/kdoc/kdoc_parser.py | 41 ++++++++++++++++-----------------
 2 files changed, 39 insertions(+), 26 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index d0c8cedb0ea5..6582d1f64d1e 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -116,7 +116,16 @@ class OutputFormat:
 
         return block
 
-    def check_doc(self, name):
+    def out_warnings(self, args):
+        warnings = args.get('warnings', [])
+
+        for warning, log_msg in warnings:
+            if warning:
+                self.config.log.warning(log_msg)
+            else:
+                self.config.log.info(log_msg)
+
+    def check_doc(self, name, args):
         """Check if DOC should be output"""
 
         if self.no_doc_sections:
@@ -126,19 +135,22 @@ class OutputFormat:
             return False
 
         if self.out_mode == self.OUTPUT_ALL:
+            self.out_warnings(args)
             return True
 
         if self.out_mode == self.OUTPUT_INCLUDE:
             if name in self.function_table:
+                self.out_warnings(args)
                 return True
 
         return False
 
-    def check_declaration(self, dtype, name):
+    def check_declaration(self, dtype, name, args):
         if name in self.nosymbol:
             return False
 
         if self.out_mode == self.OUTPUT_ALL:
+            self.out_warnings(args)
             return True
 
         if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]:
@@ -147,9 +159,11 @@ class OutputFormat:
 
         if self.out_mode == self.OUTPUT_INTERNAL:
             if dtype != "function":
+                self.out_warnings(args)
                 return True
 
             if name not in self.function_table:
+                self.out_warnings(args)
                 return True
 
         return False
@@ -163,7 +177,7 @@ class OutputFormat:
             self.out_doc(fname, name, args)
             return self.data
 
-        if not self.check_declaration(dtype, name):
+        if not self.check_declaration(dtype, name, args):
             return self.data
 
         if dtype == "function":
@@ -328,7 +342,7 @@ class RestFormat(OutputFormat):
         self.data += "\n"
 
     def out_doc(self, fname, name, args):
-        if not self.check_doc(name):
+        if not self.check_doc(name, args):
             return
         self.out_section(args, out_docblock=True)
 
@@ -586,7 +600,7 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        if not self.check_doc(name):
+        if not self.check_doc(name, args):
                 return
 
         self.data += f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX' + "\n"
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 74b311c8184c..3698ef625367 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -131,23 +131,23 @@ class KernelDoc:
         # Place all potential outputs into an array
         self.entries = []
 
-    def show_warnings(self, dtype, declaration_name):  # pylint: disable=W0613
-        """
-        Allow filtering out warnings
-        """
-
-        # TODO: implement it
-
-        return True
-
     # TODO: rename to emit_message
     def emit_warning(self, ln, msg, warning=True):
         """Emit a message"""
 
+        log_msg = f"{self.fname}:{ln} {msg}"
+
+        if self.entry:
+            # Delegate warning output to output logic, as this way it
+            # will report warnings/info only for symbols that are output
+
+            self.entry.warnings.append((warning, log_msg))
+            return
+
         if warning:
-            self.config.log.warning("%s:%d %s", self.fname, ln, msg)
+            self.config.log.warning(log_msg)
         else:
-            self.config.log.info("%s:%d %s", self.fname, ln, msg)
+            self.config.log.info(log_msg)
 
     def dump_section(self, start_new=True):
         """
@@ -221,10 +221,9 @@ class KernelDoc:
         # For now, we're keeping the same name of the function just to make
         # easier to compare the source code of both scripts
 
-        if "declaration_start_line" not in args:
-            args["declaration_start_line"] = self.entry.declaration_start_line
-
+        args["declaration_start_line"] = self.entry.declaration_start_line
         args["type"] = dtype
+        args["warnings"] = self.entry.warnings
 
         # TODO: use colletions.OrderedDict
 
@@ -257,6 +256,8 @@ class KernelDoc:
         self.entry.struct_actual = ""
         self.entry.prototype = ""
 
+        self.entry.warnings = []
+
         self.entry.parameterlist = []
         self.entry.parameterdescs = {}
         self.entry.parametertypes = {}
@@ -328,7 +329,7 @@ class KernelDoc:
         if param not in self.entry.parameterdescs and not param.startswith("#"):
             self.entry.parameterdescs[param] = self.undescribed
 
-            if self.show_warnings(dtype, declaration_name) and "." not in param:
+            if "." not in param:
                 if decl_type == 'function':
                     dname = f"{decl_type} parameter"
                 else:
@@ -868,16 +869,14 @@ class KernelDoc:
             self.entry.parameterlist.append(arg)
             if arg not in self.entry.parameterdescs:
                 self.entry.parameterdescs[arg] = self.undescribed
-                if self.show_warnings("enum", declaration_name):
-                    self.emit_warning(ln,
-                                      f"Enum value '{arg}' not described in enum '{declaration_name}'")
+                self.emit_warning(ln,
+                                  f"Enum value '{arg}' not described in enum '{declaration_name}'")
             member_set.add(arg)
 
         for k in self.entry.parameterdescs:
             if k not in member_set:
-                if self.show_warnings("enum", declaration_name):
-                    self.emit_warning(ln,
-                                      f"Excess enum value '%{k}' description in '{declaration_name}'")
+                self.emit_warning(ln,
+                                  f"Excess enum value '%{k}' description in '{declaration_name}'")
 
         self.output_declaration('enum', declaration_name,
                                 enum=declaration_name,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 18/33] docs: add a .pylintrc file with sys path for docs scripts
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (16 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 17/33] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 19/33] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

The docs scripts that are used by Documentation/sphinx are
using scripts/lib/* directories to place classes that will
be used by both extensions and scripts.

When pylint is used, it needs to identify the path where
such scripts are, otherwise it will bail out. Add a simple
RC file placing the location of such files.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 .pylintrc | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 .pylintrc

diff --git a/.pylintrc b/.pylintrc
new file mode 100644
index 000000000000..30b8ae1659f8
--- /dev/null
+++ b/.pylintrc
@@ -0,0 +1,2 @@
+[MASTER]
+init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi"]'
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 19/33] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (17 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 18/33] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 20/33] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

It is useful to know what kernel-doc command was used during
document build time, as it allows one to check the output the same
way as Sphinx extension does.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kerneldoc.py | 34 +++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerneldoc.py
index 39ddae6ae7dd..d206eb2be10a 100644
--- a/Documentation/sphinx/kerneldoc.py
+++ b/Documentation/sphinx/kerneldoc.py
@@ -43,6 +43,29 @@ from sphinx.util import logging
 
 __version__  = '1.0'
 
+def cmd_str(cmd):
+    """
+    Helper function to output a command line that can be used to produce
+    the same records via command line. Helpful to debug troubles at the
+    script.
+    """
+
+    cmd_line = ""
+
+    for w in cmd:
+        if w == "" or " " in w:
+            esc_cmd = "'" + w + "'"
+        else:
+            esc_cmd = w
+
+        if cmd_line:
+            cmd_line += " " + esc_cmd
+            continue
+        else:
+            cmd_line = esc_cmd
+
+    return cmd_line
+
 class KernelDocDirective(Directive):
     """Extract kernel-doc comments from the specified file"""
     required_argument = 1
@@ -57,6 +80,7 @@ class KernelDocDirective(Directive):
     }
     has_content = False
     logger = logging.getLogger('kerneldoc')
+    verbose = 0
 
     def run(self):
         env = self.state.document.settings.env
@@ -65,6 +89,13 @@ class KernelDocDirective(Directive):
         filename = env.config.kerneldoc_srctree + '/' + self.arguments[0]
         export_file_patterns = []
 
+        verbose = os.environ.get("V")
+        if verbose:
+            try:
+                self.verbose = int(verbose)
+            except ValueError:
+                pass
+
         # Tell sphinx of the dependency
         env.note_dependency(os.path.abspath(filename))
 
@@ -104,6 +135,9 @@ class KernelDocDirective(Directive):
 
         cmd += [filename]
 
+        if self.verbose >= 1:
+            print(cmd_str(cmd))
+
         try:
             self.logger.verbose("calling kernel-doc '%s'" % (" ".join(cmd)))
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 20/33] docs: sphinx: kerneldoc: ignore "\" characters from options
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (18 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 19/33] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 21/33] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Kees Cook, linux-kernel

Documentation/driver-api/infiniband.rst has a kernel-doc tag
with "\" characters at the end:

	.. kernel-doc:: drivers/infiniband/ulp/iser/iscsi_iser.c
	   :functions: iscsi_iser_pdu_alloc iser_initialize_task_headers \
	        iscsi_iser_task_init iscsi_iser_mtask_xmit iscsi_iser_task_xmit \
	        iscsi_iser_cleanup_task iscsi_iser_check_protection \
	        iscsi_iser_conn_create iscsi_iser_conn_bind \
	        iscsi_iser_conn_start iscsi_iser_conn_stop \
	        iscsi_iser_session_destroy iscsi_iser_session_create \
	        iscsi_iser_set_param iscsi_iser_ep_connect iscsi_iser_ep_poll \
	        iscsi_iser_ep_disconnect

This is not handled well, as the "\" strings will be just stored inside
Sphinx options.

While the actual problem deserves being fixed, better to relax the
keneldoc.py extension to silently strip "\" from the end of strings,
as otherwise this may cause troubles when preparing arguments to
be executed by kernel-doc.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/sphinx/kerneldoc.py | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerneldoc.py
index d206eb2be10a..344789ed9ea2 100644
--- a/Documentation/sphinx/kerneldoc.py
+++ b/Documentation/sphinx/kerneldoc.py
@@ -118,6 +118,10 @@ class KernelDocDirective(Directive):
             identifiers = self.options.get('identifiers').split()
             if identifiers:
                 for i in identifiers:
+                    i = i.rstrip("\\").strip()
+                    if not i:
+                        continue
+
                     cmd += ['-function', i]
             else:
                 cmd += ['-no-doc-sections']
@@ -126,9 +130,17 @@ class KernelDocDirective(Directive):
             no_identifiers = self.options.get('no-identifiers').split()
             if no_identifiers:
                 for i in no_identifiers:
+                    i = i.rstrip("\\").strip()
+                    if not i:
+                        continue
+
                     cmd += ['-nosymbol', i]
 
         for pattern in export_file_patterns:
+            pattern = pattern.rstrip("\\").strip()
+            if not pattern:
+                continue
+
             for f in glob.glob(env.config.kerneldoc_srctree + '/' + pattern):
                 env.note_dependency(os.path.abspath(f))
                 cmd += ['-export-file', f]
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 21/33] docs: sphinx: kerneldoc: use kernel-doc.py script
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (19 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 20/33] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 22/33] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Switch to the new version when producing documentation.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/Makefile | 2 +-
 Documentation/conf.py  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/Makefile b/Documentation/Makefile
index 63094646df28..c022b97c487e 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -60,7 +60,7 @@ endif #HAVE_LATEXMK
 # Internal variables.
 PAPEROPT_a4     = -D latex_paper_size=a4
 PAPEROPT_letter = -D latex_paper_size=letter
-KERNELDOC       = $(srctree)/scripts/kernel-doc
+KERNELDOC       = $(srctree)/scripts/kernel-doc.py
 KERNELDOC_CONF  = -D kerneldoc_srctree=$(srctree) -D kerneldoc_bin=$(KERNELDOC)
 ALLSPHINXOPTS   =  $(KERNELDOC_CONF) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS)
 ifneq ($(wildcard $(srctree)/.config),)
diff --git a/Documentation/conf.py b/Documentation/conf.py
index 3dad1f90b098..b126f6760b5f 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -540,7 +540,7 @@ pdf_documents = [
 # kernel-doc extension configuration for running Sphinx directly (e.g. by Read
 # the Docs). In a normal build, these are supplied from the Makefile via command
 # line arguments.
-kerneldoc_bin = '../scripts/kernel-doc'
+kerneldoc_bin = '../scripts/kernel-doc.py'
 kerneldoc_srctree = '..'
 
 # ------------------------------------------------------------------------------
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 22/33] scripts/kernel-doc.py: Set an output format for --none
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (20 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 21/33] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 23/33] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Now that warnings output is deferred to the output plugin, we
need to have an output style for none as well.

So, use the OutputFormat base class on such cases.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_files.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 4c04546a74fe..dd3dbe87520b 100755
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -20,6 +20,7 @@ from datetime import datetime
 from dateutil import tz
 
 from kdoc_parser import KernelDoc
+from kdoc_output import OutputFormat
 
 
 class GlobSourceFiles:
@@ -138,6 +139,9 @@ class KernelFiles():
         if not modulename:
             modulename = "Kernel API"
 
+        if out_style is None:
+            out_style = OutputFormat()
+
         dt = datetime.now()
         if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
             # use UTC TZ
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 23/33] scripts/kernel-doc.py: adjust some coding style issues
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (21 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 22/33] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 24/33] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

Make pylint happier by adding some missing documentation and
addressing a couple of pylint warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           | 12 ++++----
 scripts/lib/kdoc/kdoc_files.py  |  4 +--
 scripts/lib/kdoc/kdoc_output.py | 50 ++++++++++++++++++++++++++-------
 scripts/lib/kdoc/kdoc_parser.py | 30 +++++---------------
 scripts/lib/kdoc/kdoc_re.py     |  3 +-
 5 files changed, 57 insertions(+), 42 deletions(-)
 mode change 100755 => 100644 scripts/lib/kdoc/kdoc_files.py

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 90aacd17499a..eca7e34f9d03 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -2,7 +2,7 @@
 # SPDX-License-Identifier: GPL-2.0
 # Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
 #
-# pylint: disable=C0103
+# pylint: disable=C0103,R0915
 #
 # Converted from the kernel-doc script originally written in Perl
 # under GPLv2, copyrighted since 1998 by the following authors:
@@ -165,6 +165,8 @@ neither here nor at the original Perl script.
 
 
 class MsgFormatter(logging.Formatter):
+    """Helper class to format warnings on a similar way to kernel-doc.pl"""
+
     def format(self, record):
         record.levelname = record.levelname.capitalize()
         return logging.Formatter.format(self, record)
@@ -241,7 +243,7 @@ def main():
 
     # Those are valid for all 3 types of filter
     parser.add_argument("-n", "-nosymbol", "--nosymbol", action='append',
-                         help=NOSYMBOL_DESC)
+                        help=NOSYMBOL_DESC)
 
     parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections",
                         action='store_true', help="Don't outputt DOC sections")
@@ -286,9 +288,9 @@ def main():
     kfiles.parse(args.files, export_file=args.export_file)
 
     for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
-                          internal=args.internal, symbol=args.symbol,
-                          nosymbol=args.nosymbol,
-                          no_doc_sections=args.no_doc_sections):
+                        internal=args.internal, symbol=args.symbol,
+                        nosymbol=args.nosymbol,
+                        no_doc_sections=args.no_doc_sections):
         msg = t[1]
         if msg:
             print(msg)
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
old mode 100755
new mode 100644
index dd3dbe87520b..e2221db7022a
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -4,8 +4,6 @@
 #
 # pylint: disable=R0903,R0913,R0914,R0917
 
-# TODO: implement warning filtering
-
 """
 Parse lernel-doc tags on multiple kernel source files.
 """
@@ -128,7 +126,7 @@ class KernelFiles():
     def __init__(self, verbose=False, out_style=None,
                  werror=False, wreturn=False, wshort_desc=False,
                  wcontents_before_sections=False,
-                 logger=None, modulename=None, export_file=None):
+                 logger=None, modulename=None):
         """
         Initialize startup variables and parse all files
         """
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 6582d1f64d1e..7f84bf12f1e1 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -2,9 +2,7 @@
 # SPDX-License-Identifier: GPL-2.0
 # Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
 #
-# pylint: disable=C0301,R0911,R0912,R0913,R0914,R0915,R0917
-
-# TODO: implement warning filtering
+# pylint: disable=C0301,R0902,R0911,R0912,R0913,R0914,R0915,R0917
 
 """
 Implement output filters to print kernel-doc documentation.
@@ -52,6 +50,11 @@ type_member_func = type_member + Re(r"\(\)", cache=False)
 
 
 class OutputFormat:
+    """
+    Base class for OutputFormat. If used as-is, it means that only
+    warnings will be displayed.
+    """
+
     # output mode.
     OUTPUT_ALL          = 0 # output all symbols and doc sections
     OUTPUT_INCLUDE      = 1 # output only specified symbols
@@ -75,6 +78,10 @@ class OutputFormat:
         self.data = ""
 
     def set_config(self, config):
+        """
+        Setup global config variables used by both parser and output.
+        """
+
         self.config = config
 
     def set_filter(self, export, internal, symbol, nosymbol, function_table,
@@ -117,6 +124,10 @@ class OutputFormat:
         return block
 
     def out_warnings(self, args):
+        """
+        Output warnings for identifiers that will be displayed.
+        """
+
         warnings = args.get('warnings', [])
 
         for warning, log_msg in warnings:
@@ -146,6 +157,11 @@ class OutputFormat:
         return False
 
     def check_declaration(self, dtype, name, args):
+        """
+        Checks if a declaration should be output or not based on the
+        filtering criteria.
+        """
+
         if name in self.nosymbol:
             return False
 
@@ -169,6 +185,10 @@ class OutputFormat:
         return False
 
     def msg(self, fname, name, args):
+        """
+        Handles a single entry from kernel-doc parser
+        """
+
         self.data = ""
 
         dtype = args.get('type', "")
@@ -203,24 +223,25 @@ class OutputFormat:
         return None
 
     # Virtual methods to be overridden by inherited classes
+    # At the base class, those do nothing.
     def out_doc(self, fname, name, args):
-        pass
+        """Outputs a DOC block"""
 
     def out_function(self, fname, name, args):
-        pass
+        """Outputs a function"""
 
     def out_enum(self, fname, name, args):
-        pass
+        """Outputs an enum"""
 
     def out_typedef(self, fname, name, args):
-        pass
+        """Outputs a typedef"""
 
     def out_struct(self, fname, name, args):
-        pass
+        """Outputs a struct"""
 
 
 class RestFormat(OutputFormat):
-    # """Consts and functions used by ReST output"""
+    """Consts and functions used by ReST output"""
 
     highlights = [
         (type_constant, r"``\1``"),
@@ -265,6 +286,11 @@ class RestFormat(OutputFormat):
             self.data += f".. LINENO {ln}\n"
 
     def output_highlight(self, args):
+        """
+        Outputs a C symbol that may require being converted to ReST using
+        the self.highlights variable
+        """
+
         input_text = args
         output = ""
         in_literal = False
@@ -579,6 +605,10 @@ class ManFormat(OutputFormat):
         self.man_date = dt.strftime("%B %Y")
 
     def output_highlight(self, block):
+        """
+        Outputs a C symbol that may require being highlighted with
+        self.highlights variable using troff syntax
+        """
 
         contents = self.highlight_block(block)
 
@@ -601,7 +631,7 @@ class ManFormat(OutputFormat):
         sections = args.get('sections', {})
 
         if not self.check_doc(name, args):
-                return
+            return
 
         self.data += f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 3698ef625367..dcb9515fc40b 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -131,7 +131,7 @@ class KernelDoc:
         # Place all potential outputs into an array
         self.entries = []
 
-    # TODO: rename to emit_message
+    # TODO: rename to emit_message after removal of kernel-doc.pl
     def emit_warning(self, ln, msg, warning=True):
         """Emit a message"""
 
@@ -157,19 +157,6 @@ class KernelDoc:
         name = self.entry.section
         contents = self.entry.contents
 
-        # TODO: we can prevent dumping empty sections here with:
-        #
-        #    if self.entry.contents.strip("\n"):
-        #       if start_new:
-        #           self.entry.section = self.section_default
-        #           self.entry.contents = ""
-        #
-        #        return
-        #
-        # But, as we want to be producing the same output of the
-        # venerable kernel-doc Perl tool, let's just output everything,
-        # at least for now
-
         if type_param.match(name):
             name = type_param.group(1)
 
@@ -205,7 +192,7 @@ class KernelDoc:
             self.entry.section = self.section_default
             self.entry.contents = ""
 
-    # TODO: rename it to store_declaration
+    # TODO: rename it to store_declaration after removal of kernel-doc.pl
     def output_declaration(self, dtype, name, **args):
         """
         Stores the entry into an entry array.
@@ -225,13 +212,13 @@ class KernelDoc:
         args["type"] = dtype
         args["warnings"] = self.entry.warnings
 
-        # TODO: use colletions.OrderedDict
+        # TODO: use colletions.OrderedDict to remove sectionlist
 
         sections = args.get('sections', {})
         sectionlist = args.get('sectionlist', [])
 
         # Drop empty sections
-        # TODO: improve it to emit warnings
+        # TODO: improve empty sections logic to emit warnings
         for section in ["Description", "Return"]:
             if section in sectionlist:
                 if not sections[section].rstrip():
@@ -636,7 +623,9 @@ class KernelDoc:
 
             # Replace macros
             #
-            # TODO: it is better to also move those to the NestedMatch logic,
+            # TODO: use NestedMatch for FOO($1, $2, ...) matches
+            #
+            # it is better to also move those to the NestedMatch logic,
             # to ensure that parenthesis will be properly matched.
 
             (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
@@ -906,7 +895,6 @@ class KernelDoc:
             self.dump_struct(ln, prototype)
             return
 
-        # TODO: handle other types
         self.output_declaration(self.entry.decl_type, prototype,
                                 entry=self.entry)
 
@@ -1680,10 +1668,6 @@ class KernelDoc:
                                           self.st_inline_name[self.inline_doc_state],
                                           line)
 
-                    # TODO: not all states allow EXPORT_SYMBOL*, so this
-                    # can be optimized later on to speedup parsing
-                    self.process_export(self.config.function_table, line)
-
                     # Hand this line to the appropriate state handler
                     if self.state == self.STATE_NORMAL:
                         self.process_normal(ln, line)
diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
index 512b6521e79d..d28485ff94d6 100755
--- a/scripts/lib/kdoc/kdoc_re.py
+++ b/scripts/lib/kdoc/kdoc_re.py
@@ -131,7 +131,8 @@ class NestedMatch:
     will ignore the search string.
     """
 
-    # TODO:
+    # TODO: make NestedMatch handle multiple match groups
+    #
     # Right now, regular expressions to match it are defined only up to
     #       the start delimiter, e.g.:
     #
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 24/33] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (22 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 23/33] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 25/33] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

- str.replace count was introduced only in Python 3.13;
- before Python 3.13, f-string dict arguments can't use the same
  delimiter of the main string.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py | 8 ++++----
 scripts/lib/kdoc/kdoc_parser.py | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 7f84bf12f1e1..e0ed79e4d985 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -647,16 +647,16 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        self.data += f'.TH "{args['function']}" 9 "{args['function']}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n"
+        self.data += f'.TH "{args["function"]}" 9 "{args["function"]}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n"
 
         self.data += ".SH NAME\n"
         self.data += f"{args['function']} \\- {args['purpose']}\n"
 
         self.data += ".SH SYNOPSIS\n"
         if args.get('functiontype', ''):
-            self.data += f'.B "{args['functiontype']}" {args['function']}' + "\n"
+            self.data += f'.B "{args["functiontype"]}" {args["function"]}' + "\n"
         else:
-            self.data += f'.B "{args['function']}' + "\n"
+            self.data += f'.B "{args["function"]}' + "\n"
 
         count = 0
         parenth = "("
@@ -697,7 +697,7 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        self.data += f'.TH "{args['module']}" 9 "enum {args['enum']}" "{self.man_date}" "API Manual" LINUX' + "\n"
+        self.data += f'.TH "{args["module"]}" 9 "enum {args["enum"]}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
         self.data += ".SH NAME\n"
         self.data += f"enum {args['enum']} \\- {args['purpose']}\n"
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index dcb9515fc40b..e48ed128ca04 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1444,9 +1444,9 @@ class KernelDoc:
 
         r = Re(r'long\s+(sys_.*?),')
         if r.search(proto):
-            proto = proto.replace(',', '(', count=1)
+            proto = Re(',').sub('(', proto, count=1)
         elif is_void:
-            proto = proto.replace(')', '(void)', count=1)
+            proto = Re(r'\)').sub('(void)', proto, count=1)
 
         # Now delete all of the odd-numbered commas in the proto
         # so that argument types & names don't have a comma between them
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 25/33] scripts/kernel-doc.py: move modulename to man class
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (23 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 24/33] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 26/33] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

Only man output requires a modulename. Move its definition
to the man class.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           |  6 +++---
 scripts/lib/kdoc/kdoc_files.py  |  6 +-----
 scripts/lib/kdoc/kdoc_output.py | 12 ++++++------
 scripts/lib/kdoc/kdoc_parser.py |  9 +--------
 4 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index eca7e34f9d03..6a6bc81efd31 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -186,6 +186,7 @@ def main():
                         help="Enable debug messages")
 
     parser.add_argument("-M", "-modulename", "--modulename",
+                        default="Kernel API",
                         help="Allow setting a module name at the output.")
 
     parser.add_argument("-l", "-enable-lineno", "--enable_lineno",
@@ -273,7 +274,7 @@ def main():
     logger.addHandler(handler)
 
     if args.man:
-        out_style = ManFormat()
+        out_style = ManFormat(modulename=args.modulename)
     elif args.none:
         out_style = None
     else:
@@ -282,8 +283,7 @@ def main():
     kfiles = KernelFiles(verbose=args.verbose,
                          out_style=out_style, werror=args.werror,
                          wreturn=args.wreturn, wshort_desc=args.wshort_desc,
-                         wcontents_before_sections=args.wcontents_before_sections,
-                         modulename=args.modulename)
+                         wcontents_before_sections=args.wcontents_before_sections)
 
     kfiles.parse(args.files, export_file=args.export_file)
 
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index e2221db7022a..5a6e92e34d05 100644
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -126,7 +126,7 @@ class KernelFiles():
     def __init__(self, verbose=False, out_style=None,
                  werror=False, wreturn=False, wshort_desc=False,
                  wcontents_before_sections=False,
-                 logger=None, modulename=None):
+                 logger=None):
         """
         Initialize startup variables and parse all files
         """
@@ -134,9 +134,6 @@ class KernelFiles():
         if not verbose:
             verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
 
-        if not modulename:
-            modulename = "Kernel API"
-
         if out_style is None:
             out_style = OutputFormat()
 
@@ -168,7 +165,6 @@ class KernelFiles():
         self.config.wreturn = wreturn
         self.config.wshort_desc = wshort_desc
         self.config.wcontents_before_sections = wcontents_before_sections
-        self.config.modulename = modulename
 
         self.config.function_table = set()
         self.config.source_map = {}
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index e0ed79e4d985..8be69245c0d0 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -586,7 +586,7 @@ class ManFormat(OutputFormat):
     )
     blankline = ""
 
-    def __init__(self):
+    def __init__(self, modulename):
         """
         Creates class variables.
 
@@ -595,6 +595,7 @@ class ManFormat(OutputFormat):
         """
 
         super().__init__()
+        self.modulename = modulename
 
         dt = datetime.now()
         if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
@@ -626,14 +627,13 @@ class ManFormat(OutputFormat):
                 self.data += line + "\n"
 
     def out_doc(self, fname, name, args):
-        module = args.get('module')
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
         if not self.check_doc(name, args):
             return
 
-        self.data += f'.TH "{module}" 9 "{module}" "{self.man_date}" "API Manual" LINUX' + "\n"
+        self.data += f'.TH "{self.modulename}" 9 "{self.modulename}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
         for section in sectionlist:
             self.data += f'.SH "{section}"' + "\n"
@@ -697,7 +697,7 @@ class ManFormat(OutputFormat):
         sectionlist = args.get('sectionlist', [])
         sections = args.get('sections', {})
 
-        self.data += f'.TH "{args["module"]}" 9 "enum {args["enum"]}" "{self.man_date}" "API Manual" LINUX' + "\n"
+        self.data += f'.TH "{self.modulename}" 9 "enum {args["enum"]}" "{self.man_date}" "API Manual" LINUX' + "\n"
 
         self.data += ".SH NAME\n"
         self.data += f"enum {args['enum']} \\- {args['purpose']}\n"
@@ -727,7 +727,7 @@ class ManFormat(OutputFormat):
             self.output_highlight(sections[section])
 
     def out_typedef(self, fname, name, args):
-        module = args.get('module')
+        module = self.modulename
         typedef = args.get('typedef')
         purpose = args.get('purpose')
         sectionlist = args.get('sectionlist', [])
@@ -743,7 +743,7 @@ class ManFormat(OutputFormat):
             self.output_highlight(sections.get(section))
 
     def out_struct(self, fname, name, args):
-        module = args.get('module')
+        module = self.modulename
         struct_type = args.get('type')
         struct_name = args.get('struct')
         purpose = args.get('purpose')
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index e48ed128ca04..f923600561f8 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -791,7 +791,6 @@ class KernelDoc:
 
         self.output_declaration(decl_type, declaration_name,
                                 struct=declaration_name,
-                                module=self.entry.modulename,
                                 definition=declaration,
                                 parameterlist=self.entry.parameterlist,
                                 parameterdescs=self.entry.parameterdescs,
@@ -869,7 +868,6 @@ class KernelDoc:
 
         self.output_declaration('enum', declaration_name,
                                 enum=declaration_name,
-                                module=self.config.modulename,
                                 parameterlist=self.entry.parameterlist,
                                 parameterdescs=self.entry.parameterdescs,
                                 parameterdesc_start_lines=self.entry.parameterdesc_start_lines,
@@ -1040,7 +1038,6 @@ class KernelDoc:
             self.output_declaration(decl_type, declaration_name,
                                     function=declaration_name,
                                     typedef=True,
-                                    module=self.config.modulename,
                                     functiontype=return_type,
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
@@ -1055,7 +1052,6 @@ class KernelDoc:
             self.output_declaration(decl_type, declaration_name,
                                     function=declaration_name,
                                     typedef=False,
-                                    module=self.config.modulename,
                                     functiontype=return_type,
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
@@ -1102,7 +1098,6 @@ class KernelDoc:
             self.output_declaration(decl_type, declaration_name,
                                     function=declaration_name,
                                     typedef=True,
-                                    module=self.entry.modulename,
                                     functiontype=return_type,
                                     parameterlist=self.entry.parameterlist,
                                     parameterdescs=self.entry.parameterdescs,
@@ -1130,7 +1125,6 @@ class KernelDoc:
 
             self.output_declaration('typedef', declaration_name,
                                     typedef=declaration_name,
-                                    module=self.entry.modulename,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
                                     section_start_lines=self.entry.section_start_lines,
@@ -1619,8 +1613,7 @@ class KernelDoc:
             self.output_declaration("doc", self.entry.identifier,
                                     sectionlist=self.entry.sectionlist,
                                     sections=self.entry.sections,
-                                    section_start_lines=self.entry.section_start_lines,
-                                    module=self.config.modulename)
+                                    section_start_lines=self.entry.section_start_lines)
             self.reset_state(ln)
 
         elif doc_content.search(line):
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 26/33] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (24 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 25/33] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 27/33] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

The logic that handles KBUILD_BUILD_TIMESTAMP is wrong, and adds
a dependency of a third party module (dateutil).

Fix it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_files.py  |  9 ---------
 scripts/lib/kdoc/kdoc_output.py | 28 +++++++++++++++++++++-------
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 5a6e92e34d05..e52a6d05237e 100644
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -13,9 +13,6 @@ import logging
 import os
 import re
 import sys
-from datetime import datetime
-
-from dateutil import tz
 
 from kdoc_parser import KernelDoc
 from kdoc_output import OutputFormat
@@ -137,12 +134,6 @@ class KernelFiles():
         if out_style is None:
             out_style = OutputFormat()
 
-        dt = datetime.now()
-        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
-            # use UTC TZ
-            to_zone = tz.gettz('UTC')
-            dt = dt.astimezone(to_zone)
-
         if not werror:
             kcflags = os.environ.get("KCFLAGS", None)
             if kcflags:
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index 8be69245c0d0..eb013075da84 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -19,8 +19,6 @@ import os
 import re
 from datetime import datetime
 
-from dateutil import tz
-
 from kdoc_parser import KernelDoc, type_param
 from kdoc_re import Re
 
@@ -586,6 +584,15 @@ class ManFormat(OutputFormat):
     )
     blankline = ""
 
+    date_formats = [
+        "%a %b %d %H:%M:%S %Z %Y",
+        "%a %b %d %H:%M:%S %Y",
+        "%Y-%m-%d",
+        "%b %d %Y",
+        "%B %d %Y",
+        "%m %d %Y",
+    ]
+
     def __init__(self, modulename):
         """
         Creates class variables.
@@ -597,11 +604,18 @@ class ManFormat(OutputFormat):
         super().__init__()
         self.modulename = modulename
 
-        dt = datetime.now()
-        if os.environ.get("KBUILD_BUILD_TIMESTAMP", None):
-            # use UTC TZ
-            to_zone = tz.gettz('UTC')
-            dt = dt.astimezone(to_zone)
+        dt = None
+        tstamp = os.environ.get("KBUILD_BUILD_TIMESTAMP")
+        if tstamp:
+            for fmt in self.date_formats:
+                try:
+                    dt = datetime.strptime(tstamp, fmt)
+                    break
+                except ValueError:
+                    pass
+
+        if not dt:
+            dt = datetime.now()
 
         self.man_date = dt.strftime("%B %Y")
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 27/33] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (25 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 26/33] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 28/33] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

str.removesuffix() was added on Python 3.9, but rstrip()
actually does the same thing, as we just want to remove a single
character. It is also shorter.

So, use it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_parser.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index f923600561f8..77e8bfeccc8e 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1641,7 +1641,7 @@ class KernelDoc:
                     # Group continuation lines on prototypes
                     if self.state == self.STATE_PROTO:
                         if line.endswith("\\"):
-                            prev += line.removesuffix("\\")
+                            prev += line.rstrip("\\")
                             cont = True
 
                             if not prev_ln:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 28/33] scripts/kernel-doc.py: Properly handle Werror and exit codes
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (26 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 27/33] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 29/33] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

The original kernel-doc script has a logic to return warnings
as errors, and to report the number of warnings found, if in
verbose mode.

Implement it to be fully compatible with the original script.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           | 18 ++++++++++++++++--
 scripts/lib/kdoc/kdoc_files.py  | 12 ++++++++++--
 scripts/lib/kdoc/kdoc_output.py |  8 +++-----
 scripts/lib/kdoc/kdoc_parser.py | 15 ++++++---------
 4 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 6a6bc81efd31..2f2fad813024 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -78,8 +78,6 @@
 #    Yacine Belkadi <yacine.belkadi.1@gmail.com>
 #    Yujie Liu <yujie.liu@intel.com>
 
-# TODO: implement warning filtering
-
 """
 kernel_doc
 ==========
@@ -295,6 +293,22 @@ def main():
         if msg:
             print(msg)
 
+    error_count = kfiles.errors
+    if not error_count:
+        sys.exit(0)
+
+    if args.werror:
+        print(f"{error_count} warnings as errors")
+        sys.exit(error_count)
+
+    if args.verbose:
+        print(f"{error_count} errors")
+
+    if args.none:
+        sys.exit(0)
+
+    sys.exit(error_count)
+
 
 # Call main method
 if __name__ == "__main__":
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index e52a6d05237e..182d9ed58a72 100644
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -12,7 +12,6 @@ import argparse
 import logging
 import os
 import re
-import sys
 
 from kdoc_parser import KernelDoc
 from kdoc_output import OutputFormat
@@ -109,7 +108,7 @@ class KernelFiles():
                     KernelDoc.process_export(self.config.function_table, line)
 
         except IOError:
-            print(f"Error: Cannot open fname {fname}", fname=sys.stderr)
+            self.config.log.error("Error: Cannot open fname %s", fname)
             self.config.errors += 1
 
     def file_not_found_cb(self, fname):
@@ -262,3 +261,12 @@ class KernelFiles():
                                             fname, ln, dtype)
             if msg:
                 yield fname, msg
+
+    @property
+    def errors(self):
+        """
+        Return a count of the number of warnings found, including
+        the ones displayed while interacting over self.msg.
+        """
+
+        return self.config.errors
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index eb013075da84..e9b4d0093084 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -128,11 +128,9 @@ class OutputFormat:
 
         warnings = args.get('warnings', [])
 
-        for warning, log_msg in warnings:
-            if warning:
-                self.config.log.warning(log_msg)
-            else:
-                self.config.log.info(log_msg)
+        for log_msg in warnings:
+            self.config.log.warning(log_msg)
+            self.config.errors += 1
 
     def check_doc(self, name, args):
         """Check if DOC should be output"""
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 77e8bfeccc8e..43e6ffbdcc2c 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -137,17 +137,18 @@ class KernelDoc:
 
         log_msg = f"{self.fname}:{ln} {msg}"
 
+        if not warning:
+            self.config.log.info(log_msg)
+            return
+
         if self.entry:
             # Delegate warning output to output logic, as this way it
             # will report warnings/info only for symbols that are output
 
-            self.entry.warnings.append((warning, log_msg))
+            self.entry.warnings.append(log_msg)
             return
 
-        if warning:
-            self.config.log.warning(log_msg)
-        else:
-            self.config.log.info(log_msg)
+        self.config.log.warning(log_msg)
 
     def dump_section(self, start_new=True):
         """
@@ -556,7 +557,6 @@ class KernelDoc:
 
         if not members:
             self.emit_warning(ln, f"{proto} error: Cannot parse struct or union!")
-            self.config.errors += 1
             return
 
         if self.entry.identifier != declaration_name:
@@ -831,7 +831,6 @@ class KernelDoc:
 
         if not members:
             self.emit_warning(ln, f"{proto}: error: Cannot parse enum!")
-            self.config.errors += 1
             return
 
         if self.entry.identifier != declaration_name:
@@ -1132,7 +1131,6 @@ class KernelDoc:
             return
 
         self.emit_warning(ln, "error: Cannot parse typedef!")
-        self.config.errors += 1
 
     @staticmethod
     def process_export(function_table, line):
@@ -1677,4 +1675,3 @@ class KernelDoc:
                         self.process_docblock(ln, line)
         except OSError:
             self.config.log.error(f"Error: Cannot open file {self.fname}")
-            self.config.errors += 1
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 29/33] scripts/kernel-doc: switch to use kernel-doc.py
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (27 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 28/33] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 30/33] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

Now that all features are in place, change the kernel-doc alias
to point to kernel-doc.py.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index f175155c1e66..3b6ef807791a 120000
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1 +1 @@
-kernel-doc.pl
\ No newline at end of file
+kernel-doc.py
\ No newline at end of file
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 30/33] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (28 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 29/33] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 31/33] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel

For kerneldoc Sphinx extension, it is useful to display
parsed results only from a single file. Change the logic at
KernelFiles.msg() to allow such usage.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_files.py | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 182d9ed58a72..527ab9117268 100644
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -95,7 +95,7 @@ class KernelFiles():
         doc = KernelDoc(self.config, fname)
         doc.run()
 
-        return doc
+        return doc.entries
 
     def process_export_file(self, fname):
         """
@@ -173,7 +173,7 @@ class KernelFiles():
         # Initialize internal variables
 
         self.config.errors = 0
-        self.results = []
+        self.results = {}
 
         self.files = set()
         self.export_files = set()
@@ -189,13 +189,9 @@ class KernelFiles():
         # avoid reporting errors multiple times
 
         for fname in glob.parse_files(file_list, self.file_not_found_cb):
-            if fname in self.files:
-                continue
-
-            res = self.parse_file(fname)
-
-            self.results.append((res.fname, res.entries))
-            self.files.add(fname)
+            if fname not in self.files:
+                self.results[fname] = self.parse_file(fname)
+                self.files.add(fname)
 
         # If a list of export files was provided, parse EXPORT_SYMBOL*
         # from files that weren't fully parsed
@@ -226,7 +222,8 @@ class KernelFiles():
         return self.out_style.msg(fname, name, arg)
 
     def msg(self, enable_lineno=False, export=False, internal=False,
-            symbol=None, nosymbol=None, no_doc_sections=False):
+            symbol=None, nosymbol=None, no_doc_sections=False,
+            filenames=None):
         """
         Interacts over the kernel-doc results and output messages,
         returning kernel-doc markups on each interaction
@@ -248,9 +245,12 @@ class KernelFiles():
                                   function_table, enable_lineno,
                                   no_doc_sections)
 
-        for fname, arg_tuple in self.results:
+        if not filenames:
+            filenames = sorted(self.results.keys())
+
+        for fname in filenames:
             msg = ""
-            for name, arg in arg_tuple:
+            for name, arg in self.results[fname]:
                 msg += self.out_msg(fname, name, arg)
 
                 if msg is None:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 31/33] scripts/kernel_doc.py: better handle exported symbols
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (29 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 30/33] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 32/33] scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe Mauro Carvalho Chehab
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Sean Anderson, linux-kernel

Change the logic which detects internal/external symbols in a way
that we can re-use it when calling via Sphinx extension.

While here, remove an unused self.config var and let it clearer
that self.config variables are read-only. This helps to allow
handling multiple times in parallel if ever needed.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.py           |   2 +-
 scripts/lib/kdoc/kdoc_files.py  | 142 +++++++++++++++++---------------
 scripts/lib/kdoc/kdoc_output.py |   9 +-
 scripts/lib/kdoc/kdoc_parser.py |  52 ++++++++++--
 4 files changed, 125 insertions(+), 80 deletions(-)

diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index 2f2fad813024..12ae66f40bd7 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -287,7 +287,7 @@ def main():
 
     for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
                         internal=args.internal, symbol=args.symbol,
-                        nosymbol=args.nosymbol,
+                        nosymbol=args.nosymbol, export_file=args.export_file,
                         no_doc_sections=args.no_doc_sections):
         msg = t[1]
         if msg:
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
index 527ab9117268..dd003feefd1b 100644
--- a/scripts/lib/kdoc/kdoc_files.py
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -68,6 +68,9 @@ class GlobSourceFiles:
         handling directories if any
         """
 
+        if not file_list:
+            return
+
         for fname in file_list:
             if self.srctree:
                 f = os.path.join(self.srctree, fname)
@@ -84,40 +87,70 @@ class GlobSourceFiles:
 
 class KernelFiles():
     """
-    Parse lernel-doc tags on multiple kernel source files.
+    Parse kernel-doc tags on multiple kernel source files.
+
+    There are two type of parsers defined here:
+        - self.parse_file(): parses both kernel-doc markups and
+          EXPORT_SYMBOL* macros;
+        - self.process_export_file(): parses only EXPORT_SYMBOL* macros.
     """
 
+    def warning(self, msg):
+        """Ancillary routine to output a warning and increment error count"""
+
+        self.config.log.warning(msg)
+        self.errors += 1
+
+    def error(self, msg):
+        """Ancillary routine to output an error and increment error count"""
+
+        self.config.log.error(msg)
+        self.errors += 1
+
     def parse_file(self, fname):
         """
         Parse a single Kernel source.
         """
 
+        # Prevent parsing the same file twice if results are cached
+        if fname in self.files:
+            return
+
         doc = KernelDoc(self.config, fname)
-        doc.run()
+        export_table, entries = doc.parse_kdoc()
 
-        return doc.entries
+        self.export_table[fname] = export_table
+
+        self.files.add(fname)
+        self.export_files.add(fname)      # parse_kdoc() already check exports
+
+        self.results[fname] = entries
 
     def process_export_file(self, fname):
         """
         Parses EXPORT_SYMBOL* macros from a single Kernel source file.
         """
-        try:
-            with open(fname, "r", encoding="utf8",
-                      errors="backslashreplace") as fp:
-                for line in fp:
-                    KernelDoc.process_export(self.config.function_table, line)
-
-        except IOError:
-            self.config.log.error("Error: Cannot open fname %s", fname)
-            self.config.errors += 1
+
+        # Prevent parsing the same file twice if results are cached
+        if fname in self.export_files:
+            return
+
+        doc = KernelDoc(self.config, fname)
+        export_table = doc.parse_export()
+
+        if not export_table:
+            self.error(f"Error: Cannot check EXPORT_SYMBOL* on {fname}")
+            export_table = set()
+
+        self.export_table[fname] = export_table
+        self.export_files.add(fname)
 
     def file_not_found_cb(self, fname):
         """
         Callback to warn if a file was not found.
         """
 
-        self.config.log.error("Cannot find file %s", fname)
-        self.config.errors += 1
+        self.error(f"Cannot find file {fname}")
 
     def __init__(self, verbose=False, out_style=None,
                  werror=False, wreturn=False, wshort_desc=False,
@@ -147,7 +180,9 @@ class KernelFiles():
             if kdoc_werror:
                 werror = kdoc_werror
 
-        # Set global config data used on all files
+        # Some variables are global to the parser logic as a whole as they are
+        # used to send control configuration to KernelDoc class. As such,
+        # those variables are read-only inside the KernelDoc.
         self.config = argparse.Namespace
 
         self.config.verbose = verbose
@@ -156,27 +191,25 @@ class KernelFiles():
         self.config.wshort_desc = wshort_desc
         self.config.wcontents_before_sections = wcontents_before_sections
 
-        self.config.function_table = set()
-        self.config.source_map = {}
-
         if not logger:
             self.config.log = logging.getLogger("kernel-doc")
         else:
             self.config.log = logger
 
-        self.config.kernel_version = os.environ.get("KERNELVERSION",
-                                                    "unknown kernel version'")
+        self.config.warning = self.warning
+
         self.config.src_tree = os.environ.get("SRCTREE", None)
 
+        # Initialize variables that are internal to KernelFiles
+
         self.out_style = out_style
 
-        # Initialize internal variables
-
-        self.config.errors = 0
+        self.errors = 0
         self.results = {}
 
         self.files = set()
         self.export_files = set()
+        self.export_table = {}
 
     def parse(self, file_list, export_file=None):
         """
@@ -185,28 +218,11 @@ class KernelFiles():
 
         glob = GlobSourceFiles(srctree=self.config.src_tree)
 
-        # Prevent parsing the same file twice to speedup parsing and
-        # avoid reporting errors multiple times
-
         for fname in glob.parse_files(file_list, self.file_not_found_cb):
-            if fname not in self.files:
-                self.results[fname] = self.parse_file(fname)
-                self.files.add(fname)
-
-        # If a list of export files was provided, parse EXPORT_SYMBOL*
-        # from files that weren't fully parsed
-
-        if not export_file:
-            return
-
-        self.export_files |= self.files
-
-        glob = GlobSourceFiles(srctree=self.config.src_tree)
+            self.parse_file(fname)
 
         for fname in glob.parse_files(export_file, self.file_not_found_cb):
-            if fname not in self.export_files:
-                self.process_export_file(fname)
-                self.export_files.add(fname)
+            self.process_export_file(fname)
 
     def out_msg(self, fname, name, arg):
         """
@@ -223,32 +239,35 @@ class KernelFiles():
 
     def msg(self, enable_lineno=False, export=False, internal=False,
             symbol=None, nosymbol=None, no_doc_sections=False,
-            filenames=None):
+            filenames=None, export_file=None):
         """
         Interacts over the kernel-doc results and output messages,
         returning kernel-doc markups on each interaction
         """
 
-        function_table = self.config.function_table
-
-        if symbol:
-            for s in symbol:
-                function_table.add(s)
-
-        # Output none mode: only warnings will be shown
-        if not self.out_style:
-            return
-
         self.out_style.set_config(self.config)
 
-        self.out_style.set_filter(export, internal, symbol, nosymbol,
-                                  function_table, enable_lineno,
-                                  no_doc_sections)
-
         if not filenames:
             filenames = sorted(self.results.keys())
 
         for fname in filenames:
+            function_table = set()
+
+            if internal or export:
+                if not export_file:
+                    export_file = [fname]
+
+                for f in export_file:
+                    function_table |= self.export_table[f]
+
+            if symbol:
+                for s in symbol:
+                    function_table.add(s)
+
+            self.out_style.set_filter(export, internal, symbol, nosymbol,
+                                      function_table, enable_lineno,
+                                      no_doc_sections)
+
             msg = ""
             for name, arg in self.results[fname]:
                 msg += self.out_msg(fname, name, arg)
@@ -261,12 +280,3 @@ class KernelFiles():
                                             fname, ln, dtype)
             if msg:
                 yield fname, msg
-
-    @property
-    def errors(self):
-        """
-        Return a count of the number of warnings found, including
-        the ones displayed while interacting over self.msg.
-        """
-
-        return self.config.errors
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index e9b4d0093084..c352b7f8d3fd 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -69,7 +69,7 @@ class OutputFormat:
         self.enable_lineno = None
         self.nosymbol = {}
         self.symbol = None
-        self.function_table = set()
+        self.function_table = None
         self.config = None
         self.no_doc_sections = False
 
@@ -94,10 +94,10 @@ class OutputFormat:
 
         self.enable_lineno = enable_lineno
         self.no_doc_sections = no_doc_sections
+        self.function_table = function_table
 
         if symbol:
             self.out_mode = self.OUTPUT_INCLUDE
-            function_table = symbol
         elif export:
             self.out_mode = self.OUTPUT_EXPORTED
         elif internal:
@@ -108,8 +108,6 @@ class OutputFormat:
         if nosymbol:
             self.nosymbol = set(nosymbol)
 
-        if function_table:
-            self.function_table = function_table
 
     def highlight_block(self, block):
         """
@@ -129,8 +127,7 @@ class OutputFormat:
         warnings = args.get('warnings', [])
 
         for log_msg in warnings:
-            self.config.log.warning(log_msg)
-            self.config.errors += 1
+            self.config.warning(log_msg)
 
     def check_doc(self, name, args):
         """Check if DOC should be output"""
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 43e6ffbdcc2c..33f00c77dd5f 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1133,21 +1133,25 @@ class KernelDoc:
         self.emit_warning(ln, "error: Cannot parse typedef!")
 
     @staticmethod
-    def process_export(function_table, line):
+    def process_export(function_set, line):
         """
         process EXPORT_SYMBOL* tags
 
-        This method is called both internally and externally, so, it
-        doesn't use self.
+        This method doesn't use any variable from the class, so declare it
+        with a staticmethod decorator.
         """
 
+        # Note: it accepts only one EXPORT_SYMBOL* per line, as having
+        # multiple export lines would violate Kernel coding style.
+
         if export_symbol.search(line):
             symbol = export_symbol.group(2)
-            function_table.add(symbol)
+            function_set.add(symbol)
+            return
 
         if export_symbol_ns.search(line):
             symbol = export_symbol_ns.group(2)
-            function_table.add(symbol)
+            function_set.add(symbol)
 
     def process_normal(self, ln, line):
         """
@@ -1617,17 +1621,39 @@ class KernelDoc:
         elif doc_content.search(line):
             self.entry.contents += doc_content.group(1) + "\n"
 
-    def run(self):
+    def parse_export(self):
+        """
+        Parses EXPORT_SYMBOL* macros from a single Kernel source file.
+        """
+
+        export_table = set()
+
+        try:
+            with open(self.fname, "r", encoding="utf8",
+                      errors="backslashreplace") as fp:
+
+                for line in fp:
+                    self.process_export(export_table, line)
+
+        except IOError:
+            return None
+
+        return export_table
+
+    def parse_kdoc(self):
         """
         Open and process each line of a C source file.
-        he parsing is controlled via a state machine, and the line is passed
+        The parsing is controlled via a state machine, and the line is passed
         to a different process function depending on the state. The process
         function may update the state as needed.
+
+        Besides parsing kernel-doc tags, it also parses export symbols.
         """
 
         cont = False
         prev = ""
         prev_ln = None
+        export_table = set()
 
         try:
             with open(self.fname, "r", encoding="utf8",
@@ -1659,6 +1685,16 @@ class KernelDoc:
                                           self.st_inline_name[self.inline_doc_state],
                                           line)
 
+                    # This is an optimization over the original script.
+                    # There, when export_file was used for the same file,
+                    # it was read twice. Here, we use the already-existing
+                    # loop to parse exported symbols as well.
+                    #
+                    # TODO: It should be noticed that not all states are
+                    # needed here. On a future cleanup, process export only
+                    # at the states that aren't handling comment markups.
+                    self.process_export(export_table, line)
+
                     # Hand this line to the appropriate state handler
                     if self.state == self.STATE_NORMAL:
                         self.process_normal(ln, line)
@@ -1675,3 +1711,5 @@ class KernelDoc:
                         self.process_docblock(ln, line)
         except OSError:
             self.config.log.error(f"Error: Cannot open file {self.fname}")
+
+        return export_table, self.entries
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 32/33] scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (30 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 31/33] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-08 10:09 ` [PATCH v3 33/33] scripts: kernel-doc: fix parsing function-like typedefs (again) Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Gustavo A. R. Silva, Kees Cook,
	Sean Anderson, linux-hardening, linux-kernel

Using just "Re" makes it harder to distinguish from the native
"re" class. So, let's rename it.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/lib/kdoc/kdoc_output.py |  50 +++---
 scripts/lib/kdoc/kdoc_parser.py | 264 ++++++++++++++++----------------
 scripts/lib/kdoc/kdoc_re.py     |   4 +-
 3 files changed, 159 insertions(+), 159 deletions(-)

diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index c352b7f8d3fd..86102e628d91 100755
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -20,31 +20,31 @@ import re
 from datetime import datetime
 
 from kdoc_parser import KernelDoc, type_param
-from kdoc_re import Re
+from kdoc_re import KernRe
 
 
-function_pointer = Re(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
+function_pointer = KernRe(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
 
 # match expressions used to find embedded type information
-type_constant = Re(r"\b``([^\`]+)``\b", cache=False)
-type_constant2 = Re(r"\%([-_*\w]+)", cache=False)
-type_func = Re(r"(\w+)\(\)", cache=False)
-type_param_ref = Re(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+type_constant = KernRe(r"\b``([^\`]+)``\b", cache=False)
+type_constant2 = KernRe(r"\%([-_*\w]+)", cache=False)
+type_func = KernRe(r"(\w+)\(\)", cache=False)
+type_param_ref = KernRe(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
 
 # Special RST handling for func ptr params
-type_fp_param = Re(r"\@(\w+)\(\)", cache=False)
+type_fp_param = KernRe(r"\@(\w+)\(\)", cache=False)
 
 # Special RST handling for structs with func ptr params
-type_fp_param2 = Re(r"\@(\w+->\S+)\(\)", cache=False)
+type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
 
-type_env = Re(r"(\$\w+)", cache=False)
-type_enum = Re(r"\&(enum\s*([_\w]+))", cache=False)
-type_struct = Re(r"\&(struct\s*([_\w]+))", cache=False)
-type_typedef = Re(r"\&(typedef\s*([_\w]+))", cache=False)
-type_union = Re(r"\&(union\s*([_\w]+))", cache=False)
-type_member = Re(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
-type_fallback = Re(r"\&([_\w]+)", cache=False)
-type_member_func = type_member + Re(r"\(\)", cache=False)
+type_env = KernRe(r"(\$\w+)", cache=False)
+type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
+type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
+type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
+type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
+type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
+type_fallback = KernRe(r"\&([_\w]+)", cache=False)
+type_member_func = type_member + KernRe(r"\(\)", cache=False)
 
 
 class OutputFormat:
@@ -257,8 +257,8 @@ class RestFormat(OutputFormat):
     ]
     blankline = "\n"
 
-    sphinx_literal = Re(r'^[^.].*::$', cache=False)
-    sphinx_cblock = Re(r'^\.\.\ +code-block::', cache=False)
+    sphinx_literal = KernRe(r'^[^.].*::$', cache=False)
+    sphinx_cblock = KernRe(r'^\.\.\ +code-block::', cache=False)
 
     def __init__(self):
         """
@@ -299,14 +299,14 @@ class RestFormat(OutputFormat):
                     # If this is the first non-blank line in a literal block,
                     # figure out the proper indent.
                     if not litprefix:
-                        r = Re(r'^(\s*)')
+                        r = KernRe(r'^(\s*)')
                         if r.match(line):
                             litprefix = '^' + r.group(1)
                         else:
                             litprefix = ""
 
                         output += line + "\n"
-                    elif not Re(litprefix).match(line):
+                    elif not KernRe(litprefix).match(line):
                         in_literal = False
                     else:
                         output += line + "\n"
@@ -429,7 +429,7 @@ class RestFormat(OutputFormat):
             self.data += f"{self.lineprefix}**Parameters**\n\n"
 
         for parameter in parameterlist:
-            parameter_name = Re(r'\[.*').sub('', parameter)
+            parameter_name = KernRe(r'\[.*').sub('', parameter)
             dtype = args['parametertypes'].get(parameter, "")
 
             if dtype:
@@ -626,7 +626,7 @@ class ManFormat(OutputFormat):
             contents = "\n".join(contents)
 
         for line in contents.strip("\n").split("\n"):
-            line = Re(r"^\s*").sub("", line)
+            line = KernRe(r"^\s*").sub("", line)
             if not line:
                 continue
 
@@ -680,7 +680,7 @@ class ManFormat(OutputFormat):
                 # Pointer-to-function
                 self.data += f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"' + "\n"
             else:
-                dtype = Re(r'([^\*])$').sub(r'\1 ', dtype)
+                dtype = KernRe(r'([^\*])$').sub(r'\1 ', dtype)
 
                 self.data += f'.BI "{parenth}{dtype}"  "{post}"' + "\n"
             count += 1
@@ -727,7 +727,7 @@ class ManFormat(OutputFormat):
         self.data += ".SH Constants\n"
 
         for parameter in parameterlist:
-            parameter_name = Re(r'\[.*').sub('', parameter)
+            parameter_name = KernRe(r'\[.*').sub('', parameter)
             self.data += f'.IP "{parameter}" 12' + "\n"
             self.output_highlight(args['parameterdescs'].get(parameter_name, ""))
 
@@ -769,7 +769,7 @@ class ManFormat(OutputFormat):
 
         # Replace tabs with two spaces and handle newlines
         declaration = definition.replace("\t", "  ")
-        declaration = Re(r"\n").sub('"\n.br\n.BI "', declaration)
+        declaration = KernRe(r"\n").sub('"\n.br\n.BI "', declaration)
 
         self.data += ".SH SYNOPSIS\n"
         self.data += f"{struct_type} {struct_name} " + "{" + "\n.br\n"
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index 33f00c77dd5f..f60722bcc687 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -16,7 +16,7 @@ import argparse
 import re
 from pprint import pformat
 
-from kdoc_re import NestedMatch, Re
+from kdoc_re import NestedMatch, KernRe
 
 
 #
@@ -29,12 +29,12 @@ from kdoc_re import NestedMatch, Re
 #
 
 # Allow whitespace at end of comment start.
-doc_start = Re(r'^/\*\*\s*$', cache=False)
+doc_start = KernRe(r'^/\*\*\s*$', cache=False)
 
-doc_end = Re(r'\*/', cache=False)
-doc_com = Re(r'\s*\*\s*', cache=False)
-doc_com_body = Re(r'\s*\* ?', cache=False)
-doc_decl = doc_com + Re(r'(\w+)', cache=False)
+doc_end = KernRe(r'\*/', cache=False)
+doc_com = KernRe(r'\s*\*\s*', cache=False)
+doc_com_body = KernRe(r'\s*\* ?', cache=False)
+doc_decl = doc_com + KernRe(r'(\w+)', cache=False)
 
 # @params and a strictly limited set of supported section names
 # Specifically:
@@ -44,22 +44,22 @@ doc_decl = doc_com + Re(r'(\w+)', cache=False)
 # while trying to not match literal block starts like "example::"
 #
 doc_sect = doc_com + \
-            Re(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:([^:].*)?$',
+            KernRe(r'\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:([^:].*)?$',
                 flags=re.I, cache=False)
 
-doc_content = doc_com_body + Re(r'(.*)', cache=False)
-doc_block = doc_com + Re(r'DOC:\s*(.*)?', cache=False)
-doc_inline_start = Re(r'^\s*/\*\*\s*$', cache=False)
-doc_inline_sect = Re(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
-doc_inline_end = Re(r'^\s*\*/\s*$', cache=False)
-doc_inline_oneline = Re(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
-attribute = Re(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
+doc_content = doc_com_body + KernRe(r'(.*)', cache=False)
+doc_block = doc_com + KernRe(r'DOC:\s*(.*)?', cache=False)
+doc_inline_start = KernRe(r'^\s*/\*\*\s*$', cache=False)
+doc_inline_sect = KernRe(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
+doc_inline_end = KernRe(r'^\s*\*/\s*$', cache=False)
+doc_inline_oneline = KernRe(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
+attribute = KernRe(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
                flags=re.I | re.S, cache=False)
 
-export_symbol = Re(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
-export_symbol_ns = Re(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
+export_symbol = KernRe(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
+export_symbol_ns = KernRe(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
 
-type_param = Re(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+type_param = KernRe(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
 
 
 class KernelDoc:
@@ -278,10 +278,10 @@ class KernelDoc:
 
         self.entry.anon_struct_union = False
 
-        param = Re(r'[\[\)].*').sub('', param, count=1)
+        param = KernRe(r'[\[\)].*').sub('', param, count=1)
 
         if dtype == "" and param.endswith("..."):
-            if Re(r'\w\.\.\.$').search(param):
+            if KernRe(r'\w\.\.\.$').search(param):
                 # For named variable parameters of the form `x...`,
                 # remove the dots
                 param = param[:-3]
@@ -335,7 +335,7 @@ class KernelDoc:
         # to ignore "[blah" in a parameter string.
 
         self.entry.parameterlist.append(param)
-        org_arg = Re(r'\s\s+').sub(' ', org_arg)
+        org_arg = KernRe(r'\s\s+').sub(' ', org_arg)
         self.entry.parametertypes[param] = org_arg
 
     def save_struct_actual(self, actual):
@@ -344,7 +344,7 @@ class KernelDoc:
         one string item.
         """
 
-        actual = Re(r'\s*').sub("", actual, count=1)
+        actual = KernRe(r'\s*').sub("", actual, count=1)
 
         self.entry.struct_actual += actual + " "
 
@@ -355,20 +355,20 @@ class KernelDoc:
         """
 
         # temporarily replace all commas inside function pointer definition
-        arg_expr = Re(r'(\([^\),]+),')
+        arg_expr = KernRe(r'(\([^\),]+),')
         while arg_expr.search(args):
             args = arg_expr.sub(r"\1#", args)
 
         for arg in args.split(splitter):
             # Strip comments
-            arg = Re(r'\/\*.*\*\/').sub('', arg)
+            arg = KernRe(r'\/\*.*\*\/').sub('', arg)
 
             # Ignore argument attributes
-            arg = Re(r'\sPOS0?\s').sub(' ', arg)
+            arg = KernRe(r'\sPOS0?\s').sub(' ', arg)
 
             # Strip leading/trailing spaces
             arg = arg.strip()
-            arg = Re(r'\s+').sub(' ', arg, count=1)
+            arg = KernRe(r'\s+').sub(' ', arg, count=1)
 
             if arg.startswith('#'):
                 # Treat preprocessor directive as a typeless variable just to fill
@@ -379,63 +379,63 @@ class KernelDoc:
                 self.push_parameter(ln, decl_type, arg, "",
                                     "", declaration_name)
 
-            elif Re(r'\(.+\)\s*\(').search(arg):
+            elif KernRe(r'\(.+\)\s*\(').search(arg):
                 # Pointer-to-function
 
                 arg = arg.replace('#', ',')
 
-                r = Re(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
+                r = KernRe(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
                 if r.match(arg):
                     param = r.group(1)
                 else:
                     self.emit_warning(ln, f"Invalid param: {arg}")
                     param = arg
 
-                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+                dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
                 self.save_struct_actual(param)
                 self.push_parameter(ln, decl_type, param, dtype,
                                     arg, declaration_name)
 
-            elif Re(r'\(.+\)\s*\[').search(arg):
+            elif KernRe(r'\(.+\)\s*\[').search(arg):
                 # Array-of-pointers
 
                 arg = arg.replace('#', ',')
-                r = Re(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
+                r = KernRe(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
                 if r.match(arg):
                     param = r.group(1)
                 else:
                     self.emit_warning(ln, f"Invalid param: {arg}")
                     param = arg
 
-                dtype = Re(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+                dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
 
                 self.save_struct_actual(param)
                 self.push_parameter(ln, decl_type, param, dtype,
                                     arg, declaration_name)
 
             elif arg:
-                arg = Re(r'\s*:\s*').sub(":", arg)
-                arg = Re(r'\s*\[').sub('[', arg)
+                arg = KernRe(r'\s*:\s*').sub(":", arg)
+                arg = KernRe(r'\s*\[').sub('[', arg)
 
-                args = Re(r'\s*,\s*').split(arg)
+                args = KernRe(r'\s*,\s*').split(arg)
                 if args[0] and '*' in args[0]:
                     args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
 
                 first_arg = []
-                r = Re(r'^(.*\s+)(.*?\[.*\].*)$')
+                r = KernRe(r'^(.*\s+)(.*?\[.*\].*)$')
                 if args[0] and r.match(args[0]):
                     args.pop(0)
                     first_arg.extend(r.group(1))
                     first_arg.append(r.group(2))
                 else:
-                    first_arg = Re(r'\s+').split(args.pop(0))
+                    first_arg = KernRe(r'\s+').split(args.pop(0))
 
                 args.insert(0, first_arg.pop())
                 dtype = ' '.join(first_arg)
 
                 for param in args:
-                    if Re(r'^(\*+)\s*(.*)').match(param):
-                        r = Re(r'^(\*+)\s*(.*)')
+                    if KernRe(r'^(\*+)\s*(.*)').match(param):
+                        r = KernRe(r'^(\*+)\s*(.*)')
                         if not r.match(param):
                             self.emit_warning(ln, f"Invalid param: {param}")
                             continue
@@ -447,8 +447,8 @@ class KernelDoc:
                                             f"{dtype} {r.group(1)}",
                                             arg, declaration_name)
 
-                    elif Re(r'(.*?):(\w+)').search(param):
-                        r = Re(r'(.*?):(\w+)')
+                    elif KernRe(r'(.*?):(\w+)').search(param):
+                        r = KernRe(r'(.*?):(\w+)')
                         if not r.match(param):
                             self.emit_warning(ln, f"Invalid param: {param}")
                             continue
@@ -477,7 +477,7 @@ class KernelDoc:
             err = True
             for px in range(len(prms)):               # pylint: disable=C0200
                 prm_clean = prms[px]
-                prm_clean = Re(r'\[.*\]').sub('', prm_clean)
+                prm_clean = KernRe(r'\[.*\]').sub('', prm_clean)
                 prm_clean = attribute.sub('', prm_clean)
 
                 # ignore array size in a parameter string;
@@ -486,7 +486,7 @@ class KernelDoc:
                 # and this appears in @prms as "addr[6" since the
                 # parameter list is split at spaces;
                 # hence just ignore "[..." for the sections check;
-                prm_clean = Re(r'\[.*').sub('', prm_clean)
+                prm_clean = KernRe(r'\[.*').sub('', prm_clean)
 
                 if prm_clean == sects[sx]:
                     err = False
@@ -512,7 +512,7 @@ class KernelDoc:
 
         # Ignore an empty return type (It's a macro)
         # Ignore functions with a "void" return type (but not "void *")
-        if not return_type or Re(r'void\s*\w*\s*$').search(return_type):
+        if not return_type or KernRe(r'void\s*\w*\s*$').search(return_type):
             return
 
         if not self.entry.sections.get("Return", None):
@@ -535,20 +535,20 @@ class KernelDoc:
         ]
 
         definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
-        struct_members = Re(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
+        struct_members = KernRe(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
 
         # Extract struct/union definition
         members = None
         declaration_name = None
         decl_type = None
 
-        r = Re(type_pattern + r'\s+(\w+)\s*' + definition_body)
+        r = KernRe(type_pattern + r'\s+(\w+)\s*' + definition_body)
         if r.search(proto):
             decl_type = r.group(1)
             declaration_name = r.group(2)
             members = r.group(3)
         else:
-            r = Re(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
+            r = KernRe(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
 
             if r.search(proto):
                 decl_type = r.group(1)
@@ -567,21 +567,21 @@ class KernelDoc:
         args_pattern = r'([^,)]+)'
 
         sub_prefixes = [
-            (Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), ''),
-            (Re(r'\/\*\s*private:.*', re.S | re.I), ''),
+            (KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), ''),
+            (KernRe(r'\/\*\s*private:.*', re.S | re.I), ''),
 
             # Strip comments
-            (Re(r'\/\*.*?\*\/', re.S), ''),
+            (KernRe(r'\/\*.*?\*\/', re.S), ''),
 
             # Strip attributes
             (attribute, ' '),
-            (Re(r'\s*__aligned\s*\([^;]*\)', re.S), ' '),
-            (Re(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '),
-            (Re(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '),
-            (Re(r'\s*__packed\s*', re.S), ' '),
-            (Re(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '),
-            (Re(r'\s*____cacheline_aligned_in_smp', re.S), ' '),
-            (Re(r'\s*____cacheline_aligned', re.S), ' '),
+            (KernRe(r'\s*__aligned\s*\([^;]*\)', re.S), ' '),
+            (KernRe(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '),
+            (KernRe(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '),
+            (KernRe(r'\s*__packed\s*', re.S), ' '),
+            (KernRe(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '),
+            (KernRe(r'\s*____cacheline_aligned_in_smp', re.S), ' '),
+            (KernRe(r'\s*____cacheline_aligned', re.S), ' '),
 
             # Unwrap struct_group macros based on this definition:
             # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
@@ -616,10 +616,10 @@ class KernelDoc:
             # matched. So, the implementation to drop STRUCT_GROUP() will be
             # handled in separate.
 
-            (Re(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('),
-            (Re(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GROUP('),
-            (Re(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'struct \1 \2; STRUCT_GROUP('),
-            (Re(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP('),
+            (KernRe(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('),
+            (KernRe(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GROUP('),
+            (KernRe(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'struct \1 \2; STRUCT_GROUP('),
+            (KernRe(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP('),
 
             # Replace macros
             #
@@ -628,15 +628,15 @@ class KernelDoc:
             # it is better to also move those to the NestedMatch logic,
             # to ensure that parenthesis will be properly matched.
 
-            (Re(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
-            (Re(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
-            (Re(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'),
-            (Re(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'),
-            (Re(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
-            (Re(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
-            (Re(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\1 \2[]'),
-            (Re(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S), r'dma_addr_t \1'),
-            (Re(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S), r'__u32 \1'),
+            (KernRe(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
+            (KernRe(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
+            (KernRe(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'),
+            (KernRe(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'),
+            (KernRe(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+            (KernRe(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+            (KernRe(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\1 \2[]'),
+            (KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S), r'dma_addr_t \1'),
+            (KernRe(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S), r'__u32 \1'),
         ]
 
         # Regexes here are guaranteed to have the end limiter matching
@@ -689,8 +689,8 @@ class KernelDoc:
                     s_id = s_id.strip()
 
                     newmember += f"{maintype} {s_id}; "
-                    s_id = Re(r'[:\[].*').sub('', s_id)
-                    s_id = Re(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
+                    s_id = KernRe(r'[:\[].*').sub('', s_id)
+                    s_id = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
 
                     for arg in content.split(';'):
                         arg = arg.strip()
@@ -698,7 +698,7 @@ class KernelDoc:
                         if not arg:
                             continue
 
-                        r = Re(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
+                        r = KernRe(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
                         if r.match(arg):
                             # Pointer-to-function
                             dtype = r.group(1)
@@ -717,15 +717,15 @@ class KernelDoc:
                         else:
                             arg = arg.strip()
                             # Handle bitmaps
-                            arg = Re(r':\s*\d+\s*').sub('', arg)
+                            arg = KernRe(r':\s*\d+\s*').sub('', arg)
 
                             # Handle arrays
-                            arg = Re(r'\[.*\]').sub('', arg)
+                            arg = KernRe(r'\[.*\]').sub('', arg)
 
                             # Handle multiple IDs
-                            arg = Re(r'\s*,\s*').sub(',', arg)
+                            arg = KernRe(r'\s*,\s*').sub(',', arg)
 
-                            r = Re(r'(.*)\s+([\S+,]+)')
+                            r = KernRe(r'(.*)\s+([\S+,]+)')
 
                             if r.search(arg):
                                 dtype = r.group(1)
@@ -735,7 +735,7 @@ class KernelDoc:
                                 continue
 
                             for name in names.split(','):
-                                name = Re(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
+                                name = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
 
                                 if not name:
                                     continue
@@ -757,12 +757,12 @@ class KernelDoc:
                             self.entry.sectcheck, self.entry.struct_actual)
 
         # Adjust declaration for better display
-        declaration = Re(r'([\{;])').sub(r'\1\n', declaration)
-        declaration = Re(r'\}\s+;').sub('};', declaration)
+        declaration = KernRe(r'([\{;])').sub(r'\1\n', declaration)
+        declaration = KernRe(r'\}\s+;').sub('};', declaration)
 
         # Better handle inlined enums
         while True:
-            r = Re(r'(enum\s+\{[^\}]+),([^\n])')
+            r = KernRe(r'(enum\s+\{[^\}]+),([^\n])')
             if not r.search(declaration):
                 break
 
@@ -774,7 +774,7 @@ class KernelDoc:
         for clause in def_args:
 
             clause = clause.strip()
-            clause = Re(r'\s+').sub(' ', clause, count=1)
+            clause = KernRe(r'\s+').sub(' ', clause, count=1)
 
             if not clause:
                 continue
@@ -782,7 +782,7 @@ class KernelDoc:
             if '}' in clause and level > 1:
                 level -= 1
 
-            if not Re(r'^\s*#').match(clause):
+            if not KernRe(r'^\s*#').match(clause):
                 declaration += "\t" * level
 
             declaration += "\t" + clause + "\n"
@@ -807,24 +807,24 @@ class KernelDoc:
         """
 
         # Ignore members marked private
-        proto = Re(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
-        proto = Re(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
+        proto = KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
+        proto = KernRe(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
 
         # Strip comments
-        proto = Re(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
+        proto = KernRe(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
 
         # Strip #define macros inside enums
-        proto = Re(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
+        proto = KernRe(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
 
         members = None
         declaration_name = None
 
-        r = Re(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
+        r = KernRe(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
         if r.search(proto):
             declaration_name = r.group(2)
             members = r.group(1).rstrip()
         else:
-            r = Re(r'enum\s+(\w*)\s*\{(.*)\}')
+            r = KernRe(r'enum\s+(\w*)\s*\{(.*)\}')
             if r.match(proto):
                 declaration_name = r.group(1)
                 members = r.group(2).rstrip()
@@ -847,12 +847,12 @@ class KernelDoc:
 
         member_set = set()
 
-        members = Re(r'\([^;]*?[\)]').sub('', members)
+        members = KernRe(r'\([^;]*?[\)]').sub('', members)
 
         for arg in members.split(','):
             if not arg:
                 continue
-            arg = Re(r'^\s*(\w+).*').sub(r'\1', arg)
+            arg = KernRe(r'^\s*(\w+).*').sub(r'\1', arg)
             self.entry.parameterlist.append(arg)
             if arg not in self.entry.parameterdescs:
                 self.entry.parameterdescs[arg] = self.undescribed
@@ -947,10 +947,10 @@ class KernelDoc:
         ]
 
         for search, sub, flags in sub_prefixes:
-            prototype = Re(search, flags).sub(sub, prototype)
+            prototype = KernRe(search, flags).sub(sub, prototype)
 
         # Macros are a special case, as they change the prototype format
-        new_proto = Re(r"^#\s*define\s+").sub("", prototype)
+        new_proto = KernRe(r"^#\s*define\s+").sub("", prototype)
         if new_proto != prototype:
             is_define_proto = True
             prototype = new_proto
@@ -987,7 +987,7 @@ class KernelDoc:
         found = False
 
         if is_define_proto:
-            r = Re(r'^()(' + name + r')\s+')
+            r = KernRe(r'^()(' + name + r')\s+')
 
             if r.search(prototype):
                 return_type = ''
@@ -1004,7 +1004,7 @@ class KernelDoc:
             ]
 
             for p in patterns:
-                r = Re(p)
+                r = KernRe(p)
 
                 if r.match(prototype):
 
@@ -1071,11 +1071,11 @@ class KernelDoc:
         typedef_ident = r'\*?\s*(\w\S+)\s*'
         typedef_args = r'\s*\((.*)\);'
 
-        typedef1 = Re(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
-        typedef2 = Re(r'typedef' + typedef_type + typedef_ident + typedef_args)
+        typedef1 = KernRe(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
+        typedef2 = KernRe(r'typedef' + typedef_type + typedef_ident + typedef_args)
 
         # Strip comments
-        proto = Re(r'/\*.*?\*/', flags=re.S).sub('', proto)
+        proto = KernRe(r'/\*.*?\*/', flags=re.S).sub('', proto)
 
         # Parse function typedef prototypes
         for r in [typedef1, typedef2]:
@@ -1109,12 +1109,12 @@ class KernelDoc:
             return
 
         # Handle nested parentheses or brackets
-        r = Re(r'(\(*.\)\s*|\[*.\]\s*);$')
+        r = KernRe(r'(\(*.\)\s*|\[*.\]\s*);$')
         while r.search(proto):
             proto = r.sub('', proto)
 
         # Parse simple typedefs
-        r = Re(r'typedef.*\s+(\w+)\s*;')
+        r = KernRe(r'typedef.*\s+(\w+)\s*;')
         if r.match(proto):
             declaration_name = r.group(1)
 
@@ -1195,12 +1195,12 @@ class KernelDoc:
             decl_end = r"(?:[-:].*)"         # end of the name part
 
             # test for pointer declaration type, foo * bar() - desc
-            r = Re(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}?$")
+            r = KernRe(fr"^{decl_start}([\w\s]+?){parenthesis}?\s*{decl_end}?$")
             if r.search(line):
                 self.entry.identifier = r.group(1)
 
             # Test for data declaration
-            r = Re(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)")
+            r = KernRe(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)")
             if r.search(line):
                 self.entry.decl_type = r.group(1)
                 self.entry.identifier = r.group(2)
@@ -1209,15 +1209,15 @@ class KernelDoc:
                 # Look for foo() or static void foo() - description;
                 # or misspelt identifier
 
-                r1 = Re(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s*{decl_end}?$")
-                r2 = Re(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis}\s*{decl_end}$")
+                r1 = KernRe(fr"^{decl_start}{fn_type}(\w+)\s*{parenthesis}\s*{decl_end}?$")
+                r2 = KernRe(fr"^{decl_start}{fn_type}(\w+[^-:]*){parenthesis}\s*{decl_end}$")
 
                 for r in [r1, r2]:
                     if r.search(line):
                         self.entry.identifier = r.group(1)
                         self.entry.decl_type = "function"
 
-                        r = Re(r"define\s+")
+                        r = KernRe(r"define\s+")
                         self.entry.identifier = r.sub("", self.entry.identifier)
                         self.entry.is_kernel_comment = True
                         break
@@ -1230,12 +1230,12 @@ class KernelDoc:
             self.entry.section = self.section_default
             self.entry.new_start_line = ln + 1
 
-            r = Re("[-:](.*)")
+            r = KernRe("[-:](.*)")
             if r.search(line):
                 # strip leading/trailing/multiple spaces
                 self.entry.descr = r.group(1).strip(" ")
 
-                r = Re(r"\s+")
+                r = KernRe(r"\s+")
                 self.entry.descr = r.sub(" ", self.entry.descr)
                 self.entry.declaration_purpose = self.entry.descr
                 self.state = self.STATE_BODY_MAYBE
@@ -1272,7 +1272,7 @@ class KernelDoc:
         """
 
         if self.state == self.STATE_BODY_WITH_BLANK_LINE:
-            r = Re(r"\s*\*\s?\S")
+            r = KernRe(r"\s*\*\s?\S")
             if r.match(line):
                 self.dump_section()
                 self.entry.section = self.section_default
@@ -1318,7 +1318,7 @@ class KernelDoc:
             self.dump_section()
 
             # Look for doc_com + <text> + doc_end:
-            r = Re(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
+            r = KernRe(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
             if r.match(line):
                 self.emit_warning(ln, f"suspicious ending line: {line}")
 
@@ -1351,7 +1351,7 @@ class KernelDoc:
                 self.entry.declaration_purpose = self.entry.declaration_purpose.rstrip()
                 self.entry.declaration_purpose += " " + cont
 
-                r = Re(r"\s+")
+                r = KernRe(r"\s+")
                 self.entry.declaration_purpose = r.sub(' ',
                                                        self.entry.declaration_purpose)
 
@@ -1359,7 +1359,7 @@ class KernelDoc:
                 if self.entry.section.startswith('@') or        \
                    self.entry.section == self.section_context:
                     if self.entry.leading_space is None:
-                        r = Re(r'^(\s+)')
+                        r = KernRe(r'^(\s+)')
                         if r.match(cont):
                             self.entry.leading_space = len(r.group(1))
                         else:
@@ -1436,13 +1436,13 @@ class KernelDoc:
             is_void = True
 
         # Replace SYSCALL_DEFINE with correct return type & function name
-        proto = Re(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
+        proto = KernRe(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
 
-        r = Re(r'long\s+(sys_.*?),')
+        r = KernRe(r'long\s+(sys_.*?),')
         if r.search(proto):
-            proto = Re(',').sub('(', proto, count=1)
+            proto = KernRe(',').sub('(', proto, count=1)
         elif is_void:
-            proto = Re(r'\)').sub('(void)', proto, count=1)
+            proto = KernRe(r'\)').sub('(void)', proto, count=1)
 
         # Now delete all of the odd-numbered commas in the proto
         # so that argument types & names don't have a comma between them
@@ -1469,22 +1469,22 @@ class KernelDoc:
         tracepointargs = None
 
         # Match tracepoint name based on different patterns
-        r = Re(r'TRACE_EVENT\((.*?),')
+        r = KernRe(r'TRACE_EVENT\((.*?),')
         if r.search(proto):
             tracepointname = r.group(1)
 
-        r = Re(r'DEFINE_SINGLE_EVENT\((.*?),')
+        r = KernRe(r'DEFINE_SINGLE_EVENT\((.*?),')
         if r.search(proto):
             tracepointname = r.group(1)
 
-        r = Re(r'DEFINE_EVENT\((.*?),(.*?),')
+        r = KernRe(r'DEFINE_EVENT\((.*?),(.*?),')
         if r.search(proto):
             tracepointname = r.group(2)
 
         if tracepointname:
             tracepointname = tracepointname.lstrip()
 
-        r = Re(r'TP_PROTO\((.*?)\)')
+        r = KernRe(r'TP_PROTO\((.*?)\)')
         if r.search(proto):
             tracepointargs = r.group(1)
 
@@ -1501,43 +1501,43 @@ class KernelDoc:
         """Ancillary routine to process a function prototype"""
 
         # strip C99-style comments to end of line
-        r = Re(r"\/\/.*$", re.S)
+        r = KernRe(r"\/\/.*$", re.S)
         line = r.sub('', line)
 
-        if Re(r'\s*#\s*define').match(line):
+        if KernRe(r'\s*#\s*define').match(line):
             self.entry.prototype = line
         elif line.startswith('#'):
             # Strip other macros like #ifdef/#ifndef/#endif/...
             pass
         else:
-            r = Re(r'([^\{]*)')
+            r = KernRe(r'([^\{]*)')
             if r.match(line):
                 self.entry.prototype += r.group(1) + " "
 
-        if '{' in line or ';' in line or Re(r'\s*#\s*define').match(line):
+        if '{' in line or ';' in line or KernRe(r'\s*#\s*define').match(line):
             # strip comments
-            r = Re(r'/\*.*?\*/')
+            r = KernRe(r'/\*.*?\*/')
             self.entry.prototype = r.sub('', self.entry.prototype)
 
             # strip newlines/cr's
-            r = Re(r'[\r\n]+')
+            r = KernRe(r'[\r\n]+')
             self.entry.prototype = r.sub(' ', self.entry.prototype)
 
             # strip leading spaces
-            r = Re(r'^\s+')
+            r = KernRe(r'^\s+')
             self.entry.prototype = r.sub('', self.entry.prototype)
 
             # Handle self.entry.prototypes for function pointers like:
             #       int (*pcs_config)(struct foo)
 
-            r = Re(r'^(\S+\s+)\(\s*\*(\S+)\)')
+            r = KernRe(r'^(\S+\s+)\(\s*\*(\S+)\)')
             self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
 
             if 'SYSCALL_DEFINE' in self.entry.prototype:
                 self.entry.prototype = self.syscall_munge(ln,
                                                           self.entry.prototype)
 
-            r = Re(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
+            r = KernRe(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
             if r.search(self.entry.prototype):
                 self.entry.prototype = self.tracepoint_munge(ln,
                                                              self.entry.prototype)
@@ -1549,22 +1549,22 @@ class KernelDoc:
         """Ancillary routine to process a type"""
 
         # Strip newlines/cr's.
-        line = Re(r'[\r\n]+', re.S).sub(' ', line)
+        line = KernRe(r'[\r\n]+', re.S).sub(' ', line)
 
         # Strip leading spaces
-        line = Re(r'^\s+', re.S).sub('', line)
+        line = KernRe(r'^\s+', re.S).sub('', line)
 
         # Strip trailing spaces
-        line = Re(r'\s+$', re.S).sub('', line)
+        line = KernRe(r'\s+$', re.S).sub('', line)
 
         # Strip C99-style comments to the end of the line
-        line = Re(r"\/\/.*$", re.S).sub('', line)
+        line = KernRe(r"\/\/.*$", re.S).sub('', line)
 
         # To distinguish preprocessor directive from regular declaration later.
         if line.startswith('#'):
             line += ";"
 
-        r = Re(r'([^\{\};]*)([\{\};])(.*)')
+        r = KernRe(r'([^\{\};]*)([\{\};])(.*)')
         while True:
             if r.search(line):
                 if self.entry.prototype:
diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
index d28485ff94d6..e81695b273bf 100755
--- a/scripts/lib/kdoc/kdoc_re.py
+++ b/scripts/lib/kdoc/kdoc_re.py
@@ -14,7 +14,7 @@ import re
 re_cache = {}
 
 
-class Re:
+class KernRe:
     """
     Helper class to simplify regex declaration and usage,
 
@@ -59,7 +59,7 @@ class Re:
         Allows adding two regular expressions into one.
         """
 
-        return Re(str(self) + str(other), cache=self.cache or other.cache,
+        return KernRe(str(self) + str(other), cache=self.cache or other.cache,
                   flags=self.regex.flags | other.regex.flags)
 
     def match(self, string):
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v3 33/33] scripts: kernel-doc: fix parsing function-like typedefs (again)
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (31 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 32/33] scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe Mauro Carvalho Chehab
@ 2025-04-08 10:09 ` Mauro Carvalho Chehab
  2025-04-09  5:29 ` [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-08 10:09 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: Sean Anderson, Mauro Carvalho Chehab, Russell King, linux-kernel,
	netdev

From: Sean Anderson <sean.anderson@linux.dev>

Typedefs like

    typedef struct phylink_pcs *(*pcs_xlate_t)(const u64 *args);

have a typedef_type that ends with a * and therefore has no word
boundary. Add an extra clause for the final group of the typedef_type so
we only require a word boundary if we match a word.

[mchehab: modify also kernel-doc.py, as we're deprecating the perl version]

Fixes: 7d2c6b1edf79 ("scripts: kernel-doc: fix parsing function-like typedefs")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 scripts/kernel-doc.pl           | 2 +-
 scripts/lib/kdoc/kdoc_parser.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/kernel-doc.pl b/scripts/kernel-doc.pl
index af6cf408b96d..5db23cbf4eb2 100755
--- a/scripts/kernel-doc.pl
+++ b/scripts/kernel-doc.pl
@@ -1325,7 +1325,7 @@ sub dump_enum($$) {
     }
 }
 
-my $typedef_type = qr { ((?:\s+[\w\*]+\b){1,8})\s* }x;
+my $typedef_type = qr { ((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s* }x;
 my $typedef_ident = qr { \*?\s*(\w\S+)\s* }x;
 my $typedef_args = qr { \s*\((.*)\); }x;
 
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index f60722bcc687..4f036c720b36 100755
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -1067,7 +1067,7 @@ class KernelDoc:
         Stores a typedef inside self.entries array.
         """
 
-        typedef_type = r'((?:\s+[\w\*]+\b){1,8})\s*'
+        typedef_type = r'((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s*'
         typedef_ident = r'\*?\s*(\w\S+)\s*'
         typedef_args = r'\s*\((.*)\);'
 
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (32 preceding siblings ...)
  2025-04-08 10:09 ` [PATCH v3 33/33] scripts: kernel-doc: fix parsing function-like typedefs (again) Mauro Carvalho Chehab
@ 2025-04-09  5:29 ` Mauro Carvalho Chehab
  2025-04-09 10:16 ` Jani Nikula
  2025-04-09 18:30 ` Jonathan Corbet
  35 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-09  5:29 UTC (permalink / raw)
  To: Linux Doc Mailing List, Jonathan Corbet
  Cc: linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

Jon,

Em Tue,  8 Apr 2025 18:09:03 +0800
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Mauro Carvalho Chehab (32):
>   scripts/kernel-doc: rename it to scripts/kernel-doc.pl
>   scripts/kernel-doc: add a symlink to the Perl version of kernel-doc

Forgot to mention at the description. After review, it makes sense to merge
those two patches into one.

Having them in separate is good for review, but merging them makes
sense from bisectability PoV.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (33 preceding siblings ...)
  2025-04-09  5:29 ` [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
@ 2025-04-09 10:16 ` Jani Nikula
  2025-04-09 11:44   ` Mauro Carvalho Chehab
  2025-04-09 18:30 ` Jonathan Corbet
  35 siblings, 1 reply; 56+ messages in thread
From: Jani Nikula @ 2025-04-09 10:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Linux Doc Mailing List, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, linux-kernel, Gustavo A. R. Silva,
	Kees Cook, Russell King, linux-hardening, netdev

On Tue, 08 Apr 2025, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Hi Jon,
>
> This changeset contains the kernel-doc.py script to replace the verable
> kernel-doc originally written in Perl. It replaces the first version and the
> second series I sent on the top of it.

Yay! Thanks for doing this. I believe this will make contributing to
kernel-doc more accessible in the long run.

> I tried to stay as close as possible of the original Perl implementation
> on the first patch introducing kernel-doc.py, as it helps to double check
> if each function was  properly translated to Python.  This have been 
> helpful debugging troubles that happened during the conversion.
>
> I worked hard to make it bug-compatible with the original one. Still, its
> output has a couple of differences from the original one:
>
> - The tab expansion works better with the Python script. With that, some
>   outputs that contain tabs at kernel-doc markups are now different;
>
> - The new script  works better stripping blank lines. So, there are a couple
>   of empty new lines that are now stripped with this version;
>
> - There is a buggy logic at kernel-doc to strip empty description and
>   return sections. I was not able to replicate the exact behavior. So, I ended
>   adding an extra logic to strip empty sections with a different algorithm.
>
> Yet, on my tests, the results are compatible with the venerable script
> output for all .. kernel-doc tags found in Documentation/. I double-checked
> this by adding support to output the kernel-doc commands when V=1, and
> then I ran a diff between kernel-doc.pl and kernel-doc.py for the same
> command lines.
>
> The only patch that doesn't belong to this series is a patch dropping
> kernel-doc.pl. I opted to keep it for now, as it can help to better
> test the new tools.
>
> With such changes, if one wants to build docs with the old script,
> all it is needed is to use KERNELDOC parameter, e.g.:
>
> 	$ make KERNELDOC=scripts/kernel-doc.pl htmldocs

I guess that's good for double checking that the python version
reproduces the output of the old version, warts and all. And it could be
used standalone for comparing the output for .[ch] files directly
instead of going through Sphinx.

But once we're reasonably sure the new one works fine, I think the
natural follow-up will be to import the kernel-doc python module from
the kernel-doc Sphinx extension instead of running it with
subprocess.Popen(). It'll bypass an absolutely insane amount of forks,
python interpreter launches and module imports.

It'll also open the door for passing the results in python native
structures instead of text, also making it possible to cache parse
results instead of parsing the source files for every kernel-doc
directive in rst.

Another idea regarding code organization, again for future. Maybe we
should have a scripts/python/ directory structure, so we can point
python path there, and be able to import stuff from there? And
reasonably share code between modules. And have linters handle it
recursively, etc, etc.

Anyway, I applaud the work, and I regret that I don't have time to
review it in detail. Regardless, I think the matching output is the most
important part.


BR,
Jani.

> ---
>
> v3:
> - rebased on the top of v6.15-rc1;
> - Removed patches that weren't touching kernel-doc and its Sphinx extension;
> - The "Re" class was renamed to "KernRe"
> - It contains one patch from Sean with an additional hunk for the
>   python version.
>
> Mauro Carvalho Chehab (32):
>   scripts/kernel-doc: rename it to scripts/kernel-doc.pl
>   scripts/kernel-doc: add a symlink to the Perl version of kernel-doc
>   scripts/kernel-doc.py: add a Python parser
>   scripts/kernel-doc.py: output warnings the same way as kerneldoc
>   scripts/kernel-doc.py: better handle empty sections
>   scripts/kernel-doc.py: properly handle struct_group macros
>   scripts/kernel-doc.py: move regex methods to a separate file
>   scripts/kernel-doc.py: move KernelDoc class to a separate file
>   scripts/kernel-doc.py: move KernelFiles class to a separate file
>   scripts/kernel-doc.py: move output classes to a separate file
>   scripts/kernel-doc.py: convert message output to an interactor
>   scripts/kernel-doc.py: move file lists to the parser function
>   scripts/kernel-doc.py: implement support for -no-doc-sections
>   scripts/kernel-doc.py: fix line number output
>   scripts/kernel-doc.py: fix handling of doc output check
>   scripts/kernel-doc.py: properly handle out_section for ReST
>   scripts/kernel-doc.py: postpone warnings to the output plugin
>   docs: add a .pylintrc file with sys path for docs scripts
>   docs: sphinx: kerneldoc: verbose kernel-doc command if V=1
>   docs: sphinx: kerneldoc: ignore "\" characters from options
>   docs: sphinx: kerneldoc: use kernel-doc.py script
>   scripts/kernel-doc.py: Set an output format for --none
>   scripts/kernel-doc.py: adjust some coding style issues
>   scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13
>   scripts/kernel-doc.py: move modulename to man class
>   scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP
>   scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency
>   scripts/kernel-doc.py: Properly handle Werror and exit codes
>   scripts/kernel-doc: switch to use kernel-doc.py
>   scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname
>   scripts/kernel_doc.py: better handle exported symbols
>   scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe
>
> Sean Anderson (1):
>   scripts: kernel-doc: fix parsing function-like typedefs (again)
>
>  .pylintrc                         |    2 +
>  Documentation/Makefile            |    2 +-
>  Documentation/conf.py             |    2 +-
>  Documentation/sphinx/kerneldoc.py |   46 +
>  scripts/kernel-doc                | 2440 +----------------------------
>  scripts/kernel-doc.pl             | 2439 ++++++++++++++++++++++++++++
>  scripts/kernel-doc.py             |  315 ++++
>  scripts/lib/kdoc/kdoc_files.py    |  282 ++++
>  scripts/lib/kdoc/kdoc_output.py   |  793 ++++++++++
>  scripts/lib/kdoc/kdoc_parser.py   | 1715 ++++++++++++++++++++
>  scripts/lib/kdoc/kdoc_re.py       |  273 ++++
>  11 files changed, 5868 insertions(+), 2441 deletions(-)
>  create mode 100644 .pylintrc
>  mode change 100755 => 120000 scripts/kernel-doc
>  create mode 100755 scripts/kernel-doc.pl
>  create mode 100755 scripts/kernel-doc.py
>  create mode 100644 scripts/lib/kdoc/kdoc_files.py
>  create mode 100755 scripts/lib/kdoc/kdoc_output.py
>  create mode 100755 scripts/lib/kdoc/kdoc_parser.py
>  create mode 100755 scripts/lib/kdoc/kdoc_re.py

-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-09 10:16 ` Jani Nikula
@ 2025-04-09 11:44   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-09 11:44 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Linux Doc Mailing List, Jonathan Corbet, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

Em Wed, 09 Apr 2025 13:16:06 +0300
Jani Nikula <jani.nikula@linux.intel.com> escreveu:

> On Tue, 08 Apr 2025, Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > Hi Jon,
> >
> > This changeset contains the kernel-doc.py script to replace the verable
> > kernel-doc originally written in Perl. It replaces the first version and the
> > second series I sent on the top of it.  
> 
> Yay! Thanks for doing this. I believe this will make contributing to
> kernel-doc more accessible in the long run.
> 
> > I tried to stay as close as possible of the original Perl implementation
> > on the first patch introducing kernel-doc.py, as it helps to double check
> > if each function was  properly translated to Python.  This have been 
> > helpful debugging troubles that happened during the conversion.
> >
> > I worked hard to make it bug-compatible with the original one. Still, its
> > output has a couple of differences from the original one:
> >
> > - The tab expansion works better with the Python script. With that, some
> >   outputs that contain tabs at kernel-doc markups are now different;
> >
> > - The new script  works better stripping blank lines. So, there are a couple
> >   of empty new lines that are now stripped with this version;
> >
> > - There is a buggy logic at kernel-doc to strip empty description and
> >   return sections. I was not able to replicate the exact behavior. So, I ended
> >   adding an extra logic to strip empty sections with a different algorithm.
> >
> > Yet, on my tests, the results are compatible with the venerable script
> > output for all .. kernel-doc tags found in Documentation/. I double-checked
> > this by adding support to output the kernel-doc commands when V=1, and
> > then I ran a diff between kernel-doc.pl and kernel-doc.py for the same
> > command lines.
> >
> > The only patch that doesn't belong to this series is a patch dropping
> > kernel-doc.pl. I opted to keep it for now, as it can help to better
> > test the new tools.
> >
> > With such changes, if one wants to build docs with the old script,
> > all it is needed is to use KERNELDOC parameter, e.g.:
> >
> > 	$ make KERNELDOC=scripts/kernel-doc.pl htmldocs  
> 
> I guess that's good for double checking that the python version
> reproduces the output of the old version, warts and all. And it could be
> used standalone for comparing the output for .[ch] files directly
> instead of going through Sphinx.
> 
> But once we're reasonably sure the new one works fine, I think the
> natural follow-up will be to import the kernel-doc python module from
> the kernel-doc Sphinx extension instead of running it with
> subprocess.Popen(). It'll bypass an absolutely insane amount of forks,
> python interpreter launches and module imports.
> 
> It'll also open the door for passing the results in python native
> structures instead of text, also making it possible to cache parse
> results instead of parsing the source files for every kernel-doc
> directive in rst.

Yes, this is on my plan. I have already a patch series for that,
but it still requires some care to ensure that the results will be
identical.

> Another idea regarding code organization, again for future. Maybe we
> should have a scripts/python/ directory structure, so we can point
> python path there, and be able to import stuff from there? And
> reasonably share code between modules. And have linters handle it
> recursively, etc, etc.

Sounds like a plan. I did some code reorg already, but surely there
are spaces for improvements. 

> Anyway, I applaud the work, and I regret that I don't have time to
> review it in detail. Regardless, I think the matching output is the most
> important part.

I did several tests here to check the output, making it similar to the
output from the Perl version.

> 
> 
> BR,
> Jani.
> 
> > ---
> >
> > v3:
> > - rebased on the top of v6.15-rc1;
> > - Removed patches that weren't touching kernel-doc and its Sphinx extension;
> > - The "Re" class was renamed to "KernRe"
> > - It contains one patch from Sean with an additional hunk for the
> >   python version.
> >
> > Mauro Carvalho Chehab (32):
> >   scripts/kernel-doc: rename it to scripts/kernel-doc.pl
> >   scripts/kernel-doc: add a symlink to the Perl version of kernel-doc
> >   scripts/kernel-doc.py: add a Python parser
> >   scripts/kernel-doc.py: output warnings the same way as kerneldoc
> >   scripts/kernel-doc.py: better handle empty sections
> >   scripts/kernel-doc.py: properly handle struct_group macros
> >   scripts/kernel-doc.py: move regex methods to a separate file
> >   scripts/kernel-doc.py: move KernelDoc class to a separate file
> >   scripts/kernel-doc.py: move KernelFiles class to a separate file
> >   scripts/kernel-doc.py: move output classes to a separate file
> >   scripts/kernel-doc.py: convert message output to an interactor
> >   scripts/kernel-doc.py: move file lists to the parser function
> >   scripts/kernel-doc.py: implement support for -no-doc-sections
> >   scripts/kernel-doc.py: fix line number output
> >   scripts/kernel-doc.py: fix handling of doc output check
> >   scripts/kernel-doc.py: properly handle out_section for ReST
> >   scripts/kernel-doc.py: postpone warnings to the output plugin
> >   docs: add a .pylintrc file with sys path for docs scripts
> >   docs: sphinx: kerneldoc: verbose kernel-doc command if V=1
> >   docs: sphinx: kerneldoc: ignore "\" characters from options
> >   docs: sphinx: kerneldoc: use kernel-doc.py script
> >   scripts/kernel-doc.py: Set an output format for --none
> >   scripts/kernel-doc.py: adjust some coding style issues
> >   scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13
> >   scripts/kernel-doc.py: move modulename to man class
> >   scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP
> >   scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency
> >   scripts/kernel-doc.py: Properly handle Werror and exit codes
> >   scripts/kernel-doc: switch to use kernel-doc.py
> >   scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname
> >   scripts/kernel_doc.py: better handle exported symbols
> >   scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe
> >
> > Sean Anderson (1):
> >   scripts: kernel-doc: fix parsing function-like typedefs (again)
> >
> >  .pylintrc                         |    2 +
> >  Documentation/Makefile            |    2 +-
> >  Documentation/conf.py             |    2 +-
> >  Documentation/sphinx/kerneldoc.py |   46 +
> >  scripts/kernel-doc                | 2440 +----------------------------
> >  scripts/kernel-doc.pl             | 2439 ++++++++++++++++++++++++++++
> >  scripts/kernel-doc.py             |  315 ++++
> >  scripts/lib/kdoc/kdoc_files.py    |  282 ++++
> >  scripts/lib/kdoc/kdoc_output.py   |  793 ++++++++++
> >  scripts/lib/kdoc/kdoc_parser.py   | 1715 ++++++++++++++++++++
> >  scripts/lib/kdoc/kdoc_re.py       |  273 ++++
> >  11 files changed, 5868 insertions(+), 2441 deletions(-)
> >  create mode 100644 .pylintrc
> >  mode change 100755 => 120000 scripts/kernel-doc
> >  create mode 100755 scripts/kernel-doc.pl
> >  create mode 100755 scripts/kernel-doc.py
> >  create mode 100644 scripts/lib/kdoc/kdoc_files.py
> >  create mode 100755 scripts/lib/kdoc/kdoc_output.py
> >  create mode 100755 scripts/lib/kdoc/kdoc_parser.py
> >  create mode 100755 scripts/lib/kdoc/kdoc_re.py  
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
                   ` (34 preceding siblings ...)
  2025-04-09 10:16 ` Jani Nikula
@ 2025-04-09 18:30 ` Jonathan Corbet
  2025-04-14  9:41   ` Andy Shevchenko
  35 siblings, 1 reply; 56+ messages in thread
From: Jonathan Corbet @ 2025-04-09 18:30 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, linux-kernel, Gustavo A. R. Silva,
	Kees Cook, Russell King, linux-hardening, netdev

Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:

> This changeset contains the kernel-doc.py script to replace the verable
> kernel-doc originally written in Perl. It replaces the first version and the
> second series I sent on the top of it.

OK, I've applied it, looked at the (minimal) changes in output, and
concluded that it's good - all this stuff is now in docs-next.  Many
thanks for doing this!

I'm going to hold off on other documentation patches for a day or two
just in case anything turns up.  But it looks awfully good.

Thanks,

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-09 18:30 ` Jonathan Corbet
@ 2025-04-14  9:41   ` Andy Shevchenko
  2025-04-14 15:17     ` Jonathan Corbet
  0 siblings, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-14  9:41 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> 
> > This changeset contains the kernel-doc.py script to replace the verable
> > kernel-doc originally written in Perl. It replaces the first version and the
> > second series I sent on the top of it.
> 
> OK, I've applied it, looked at the (minimal) changes in output, and
> concluded that it's good - all this stuff is now in docs-next.  Many
> thanks for doing this!
> 
> I'm going to hold off on other documentation patches for a day or two
> just in case anything turns up.  But it looks awfully good.

This started well, until it becomes a scripts/lib/kdoc.
So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
"disgusting turd" )as said by Linus) in the clean tree.

*) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-14  9:41   ` Andy Shevchenko
@ 2025-04-14 15:17     ` Jonathan Corbet
  2025-04-14 15:54       ` Jonathan Corbet
  2025-04-15  7:01       ` Andy Shevchenko
  0 siblings, 2 replies; 56+ messages in thread
From: Jonathan Corbet @ 2025-04-14 15:17 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

Andy Shevchenko <andriy.shevchenko@intel.com> writes:

> On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
>> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
>> 
>> > This changeset contains the kernel-doc.py script to replace the verable
>> > kernel-doc originally written in Perl. It replaces the first version and the
>> > second series I sent on the top of it.
>> 
>> OK, I've applied it, looked at the (minimal) changes in output, and
>> concluded that it's good - all this stuff is now in docs-next.  Many
>> thanks for doing this!
>> 
>> I'm going to hold off on other documentation patches for a day or two
>> just in case anything turns up.  But it looks awfully good.
>
> This started well, until it becomes a scripts/lib/kdoc.
> So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> "disgusting turd" )as said by Linus) in the clean tree.
>
> *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.

If nothing else, "make cleandocs" should clean it up, certainly.

We can also tell CPython to not create that directory at all.  I'll run
some tests to see what the effect is on the documentation build times;
I'm guessing it will not be huge...

Thanks,

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-14 15:17     ` Jonathan Corbet
@ 2025-04-14 15:54       ` Jonathan Corbet
  2025-04-15  7:01       ` Andy Shevchenko
  1 sibling, 0 replies; 56+ messages in thread
From: Jonathan Corbet @ 2025-04-14 15:54 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

> Andy Shevchenko <andriy.shevchenko@intel.com> writes:
>
>> On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
>>> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
>>> 
>>> > This changeset contains the kernel-doc.py script to replace the verable
>>> > kernel-doc originally written in Perl. It replaces the first version and the
>>> > second series I sent on the top of it.
>>> 
>>> OK, I've applied it, looked at the (minimal) changes in output, and
>>> concluded that it's good - all this stuff is now in docs-next.  Many
>>> thanks for doing this!
>>> 
>>> I'm going to hold off on other documentation patches for a day or two
>>> just in case anything turns up.  But it looks awfully good.
>>
>> This started well, until it becomes a scripts/lib/kdoc.
>> So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
>> "disgusting turd" )as said by Linus) in the clean tree.
>>
>> *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.

Actually, I find myself unable to reproduce this; can you tell me how
you get Python to create that directory on your system?  Which version
of Python?

Thanks,

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-14 15:17     ` Jonathan Corbet
  2025-04-14 15:54       ` Jonathan Corbet
@ 2025-04-15  7:01       ` Andy Shevchenko
  2025-04-15  7:03         ` Andy Shevchenko
  1 sibling, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  7:01 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:
> Andy Shevchenko <andriy.shevchenko@intel.com> writes:
> > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
> >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> >> 
> >> > This changeset contains the kernel-doc.py script to replace the verable
> >> > kernel-doc originally written in Perl. It replaces the first version and the
> >> > second series I sent on the top of it.
> >> 
> >> OK, I've applied it, looked at the (minimal) changes in output, and
> >> concluded that it's good - all this stuff is now in docs-next.  Many
> >> thanks for doing this!
> >> 
> >> I'm going to hold off on other documentation patches for a day or two
> >> just in case anything turns up.  But it looks awfully good.
> >
> > This started well, until it becomes a scripts/lib/kdoc.
> > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > "disgusting turd" )as said by Linus) in the clean tree.
> >
> > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.
> 
> If nothing else, "make cleandocs" should clean it up, certainly.
> 
> We can also tell CPython to not create that directory at all.  I'll run
> some tests to see what the effect is on the documentation build times;
> I'm guessing it will not be huge...

I do not build documentation at all, it's just a regular code build that leaves
tree dirty.

$ python3 --version
Python 3.13.2

It's standard Debian testing distribution, no customisation in the code.

To reproduce.
1) I have just done a new build to reduce the churn, so, running make again does nothing;
2) The following snippet in shell shows the issue

$ git clean -xdf
$ git status --ignored
On branch ...
nothing to commit, working tree clean

$ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
make[1]: Entering directory '...'
  GEN     Makefile
  DESCEND objtool
  CALL    .../scripts/checksyscalls.sh
  INSTALL libsubcmd_headers
.pylintrc: warning: ignored by one of the .gitignore files
Kernel: arch/x86/boot/bzImage is ready  (#23)
make[1]: Leaving directory '...'

$ touch drivers/gpio/gpiolib-acpi.c

$ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
make[1]: Entering directory '...'
  GEN     Makefile
  DESCEND objtool
  CALL    .../scripts/checksyscalls.sh
  INSTALL libsubcmd_headers
...
  OBJCOPY arch/x86/boot/setup.bin
  BUILD   arch/x86/boot/bzImage
Kernel: arch/x86/boot/bzImage is ready  (#24)
make[1]: Leaving directory '...'

$ git status --ignored
On branch ...
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	scripts/lib/kdoc/__pycache__/

nothing added to commit but untracked files present (use "git add" to track)

It's 100% reproducible on my side. I am happy to test any patches to fix this.
It's really annoying "feature" for `make O=...` builds. Also note that
theoretically the Git worktree may be located on read-only storage / media
and this can induce subtle issues.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  7:01       ` Andy Shevchenko
@ 2025-04-15  7:03         ` Andy Shevchenko
  2025-04-15  7:49           ` Jani Nikula
  2025-04-15  8:30           ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  7:03 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:
> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:
> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:
> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > >> 
> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > >> > second series I sent on the top of it.
> > >> 
> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > >> thanks for doing this!
> > >> 
> > >> I'm going to hold off on other documentation patches for a day or two
> > >> just in case anything turns up.  But it looks awfully good.
> > >
> > > This started well, until it becomes a scripts/lib/kdoc.
> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > "disgusting turd" )as said by Linus) in the clean tree.
> > >
> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.
> > 
> > If nothing else, "make cleandocs" should clean it up, certainly.
> > 
> > We can also tell CPython to not create that directory at all.  I'll run
> > some tests to see what the effect is on the documentation build times;
> > I'm guessing it will not be huge...
> 
> I do not build documentation at all, it's just a regular code build that leaves
> tree dirty.
> 
> $ python3 --version
> Python 3.13.2
> 
> It's standard Debian testing distribution, no customisation in the code.
> 
> To reproduce.
> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> 2) The following snippet in shell shows the issue
> 
> $ git clean -xdf
> $ git status --ignored
> On branch ...
> nothing to commit, working tree clean
> 
> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> make[1]: Entering directory '...'
>   GEN     Makefile
>   DESCEND objtool
>   CALL    .../scripts/checksyscalls.sh
>   INSTALL libsubcmd_headers
> .pylintrc: warning: ignored by one of the .gitignore files
> Kernel: arch/x86/boot/bzImage is ready  (#23)
> make[1]: Leaving directory '...'
> 
> $ touch drivers/gpio/gpiolib-acpi.c
> 
> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> make[1]: Entering directory '...'
>   GEN     Makefile
>   DESCEND objtool
>   CALL    .../scripts/checksyscalls.sh
>   INSTALL libsubcmd_headers
> ...
>   OBJCOPY arch/x86/boot/setup.bin
>   BUILD   arch/x86/boot/bzImage
> Kernel: arch/x86/boot/bzImage is ready  (#24)
> make[1]: Leaving directory '...'
> 
> $ git status --ignored
> On branch ...
> Untracked files:
>   (use "git add <file>..." to include in what will be committed)
> 	scripts/lib/kdoc/__pycache__/
> 
> nothing added to commit but untracked files present (use "git add" to track)

FWIW, I repeated this with removing the O=.../out folder completely, so it's
fully clean build. Still the same issue.

And it appears at the very beginning of the build. You don't need to wait to
have the kernel to be built actually.

> It's 100% reproducible on my side. I am happy to test any patches to fix this.
> It's really annoying "feature" for `make O=...` builds. Also note that
> theoretically the Git worktree may be located on read-only storage / media
> and this can induce subtle issues.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  7:03         ` Andy Shevchenko
@ 2025-04-15  7:49           ` Jani Nikula
  2025-04-15  8:17             ` Andy Shevchenko
  2025-04-15  8:30           ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 56+ messages in thread
From: Jani Nikula @ 2025-04-15  7:49 UTC (permalink / raw)
  To: Andy Shevchenko, Jonathan Corbet
  Cc: Mauro Carvalho Chehab, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:
> On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:
>> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:
>> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:
>> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
>> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
>> > >> 
>> > >> > This changeset contains the kernel-doc.py script to replace the verable
>> > >> > kernel-doc originally written in Perl. It replaces the first version and the
>> > >> > second series I sent on the top of it.
>> > >> 
>> > >> OK, I've applied it, looked at the (minimal) changes in output, and
>> > >> concluded that it's good - all this stuff is now in docs-next.  Many
>> > >> thanks for doing this!
>> > >> 
>> > >> I'm going to hold off on other documentation patches for a day or two
>> > >> just in case anything turns up.  But it looks awfully good.
>> > >
>> > > This started well, until it becomes a scripts/lib/kdoc.
>> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
>> > > "disgusting turd" )as said by Linus) in the clean tree.
>> > >
>> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.
>> > 
>> > If nothing else, "make cleandocs" should clean it up, certainly.
>> > 
>> > We can also tell CPython to not create that directory at all.  I'll run
>> > some tests to see what the effect is on the documentation build times;
>> > I'm guessing it will not be huge...
>> 
>> I do not build documentation at all, it's just a regular code build that leaves
>> tree dirty.
>> 
>> $ python3 --version
>> Python 3.13.2
>> 
>> It's standard Debian testing distribution, no customisation in the code.
>> 
>> To reproduce.
>> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
>> 2) The following snippet in shell shows the issue
>> 
>> $ git clean -xdf
>> $ git status --ignored
>> On branch ...
>> nothing to commit, working tree clean
>> 
>> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
>> make[1]: Entering directory '...'
>>   GEN     Makefile
>>   DESCEND objtool
>>   CALL    .../scripts/checksyscalls.sh
>>   INSTALL libsubcmd_headers
>> .pylintrc: warning: ignored by one of the .gitignore files
>> Kernel: arch/x86/boot/bzImage is ready  (#23)
>> make[1]: Leaving directory '...'
>> 
>> $ touch drivers/gpio/gpiolib-acpi.c
>> 
>> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
>> make[1]: Entering directory '...'
>>   GEN     Makefile
>>   DESCEND objtool
>>   CALL    .../scripts/checksyscalls.sh
>>   INSTALL libsubcmd_headers
>> ...
>>   OBJCOPY arch/x86/boot/setup.bin
>>   BUILD   arch/x86/boot/bzImage
>> Kernel: arch/x86/boot/bzImage is ready  (#24)
>> make[1]: Leaving directory '...'
>> 
>> $ git status --ignored
>> On branch ...
>> Untracked files:
>>   (use "git add <file>..." to include in what will be committed)
>> 	scripts/lib/kdoc/__pycache__/
>> 
>> nothing added to commit but untracked files present (use "git add" to track)
>
> FWIW, I repeated this with removing the O=.../out folder completely, so it's
> fully clean build. Still the same issue.
>
> And it appears at the very beginning of the build. You don't need to wait to
> have the kernel to be built actually.

kernel-doc gets run on source files for W=1 builds. See Makefile.build.

BR,
Jani.


>
>> It's 100% reproducible on my side. I am happy to test any patches to fix this.
>> It's really annoying "feature" for `make O=...` builds. Also note that
>> theoretically the Git worktree may be located on read-only storage / media
>> and this can induce subtle issues.

-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  7:49           ` Jani Nikula
@ 2025-04-15  8:17             ` Andy Shevchenko
  2025-04-15  8:19               ` Andy Shevchenko
  0 siblings, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  8:17 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Jonathan Corbet, Mauro Carvalho Chehab, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:
> On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:
> > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:
> >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:
> >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:
> >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
> >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> >> > >> 
> >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> >> > >> > second series I sent on the top of it.
> >> > >> 
> >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> >> > >> thanks for doing this!
> >> > >> 
> >> > >> I'm going to hold off on other documentation patches for a day or two
> >> > >> just in case anything turns up.  But it looks awfully good.
> >> > >
> >> > > This started well, until it becomes a scripts/lib/kdoc.
> >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> >> > > "disgusting turd" )as said by Linus) in the clean tree.
> >> > >
> >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.
> >> > 
> >> > If nothing else, "make cleandocs" should clean it up, certainly.
> >> > 
> >> > We can also tell CPython to not create that directory at all.  I'll run
> >> > some tests to see what the effect is on the documentation build times;
> >> > I'm guessing it will not be huge...
> >> 
> >> I do not build documentation at all, it's just a regular code build that leaves
> >> tree dirty.
> >> 
> >> $ python3 --version
> >> Python 3.13.2
> >> 
> >> It's standard Debian testing distribution, no customisation in the code.
> >> 
> >> To reproduce.
> >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> >> 2) The following snippet in shell shows the issue
> >> 
> >> $ git clean -xdf
> >> $ git status --ignored
> >> On branch ...
> >> nothing to commit, working tree clean
> >> 
> >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> >> make[1]: Entering directory '...'
> >>   GEN     Makefile
> >>   DESCEND objtool
> >>   CALL    .../scripts/checksyscalls.sh
> >>   INSTALL libsubcmd_headers
> >> .pylintrc: warning: ignored by one of the .gitignore files
> >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> >> make[1]: Leaving directory '...'
> >> 
> >> $ touch drivers/gpio/gpiolib-acpi.c
> >> 
> >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> >> make[1]: Entering directory '...'
> >>   GEN     Makefile
> >>   DESCEND objtool
> >>   CALL    .../scripts/checksyscalls.sh
> >>   INSTALL libsubcmd_headers
> >> ...
> >>   OBJCOPY arch/x86/boot/setup.bin
> >>   BUILD   arch/x86/boot/bzImage
> >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> >> make[1]: Leaving directory '...'
> >> 
> >> $ git status --ignored
> >> On branch ...
> >> Untracked files:
> >>   (use "git add <file>..." to include in what will be committed)
> >> 	scripts/lib/kdoc/__pycache__/
> >> 
> >> nothing added to commit but untracked files present (use "git add" to track)
> >
> > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > fully clean build. Still the same issue.
> >
> > And it appears at the very beginning of the build. You don't need to wait to
> > have the kernel to be built actually.
> 
> kernel-doc gets run on source files for W=1 builds. See Makefile.build.

Thanks for the clarification, so we know that it runs and we know that it has
an issue.

> >> It's 100% reproducible on my side. I am happy to test any patches to fix this.
> >> It's really annoying "feature" for `make O=...` builds. Also note that
> >> theoretically the Git worktree may be located on read-only storage / media
> >> and this can induce subtle issues.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  8:17             ` Andy Shevchenko
@ 2025-04-15  8:19               ` Andy Shevchenko
  2025-04-15  8:40                 ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  8:19 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Jonathan Corbet, Mauro Carvalho Chehab, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:
> On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:
> > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:
> > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:
> > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:
> > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:
> > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:
> > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > >> > >> 
> > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > >> > >> > second series I sent on the top of it.
> > >> > >> 
> > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > >> > >> thanks for doing this!
> > >> > >> 
> > >> > >> I'm going to hold off on other documentation patches for a day or two
> > >> > >> just in case anything turns up.  But it looks awfully good.
> > >> > >
> > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > >> > >
> > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.
> > >> > 
> > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > >> > 
> > >> > We can also tell CPython to not create that directory at all.  I'll run
> > >> > some tests to see what the effect is on the documentation build times;
> > >> > I'm guessing it will not be huge...
> > >> 
> > >> I do not build documentation at all, it's just a regular code build that leaves
> > >> tree dirty.
> > >> 
> > >> $ python3 --version
> > >> Python 3.13.2
> > >> 
> > >> It's standard Debian testing distribution, no customisation in the code.
> > >> 
> > >> To reproduce.
> > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > >> 2) The following snippet in shell shows the issue
> > >> 
> > >> $ git clean -xdf
> > >> $ git status --ignored
> > >> On branch ...
> > >> nothing to commit, working tree clean
> > >> 
> > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > >> make[1]: Entering directory '...'
> > >>   GEN     Makefile
> > >>   DESCEND objtool
> > >>   CALL    .../scripts/checksyscalls.sh
> > >>   INSTALL libsubcmd_headers
> > >> .pylintrc: warning: ignored by one of the .gitignore files
> > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > >> make[1]: Leaving directory '...'
> > >> 
> > >> $ touch drivers/gpio/gpiolib-acpi.c
> > >> 
> > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > >> make[1]: Entering directory '...'
> > >>   GEN     Makefile
> > >>   DESCEND objtool
> > >>   CALL    .../scripts/checksyscalls.sh
> > >>   INSTALL libsubcmd_headers
> > >> ...
> > >>   OBJCOPY arch/x86/boot/setup.bin
> > >>   BUILD   arch/x86/boot/bzImage
> > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > >> make[1]: Leaving directory '...'
> > >> 
> > >> $ git status --ignored
> > >> On branch ...
> > >> Untracked files:
> > >>   (use "git add <file>..." to include in what will be committed)
> > >> 	scripts/lib/kdoc/__pycache__/
> > >> 
> > >> nothing added to commit but untracked files present (use "git add" to track)
> > >
> > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > fully clean build. Still the same issue.
> > >
> > > And it appears at the very beginning of the build. You don't need to wait to
> > > have the kernel to be built actually.
> > 
> > kernel-doc gets run on source files for W=1 builds. See Makefile.build.
> 
> Thanks for the clarification, so we know that it runs and we know that it has
> an issue.

Ideal solution what would I expect is that the cache folder should respect
the given O=... argument, or disabled at all (but I don't think the latter
is what we want as it may slow down the build).


> > >> It's 100% reproducible on my side. I am happy to test any patches to fix this.
> > >> It's really annoying "feature" for `make O=...` builds. Also note that
> > >> theoretically the Git worktree may be located on read-only storage / media
> > >> and this can induce subtle issues.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  7:03         ` Andy Shevchenko
  2025-04-15  7:49           ` Jani Nikula
@ 2025-04-15  8:30           ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-15  8:30 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jonathan Corbet, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

Em Tue, 15 Apr 2025 10:03:47 +0300
Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:

> On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:
> > On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:  
> > > Andy Shevchenko <andriy.shevchenko@intel.com> writes:  
> > > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:  
> > > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > >>   
> > > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > >> > second series I sent on the top of it.  
> > > >> 
> > > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > >> thanks for doing this!
> > > >> 
> > > >> I'm going to hold off on other documentation patches for a day or two
> > > >> just in case anything turns up.  But it looks awfully good.  
> > > >
> > > > This started well, until it becomes a scripts/lib/kdoc.
> > > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > "disgusting turd" )as said by Linus) in the clean tree.
> > > >
> > > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.  
> > > 
> > > If nothing else, "make cleandocs" should clean it up, certainly.

Not sure about that, as __pycache__ is completely managed by Python: it
will not only create it for scripts/lib, but also for all Python libraries,
including the Sphinx ones.

IMO, it makes more sense, instead, to ensure that __pycache__ won't be
created at the sourcedir if O= is used, but ignore it if this is created.

Btw, the same problem should already happen with get_abi.py, as it also
uses "import" from scripts/lib. So, we need a more generic solution. See
below.

> > > 
> > > We can also tell CPython to not create that directory at all.  I'll run
> > > some tests to see what the effect is on the documentation build times;
> > > I'm guessing it will not be huge...  

I doubt it would have much impact for kernel-doc, but it can have some impact
for Sphinx, as disabling Python JIT to store bytecode would affect it too.

-

Andy, 

Could you please remove __pycache__ and set this env:

	PYTHONDONTWRITEBYTECODE=1

before building the Kernel? If this works, one alternative would be to 
set it when O= is used.

> > I do not build documentation at all, it's just a regular code build that leaves
> > tree dirty.
> > 
> > $ python3 --version
> > Python 3.13.2
> > 
> > It's standard Debian testing distribution, no customisation in the code.
> > 
> > To reproduce.
> > 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > 2) The following snippet in shell shows the issue
> > 
> > $ git clean -xdf
> > $ git status --ignored
> > On branch ...
> > nothing to commit, working tree clean
> > 
> > $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > make[1]: Entering directory '...'
> >   GEN     Makefile
> >   DESCEND objtool
> >   CALL    .../scripts/checksyscalls.sh
> >   INSTALL libsubcmd_headers
> > .pylintrc: warning: ignored by one of the .gitignore files
> > Kernel: arch/x86/boot/bzImage is ready  (#23)
> > make[1]: Leaving directory '...'
> > 
> > $ touch drivers/gpio/gpiolib-acpi.c
> > 
> > $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > make[1]: Entering directory '...'
> >   GEN     Makefile
> >   DESCEND objtool
> >   CALL    .../scripts/checksyscalls.sh
> >   INSTALL libsubcmd_headers
> > ...
> >   OBJCOPY arch/x86/boot/setup.bin
> >   BUILD   arch/x86/boot/bzImage
> > Kernel: arch/x86/boot/bzImage is ready  (#24)
> > make[1]: Leaving directory '...'
> > 
> > $ git status --ignored
> > On branch ...
> > Untracked files:
> >   (use "git add <file>..." to include in what will be committed)
> > 	scripts/lib/kdoc/__pycache__/
> > 
> > nothing added to commit but untracked files present (use "git add" to track)  
> 
> FWIW, I repeated this with removing the O=.../out folder completely, so it's
> fully clean build. Still the same issue.
> 
> And it appears at the very beginning of the build. You don't need to wait to
> have the kernel to be built actually.

Depending on your .config, kernel-doc will be called even without building
documentation to check for some problems at kernel-doc tags.

> 
> > It's 100% reproducible on my side. I am happy to test any patches to fix this.
> > It's really annoying "feature" for `make O=...` builds. Also note that
> > theoretically the Git worktree may be located on read-only storage / media
> > and this can induce subtle issues.  

Python's JIT compiler automatically creates __pycache__ whenever it
encounters an "import" and the *.pyc is older than the script (or doesn't
exist). This happens with external libraries, and also with the internal
ones, like the ones we now have at the Kernel.

I dunno what happens if the FS is read-only. I would expect that the JIT
compiler would just work as if bytecode creation is disabled.

That's said, I never played myself with __pycache__.

Yet, I have some raw ideas about how to deal with that. This requires
more tests, though. I can see some possible solutions for that:

1. Assuming that PYTHONDONTWRITEBYTECODE=1 works, the build system could
   set it if O= is used. This would have some performance impact for both
   Kernel compilation (because kernel-doc is called to check doc issues),
   and for Kernel compilation itself. I dunno how much it would impact,
   but this is probably the quickest solution to implement;

2. when O=<targetdir> is used, copy scripts/lib/*/*.py to the target
   directory and change kernel-doc.py to use <targetdir> for library search
   on such case. This way, the __pycache__ would be created at the 
   <targetdir>. This might work as well with symlinks. The performance
   impact here would be minimal, but it will require an extra step for
   O= to copy data (or to create symlinks);

3. eventually there is a way to teach Python to place the __pycache__
   at <targetdir>, instead of <sourcedir>. If supported, this would
   be the cleanest solution.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  8:19               ` Andy Shevchenko
@ 2025-04-15  8:40                 ` Mauro Carvalho Chehab
  2025-04-15  8:51                   ` Mauro Carvalho Chehab
  2025-04-15  9:51                   ` Andy Shevchenko
  0 siblings, 2 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-15  8:40 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

Em Tue, 15 Apr 2025 11:19:26 +0300
Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:

> On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:  
> > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:  
> > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:  
> > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:  
> > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:  
> > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:  
> > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > >> > >>   
> > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > >> > >> > second series I sent on the top of it.  
> > > >> > >> 
> > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > >> > >> thanks for doing this!
> > > >> > >> 
> > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > >> > >> just in case anything turns up.  But it looks awfully good.  
> > > >> > >
> > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > >> > >
> > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.  
> > > >> > 
> > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > >> > 
> > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > >> > some tests to see what the effect is on the documentation build times;
> > > >> > I'm guessing it will not be huge...  
> > > >> 
> > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > >> tree dirty.
> > > >> 
> > > >> $ python3 --version
> > > >> Python 3.13.2
> > > >> 
> > > >> It's standard Debian testing distribution, no customisation in the code.
> > > >> 
> > > >> To reproduce.
> > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > >> 2) The following snippet in shell shows the issue
> > > >> 
> > > >> $ git clean -xdf
> > > >> $ git status --ignored
> > > >> On branch ...
> > > >> nothing to commit, working tree clean
> > > >> 
> > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > >> make[1]: Entering directory '...'
> > > >>   GEN     Makefile
> > > >>   DESCEND objtool
> > > >>   CALL    .../scripts/checksyscalls.sh
> > > >>   INSTALL libsubcmd_headers
> > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > >> make[1]: Leaving directory '...'
> > > >> 
> > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > >> 
> > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > >> make[1]: Entering directory '...'
> > > >>   GEN     Makefile
> > > >>   DESCEND objtool
> > > >>   CALL    .../scripts/checksyscalls.sh
> > > >>   INSTALL libsubcmd_headers
> > > >> ...
> > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > >>   BUILD   arch/x86/boot/bzImage
> > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > >> make[1]: Leaving directory '...'
> > > >> 
> > > >> $ git status --ignored
> > > >> On branch ...
> > > >> Untracked files:
> > > >>   (use "git add <file>..." to include in what will be committed)
> > > >> 	scripts/lib/kdoc/__pycache__/
> > > >> 
> > > >> nothing added to commit but untracked files present (use "git add" to track)  
> > > >
> > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > fully clean build. Still the same issue.
> > > >
> > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > have the kernel to be built actually.  
> > > 
> > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.  
> > 
> > Thanks for the clarification, so we know that it runs and we know that it has
> > an issue.  
> 
> Ideal solution what would I expect is that the cache folder should respect
> the given O=... argument, or disabled at all (but I don't think the latter
> is what we want as it may slow down the build).

From:
	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
and:
	https://peps.python.org/pep-3147/

It sounds that Python 3.8 and above have a way to specify the cache
location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".

As the current minimal Python version is 3.9, we can safely use it.

So, maybe this would work:

	make O="../out" PYTHONPYCACHEPREFIX="../out"

or a variant of it:

	PYTHONPYCACHEPREFIX="../out" make O="../out" 

If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
env var when O= is used.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  8:40                 ` Mauro Carvalho Chehab
@ 2025-04-15  8:51                   ` Mauro Carvalho Chehab
  2025-04-15  9:53                     ` Andy Shevchenko
  2025-04-15  9:51                   ` Andy Shevchenko
  1 sibling, 1 reply; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-15  8:51 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

Em Tue, 15 Apr 2025 16:40:34 +0800
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Em Tue, 15 Apr 2025 11:19:26 +0300
> Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:
> 
> > On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:  
> > > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:    
> > > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:    
> > > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:    
> > > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:    
> > > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:    
> > > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:    
> > > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > > >> > >>     
> > > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > > >> > >> > second series I sent on the top of it.    
> > > > >> > >> 
> > > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > > >> > >> thanks for doing this!
> > > > >> > >> 
> > > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > > >> > >> just in case anything turns up.  But it looks awfully good.    
> > > > >> > >
> > > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > > >> > >
> > > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.    
> > > > >> > 
> > > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > > >> > 
> > > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > > >> > some tests to see what the effect is on the documentation build times;
> > > > >> > I'm guessing it will not be huge...    
> > > > >> 
> > > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > > >> tree dirty.
> > > > >> 
> > > > >> $ python3 --version
> > > > >> Python 3.13.2
> > > > >> 
> > > > >> It's standard Debian testing distribution, no customisation in the code.
> > > > >> 
> > > > >> To reproduce.
> > > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > > >> 2) The following snippet in shell shows the issue
> > > > >> 
> > > > >> $ git clean -xdf
> > > > >> $ git status --ignored
> > > > >> On branch ...
> > > > >> nothing to commit, working tree clean
> > > > >> 
> > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > >> make[1]: Entering directory '...'
> > > > >>   GEN     Makefile
> > > > >>   DESCEND objtool
> > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > >>   INSTALL libsubcmd_headers
> > > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > > >> make[1]: Leaving directory '...'
> > > > >> 
> > > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > > >> 
> > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > >> make[1]: Entering directory '...'
> > > > >>   GEN     Makefile
> > > > >>   DESCEND objtool
> > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > >>   INSTALL libsubcmd_headers
> > > > >> ...
> > > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > > >>   BUILD   arch/x86/boot/bzImage
> > > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > > >> make[1]: Leaving directory '...'
> > > > >> 
> > > > >> $ git status --ignored
> > > > >> On branch ...
> > > > >> Untracked files:
> > > > >>   (use "git add <file>..." to include in what will be committed)
> > > > >> 	scripts/lib/kdoc/__pycache__/
> > > > >> 
> > > > >> nothing added to commit but untracked files present (use "git add" to track)    
> > > > >
> > > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > > fully clean build. Still the same issue.
> > > > >
> > > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > > have the kernel to be built actually.    
> > > > 
> > > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.    
> > > 
> > > Thanks for the clarification, so we know that it runs and we know that it has
> > > an issue.    
> > 
> > Ideal solution what would I expect is that the cache folder should respect
> > the given O=... argument, or disabled at all (but I don't think the latter
> > is what we want as it may slow down the build).  
> 
> From:
> 	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
> and:
> 	https://peps.python.org/pep-3147/
> 
> It sounds that Python 3.8 and above have a way to specify the cache
> location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".
> 
> As the current minimal Python version is 3.9, we can safely use it.
> 
> So, maybe this would work:
> 
> 	make O="../out" PYTHONPYCACHEPREFIX="../out"
> 
> or a variant of it:
> 
> 	PYTHONPYCACHEPREFIX="../out" make O="../out" 
> 
> If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
> env var when O= is used.

That's interesting... Sphinx is already called with PYTHONDONTWRITEBYTECODE.
From Documentation/Makefile:

	quiet_cmd_sphinx = SPHINX  $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
	      cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media $2 && \
	        PYTHONDONTWRITEBYTECODE=1 \
	...

It seems that the issue happens only when W=1 is used and kernel-doc
is called outside Sphinx.

Anyway, IMHO, the best would be to change the above to:

	PYTHONPYCACHEPREFIX=$(abspath $(BUILDDIR))

And do the same for the other places where kernel-doc is called:

	include/drm/Makefile:           $(srctree)/scripts/kernel-doc -none $(if $(CONFIG_WERROR)$(CONFIG_DRM_WERROR),-Werror) $<; \
	scripts/Makefile.build:  cmd_checkdoc = $(srctree)/scripts/kernel-doc -none $(KDOCFLAGS) \
	scripts/find-unused-docs.sh:    str=$(scripts/kernel-doc -export "$file" 2>/dev/null)

Comments?

Regards,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  8:40                 ` Mauro Carvalho Chehab
  2025-04-15  8:51                   ` Mauro Carvalho Chehab
@ 2025-04-15  9:51                   ` Andy Shevchenko
  2025-04-15  9:54                     ` Andy Shevchenko
  1 sibling, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  9:51 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 04:40:34PM +0800, Mauro Carvalho Chehab wrote:
> Em Tue, 15 Apr 2025 11:19:26 +0300
> Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:
> > On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:
> > > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:  
> > > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:  
> > > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:  
> > > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:  
> > > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:  
> > > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:  
> > > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > > >> > >>   
> > > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > > >> > >> > second series I sent on the top of it.  
> > > > >> > >> 
> > > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > > >> > >> thanks for doing this!
> > > > >> > >> 
> > > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > > >> > >> just in case anything turns up.  But it looks awfully good.  
> > > > >> > >
> > > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > > >> > >
> > > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.  
> > > > >> > 
> > > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > > >> > 
> > > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > > >> > some tests to see what the effect is on the documentation build times;
> > > > >> > I'm guessing it will not be huge...  
> > > > >> 
> > > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > > >> tree dirty.
> > > > >> 
> > > > >> $ python3 --version
> > > > >> Python 3.13.2
> > > > >> 
> > > > >> It's standard Debian testing distribution, no customisation in the code.
> > > > >> 
> > > > >> To reproduce.
> > > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > > >> 2) The following snippet in shell shows the issue
> > > > >> 
> > > > >> $ git clean -xdf
> > > > >> $ git status --ignored
> > > > >> On branch ...
> > > > >> nothing to commit, working tree clean
> > > > >> 
> > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > >> make[1]: Entering directory '...'
> > > > >>   GEN     Makefile
> > > > >>   DESCEND objtool
> > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > >>   INSTALL libsubcmd_headers
> > > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > > >> make[1]: Leaving directory '...'
> > > > >> 
> > > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > > >> 
> > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > >> make[1]: Entering directory '...'
> > > > >>   GEN     Makefile
> > > > >>   DESCEND objtool
> > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > >>   INSTALL libsubcmd_headers
> > > > >> ...
> > > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > > >>   BUILD   arch/x86/boot/bzImage
> > > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > > >> make[1]: Leaving directory '...'
> > > > >> 
> > > > >> $ git status --ignored
> > > > >> On branch ...
> > > > >> Untracked files:
> > > > >>   (use "git add <file>..." to include in what will be committed)
> > > > >> 	scripts/lib/kdoc/__pycache__/
> > > > >> 
> > > > >> nothing added to commit but untracked files present (use "git add" to track)  
> > > > >
> > > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > > fully clean build. Still the same issue.
> > > > >
> > > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > > have the kernel to be built actually.  
> > > > 
> > > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.  
> > > 
> > > Thanks for the clarification, so we know that it runs and we know that it has
> > > an issue.  
> > 
> > Ideal solution what would I expect is that the cache folder should respect
> > the given O=... argument, or disabled at all (but I don't think the latter
> > is what we want as it may slow down the build).
> 
> From:
> 	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
> and:
> 	https://peps.python.org/pep-3147/
> 
> It sounds that Python 3.8 and above have a way to specify the cache
> location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".
> 
> As the current minimal Python version is 3.9, we can safely use it.
> 
> So, maybe this would work:
> 
> 	make O="../out" PYTHONPYCACHEPREFIX="../out"
> 
> or a variant of it:
> 
> 	PYTHONPYCACHEPREFIX="../out" make O="../out" 
> 
> If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
> env var when O= is used.

It works, the problem is that it should be automatically assigned to the
respective folder, so when compiling kdoc, it should be actually

$O/scripts/lib/kdoc/__pycache__

and so on for _each_ of the python code.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  8:51                   ` Mauro Carvalho Chehab
@ 2025-04-15  9:53                     ` Andy Shevchenko
  0 siblings, 0 replies; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  9:53 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 04:51:02PM +0800, Mauro Carvalho Chehab wrote:
> Em Tue, 15 Apr 2025 16:40:34 +0800
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:
> > Em Tue, 15 Apr 2025 11:19:26 +0300
> > Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:
> > > On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:  
> > > > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:    
> > > > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:    
> > > > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:    
> > > > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:    
> > > > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:    
> > > > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:    
> > > > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > > > >> > >>     
> > > > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > > > >> > >> > second series I sent on the top of it.    
> > > > > >> > >> 
> > > > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > > > >> > >> thanks for doing this!
> > > > > >> > >> 
> > > > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > > > >> > >> just in case anything turns up.  But it looks awfully good.    
> > > > > >> > >
> > > > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > > > >> > >
> > > > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.    
> > > > > >> > 
> > > > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > > > >> > 
> > > > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > > > >> > some tests to see what the effect is on the documentation build times;
> > > > > >> > I'm guessing it will not be huge...    
> > > > > >> 
> > > > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > > > >> tree dirty.
> > > > > >> 
> > > > > >> $ python3 --version
> > > > > >> Python 3.13.2
> > > > > >> 
> > > > > >> It's standard Debian testing distribution, no customisation in the code.
> > > > > >> 
> > > > > >> To reproduce.
> > > > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > > > >> 2) The following snippet in shell shows the issue
> > > > > >> 
> > > > > >> $ git clean -xdf
> > > > > >> $ git status --ignored
> > > > > >> On branch ...
> > > > > >> nothing to commit, working tree clean
> > > > > >> 
> > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > >> make[1]: Entering directory '...'
> > > > > >>   GEN     Makefile
> > > > > >>   DESCEND objtool
> > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > >>   INSTALL libsubcmd_headers
> > > > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > > > >> make[1]: Leaving directory '...'
> > > > > >> 
> > > > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > > > >> 
> > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > >> make[1]: Entering directory '...'
> > > > > >>   GEN     Makefile
> > > > > >>   DESCEND objtool
> > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > >>   INSTALL libsubcmd_headers
> > > > > >> ...
> > > > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > > > >>   BUILD   arch/x86/boot/bzImage
> > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > > > >> make[1]: Leaving directory '...'
> > > > > >> 
> > > > > >> $ git status --ignored
> > > > > >> On branch ...
> > > > > >> Untracked files:
> > > > > >>   (use "git add <file>..." to include in what will be committed)
> > > > > >> 	scripts/lib/kdoc/__pycache__/
> > > > > >> 
> > > > > >> nothing added to commit but untracked files present (use "git add" to track)    
> > > > > >
> > > > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > > > fully clean build. Still the same issue.
> > > > > >
> > > > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > > > have the kernel to be built actually.    
> > > > > 
> > > > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.    
> > > > 
> > > > Thanks for the clarification, so we know that it runs and we know that it has
> > > > an issue.    
> > > 
> > > Ideal solution what would I expect is that the cache folder should respect
> > > the given O=... argument, or disabled at all (but I don't think the latter
> > > is what we want as it may slow down the build).  
> > 
> > From:
> > 	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
> > and:
> > 	https://peps.python.org/pep-3147/
> > 
> > It sounds that Python 3.8 and above have a way to specify the cache
> > location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".
> > 
> > As the current minimal Python version is 3.9, we can safely use it.
> > 
> > So, maybe this would work:
> > 
> > 	make O="../out" PYTHONPYCACHEPREFIX="../out"
> > 
> > or a variant of it:
> > 
> > 	PYTHONPYCACHEPREFIX="../out" make O="../out" 
> > 
> > If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
> > env var when O= is used.
> 
> That's interesting... Sphinx is already called with PYTHONDONTWRITEBYTECODE.
> From Documentation/Makefile:
> 
> 	quiet_cmd_sphinx = SPHINX  $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
> 	      cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media $2 && \
> 	        PYTHONDONTWRITEBYTECODE=1 \
> 	...
> 
> It seems that the issue happens only when W=1 is used and kernel-doc
> is called outside Sphinx.
> 
> Anyway, IMHO, the best would be to change the above to:
> 
> 	PYTHONPYCACHEPREFIX=$(abspath $(BUILDDIR))
> 
> And do the same for the other places where kernel-doc is called:
> 
> 	include/drm/Makefile:           $(srctree)/scripts/kernel-doc -none $(if $(CONFIG_WERROR)$(CONFIG_DRM_WERROR),-Werror) $<; \
> 	scripts/Makefile.build:  cmd_checkdoc = $(srctree)/scripts/kernel-doc -none $(KDOCFLAGS) \
> 	scripts/find-unused-docs.sh:    str=$(scripts/kernel-doc -export "$file" 2>/dev/null)
> 
> Comments?

I would like that, but it should be properly formed, because somewhere we drop
the path in the source tree and if $O is used as is, it becomes _the_ pycache
folder!

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  9:51                   ` Andy Shevchenko
@ 2025-04-15  9:54                     ` Andy Shevchenko
  2025-04-15 10:06                       ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15  9:54 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 12:51:38PM +0300, Andy Shevchenko wrote:
> On Tue, Apr 15, 2025 at 04:40:34PM +0800, Mauro Carvalho Chehab wrote:
> > Em Tue, 15 Apr 2025 11:19:26 +0300
> > Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:
> > > On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:
> > > > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:  
> > > > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:  
> > > > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:  
> > > > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:  
> > > > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:  
> > > > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:  
> > > > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > > > >> > >>   
> > > > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > > > >> > >> > second series I sent on the top of it.  
> > > > > >> > >> 
> > > > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > > > >> > >> thanks for doing this!
> > > > > >> > >> 
> > > > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > > > >> > >> just in case anything turns up.  But it looks awfully good.  
> > > > > >> > >
> > > > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > > > >> > >
> > > > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.  
> > > > > >> > 
> > > > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > > > >> > 
> > > > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > > > >> > some tests to see what the effect is on the documentation build times;
> > > > > >> > I'm guessing it will not be huge...  
> > > > > >> 
> > > > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > > > >> tree dirty.
> > > > > >> 
> > > > > >> $ python3 --version
> > > > > >> Python 3.13.2
> > > > > >> 
> > > > > >> It's standard Debian testing distribution, no customisation in the code.
> > > > > >> 
> > > > > >> To reproduce.
> > > > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > > > >> 2) The following snippet in shell shows the issue
> > > > > >> 
> > > > > >> $ git clean -xdf
> > > > > >> $ git status --ignored
> > > > > >> On branch ...
> > > > > >> nothing to commit, working tree clean
> > > > > >> 
> > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > >> make[1]: Entering directory '...'
> > > > > >>   GEN     Makefile
> > > > > >>   DESCEND objtool
> > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > >>   INSTALL libsubcmd_headers
> > > > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > > > >> make[1]: Leaving directory '...'
> > > > > >> 
> > > > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > > > >> 
> > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > >> make[1]: Entering directory '...'
> > > > > >>   GEN     Makefile
> > > > > >>   DESCEND objtool
> > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > >>   INSTALL libsubcmd_headers
> > > > > >> ...
> > > > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > > > >>   BUILD   arch/x86/boot/bzImage
> > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > > > >> make[1]: Leaving directory '...'
> > > > > >> 
> > > > > >> $ git status --ignored
> > > > > >> On branch ...
> > > > > >> Untracked files:
> > > > > >>   (use "git add <file>..." to include in what will be committed)
> > > > > >> 	scripts/lib/kdoc/__pycache__/
> > > > > >> 
> > > > > >> nothing added to commit but untracked files present (use "git add" to track)  
> > > > > >
> > > > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > > > fully clean build. Still the same issue.
> > > > > >
> > > > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > > > have the kernel to be built actually.  
> > > > > 
> > > > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.  
> > > > 
> > > > Thanks for the clarification, so we know that it runs and we know that it has
> > > > an issue.  
> > > 
> > > Ideal solution what would I expect is that the cache folder should respect
> > > the given O=... argument, or disabled at all (but I don't think the latter
> > > is what we want as it may slow down the build).
> > 
> > From:
> > 	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
> > and:
> > 	https://peps.python.org/pep-3147/
> > 
> > It sounds that Python 3.8 and above have a way to specify the cache
> > location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".
> > 
> > As the current minimal Python version is 3.9, we can safely use it.
> > 
> > So, maybe this would work:
> > 
> > 	make O="../out" PYTHONPYCACHEPREFIX="../out"
> > 
> > or a variant of it:
> > 
> > 	PYTHONPYCACHEPREFIX="../out" make O="../out" 
> > 
> > If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
> > env var when O= is used.
> 
> It works, the problem is that it should be automatically assigned to the
> respective folder, so when compiling kdoc, it should be actually
> 
> $O/scripts/lib/kdoc/__pycache__
> 
> and so on for _each_ of the python code.

So, the bottom line, can we just disable it for a quick fix and when a proper
solution comes, it will redo that?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15  9:54                     ` Andy Shevchenko
@ 2025-04-15 10:06                       ` Mauro Carvalho Chehab
  2025-04-15 11:13                         ` Andy Shevchenko
  2025-04-15 13:34                         ` Jonathan Corbet
  0 siblings, 2 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-15 10:06 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

Em Tue, 15 Apr 2025 12:54:12 +0300
Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:

> On Tue, Apr 15, 2025 at 12:51:38PM +0300, Andy Shevchenko wrote:
> > On Tue, Apr 15, 2025 at 04:40:34PM +0800, Mauro Carvalho Chehab wrote:  
> > > Em Tue, 15 Apr 2025 11:19:26 +0300
> > > Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:  
> > > > On Tue, Apr 15, 2025 at 11:17:12AM +0300, Andy Shevchenko wrote:  
> > > > > On Tue, Apr 15, 2025 at 10:49:29AM +0300, Jani Nikula wrote:    
> > > > > > On Tue, 15 Apr 2025, Andy Shevchenko <andriy.shevchenko@intel.com> wrote:    
> > > > > > > On Tue, Apr 15, 2025 at 10:01:04AM +0300, Andy Shevchenko wrote:    
> > > > > > >> On Mon, Apr 14, 2025 at 09:17:51AM -0600, Jonathan Corbet wrote:    
> > > > > > >> > Andy Shevchenko <andriy.shevchenko@intel.com> writes:    
> > > > > > >> > > On Wed, Apr 09, 2025 at 12:30:00PM -0600, Jonathan Corbet wrote:    
> > > > > > >> > >> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> > > > > > >> > >>     
> > > > > > >> > >> > This changeset contains the kernel-doc.py script to replace the verable
> > > > > > >> > >> > kernel-doc originally written in Perl. It replaces the first version and the
> > > > > > >> > >> > second series I sent on the top of it.    
> > > > > > >> > >> 
> > > > > > >> > >> OK, I've applied it, looked at the (minimal) changes in output, and
> > > > > > >> > >> concluded that it's good - all this stuff is now in docs-next.  Many
> > > > > > >> > >> thanks for doing this!
> > > > > > >> > >> 
> > > > > > >> > >> I'm going to hold off on other documentation patches for a day or two
> > > > > > >> > >> just in case anything turns up.  But it looks awfully good.    
> > > > > > >> > >
> > > > > > >> > > This started well, until it becomes a scripts/lib/kdoc.
> > > > > > >> > > So, it makes the `make O=...` builds dirty *). Please make sure this doesn't leave
> > > > > > >> > > "disgusting turd" )as said by Linus) in the clean tree.
> > > > > > >> > >
> > > > > > >> > > *) it creates that __pycache__ disaster. And no, .gitignore IS NOT a solution.    
> > > > > > >> > 
> > > > > > >> > If nothing else, "make cleandocs" should clean it up, certainly.
> > > > > > >> > 
> > > > > > >> > We can also tell CPython to not create that directory at all.  I'll run
> > > > > > >> > some tests to see what the effect is on the documentation build times;
> > > > > > >> > I'm guessing it will not be huge...    
> > > > > > >> 
> > > > > > >> I do not build documentation at all, it's just a regular code build that leaves
> > > > > > >> tree dirty.
> > > > > > >> 
> > > > > > >> $ python3 --version
> > > > > > >> Python 3.13.2
> > > > > > >> 
> > > > > > >> It's standard Debian testing distribution, no customisation in the code.
> > > > > > >> 
> > > > > > >> To reproduce.
> > > > > > >> 1) I have just done a new build to reduce the churn, so, running make again does nothing;
> > > > > > >> 2) The following snippet in shell shows the issue
> > > > > > >> 
> > > > > > >> $ git clean -xdf
> > > > > > >> $ git status --ignored
> > > > > > >> On branch ...
> > > > > > >> nothing to commit, working tree clean
> > > > > > >> 
> > > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > > >> make[1]: Entering directory '...'
> > > > > > >>   GEN     Makefile
> > > > > > >>   DESCEND objtool
> > > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > > >>   INSTALL libsubcmd_headers
> > > > > > >> .pylintrc: warning: ignored by one of the .gitignore files
> > > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#23)
> > > > > > >> make[1]: Leaving directory '...'
> > > > > > >> 
> > > > > > >> $ touch drivers/gpio/gpiolib-acpi.c
> > > > > > >> 
> > > > > > >> $ make LLVM=-19 O=.../out W=1 C=1 CF=-D__CHECK_ENDIAN__ -j64
> > > > > > >> make[1]: Entering directory '...'
> > > > > > >>   GEN     Makefile
> > > > > > >>   DESCEND objtool
> > > > > > >>   CALL    .../scripts/checksyscalls.sh
> > > > > > >>   INSTALL libsubcmd_headers
> > > > > > >> ...
> > > > > > >>   OBJCOPY arch/x86/boot/setup.bin
> > > > > > >>   BUILD   arch/x86/boot/bzImage
> > > > > > >> Kernel: arch/x86/boot/bzImage is ready  (#24)
> > > > > > >> make[1]: Leaving directory '...'
> > > > > > >> 
> > > > > > >> $ git status --ignored
> > > > > > >> On branch ...
> > > > > > >> Untracked files:
> > > > > > >>   (use "git add <file>..." to include in what will be committed)
> > > > > > >> 	scripts/lib/kdoc/__pycache__/
> > > > > > >> 
> > > > > > >> nothing added to commit but untracked files present (use "git add" to track)    
> > > > > > >
> > > > > > > FWIW, I repeated this with removing the O=.../out folder completely, so it's
> > > > > > > fully clean build. Still the same issue.
> > > > > > >
> > > > > > > And it appears at the very beginning of the build. You don't need to wait to
> > > > > > > have the kernel to be built actually.    
> > > > > > 
> > > > > > kernel-doc gets run on source files for W=1 builds. See Makefile.build.    
> > > > > 
> > > > > Thanks for the clarification, so we know that it runs and we know that it has
> > > > > an issue.    
> > > > 
> > > > Ideal solution what would I expect is that the cache folder should respect
> > > > the given O=... argument, or disabled at all (but I don't think the latter
> > > > is what we want as it may slow down the build).  
> > > 
> > > From:
> > > 	https://github.com/python/cpython/commit/b193fa996a746111252156f11fb14c12fd6267e6
> > > and:
> > > 	https://peps.python.org/pep-3147/
> > > 
> > > It sounds that Python 3.8 and above have a way to specify the cache
> > > location, via PYTHONPYCACHEPREFIX env var, and via "-X pycache_prefix=path".
> > > 
> > > As the current minimal Python version is 3.9, we can safely use it.
> > > 
> > > So, maybe this would work:
> > > 
> > > 	make O="../out" PYTHONPYCACHEPREFIX="../out"
> > > 
> > > or a variant of it:
> > > 
> > > 	PYTHONPYCACHEPREFIX="../out" make O="../out" 
> > > 
> > > If this works, we can adjust the building system to fill PYTHONPYCACHEPREFIX
> > > env var when O= is used.  
> > 
> > It works,

Good!

> > the problem is that it should be automatically assigned to the
> > respective folder, so when compiling kdoc, it should be actually
> > 
> > $O/scripts/lib/kdoc/__pycache__
> > 
> > and so on for _each_ of the python code.  

Yeah, agreed. We need to think on a more generic solution though,
as we also may have scripts/lib/abi/__pycache__ if one runs
get_abi.pl, and, in the future, we may have more. Not sure how
hard/easy would be to do that, though.

> So, the bottom line, can we just disable it for a quick fix and when a proper
> solution comes, it will redo that?

Agreed, this sounds to be the best approach.

I'll try to craft a patch along the week to add
PYTHONDONTWRITEBYTECODE=1 to the places where kernel-doc
is called.

Regards,
Mauro


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15 10:06                       ` Mauro Carvalho Chehab
@ 2025-04-15 11:13                         ` Andy Shevchenko
  2025-04-15 13:34                         ` Jonathan Corbet
  1 sibling, 0 replies; 56+ messages in thread
From: Andy Shevchenko @ 2025-04-15 11:13 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jani Nikula, Jonathan Corbet, Linux Doc Mailing List,
	linux-kernel, Gustavo A. R. Silva, Kees Cook, Russell King,
	linux-hardening, netdev

On Tue, Apr 15, 2025 at 06:06:31PM +0800, Mauro Carvalho Chehab wrote:
> Em Tue, 15 Apr 2025 12:54:12 +0300
> Andy Shevchenko <andriy.shevchenko@intel.com> escreveu:

...

> I'll try to craft a patch along the week to add
> PYTHONDONTWRITEBYTECODE=1 to the places where kernel-doc
> is called.

Cc me and I'll be happy to test. Thank you!

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15 10:06                       ` Mauro Carvalho Chehab
  2025-04-15 11:13                         ` Andy Shevchenko
@ 2025-04-15 13:34                         ` Jonathan Corbet
  2025-04-16  6:44                           ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 56+ messages in thread
From: Jonathan Corbet @ 2025-04-15 13:34 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Andy Shevchenko
  Cc: Jani Nikula, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:

> I'll try to craft a patch along the week to add
> PYTHONDONTWRITEBYTECODE=1 to the places where kernel-doc
> is called.

This may really be all we need.  It will be interesting to do some
build-time tests; I don't really see this as making much of a
difference.

Thanks,

jon

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v3 00/33] Implement kernel-doc in Python
  2025-04-15 13:34                         ` Jonathan Corbet
@ 2025-04-16  6:44                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 56+ messages in thread
From: Mauro Carvalho Chehab @ 2025-04-16  6:44 UTC (permalink / raw)
  To: Jonathan Corbet, Andy Shevchenko
  Cc: Jani Nikula, Linux Doc Mailing List, linux-kernel,
	Gustavo A. R. Silva, Kees Cook, Russell King, linux-hardening,
	netdev

Em Tue, 15 Apr 2025 07:34:51 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> 
> > I'll try to craft a patch along the week to add
> > PYTHONDONTWRITEBYTECODE=1 to the places where kernel-doc
> > is called.  
> 
> This may really be all we need.  It will be interesting to do some
> build-time tests; I don't really see this as making much of a
> difference.

Just sent a patch meant to fix it.

Andy,

Please test.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2025-04-16  6:44 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-08 10:09 [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 01/33] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 02/33] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 03/33] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 04/33] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 05/33] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 06/33] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 07/33] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 08/33] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 09/33] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 10/33] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 11/33] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 12/33] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 13/33] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 14/33] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 15/33] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 16/33] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 17/33] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 18/33] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 19/33] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 20/33] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 21/33] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 22/33] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 23/33] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 24/33] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 25/33] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 26/33] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 27/33] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 28/33] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 29/33] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 30/33] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 31/33] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 32/33] scripts/kernel-doc.py: Rename the kernel doc Re class to KernRe Mauro Carvalho Chehab
2025-04-08 10:09 ` [PATCH v3 33/33] scripts: kernel-doc: fix parsing function-like typedefs (again) Mauro Carvalho Chehab
2025-04-09  5:29 ` [PATCH v3 00/33] Implement kernel-doc in Python Mauro Carvalho Chehab
2025-04-09 10:16 ` Jani Nikula
2025-04-09 11:44   ` Mauro Carvalho Chehab
2025-04-09 18:30 ` Jonathan Corbet
2025-04-14  9:41   ` Andy Shevchenko
2025-04-14 15:17     ` Jonathan Corbet
2025-04-14 15:54       ` Jonathan Corbet
2025-04-15  7:01       ` Andy Shevchenko
2025-04-15  7:03         ` Andy Shevchenko
2025-04-15  7:49           ` Jani Nikula
2025-04-15  8:17             ` Andy Shevchenko
2025-04-15  8:19               ` Andy Shevchenko
2025-04-15  8:40                 ` Mauro Carvalho Chehab
2025-04-15  8:51                   ` Mauro Carvalho Chehab
2025-04-15  9:53                     ` Andy Shevchenko
2025-04-15  9:51                   ` Andy Shevchenko
2025-04-15  9:54                     ` Andy Shevchenko
2025-04-15 10:06                       ` Mauro Carvalho Chehab
2025-04-15 11:13                         ` Andy Shevchenko
2025-04-15 13:34                         ` Jonathan Corbet
2025-04-16  6:44                           ` Mauro Carvalho Chehab
2025-04-15  8:30           ` Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).