* [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
@ 2025-08-14 17:13 Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax Peter Maydell
` (9 more replies)
0 siblings, 10 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
Earlier this year, the Linux kernel's kernel-doc script was rewritten
from the old Perl version into a shiny and hopefully more maintainable
Python version. This commit series updates our copy of this script
to the latest kernel version. I have tested it by comparing the
generated HTML documentation and checking that there are no
unexpected changes.
Luckily we are carrying very few local modifications to the Perl
script, so this is fairly straightforward. The structure of the
patchset is:
* a minor update to the kerneldoc.py Sphinx extension so it
will work with both old and new kernel-doc script output
* a fix to a doc comment markup error that I noticed while comparing
the HTML output from the two versions of the script
* import the new Python script, unmodified from the kernel's version
(conveniently the kernel calls it kernel-doc.py, so it doesn't
clash with the existing script)
* make the changes to that library code that correspond to the
two local QEMU-specific changes we carry
* tell sphinx to use the Python version
* delete the Perl script (I have put a diff of our local mods
to the Perl script in the commit message of this commit, for
posterity)
The diffstat looks big, but almost all of it is "import the
kernel's new script that we trust and don't need to review in
detail" and "delete the old script".
My immediate motivation for doing this update is that I noticed
that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
is using a Perl that complains about a construct in the perl script,
which prompted me to check if the kernel folks had already fixed
it, which it turned out that they had, by rewriting the whole thing :-)
More generally, if we don't do this update, then we're effectively
going to drift down the same path we did with checkpatch.pl, where
we have our own version that diverges from the kernel's version
and we have to maintain it ourselves.
We should also update the Sphinx plugin itself (i.e.
docs/sphinx/kerneldoc.py), but because I did not need to do
that to update the main kernel-doc script, I have left that as
a separate todo item.
Testing
-------
I looked at the HTML output of the old kernel-doc script versus the
new one, using the following diff command which mechanically excludes
a couple of "same minor change" everywhere diffs, and eyeballing the
resulting ~150 lines of diff.
diff -w -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
The HTML changes are:
(1) some paras now have ID tags, eg:
-<p><strong>Functions operating on arrays of bits</strong></p>
+<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
(2) Some extra named <div>s, eg:
+<div class="kernelindent docutils container">
<p><strong>Parameters</strong></p>
<dl class="simple">
<dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
@@ -144,12 +145,14 @@
<dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
</dd>
</dl>
+</div>
(3) The new version correctly parses the multi-line Return: block for
the memory_translate_iotlb() doc comment. You can see that the
old HTML here had dt/dd markup, and it mis-renders in the HTML at
https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
<p><strong>Return</strong></p>
-<dl class="simple">
-<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr. The MemoryRegion must not be
accessed after rcu_read_unlock.
+<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
+addr. The MemoryRegion must not be accessed after rcu_read_unlock.
On failure, return NULL, setting <strong>errp</strong> with error.</p>
-</dd>
-</dl>
+</div>
"Definition" sections now get output with a trailing colon:
-<p><strong>Definition</strong></p>
+<div class="kernelindent docutils container">
+<p><strong>Definition</strong>:</p>
This seems like it might be a bug in kernel-doc since the Parameters,
Return, etc sections don't get the trailing colon. I don't think it's
important enough to worry about.
thanks
-- PMM
Peter Maydell (8):
docs/sphinx/kerneldoc.py: Handle new LINENO syntax
tests/qtest/libqtest.h: Remove stray space from doc comment
scripts: Import Python kerneldoc from Linux kernel
scripts/kernel-doc: strip QEMU_ from function definitions
scripts/kernel-doc: tweak for QEMU coding standards
scripts/kerneldoc: Switch to the Python kernel-doc script
scripts/kernel-doc: Delete the old Perl kernel-doc script
MAINTAINERS: Put kernel-doc under the "docs build machinery" section
MAINTAINERS | 2 +
docs/conf.py | 4 +-
docs/sphinx/kerneldoc.py | 7 +-
tests/qtest/libqtest.h | 2 +-
.editorconfig | 2 +-
scripts/kernel-doc | 2442 -------------------------------
scripts/kernel-doc.py | 325 ++++
scripts/lib/kdoc/kdoc_files.py | 291 ++++
scripts/lib/kdoc/kdoc_item.py | 42 +
scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++
scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
scripts/lib/kdoc/kdoc_re.py | 270 ++++
12 files changed, 3355 insertions(+), 2451 deletions(-)
delete mode 100755 scripts/kernel-doc
create mode 100755 scripts/kernel-doc.py
create mode 100644 scripts/lib/kdoc/kdoc_files.py
create mode 100644 scripts/lib/kdoc/kdoc_item.py
create mode 100644 scripts/lib/kdoc/kdoc_output.py
create mode 100644 scripts/lib/kdoc/kdoc_parser.py
create mode 100644 scripts/lib/kdoc/kdoc_re.py
--
2.43.0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 9:49 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment Peter Maydell
` (8 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
The new upstream kernel-doc that we plan to update to uses a different
syntax for the LINENO directives that the Sphinx extension parses:
instead of
#define LINENO 86
it has
.. LINENO 86
Update the kerneldoc.py extension to handle both syntaxes, so
that it will work with both the old and the new kernel-doc.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
docs/sphinx/kerneldoc.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/docs/sphinx/kerneldoc.py b/docs/sphinx/kerneldoc.py
index 3aa972f2e89..30bb3431983 100644
--- a/docs/sphinx/kerneldoc.py
+++ b/docs/sphinx/kerneldoc.py
@@ -127,7 +127,7 @@ def run(self):
result = ViewList()
lineoffset = 0;
- line_regex = re.compile("^#define LINENO ([0-9]+)$")
+ line_regex = re.compile(r"^(?:\.\.|#define) LINENO ([0-9]+)$")
for line in lines:
match = line_regex.search(line)
if match:
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 9:51 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel Peter Maydell
` (7 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
The doc comment for qtest_cb_for_every_machine has a stray
space at the start of its description, which makes kernel-doc
think that this line is part of the documentation of the
skip_old_versioned argument. The result is that the HTML
doesn't have a "Description" section and the text is instead
put in the wrong place.
Remove the stray space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
tests/qtest/libqtest.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index b3f2e7fbefd..fd27521a9c7 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -977,7 +977,7 @@ void qtest_qmp_fds_assert_success(QTestState *qts, int *fds, size_t nfds,
* @cb: Pointer to the callback function
* @skip_old_versioned: true if versioned old machine types should be skipped
*
- * Call a callback function for every name of all available machines.
+ * Call a callback function for every name of all available machines.
*/
void qtest_cb_for_every_machine(void (*cb)(const char *machine),
bool skip_old_versioned);
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 10:00 ` Mauro Carvalho Chehab
2025-08-15 10:19 ` Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions Peter Maydell
` (6 subsequent siblings)
9 siblings, 2 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
We last synced our copy of kerneldoc with Linux back in 2020. In the
interim, upstream has entirely rewritten the script in Python, and
the new Python version is split into a main script plus some
libraries in the kernel's scripts/lib/kdoc.
Import all these files. These are the versions as of kernel commit
0cc53520e68be, with no local changes.
We use the same lib/kdoc/ directory as the kernel does here, so we
can avoid having to edit the top-level script just to adjust a
pathname, even though it is probably not the naming we would have
picked if this was a purely QEMU script.
The Sphinx conf.py still points at the Perl version of the script,
so this Python code will not be invoked to build the docs yet.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
scripts/kernel-doc.py | 325 ++++++
scripts/lib/kdoc/kdoc_files.py | 291 ++++++
scripts/lib/kdoc/kdoc_item.py | 42 +
scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++++++
scripts/lib/kdoc/kdoc_parser.py | 1669 +++++++++++++++++++++++++++++++
scripts/lib/kdoc/kdoc_re.py | 270 +++++
6 files changed, 3346 insertions(+)
create mode 100755 scripts/kernel-doc.py
create mode 100644 scripts/lib/kdoc/kdoc_files.py
create mode 100644 scripts/lib/kdoc/kdoc_item.py
create mode 100644 scripts/lib/kdoc/kdoc_output.py
create mode 100644 scripts/lib/kdoc/kdoc_parser.py
create mode 100644 scripts/lib/kdoc/kdoc_re.py
diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
new file mode 100755
index 00000000000..fc3d46ef519
--- /dev/null
+++ b/scripts/kernel-doc.py
@@ -0,0 +1,325 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=C0103,R0915
+#
+# Converted from the kernel-doc script originally written in Perl
+# under GPLv2, copyrighted since 1998 by the following authors:
+#
+# Aditya Srivastava <yashsri421@gmail.com>
+# Akira Yokosawa <akiyks@gmail.com>
+# Alexander A. Klimov <grandmaster@al2klimov.de>
+# Alexander Lobakin <aleksander.lobakin@intel.com>
+# André Almeida <andrealmeid@igalia.com>
+# Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+# Anna-Maria Behnsen <anna-maria@linutronix.de>
+# Armin Kuster <akuster@mvista.com>
+# Bart Van Assche <bart.vanassche@sandisk.com>
+# Ben Hutchings <ben@decadent.org.uk>
+# Borislav Petkov <bbpetkov@yahoo.de>
+# Chen-Yu Tsai <wenst@chromium.org>
+# Coco Li <lixiaoyan@google.com>
+# Conchúr Navid <conchur@web.de>
+# Daniel Santos <daniel.santos@pobox.com>
+# Danilo Cesar Lemes de Paula <danilo.cesar@collabora.co.uk>
+# Dan Luedtke <mail@danrl.de>
+# Donald Hunter <donald.hunter@gmail.com>
+# Gabriel Krisman Bertazi <krisman@collabora.co.uk>
+# Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+# Harvey Harrison <harvey.harrison@gmail.com>
+# Horia Geanta <horia.geanta@freescale.com>
+# Ilya Dryomov <idryomov@gmail.com>
+# Jakub Kicinski <kuba@kernel.org>
+# Jani Nikula <jani.nikula@intel.com>
+# Jason Baron <jbaron@redhat.com>
+# Jason Gunthorpe <jgg@nvidia.com>
+# Jérémy Bobbio <lunar@debian.org>
+# Johannes Berg <johannes.berg@intel.com>
+# Johannes Weiner <hannes@cmpxchg.org>
+# Jonathan Cameron <Jonathan.Cameron@huawei.com>
+# Jonathan Corbet <corbet@lwn.net>
+# Jonathan Neuschäfer <j.neuschaefer@gmx.net>
+# Kamil Rytarowski <n54@gmx.com>
+# Kees Cook <kees@kernel.org>
+# Laurent Pinchart <laurent.pinchart@ideasonboard.com>
+# Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>
+# Linus Torvalds <torvalds@linux-foundation.org>
+# Lucas De Marchi <lucas.demarchi@profusion.mobi>
+# Mark Rutland <mark.rutland@arm.com>
+# Markus Heiser <markus.heiser@darmarit.de>
+# Martin Waitz <tali@admingilde.org>
+# Masahiro Yamada <masahiroy@kernel.org>
+# Matthew Wilcox <willy@infradead.org>
+# Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+# Michal Wajdeczko <michal.wajdeczko@intel.com>
+# Michael Zucchi
+# Mike Rapoport <rppt@linux.ibm.com>
+# Niklas Söderlund <niklas.soderlund@corigine.com>
+# Nishanth Menon <nm@ti.com>
+# Paolo Bonzini <pbonzini@redhat.com>
+# Pavan Kumar Linga <pavan.kumar.linga@intel.com>
+# Pavel Pisa <pisa@cmp.felk.cvut.cz>
+# Peter Maydell <peter.maydell@linaro.org>
+# Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+# Randy Dunlap <rdunlap@infradead.org>
+# Richard Kennedy <richard@rsk.demon.co.uk>
+# Rich Walker <rw@shadow.org.uk>
+# Rolf Eike Beer <eike-kernel@sf-tec.de>
+# Sakari Ailus <sakari.ailus@linux.intel.com>
+# Silvio Fricke <silvio.fricke@gmail.com>
+# Simon Huggins
+# Tim Waugh <twaugh@redhat.com>
+# Tomasz Warniełło <tomasz.warniello@gmail.com>
+# Utkarsh Tripathi <utripathi2002@gmail.com>
+# valdis.kletnieks@vt.edu <valdis.kletnieks@vt.edu>
+# Vegard Nossum <vegard.nossum@oracle.com>
+# Will Deacon <will.deacon@arm.com>
+# Yacine Belkadi <yacine.belkadi.1@gmail.com>
+# Yujie Liu <yujie.liu@intel.com>
+
+"""
+kernel_doc
+==========
+
+Print formatted kernel documentation to stdout
+
+Read C language source or header FILEs, extract embedded
+documentation comments, and print formatted documentation
+to standard output.
+
+The documentation comments are identified by the "/**"
+opening comment mark.
+
+See Documentation/doc-guide/kernel-doc.rst for the
+documentation comment syntax.
+"""
+
+import argparse
+import logging
+import os
+import sys
+
+# Import Python modules
+
+LIB_DIR = "lib/kdoc"
+SRC_DIR = os.path.dirname(os.path.realpath(__file__))
+
+sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
+
+from kdoc_files import KernelFiles # pylint: disable=C0413
+from kdoc_output import RestFormat, ManFormat # pylint: disable=C0413
+
+DESC = """
+Read C language source or header FILEs, extract embedded documentation comments,
+and print formatted documentation to standard output.
+
+The documentation comments are identified by the "/**" opening comment mark.
+
+See Documentation/doc-guide/kernel-doc.rst for the documentation comment syntax.
+"""
+
+EXPORT_FILE_DESC = """
+Specify an additional FILE in which to look for EXPORT_SYMBOL information.
+
+May be used multiple times.
+"""
+
+EXPORT_DESC = """
+Only output documentation for the symbols that have been
+exported using EXPORT_SYMBOL() and related macros in any input
+FILE or -export-file FILE.
+"""
+
+INTERNAL_DESC = """
+Only output documentation for the symbols that have NOT been
+exported using EXPORT_SYMBOL() and related macros in any input
+FILE or -export-file FILE.
+"""
+
+FUNCTION_DESC = """
+Only output documentation for the given function or DOC: section
+title. All other functions and DOC: sections are ignored.
+
+May be used multiple times.
+"""
+
+NOSYMBOL_DESC = """
+Exclude the specified symbol from the output documentation.
+
+May be used multiple times.
+"""
+
+FILES_DESC = """
+Header and C source files to be parsed.
+"""
+
+WARN_CONTENTS_BEFORE_SECTIONS_DESC = """
+Warns if there are contents before sections (deprecated).
+
+This option is kept just for backward-compatibility, but it does nothing,
+neither here nor at the original Perl script.
+"""
+
+
+class MsgFormatter(logging.Formatter):
+ """Helper class to format warnings on a similar way to kernel-doc.pl"""
+
+ def format(self, record):
+ record.levelname = record.levelname.capitalize()
+ return logging.Formatter.format(self, record)
+
+def main():
+ """Main program"""
+
+ parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter,
+ description=DESC)
+
+ # Normal arguments
+
+ parser.add_argument("-v", "-verbose", "--verbose", action="store_true",
+ help="Verbose output, more warnings and other information.")
+
+ parser.add_argument("-d", "-debug", "--debug", action="store_true",
+ help="Enable debug messages")
+
+ parser.add_argument("-M", "-modulename", "--modulename",
+ default="Kernel API",
+ help="Allow setting a module name at the output.")
+
+ parser.add_argument("-l", "-enable-lineno", "--enable_lineno",
+ action="store_true",
+ help="Enable line number output (only in ReST mode)")
+
+ # Arguments to control the warning behavior
+
+ parser.add_argument("-Wreturn", "--wreturn", action="store_true",
+ help="Warns about the lack of a return markup on functions.")
+
+ parser.add_argument("-Wshort-desc", "-Wshort-description", "--wshort-desc",
+ action="store_true",
+ help="Warns if initial short description is missing")
+
+ parser.add_argument("-Wcontents-before-sections",
+ "--wcontents-before-sections", action="store_true",
+ help=WARN_CONTENTS_BEFORE_SECTIONS_DESC)
+
+ parser.add_argument("-Wall", "--wall", action="store_true",
+ help="Enable all types of warnings")
+
+ parser.add_argument("-Werror", "--werror", action="store_true",
+ help="Treat warnings as errors.")
+
+ parser.add_argument("-export-file", "--export-file", action='append',
+ help=EXPORT_FILE_DESC)
+
+ # Output format mutually-exclusive group
+
+ out_group = parser.add_argument_group("Output format selection (mutually exclusive)")
+
+ out_fmt = out_group.add_mutually_exclusive_group()
+
+ out_fmt.add_argument("-m", "-man", "--man", action="store_true",
+ help="Output troff manual page format.")
+ out_fmt.add_argument("-r", "-rst", "--rst", action="store_true",
+ help="Output reStructuredText format (default).")
+ out_fmt.add_argument("-N", "-none", "--none", action="store_true",
+ help="Do not output documentation, only warnings.")
+
+ # Output selection mutually-exclusive group
+
+ sel_group = parser.add_argument_group("Output selection (mutually exclusive)")
+ sel_mut = sel_group.add_mutually_exclusive_group()
+
+ sel_mut.add_argument("-e", "-export", "--export", action='store_true',
+ help=EXPORT_DESC)
+
+ sel_mut.add_argument("-i", "-internal", "--internal", action='store_true',
+ help=INTERNAL_DESC)
+
+ sel_mut.add_argument("-s", "-function", "--symbol", action='append',
+ help=FUNCTION_DESC)
+
+ # Those are valid for all 3 types of filter
+ parser.add_argument("-n", "-nosymbol", "--nosymbol", action='append',
+ help=NOSYMBOL_DESC)
+
+ parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections",
+ action='store_true', help="Don't outputt DOC sections")
+
+ parser.add_argument("files", metavar="FILE",
+ nargs="+", help=FILES_DESC)
+
+ args = parser.parse_args()
+
+ if args.wall:
+ args.wreturn = True
+ args.wshort_desc = True
+ args.wcontents_before_sections = True
+
+ logger = logging.getLogger()
+
+ if not args.debug:
+ logger.setLevel(logging.INFO)
+ else:
+ logger.setLevel(logging.DEBUG)
+
+ formatter = MsgFormatter('%(levelname)s: %(message)s')
+
+ handler = logging.StreamHandler()
+ handler.setFormatter(formatter)
+
+ logger.addHandler(handler)
+
+ python_ver = sys.version_info[:2]
+ if python_ver < (3,6):
+ logger.warning("Python 3.6 or later is required by kernel-doc")
+
+ # Return 0 here to avoid breaking compilation
+ sys.exit(0)
+
+ if python_ver < (3,7):
+ logger.warning("Python 3.7 or later is required for correct results")
+
+ if args.man:
+ out_style = ManFormat(modulename=args.modulename)
+ elif args.none:
+ out_style = None
+ else:
+ out_style = RestFormat()
+
+ kfiles = KernelFiles(verbose=args.verbose,
+ out_style=out_style, werror=args.werror,
+ wreturn=args.wreturn, wshort_desc=args.wshort_desc,
+ wcontents_before_sections=args.wcontents_before_sections)
+
+ kfiles.parse(args.files, export_file=args.export_file)
+
+ for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
+ internal=args.internal, symbol=args.symbol,
+ nosymbol=args.nosymbol, export_file=args.export_file,
+ no_doc_sections=args.no_doc_sections):
+ msg = t[1]
+ if msg:
+ print(msg)
+
+ error_count = kfiles.errors
+ if not error_count:
+ sys.exit(0)
+
+ if args.werror:
+ print(f"{error_count} warnings as errors")
+ sys.exit(error_count)
+
+ if args.verbose:
+ print(f"{error_count} errors")
+
+ if args.none:
+ sys.exit(0)
+
+ sys.exit(error_count)
+
+
+# Call main method
+if __name__ == "__main__":
+ main()
diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
new file mode 100644
index 00000000000..9e09b45b02f
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_files.py
@@ -0,0 +1,291 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=R0903,R0913,R0914,R0917
+
+"""
+Parse lernel-doc tags on multiple kernel source files.
+"""
+
+import argparse
+import logging
+import os
+import re
+
+from kdoc_parser import KernelDoc
+from kdoc_output import OutputFormat
+
+
+class GlobSourceFiles:
+ """
+ Parse C source code file names and directories via an Interactor.
+ """
+
+ def __init__(self, srctree=None, valid_extensions=None):
+ """
+ Initialize valid extensions with a tuple.
+
+ If not defined, assume default C extensions (.c and .h)
+
+ It would be possible to use python's glob function, but it is
+ very slow, and it is not interactive. So, it would wait to read all
+ directories before actually do something.
+
+ So, let's use our own implementation.
+ """
+
+ if not valid_extensions:
+ self.extensions = (".c", ".h")
+ else:
+ self.extensions = valid_extensions
+
+ self.srctree = srctree
+
+ def _parse_dir(self, dirname):
+ """Internal function to parse files recursively"""
+
+ with os.scandir(dirname) as obj:
+ for entry in obj:
+ name = os.path.join(dirname, entry.name)
+
+ if entry.is_dir():
+ yield from self._parse_dir(name)
+
+ if not entry.is_file():
+ continue
+
+ basename = os.path.basename(name)
+
+ if not basename.endswith(self.extensions):
+ continue
+
+ yield name
+
+ def parse_files(self, file_list, file_not_found_cb):
+ """
+ Define an interator to parse all source files from file_list,
+ handling directories if any
+ """
+
+ if not file_list:
+ return
+
+ for fname in file_list:
+ if self.srctree:
+ f = os.path.join(self.srctree, fname)
+ else:
+ f = fname
+
+ if os.path.isdir(f):
+ yield from self._parse_dir(f)
+ elif os.path.isfile(f):
+ yield f
+ elif file_not_found_cb:
+ file_not_found_cb(fname)
+
+
+class KernelFiles():
+ """
+ Parse kernel-doc tags on multiple kernel source files.
+
+ There are two type of parsers defined here:
+ - self.parse_file(): parses both kernel-doc markups and
+ EXPORT_SYMBOL* macros;
+ - self.process_export_file(): parses only EXPORT_SYMBOL* macros.
+ """
+
+ def warning(self, msg):
+ """Ancillary routine to output a warning and increment error count"""
+
+ self.config.log.warning(msg)
+ self.errors += 1
+
+ def error(self, msg):
+ """Ancillary routine to output an error and increment error count"""
+
+ self.config.log.error(msg)
+ self.errors += 1
+
+ def parse_file(self, fname):
+ """
+ Parse a single Kernel source.
+ """
+
+ # Prevent parsing the same file twice if results are cached
+ if fname in self.files:
+ return
+
+ doc = KernelDoc(self.config, fname)
+ export_table, entries = doc.parse_kdoc()
+
+ self.export_table[fname] = export_table
+
+ self.files.add(fname)
+ self.export_files.add(fname) # parse_kdoc() already check exports
+
+ self.results[fname] = entries
+
+ def process_export_file(self, fname):
+ """
+ Parses EXPORT_SYMBOL* macros from a single Kernel source file.
+ """
+
+ # Prevent parsing the same file twice if results are cached
+ if fname in self.export_files:
+ return
+
+ doc = KernelDoc(self.config, fname)
+ export_table = doc.parse_export()
+
+ if not export_table:
+ self.error(f"Error: Cannot check EXPORT_SYMBOL* on {fname}")
+ export_table = set()
+
+ self.export_table[fname] = export_table
+ self.export_files.add(fname)
+
+ def file_not_found_cb(self, fname):
+ """
+ Callback to warn if a file was not found.
+ """
+
+ self.error(f"Cannot find file {fname}")
+
+ def __init__(self, verbose=False, out_style=None,
+ werror=False, wreturn=False, wshort_desc=False,
+ wcontents_before_sections=False,
+ logger=None):
+ """
+ Initialize startup variables and parse all files
+ """
+
+ if not verbose:
+ verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
+
+ if out_style is None:
+ out_style = OutputFormat()
+
+ if not werror:
+ kcflags = os.environ.get("KCFLAGS", None)
+ if kcflags:
+ match = re.search(r"(\s|^)-Werror(\s|$)/", kcflags)
+ if match:
+ werror = True
+
+ # reading this variable is for backwards compat just in case
+ # someone was calling it with the variable from outside the
+ # kernel's build system
+ kdoc_werror = os.environ.get("KDOC_WERROR", None)
+ if kdoc_werror:
+ werror = kdoc_werror
+
+ # Some variables are global to the parser logic as a whole as they are
+ # used to send control configuration to KernelDoc class. As such,
+ # those variables are read-only inside the KernelDoc.
+ self.config = argparse.Namespace
+
+ self.config.verbose = verbose
+ self.config.werror = werror
+ self.config.wreturn = wreturn
+ self.config.wshort_desc = wshort_desc
+ self.config.wcontents_before_sections = wcontents_before_sections
+
+ if not logger:
+ self.config.log = logging.getLogger("kernel-doc")
+ else:
+ self.config.log = logger
+
+ self.config.warning = self.warning
+
+ self.config.src_tree = os.environ.get("SRCTREE", None)
+
+ # Initialize variables that are internal to KernelFiles
+
+ self.out_style = out_style
+
+ self.errors = 0
+ self.results = {}
+
+ self.files = set()
+ self.export_files = set()
+ self.export_table = {}
+
+ def parse(self, file_list, export_file=None):
+ """
+ Parse all files
+ """
+
+ glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+ for fname in glob.parse_files(file_list, self.file_not_found_cb):
+ self.parse_file(fname)
+
+ for fname in glob.parse_files(export_file, self.file_not_found_cb):
+ self.process_export_file(fname)
+
+ def out_msg(self, fname, name, arg):
+ """
+ Return output messages from a file name using the output style
+ filtering.
+
+ If output type was not handled by the syler, return None.
+ """
+
+ # NOTE: we can add rules here to filter out unwanted parts,
+ # although OutputFormat.msg already does that.
+
+ return self.out_style.msg(fname, name, arg)
+
+ def msg(self, enable_lineno=False, export=False, internal=False,
+ symbol=None, nosymbol=None, no_doc_sections=False,
+ filenames=None, export_file=None):
+ """
+ Interacts over the kernel-doc results and output messages,
+ returning kernel-doc markups on each interaction
+ """
+
+ self.out_style.set_config(self.config)
+
+ if not filenames:
+ filenames = sorted(self.results.keys())
+
+ glob = GlobSourceFiles(srctree=self.config.src_tree)
+
+ for fname in filenames:
+ function_table = set()
+
+ if internal or export:
+ if not export_file:
+ export_file = [fname]
+
+ for f in glob.parse_files(export_file, self.file_not_found_cb):
+ function_table |= self.export_table[f]
+
+ if symbol:
+ for s in symbol:
+ function_table.add(s)
+
+ self.out_style.set_filter(export, internal, symbol, nosymbol,
+ function_table, enable_lineno,
+ no_doc_sections)
+
+ msg = ""
+ if fname not in self.results:
+ self.config.log.warning("No kernel-doc for file %s", fname)
+ continue
+
+ for arg in self.results[fname]:
+ m = self.out_msg(fname, arg.name, arg)
+
+ if m is None:
+ ln = arg.get("ln", 0)
+ dtype = arg.get('type', "")
+
+ self.config.log.warning("%s:%d Can't handle %s",
+ fname, ln, dtype)
+ else:
+ msg += m
+
+ if msg:
+ yield fname, msg
diff --git a/scripts/lib/kdoc/kdoc_item.py b/scripts/lib/kdoc/kdoc_item.py
new file mode 100644
index 00000000000..b3b22576455
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_item.py
@@ -0,0 +1,42 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# A class that will, eventually, encapsulate all of the parsed data that we
+# then pass into the output modules.
+#
+
+class KdocItem:
+ def __init__(self, name, type, start_line, **other_stuff):
+ self.name = name
+ self.type = type
+ self.declaration_start_line = start_line
+ self.sections = {}
+ self.sections_start_lines = {}
+ self.parameterlist = []
+ self.parameterdesc_start_lines = []
+ self.parameterdescs = {}
+ self.parametertypes = {}
+ #
+ # Just save everything else into our own dict so that the output
+ # side can grab it directly as before. As we move things into more
+ # structured data, this will, hopefully, fade away.
+ #
+ self.other_stuff = other_stuff
+
+ def get(self, key, default = None):
+ return self.other_stuff.get(key, default)
+
+ def __getitem__(self, key):
+ return self.get(key)
+
+ #
+ # Tracking of section and parameter information.
+ #
+ def set_sections(self, sections, start_lines):
+ self.sections = sections
+ self.section_start_lines = start_lines
+
+ def set_params(self, names, descs, types, starts):
+ self.parameterlist = names
+ self.parameterdescs = descs
+ self.parametertypes = types
+ self.parameterdesc_start_lines = starts
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
new file mode 100644
index 00000000000..ea8914537ba
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -0,0 +1,749 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=C0301,R0902,R0911,R0912,R0913,R0914,R0915,R0917
+
+"""
+Implement output filters to print kernel-doc documentation.
+
+The implementation uses a virtual base class (OutputFormat) which
+contains a dispatches to virtual methods, and some code to filter
+out output messages.
+
+The actual implementation is done on one separate class per each type
+of output. Currently, there are output classes for ReST and man/troff.
+"""
+
+import os
+import re
+from datetime import datetime
+
+from kdoc_parser import KernelDoc, type_param
+from kdoc_re import KernRe
+
+
+function_pointer = KernRe(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
+
+# match expressions used to find embedded type information
+type_constant = KernRe(r"\b``([^\`]+)``\b", cache=False)
+type_constant2 = KernRe(r"\%([-_*\w]+)", cache=False)
+type_func = KernRe(r"(\w+)\(\)", cache=False)
+type_param_ref = KernRe(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+
+# Special RST handling for func ptr params
+type_fp_param = KernRe(r"\@(\w+)\(\)", cache=False)
+
+# Special RST handling for structs with func ptr params
+type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
+
+type_env = KernRe(r"(\$\w+)", cache=False)
+type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
+type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
+type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
+type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
+type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
+type_fallback = KernRe(r"\&([_\w]+)", cache=False)
+type_member_func = type_member + KernRe(r"\(\)", cache=False)
+
+
+class OutputFormat:
+ """
+ Base class for OutputFormat. If used as-is, it means that only
+ warnings will be displayed.
+ """
+
+ # output mode.
+ OUTPUT_ALL = 0 # output all symbols and doc sections
+ OUTPUT_INCLUDE = 1 # output only specified symbols
+ OUTPUT_EXPORTED = 2 # output exported symbols
+ OUTPUT_INTERNAL = 3 # output non-exported symbols
+
+ # Virtual member to be overriden at the inherited classes
+ highlights = []
+
+ def __init__(self):
+ """Declare internal vars and set mode to OUTPUT_ALL"""
+
+ self.out_mode = self.OUTPUT_ALL
+ self.enable_lineno = None
+ self.nosymbol = {}
+ self.symbol = None
+ self.function_table = None
+ self.config = None
+ self.no_doc_sections = False
+
+ self.data = ""
+
+ def set_config(self, config):
+ """
+ Setup global config variables used by both parser and output.
+ """
+
+ self.config = config
+
+ def set_filter(self, export, internal, symbol, nosymbol, function_table,
+ enable_lineno, no_doc_sections):
+ """
+ Initialize filter variables according with the requested mode.
+
+ Only one choice is valid between export, internal and symbol.
+
+ The nosymbol filter can be used on all modes.
+ """
+
+ self.enable_lineno = enable_lineno
+ self.no_doc_sections = no_doc_sections
+ self.function_table = function_table
+
+ if symbol:
+ self.out_mode = self.OUTPUT_INCLUDE
+ elif export:
+ self.out_mode = self.OUTPUT_EXPORTED
+ elif internal:
+ self.out_mode = self.OUTPUT_INTERNAL
+ else:
+ self.out_mode = self.OUTPUT_ALL
+
+ if nosymbol:
+ self.nosymbol = set(nosymbol)
+
+
+ def highlight_block(self, block):
+ """
+ Apply the RST highlights to a sub-block of text.
+ """
+
+ for r, sub in self.highlights:
+ block = r.sub(sub, block)
+
+ return block
+
+ def out_warnings(self, args):
+ """
+ Output warnings for identifiers that will be displayed.
+ """
+
+ for log_msg in args.warnings:
+ self.config.warning(log_msg)
+
+ def check_doc(self, name, args):
+ """Check if DOC should be output"""
+
+ if self.no_doc_sections:
+ return False
+
+ if name in self.nosymbol:
+ return False
+
+ if self.out_mode == self.OUTPUT_ALL:
+ self.out_warnings(args)
+ return True
+
+ if self.out_mode == self.OUTPUT_INCLUDE:
+ if name in self.function_table:
+ self.out_warnings(args)
+ return True
+
+ return False
+
+ def check_declaration(self, dtype, name, args):
+ """
+ Checks if a declaration should be output or not based on the
+ filtering criteria.
+ """
+
+ if name in self.nosymbol:
+ return False
+
+ if self.out_mode == self.OUTPUT_ALL:
+ self.out_warnings(args)
+ return True
+
+ if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]:
+ if name in self.function_table:
+ return True
+
+ if self.out_mode == self.OUTPUT_INTERNAL:
+ if dtype != "function":
+ self.out_warnings(args)
+ return True
+
+ if name not in self.function_table:
+ self.out_warnings(args)
+ return True
+
+ return False
+
+ def msg(self, fname, name, args):
+ """
+ Handles a single entry from kernel-doc parser
+ """
+
+ self.data = ""
+
+ dtype = args.type
+
+ if dtype == "doc":
+ self.out_doc(fname, name, args)
+ return self.data
+
+ if not self.check_declaration(dtype, name, args):
+ return self.data
+
+ if dtype == "function":
+ self.out_function(fname, name, args)
+ return self.data
+
+ if dtype == "enum":
+ self.out_enum(fname, name, args)
+ return self.data
+
+ if dtype == "typedef":
+ self.out_typedef(fname, name, args)
+ return self.data
+
+ if dtype in ["struct", "union"]:
+ self.out_struct(fname, name, args)
+ return self.data
+
+ # Warn if some type requires an output logic
+ self.config.log.warning("doesn't now how to output '%s' block",
+ dtype)
+
+ return None
+
+ # Virtual methods to be overridden by inherited classes
+ # At the base class, those do nothing.
+ def out_doc(self, fname, name, args):
+ """Outputs a DOC block"""
+
+ def out_function(self, fname, name, args):
+ """Outputs a function"""
+
+ def out_enum(self, fname, name, args):
+ """Outputs an enum"""
+
+ def out_typedef(self, fname, name, args):
+ """Outputs a typedef"""
+
+ def out_struct(self, fname, name, args):
+ """Outputs a struct"""
+
+
+class RestFormat(OutputFormat):
+ """Consts and functions used by ReST output"""
+
+ highlights = [
+ (type_constant, r"``\1``"),
+ (type_constant2, r"``\1``"),
+
+ # Note: need to escape () to avoid func matching later
+ (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
+ (type_member, r":c:type:`\1\2\3 <\1>`"),
+ (type_fp_param, r"**\1\\(\\)**"),
+ (type_fp_param2, r"**\1\\(\\)**"),
+ (type_func, r"\1()"),
+ (type_enum, r":c:type:`\1 <\2>`"),
+ (type_struct, r":c:type:`\1 <\2>`"),
+ (type_typedef, r":c:type:`\1 <\2>`"),
+ (type_union, r":c:type:`\1 <\2>`"),
+
+ # in rst this can refer to any type
+ (type_fallback, r":c:type:`\1`"),
+ (type_param_ref, r"**\1\2**")
+ ]
+ blankline = "\n"
+
+ sphinx_literal = KernRe(r'^[^.].*::$', cache=False)
+ sphinx_cblock = KernRe(r'^\.\.\ +code-block::', cache=False)
+
+ def __init__(self):
+ """
+ Creates class variables.
+
+ Not really mandatory, but it is a good coding style and makes
+ pylint happy.
+ """
+
+ super().__init__()
+ self.lineprefix = ""
+
+ def print_lineno(self, ln):
+ """Outputs a line number"""
+
+ if self.enable_lineno and ln is not None:
+ ln += 1
+ self.data += f".. LINENO {ln}\n"
+
+ def output_highlight(self, args):
+ """
+ Outputs a C symbol that may require being converted to ReST using
+ the self.highlights variable
+ """
+
+ input_text = args
+ output = ""
+ in_literal = False
+ litprefix = ""
+ block = ""
+
+ for line in input_text.strip("\n").split("\n"):
+
+ # If we're in a literal block, see if we should drop out of it.
+ # Otherwise, pass the line straight through unmunged.
+ if in_literal:
+ if line.strip(): # If the line is not blank
+ # If this is the first non-blank line in a literal block,
+ # figure out the proper indent.
+ if not litprefix:
+ r = KernRe(r'^(\s*)')
+ if r.match(line):
+ litprefix = '^' + r.group(1)
+ else:
+ litprefix = ""
+
+ output += line + "\n"
+ elif not KernRe(litprefix).match(line):
+ in_literal = False
+ else:
+ output += line + "\n"
+ else:
+ output += line + "\n"
+
+ # Not in a literal block (or just dropped out)
+ if not in_literal:
+ block += line + "\n"
+ if self.sphinx_literal.match(line) or self.sphinx_cblock.match(line):
+ in_literal = True
+ litprefix = ""
+ output += self.highlight_block(block)
+ block = ""
+
+ # Handle any remaining block
+ if block:
+ output += self.highlight_block(block)
+
+ # Print the output with the line prefix
+ for line in output.strip("\n").split("\n"):
+ self.data += self.lineprefix + line + "\n"
+
+ def out_section(self, args, out_docblock=False):
+ """
+ Outputs a block section.
+
+ This could use some work; it's used to output the DOC: sections, and
+ starts by putting out the name of the doc section itself, but that
+ tends to duplicate a header already in the template file.
+ """
+ for section, text in args.sections.items():
+ # Skip sections that are in the nosymbol_table
+ if section in self.nosymbol:
+ continue
+
+ if out_docblock:
+ if not self.out_mode == self.OUTPUT_INCLUDE:
+ self.data += f".. _{section}:\n\n"
+ self.data += f'{self.lineprefix}**{section}**\n\n'
+ else:
+ self.data += f'{self.lineprefix}**{section}**\n\n'
+
+ self.print_lineno(args.section_start_lines.get(section, 0))
+ self.output_highlight(text)
+ self.data += "\n"
+ self.data += "\n"
+
+ def out_doc(self, fname, name, args):
+ if not self.check_doc(name, args):
+ return
+ self.out_section(args, out_docblock=True)
+
+ def out_function(self, fname, name, args):
+
+ oldprefix = self.lineprefix
+ signature = ""
+
+ func_macro = args.get('func_macro', False)
+ if func_macro:
+ signature = name
+ else:
+ if args.get('functiontype'):
+ signature = args['functiontype'] + " "
+ signature += name + " ("
+
+ ln = args.declaration_start_line
+ count = 0
+ for parameter in args.parameterlist:
+ if count != 0:
+ signature += ", "
+ count += 1
+ dtype = args.parametertypes.get(parameter, "")
+
+ if function_pointer.search(dtype):
+ signature += function_pointer.group(1) + parameter + function_pointer.group(3)
+ else:
+ signature += dtype
+
+ if not func_macro:
+ signature += ")"
+
+ self.print_lineno(ln)
+ if args.get('typedef') or not args.get('functiontype'):
+ self.data += f".. c:macro:: {name}\n\n"
+
+ if args.get('typedef'):
+ self.data += " **Typedef**: "
+ self.lineprefix = ""
+ self.output_highlight(args.get('purpose', ""))
+ self.data += "\n\n**Syntax**\n\n"
+ self.data += f" ``{signature}``\n\n"
+ else:
+ self.data += f"``{signature}``\n\n"
+ else:
+ self.data += f".. c:function:: {signature}\n\n"
+
+ if not args.get('typedef'):
+ self.print_lineno(ln)
+ self.lineprefix = " "
+ self.output_highlight(args.get('purpose', ""))
+ self.data += "\n"
+
+ # Put descriptive text into a container (HTML <div>) to help set
+ # function prototypes apart
+ self.lineprefix = " "
+
+ if args.parameterlist:
+ self.data += ".. container:: kernelindent\n\n"
+ self.data += f"{self.lineprefix}**Parameters**\n\n"
+
+ for parameter in args.parameterlist:
+ parameter_name = KernRe(r'\[.*').sub('', parameter)
+ dtype = args.parametertypes.get(parameter, "")
+
+ if dtype:
+ self.data += f"{self.lineprefix}``{dtype}``\n"
+ else:
+ self.data += f"{self.lineprefix}``{parameter}``\n"
+
+ self.print_lineno(args.parameterdesc_start_lines.get(parameter_name, 0))
+
+ self.lineprefix = " "
+ if parameter_name in args.parameterdescs and \
+ args.parameterdescs[parameter_name] != KernelDoc.undescribed:
+
+ self.output_highlight(args.parameterdescs[parameter_name])
+ self.data += "\n"
+ else:
+ self.data += f"{self.lineprefix}*undescribed*\n\n"
+ self.lineprefix = " "
+
+ self.out_section(args)
+ self.lineprefix = oldprefix
+
+ def out_enum(self, fname, name, args):
+
+ oldprefix = self.lineprefix
+ ln = args.declaration_start_line
+
+ self.data += f"\n\n.. c:enum:: {name}\n\n"
+
+ self.print_lineno(ln)
+ self.lineprefix = " "
+ self.output_highlight(args.get('purpose', ''))
+ self.data += "\n"
+
+ self.data += ".. container:: kernelindent\n\n"
+ outer = self.lineprefix + " "
+ self.lineprefix = outer + " "
+ self.data += f"{outer}**Constants**\n\n"
+
+ for parameter in args.parameterlist:
+ self.data += f"{outer}``{parameter}``\n"
+
+ if args.parameterdescs.get(parameter, '') != KernelDoc.undescribed:
+ self.output_highlight(args.parameterdescs[parameter])
+ else:
+ self.data += f"{self.lineprefix}*undescribed*\n\n"
+ self.data += "\n"
+
+ self.lineprefix = oldprefix
+ self.out_section(args)
+
+ def out_typedef(self, fname, name, args):
+
+ oldprefix = self.lineprefix
+ ln = args.declaration_start_line
+
+ self.data += f"\n\n.. c:type:: {name}\n\n"
+
+ self.print_lineno(ln)
+ self.lineprefix = " "
+
+ self.output_highlight(args.get('purpose', ''))
+
+ self.data += "\n"
+
+ self.lineprefix = oldprefix
+ self.out_section(args)
+
+ def out_struct(self, fname, name, args):
+
+ purpose = args.get('purpose', "")
+ declaration = args.get('definition', "")
+ dtype = args.type
+ ln = args.declaration_start_line
+
+ self.data += f"\n\n.. c:{dtype}:: {name}\n\n"
+
+ self.print_lineno(ln)
+
+ oldprefix = self.lineprefix
+ self.lineprefix += " "
+
+ self.output_highlight(purpose)
+ self.data += "\n"
+
+ self.data += ".. container:: kernelindent\n\n"
+ self.data += f"{self.lineprefix}**Definition**::\n\n"
+
+ self.lineprefix = self.lineprefix + " "
+
+ declaration = declaration.replace("\t", self.lineprefix)
+
+ self.data += f"{self.lineprefix}{dtype} {name}" + ' {' + "\n"
+ self.data += f"{declaration}{self.lineprefix}" + "};\n\n"
+
+ self.lineprefix = " "
+ self.data += f"{self.lineprefix}**Members**\n\n"
+ for parameter in args.parameterlist:
+ if not parameter or parameter.startswith("#"):
+ continue
+
+ parameter_name = parameter.split("[", maxsplit=1)[0]
+
+ if args.parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+ continue
+
+ self.print_lineno(args.parameterdesc_start_lines.get(parameter_name, 0))
+
+ self.data += f"{self.lineprefix}``{parameter}``\n"
+
+ self.lineprefix = " "
+ self.output_highlight(args.parameterdescs[parameter_name])
+ self.lineprefix = " "
+
+ self.data += "\n"
+
+ self.data += "\n"
+
+ self.lineprefix = oldprefix
+ self.out_section(args)
+
+
+class ManFormat(OutputFormat):
+ """Consts and functions used by man pages output"""
+
+ highlights = (
+ (type_constant, r"\1"),
+ (type_constant2, r"\1"),
+ (type_func, r"\\fB\1\\fP"),
+ (type_enum, r"\\fI\1\\fP"),
+ (type_struct, r"\\fI\1\\fP"),
+ (type_typedef, r"\\fI\1\\fP"),
+ (type_union, r"\\fI\1\\fP"),
+ (type_param, r"\\fI\1\\fP"),
+ (type_param_ref, r"\\fI\1\2\\fP"),
+ (type_member, r"\\fI\1\2\3\\fP"),
+ (type_fallback, r"\\fI\1\\fP")
+ )
+ blankline = ""
+
+ date_formats = [
+ "%a %b %d %H:%M:%S %Z %Y",
+ "%a %b %d %H:%M:%S %Y",
+ "%Y-%m-%d",
+ "%b %d %Y",
+ "%B %d %Y",
+ "%m %d %Y",
+ ]
+
+ def __init__(self, modulename):
+ """
+ Creates class variables.
+
+ Not really mandatory, but it is a good coding style and makes
+ pylint happy.
+ """
+
+ super().__init__()
+ self.modulename = modulename
+
+ dt = None
+ tstamp = os.environ.get("KBUILD_BUILD_TIMESTAMP")
+ if tstamp:
+ for fmt in self.date_formats:
+ try:
+ dt = datetime.strptime(tstamp, fmt)
+ break
+ except ValueError:
+ pass
+
+ if not dt:
+ dt = datetime.now()
+
+ self.man_date = dt.strftime("%B %Y")
+
+ def output_highlight(self, block):
+ """
+ Outputs a C symbol that may require being highlighted with
+ self.highlights variable using troff syntax
+ """
+
+ contents = self.highlight_block(block)
+
+ if isinstance(contents, list):
+ contents = "\n".join(contents)
+
+ for line in contents.strip("\n").split("\n"):
+ line = KernRe(r"^\s*").sub("", line)
+ if not line:
+ continue
+
+ if line[0] == ".":
+ self.data += "\\&" + line + "\n"
+ else:
+ self.data += line + "\n"
+
+ def out_doc(self, fname, name, args):
+ if not self.check_doc(name, args):
+ return
+
+ self.data += f'.TH "{self.modulename}" 9 "{self.modulename}" "{self.man_date}" "API Manual" LINUX' + "\n"
+
+ for section, text in args.sections.items():
+ self.data += f'.SH "{section}"' + "\n"
+ self.output_highlight(text)
+
+ def out_function(self, fname, name, args):
+ """output function in man"""
+
+ self.data += f'.TH "{name}" 9 "{name}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n"
+
+ self.data += ".SH NAME\n"
+ self.data += f"{name} \\- {args['purpose']}\n"
+
+ self.data += ".SH SYNOPSIS\n"
+ if args.get('functiontype', ''):
+ self.data += f'.B "{args["functiontype"]}" {name}' + "\n"
+ else:
+ self.data += f'.B "{name}' + "\n"
+
+ count = 0
+ parenth = "("
+ post = ","
+
+ for parameter in args.parameterlist:
+ if count == len(args.parameterlist) - 1:
+ post = ");"
+
+ dtype = args.parametertypes.get(parameter, "")
+ if function_pointer.match(dtype):
+ # Pointer-to-function
+ self.data += f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"' + "\n"
+ else:
+ dtype = KernRe(r'([^\*])$').sub(r'\1 ', dtype)
+
+ self.data += f'.BI "{parenth}{dtype}" "{post}"' + "\n"
+ count += 1
+ parenth = ""
+
+ if args.parameterlist:
+ self.data += ".SH ARGUMENTS\n"
+
+ for parameter in args.parameterlist:
+ parameter_name = re.sub(r'\[.*', '', parameter)
+
+ self.data += f'.IP "{parameter}" 12' + "\n"
+ self.output_highlight(args.parameterdescs.get(parameter_name, ""))
+
+ for section, text in args.sections.items():
+ self.data += f'.SH "{section.upper()}"' + "\n"
+ self.output_highlight(text)
+
+ def out_enum(self, fname, name, args):
+ self.data += f'.TH "{self.modulename}" 9 "enum {name}" "{self.man_date}" "API Manual" LINUX' + "\n"
+
+ self.data += ".SH NAME\n"
+ self.data += f"enum {name} \\- {args['purpose']}\n"
+
+ self.data += ".SH SYNOPSIS\n"
+ self.data += f"enum {name}" + " {\n"
+
+ count = 0
+ for parameter in args.parameterlist:
+ self.data += f'.br\n.BI " {parameter}"' + "\n"
+ if count == len(args.parameterlist) - 1:
+ self.data += "\n};\n"
+ else:
+ self.data += ", \n.br\n"
+
+ count += 1
+
+ self.data += ".SH Constants\n"
+
+ for parameter in args.parameterlist:
+ parameter_name = KernRe(r'\[.*').sub('', parameter)
+ self.data += f'.IP "{parameter}" 12' + "\n"
+ self.output_highlight(args.parameterdescs.get(parameter_name, ""))
+
+ for section, text in args.sections.items():
+ self.data += f'.SH "{section}"' + "\n"
+ self.output_highlight(text)
+
+ def out_typedef(self, fname, name, args):
+ module = self.modulename
+ purpose = args.get('purpose')
+
+ self.data += f'.TH "{module}" 9 "{name}" "{self.man_date}" "API Manual" LINUX' + "\n"
+
+ self.data += ".SH NAME\n"
+ self.data += f"typedef {name} \\- {purpose}\n"
+
+ for section, text in args.sections.items():
+ self.data += f'.SH "{section}"' + "\n"
+ self.output_highlight(text)
+
+ def out_struct(self, fname, name, args):
+ module = self.modulename
+ purpose = args.get('purpose')
+ definition = args.get('definition')
+
+ self.data += f'.TH "{module}" 9 "{args.type} {name}" "{self.man_date}" "API Manual" LINUX' + "\n"
+
+ self.data += ".SH NAME\n"
+ self.data += f"{args.type} {name} \\- {purpose}\n"
+
+ # Replace tabs with two spaces and handle newlines
+ declaration = definition.replace("\t", " ")
+ declaration = KernRe(r"\n").sub('"\n.br\n.BI "', declaration)
+
+ self.data += ".SH SYNOPSIS\n"
+ self.data += f"{args.type} {name} " + "{" + "\n.br\n"
+ self.data += f'.BI "{declaration}\n' + "};\n.br\n\n"
+
+ self.data += ".SH Members\n"
+ for parameter in args.parameterlist:
+ if parameter.startswith("#"):
+ continue
+
+ parameter_name = re.sub(r"\[.*", "", parameter)
+
+ if args.parameterdescs.get(parameter_name) == KernelDoc.undescribed:
+ continue
+
+ self.data += f'.IP "{parameter}" 12' + "\n"
+ self.output_highlight(args.parameterdescs.get(parameter_name))
+
+ for section, text in args.sections.items():
+ self.data += f'.SH "{section}"' + "\n"
+ self.output_highlight(text)
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
new file mode 100644
index 00000000000..fe730099eca
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -0,0 +1,1669 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+#
+# pylint: disable=C0301,C0302,R0904,R0912,R0913,R0914,R0915,R0917,R1702
+
+"""
+kdoc_parser
+===========
+
+Read a C language source or header FILE and extract embedded
+documentation comments
+"""
+
+import sys
+import re
+from pprint import pformat
+
+from kdoc_re import NestedMatch, KernRe
+from kdoc_item import KdocItem
+
+#
+# Regular expressions used to parse kernel-doc markups at KernelDoc class.
+#
+# Let's declare them in lowercase outside any class to make easier to
+# convert from the python script.
+#
+# As those are evaluated at the beginning, no need to cache them
+#
+
+# Allow whitespace at end of comment start.
+doc_start = KernRe(r'^/\*\*\s*$', cache=False)
+
+doc_end = KernRe(r'\*/', cache=False)
+doc_com = KernRe(r'\s*\*\s*', cache=False)
+doc_com_body = KernRe(r'\s*\* ?', cache=False)
+doc_decl = doc_com + KernRe(r'(\w+)', cache=False)
+
+# @params and a strictly limited set of supported section names
+# Specifically:
+# Match @word:
+# @...:
+# @{section-name}:
+# while trying to not match literal block starts like "example::"
+#
+known_section_names = 'description|context|returns?|notes?|examples?'
+known_sections = KernRe(known_section_names, flags = re.I)
+doc_sect = doc_com + \
+ KernRe(r'\s*(\@[.\w]+|\@\.\.\.|' + known_section_names + r')\s*:([^:].*)?$',
+ flags=re.I, cache=False)
+
+doc_content = doc_com_body + KernRe(r'(.*)', cache=False)
+doc_inline_start = KernRe(r'^\s*/\*\*\s*$', cache=False)
+doc_inline_sect = KernRe(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
+doc_inline_end = KernRe(r'^\s*\*/\s*$', cache=False)
+doc_inline_oneline = KernRe(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
+attribute = KernRe(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
+ flags=re.I | re.S, cache=False)
+
+export_symbol = KernRe(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
+export_symbol_ns = KernRe(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
+
+type_param = KernRe(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
+
+#
+# Tests for the beginning of a kerneldoc block in its various forms.
+#
+doc_block = doc_com + KernRe(r'DOC:\s*(.*)?', cache=False)
+doc_begin_data = KernRe(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)", cache = False)
+doc_begin_func = KernRe(str(doc_com) + # initial " * '
+ r"(?:\w+\s*\*\s*)?" + # type (not captured)
+ r'(?:define\s+)?' + # possible "define" (not captured)
+ r'(\w+)\s*(?:\(\w*\))?\s*' + # name and optional "(...)"
+ r'(?:[-:].*)?$', # description (not captured)
+ cache = False)
+
+#
+# A little helper to get rid of excess white space
+#
+multi_space = KernRe(r'\s\s+')
+def trim_whitespace(s):
+ return multi_space.sub(' ', s.strip())
+
+class state:
+ """
+ State machine enums
+ """
+
+ # Parser states
+ NORMAL = 0 # normal code
+ NAME = 1 # looking for function name
+ DECLARATION = 2 # We have seen a declaration which might not be done
+ BODY = 3 # the body of the comment
+ SPECIAL_SECTION = 4 # doc section ending with a blank line
+ PROTO = 5 # scanning prototype
+ DOCBLOCK = 6 # documentation block
+ INLINE_NAME = 7 # gathering doc outside main block
+ INLINE_TEXT = 8 # reading the body of inline docs
+
+ name = [
+ "NORMAL",
+ "NAME",
+ "DECLARATION",
+ "BODY",
+ "SPECIAL_SECTION",
+ "PROTO",
+ "DOCBLOCK",
+ "INLINE_NAME",
+ "INLINE_TEXT",
+ ]
+
+
+SECTION_DEFAULT = "Description" # default section
+
+class KernelEntry:
+
+ def __init__(self, config, ln):
+ self.config = config
+
+ self._contents = []
+ self.prototype = ""
+
+ self.warnings = []
+
+ self.parameterlist = []
+ self.parameterdescs = {}
+ self.parametertypes = {}
+ self.parameterdesc_start_lines = {}
+
+ self.section_start_lines = {}
+ self.sections = {}
+
+ self.anon_struct_union = False
+
+ self.leading_space = None
+
+ # State flags
+ self.brcount = 0
+ self.declaration_start_line = ln + 1
+
+ #
+ # Management of section contents
+ #
+ def add_text(self, text):
+ self._contents.append(text)
+
+ def contents(self):
+ return '\n'.join(self._contents) + '\n'
+
+ # TODO: rename to emit_message after removal of kernel-doc.pl
+ def emit_msg(self, log_msg, warning=True):
+ """Emit a message"""
+
+ if not warning:
+ self.config.log.info(log_msg)
+ return
+
+ # Delegate warning output to output logic, as this way it
+ # will report warnings/info only for symbols that are output
+
+ self.warnings.append(log_msg)
+ return
+
+ #
+ # Begin a new section.
+ #
+ def begin_section(self, line_no, title = SECTION_DEFAULT, dump = False):
+ if dump:
+ self.dump_section(start_new = True)
+ self.section = title
+ self.new_start_line = line_no
+
+ def dump_section(self, start_new=True):
+ """
+ Dumps section contents to arrays/hashes intended for that purpose.
+ """
+ #
+ # If we have accumulated no contents in the default ("description")
+ # section, don't bother.
+ #
+ if self.section == SECTION_DEFAULT and not self._contents:
+ return
+ name = self.section
+ contents = self.contents()
+
+ if type_param.match(name):
+ name = type_param.group(1)
+
+ self.parameterdescs[name] = contents
+ self.parameterdesc_start_lines[name] = self.new_start_line
+
+ self.new_start_line = 0
+
+ else:
+ if name in self.sections and self.sections[name] != "":
+ # Only warn on user-specified duplicate section names
+ if name != SECTION_DEFAULT:
+ self.emit_msg(self.new_start_line,
+ f"duplicate section name '{name}'\n")
+ # Treat as a new paragraph - add a blank line
+ self.sections[name] += '\n' + contents
+ else:
+ self.sections[name] = contents
+ self.section_start_lines[name] = self.new_start_line
+ self.new_start_line = 0
+
+# self.config.log.debug("Section: %s : %s", name, pformat(vars(self)))
+
+ if start_new:
+ self.section = SECTION_DEFAULT
+ self._contents = []
+
+
+class KernelDoc:
+ """
+ Read a C language source or header FILE and extract embedded
+ documentation comments.
+ """
+
+ # Section names
+
+ section_context = "Context"
+ section_return = "Return"
+
+ undescribed = "-- undescribed --"
+
+ def __init__(self, config, fname):
+ """Initialize internal variables"""
+
+ self.fname = fname
+ self.config = config
+
+ # Initial state for the state machines
+ self.state = state.NORMAL
+
+ # Store entry currently being processed
+ self.entry = None
+
+ # Place all potential outputs into an array
+ self.entries = []
+
+ #
+ # We need Python 3.7 for its "dicts remember the insertion
+ # order" guarantee
+ #
+ if sys.version_info.major == 3 and sys.version_info.minor < 7:
+ self.emit_msg(0,
+ 'Python 3.7 or later is required for correct results')
+
+ def emit_msg(self, ln, msg, warning=True):
+ """Emit a message"""
+
+ log_msg = f"{self.fname}:{ln} {msg}"
+
+ if self.entry:
+ self.entry.emit_msg(log_msg, warning)
+ return
+
+ if warning:
+ self.config.log.warning(log_msg)
+ else:
+ self.config.log.info(log_msg)
+
+ def dump_section(self, start_new=True):
+ """
+ Dumps section contents to arrays/hashes intended for that purpose.
+ """
+
+ if self.entry:
+ self.entry.dump_section(start_new)
+
+ # TODO: rename it to store_declaration after removal of kernel-doc.pl
+ def output_declaration(self, dtype, name, **args):
+ """
+ Stores the entry into an entry array.
+
+ The actual output and output filters will be handled elsewhere
+ """
+
+ item = KdocItem(name, dtype, self.entry.declaration_start_line, **args)
+ item.warnings = self.entry.warnings
+
+ # Drop empty sections
+ # TODO: improve empty sections logic to emit warnings
+ sections = self.entry.sections
+ for section in ["Description", "Return"]:
+ if section in sections and not sections[section].rstrip():
+ del sections[section]
+ item.set_sections(sections, self.entry.section_start_lines)
+ item.set_params(self.entry.parameterlist, self.entry.parameterdescs,
+ self.entry.parametertypes,
+ self.entry.parameterdesc_start_lines)
+ self.entries.append(item)
+
+ self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
+
+ def reset_state(self, ln):
+ """
+ Ancillary routine to create a new entry. It initializes all
+ variables used by the state machine.
+ """
+
+ self.entry = KernelEntry(self.config, ln)
+
+ # State flags
+ self.state = state.NORMAL
+
+ def push_parameter(self, ln, decl_type, param, dtype,
+ org_arg, declaration_name):
+ """
+ Store parameters and their descriptions at self.entry.
+ """
+
+ if self.entry.anon_struct_union and dtype == "" and param == "}":
+ return # Ignore the ending }; from anonymous struct/union
+
+ self.entry.anon_struct_union = False
+
+ param = KernRe(r'[\[\)].*').sub('', param, count=1)
+
+ if dtype == "" and param.endswith("..."):
+ if KernRe(r'\w\.\.\.$').search(param):
+ # For named variable parameters of the form `x...`,
+ # remove the dots
+ param = param[:-3]
+ else:
+ # Handles unnamed variable parameters
+ param = "..."
+
+ if param not in self.entry.parameterdescs or \
+ not self.entry.parameterdescs[param]:
+
+ self.entry.parameterdescs[param] = "variable arguments"
+
+ elif dtype == "" and (not param or param == "void"):
+ param = "void"
+ self.entry.parameterdescs[param] = "no arguments"
+
+ elif dtype == "" and param in ["struct", "union"]:
+ # Handle unnamed (anonymous) union or struct
+ dtype = param
+ param = "{unnamed_" + param + "}"
+ self.entry.parameterdescs[param] = "anonymous\n"
+ self.entry.anon_struct_union = True
+
+ # Handle cache group enforcing variables: they do not need
+ # to be described in header files
+ elif "__cacheline_group" in param:
+ # Ignore __cacheline_group_begin and __cacheline_group_end
+ return
+
+ # Warn if parameter has no description
+ # (but ignore ones starting with # as these are not parameters
+ # but inline preprocessor statements)
+ if param not in self.entry.parameterdescs and not param.startswith("#"):
+ self.entry.parameterdescs[param] = self.undescribed
+
+ if "." not in param:
+ if decl_type == 'function':
+ dname = f"{decl_type} parameter"
+ else:
+ dname = f"{decl_type} member"
+
+ self.emit_msg(ln,
+ f"{dname} '{param}' not described in '{declaration_name}'")
+
+ # Strip spaces from param so that it is one continuous string on
+ # parameterlist. This fixes a problem where check_sections()
+ # cannot find a parameter like "addr[6 + 2]" because it actually
+ # appears as "addr[6", "+", "2]" on the parameter list.
+ # However, it's better to maintain the param string unchanged for
+ # output, so just weaken the string compare in check_sections()
+ # to ignore "[blah" in a parameter string.
+
+ self.entry.parameterlist.append(param)
+ org_arg = KernRe(r'\s\s+').sub(' ', org_arg)
+ self.entry.parametertypes[param] = org_arg
+
+
+ def create_parameter_list(self, ln, decl_type, args,
+ splitter, declaration_name):
+ """
+ Creates a list of parameters, storing them at self.entry.
+ """
+
+ # temporarily replace all commas inside function pointer definition
+ arg_expr = KernRe(r'(\([^\),]+),')
+ while arg_expr.search(args):
+ args = arg_expr.sub(r"\1#", args)
+
+ for arg in args.split(splitter):
+ # Strip comments
+ arg = KernRe(r'\/\*.*\*\/').sub('', arg)
+
+ # Ignore argument attributes
+ arg = KernRe(r'\sPOS0?\s').sub(' ', arg)
+
+ # Strip leading/trailing spaces
+ arg = arg.strip()
+ arg = KernRe(r'\s+').sub(' ', arg, count=1)
+
+ if arg.startswith('#'):
+ # Treat preprocessor directive as a typeless variable just to fill
+ # corresponding data structures "correctly". Catch it later in
+ # output_* subs.
+
+ # Treat preprocessor directive as a typeless variable
+ self.push_parameter(ln, decl_type, arg, "",
+ "", declaration_name)
+
+ elif KernRe(r'\(.+\)\s*\(').search(arg):
+ # Pointer-to-function
+
+ arg = arg.replace('#', ',')
+
+ r = KernRe(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
+ if r.match(arg):
+ param = r.group(1)
+ else:
+ self.emit_msg(ln, f"Invalid param: {arg}")
+ param = arg
+
+ dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+ self.push_parameter(ln, decl_type, param, dtype,
+ arg, declaration_name)
+
+ elif KernRe(r'\(.+\)\s*\[').search(arg):
+ # Array-of-pointers
+
+ arg = arg.replace('#', ',')
+ r = KernRe(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
+ if r.match(arg):
+ param = r.group(1)
+ else:
+ self.emit_msg(ln, f"Invalid param: {arg}")
+ param = arg
+
+ dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
+
+ self.push_parameter(ln, decl_type, param, dtype,
+ arg, declaration_name)
+
+ elif arg:
+ arg = KernRe(r'\s*:\s*').sub(":", arg)
+ arg = KernRe(r'\s*\[').sub('[', arg)
+
+ args = KernRe(r'\s*,\s*').split(arg)
+ if args[0] and '*' in args[0]:
+ args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
+
+ first_arg = []
+ r = KernRe(r'^(.*\s+)(.*?\[.*\].*)$')
+ if args[0] and r.match(args[0]):
+ args.pop(0)
+ first_arg.extend(r.group(1))
+ first_arg.append(r.group(2))
+ else:
+ first_arg = KernRe(r'\s+').split(args.pop(0))
+
+ args.insert(0, first_arg.pop())
+ dtype = ' '.join(first_arg)
+
+ for param in args:
+ if KernRe(r'^(\*+)\s*(.*)').match(param):
+ r = KernRe(r'^(\*+)\s*(.*)')
+ if not r.match(param):
+ self.emit_msg(ln, f"Invalid param: {param}")
+ continue
+
+ param = r.group(1)
+
+ self.push_parameter(ln, decl_type, r.group(2),
+ f"{dtype} {r.group(1)}",
+ arg, declaration_name)
+
+ elif KernRe(r'(.*?):(\w+)').search(param):
+ r = KernRe(r'(.*?):(\w+)')
+ if not r.match(param):
+ self.emit_msg(ln, f"Invalid param: {param}")
+ continue
+
+ if dtype != "": # Skip unnamed bit-fields
+ self.push_parameter(ln, decl_type, r.group(1),
+ f"{dtype}:{r.group(2)}",
+ arg, declaration_name)
+ else:
+ self.push_parameter(ln, decl_type, param, dtype,
+ arg, declaration_name)
+
+ def check_sections(self, ln, decl_name, decl_type):
+ """
+ Check for errors inside sections, emitting warnings if not found
+ parameters are described.
+ """
+ for section in self.entry.sections:
+ if section not in self.entry.parameterlist and \
+ not known_sections.search(section):
+ if decl_type == 'function':
+ dname = f"{decl_type} parameter"
+ else:
+ dname = f"{decl_type} member"
+ self.emit_msg(ln,
+ f"Excess {dname} '{section}' description in '{decl_name}'")
+
+ def check_return_section(self, ln, declaration_name, return_type):
+ """
+ If the function doesn't return void, warns about the lack of a
+ return description.
+ """
+
+ if not self.config.wreturn:
+ return
+
+ # Ignore an empty return type (It's a macro)
+ # Ignore functions with a "void" return type (but not "void *")
+ if not return_type or KernRe(r'void\s*\w*\s*$').search(return_type):
+ return
+
+ if not self.entry.sections.get("Return", None):
+ self.emit_msg(ln,
+ f"No description found for return value of '{declaration_name}'")
+
+ def dump_struct(self, ln, proto):
+ """
+ Store an entry for an struct or union
+ """
+
+ type_pattern = r'(struct|union)'
+
+ qualifiers = [
+ "__attribute__",
+ "__packed",
+ "__aligned",
+ "____cacheline_aligned_in_smp",
+ "____cacheline_aligned",
+ ]
+
+ definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
+ struct_members = KernRe(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
+
+ # Extract struct/union definition
+ members = None
+ declaration_name = None
+ decl_type = None
+
+ r = KernRe(type_pattern + r'\s+(\w+)\s*' + definition_body)
+ if r.search(proto):
+ decl_type = r.group(1)
+ declaration_name = r.group(2)
+ members = r.group(3)
+ else:
+ r = KernRe(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
+
+ if r.search(proto):
+ decl_type = r.group(1)
+ declaration_name = r.group(3)
+ members = r.group(2)
+
+ if not members:
+ self.emit_msg(ln, f"{proto} error: Cannot parse struct or union!")
+ return
+
+ if self.entry.identifier != declaration_name:
+ self.emit_msg(ln,
+ f"expecting prototype for {decl_type} {self.entry.identifier}. Prototype was for {decl_type} {declaration_name} instead\n")
+ return
+
+ args_pattern = r'([^,)]+)'
+
+ sub_prefixes = [
+ (KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), ''),
+ (KernRe(r'\/\*\s*private:.*', re.S | re.I), ''),
+
+ # Strip comments
+ (KernRe(r'\/\*.*?\*\/', re.S), ''),
+
+ # Strip attributes
+ (attribute, ' '),
+ (KernRe(r'\s*__aligned\s*\([^;]*\)', re.S), ' '),
+ (KernRe(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '),
+ (KernRe(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '),
+ (KernRe(r'\s*__packed\s*', re.S), ' '),
+ (KernRe(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '),
+ (KernRe(r'\s*____cacheline_aligned_in_smp', re.S), ' '),
+ (KernRe(r'\s*____cacheline_aligned', re.S), ' '),
+
+ # Unwrap struct_group macros based on this definition:
+ # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
+ # which has variants like: struct_group(NAME, MEMBERS...)
+ # Only MEMBERS arguments require documentation.
+ #
+ # Parsing them happens on two steps:
+ #
+ # 1. drop struct group arguments that aren't at MEMBERS,
+ # storing them as STRUCT_GROUP(MEMBERS)
+ #
+ # 2. remove STRUCT_GROUP() ancillary macro.
+ #
+ # The original logic used to remove STRUCT_GROUP() using an
+ # advanced regex:
+ #
+ # \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;
+ #
+ # with two patterns that are incompatible with
+ # Python re module, as it has:
+ #
+ # - a recursive pattern: (?1)
+ # - an atomic grouping: (?>...)
+ #
+ # I tried a simpler version: but it didn't work either:
+ # \bSTRUCT_GROUP\(([^\)]+)\)[^;]*;
+ #
+ # As it doesn't properly match the end parenthesis on some cases.
+ #
+ # So, a better solution was crafted: there's now a NestedMatch
+ # class that ensures that delimiters after a search are properly
+ # matched. So, the implementation to drop STRUCT_GROUP() will be
+ # handled in separate.
+
+ (KernRe(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('),
+ (KernRe(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GROUP('),
+ (KernRe(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'struct \1 \2; STRUCT_GROUP('),
+ (KernRe(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP('),
+
+ # Replace macros
+ #
+ # TODO: use NestedMatch for FOO($1, $2, ...) matches
+ #
+ # it is better to also move those to the NestedMatch logic,
+ # to ensure that parenthesis will be properly matched.
+
+ (KernRe(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
+ (KernRe(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
+ (KernRe(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'),
+ (KernRe(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'),
+ (KernRe(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+ (KernRe(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
+ (KernRe(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\1 \2[]'),
+ (KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S), r'dma_addr_t \1'),
+ (KernRe(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S), r'__u32 \1'),
+ (KernRe(r'VIRTIO_DECLARE_FEATURES\s*\(' + args_pattern + r'\)', re.S), r'u64 \1; u64 \1_array[VIRTIO_FEATURES_DWORDS]'),
+ ]
+
+ # Regexes here are guaranteed to have the end limiter matching
+ # the start delimiter. Yet, right now, only one replace group
+ # is allowed.
+
+ sub_nested_prefixes = [
+ (re.compile(r'\bSTRUCT_GROUP\('), r'\1'),
+ ]
+
+ for search, sub in sub_prefixes:
+ members = search.sub(sub, members)
+
+ nested = NestedMatch()
+
+ for search, sub in sub_nested_prefixes:
+ members = nested.sub(search, sub, members)
+
+ # Keeps the original declaration as-is
+ declaration = members
+
+ # Split nested struct/union elements
+ #
+ # This loop was simpler at the original kernel-doc perl version, as
+ # while ($members =~ m/$struct_members/) { ... }
+ # reads 'members' string on each interaction.
+ #
+ # Python behavior is different: it parses 'members' only once,
+ # creating a list of tuples from the first interaction.
+ #
+ # On other words, this won't get nested structs.
+ #
+ # So, we need to have an extra loop on Python to override such
+ # re limitation.
+
+ while True:
+ tuples = struct_members.findall(members)
+ if not tuples:
+ break
+
+ for t in tuples:
+ newmember = ""
+ maintype = t[0]
+ s_ids = t[5]
+ content = t[3]
+
+ oldmember = "".join(t)
+
+ for s_id in s_ids.split(','):
+ s_id = s_id.strip()
+
+ newmember += f"{maintype} {s_id}; "
+ s_id = KernRe(r'[:\[].*').sub('', s_id)
+ s_id = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
+
+ for arg in content.split(';'):
+ arg = arg.strip()
+
+ if not arg:
+ continue
+
+ r = KernRe(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
+ if r.match(arg):
+ # Pointer-to-function
+ dtype = r.group(1)
+ name = r.group(2)
+ extra = r.group(3)
+
+ if not name:
+ continue
+
+ if not s_id:
+ # Anonymous struct/union
+ newmember += f"{dtype}{name}{extra}; "
+ else:
+ newmember += f"{dtype}{s_id}.{name}{extra}; "
+
+ else:
+ arg = arg.strip()
+ # Handle bitmaps
+ arg = KernRe(r':\s*\d+\s*').sub('', arg)
+
+ # Handle arrays
+ arg = KernRe(r'\[.*\]').sub('', arg)
+
+ # Handle multiple IDs
+ arg = KernRe(r'\s*,\s*').sub(',', arg)
+
+ r = KernRe(r'(.*)\s+([\S+,]+)')
+
+ if r.search(arg):
+ dtype = r.group(1)
+ names = r.group(2)
+ else:
+ newmember += f"{arg}; "
+ continue
+
+ for name in names.split(','):
+ name = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
+
+ if not name:
+ continue
+
+ if not s_id:
+ # Anonymous struct/union
+ newmember += f"{dtype} {name}; "
+ else:
+ newmember += f"{dtype} {s_id}.{name}; "
+
+ members = members.replace(oldmember, newmember)
+
+ # Ignore other nested elements, like enums
+ members = re.sub(r'(\{[^\{\}]*\})', '', members)
+
+ self.create_parameter_list(ln, decl_type, members, ';',
+ declaration_name)
+ self.check_sections(ln, declaration_name, decl_type)
+
+ # Adjust declaration for better display
+ declaration = KernRe(r'([\{;])').sub(r'\1\n', declaration)
+ declaration = KernRe(r'\}\s+;').sub('};', declaration)
+
+ # Better handle inlined enums
+ while True:
+ r = KernRe(r'(enum\s+\{[^\}]+),([^\n])')
+ if not r.search(declaration):
+ break
+
+ declaration = r.sub(r'\1,\n\2', declaration)
+
+ def_args = declaration.split('\n')
+ level = 1
+ declaration = ""
+ for clause in def_args:
+
+ clause = clause.strip()
+ clause = KernRe(r'\s+').sub(' ', clause, count=1)
+
+ if not clause:
+ continue
+
+ if '}' in clause and level > 1:
+ level -= 1
+
+ if not KernRe(r'^\s*#').match(clause):
+ declaration += "\t" * level
+
+ declaration += "\t" + clause + "\n"
+ if "{" in clause and "}" not in clause:
+ level += 1
+
+ self.output_declaration(decl_type, declaration_name,
+ definition=declaration,
+ purpose=self.entry.declaration_purpose)
+
+ def dump_enum(self, ln, proto):
+ """
+ Stores an enum inside self.entries array.
+ """
+
+ # Ignore members marked private
+ proto = KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
+ proto = KernRe(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
+
+ # Strip comments
+ proto = KernRe(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
+
+ # Strip #define macros inside enums
+ proto = KernRe(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
+
+ #
+ # Parse out the name and members of the enum. Typedef form first.
+ #
+ r = KernRe(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
+ if r.search(proto):
+ declaration_name = r.group(2)
+ members = r.group(1).rstrip()
+ #
+ # Failing that, look for a straight enum
+ #
+ else:
+ r = KernRe(r'enum\s+(\w*)\s*\{(.*)\}')
+ if r.match(proto):
+ declaration_name = r.group(1)
+ members = r.group(2).rstrip()
+ #
+ # OK, this isn't going to work.
+ #
+ else:
+ self.emit_msg(ln, f"{proto}: error: Cannot parse enum!")
+ return
+ #
+ # Make sure we found what we were expecting.
+ #
+ if self.entry.identifier != declaration_name:
+ if self.entry.identifier == "":
+ self.emit_msg(ln,
+ f"{proto}: wrong kernel-doc identifier on prototype")
+ else:
+ self.emit_msg(ln,
+ f"expecting prototype for enum {self.entry.identifier}. "
+ f"Prototype was for enum {declaration_name} instead")
+ return
+
+ if not declaration_name:
+ declaration_name = "(anonymous)"
+ #
+ # Parse out the name of each enum member, and verify that we
+ # have a description for it.
+ #
+ member_set = set()
+ members = KernRe(r'\([^;)]*\)').sub('', members)
+ for arg in members.split(','):
+ if not arg:
+ continue
+ arg = KernRe(r'^\s*(\w+).*').sub(r'\1', arg)
+ self.entry.parameterlist.append(arg)
+ if arg not in self.entry.parameterdescs:
+ self.entry.parameterdescs[arg] = self.undescribed
+ self.emit_msg(ln,
+ f"Enum value '{arg}' not described in enum '{declaration_name}'")
+ member_set.add(arg)
+ #
+ # Ensure that every described member actually exists in the enum.
+ #
+ for k in self.entry.parameterdescs:
+ if k not in member_set:
+ self.emit_msg(ln,
+ f"Excess enum value '%{k}' description in '{declaration_name}'")
+
+ self.output_declaration('enum', declaration_name,
+ purpose=self.entry.declaration_purpose)
+
+ def dump_declaration(self, ln, prototype):
+ """
+ Stores a data declaration inside self.entries array.
+ """
+
+ if self.entry.decl_type == "enum":
+ self.dump_enum(ln, prototype)
+ elif self.entry.decl_type == "typedef":
+ self.dump_typedef(ln, prototype)
+ elif self.entry.decl_type in ["union", "struct"]:
+ self.dump_struct(ln, prototype)
+ else:
+ # This would be a bug
+ self.emit_message(ln, f'Unknown declaration type: {self.entry.decl_type}')
+
+ def dump_function(self, ln, prototype):
+ """
+ Stores a function of function macro inside self.entries array.
+ """
+
+ func_macro = False
+ return_type = ''
+ decl_type = 'function'
+
+ # Prefixes that would be removed
+ sub_prefixes = [
+ (r"^static +", "", 0),
+ (r"^extern +", "", 0),
+ (r"^asmlinkage +", "", 0),
+ (r"^inline +", "", 0),
+ (r"^__inline__ +", "", 0),
+ (r"^__inline +", "", 0),
+ (r"^__always_inline +", "", 0),
+ (r"^noinline +", "", 0),
+ (r"^__FORTIFY_INLINE +", "", 0),
+ (r"__init +", "", 0),
+ (r"__init_or_module +", "", 0),
+ (r"__deprecated +", "", 0),
+ (r"__flatten +", "", 0),
+ (r"__meminit +", "", 0),
+ (r"__must_check +", "", 0),
+ (r"__weak +", "", 0),
+ (r"__sched +", "", 0),
+ (r"_noprof", "", 0),
+ (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0),
+ (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", 0),
+ (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0),
+ (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2", 0),
+ (r"__attribute_const__ +", "", 0),
+
+ # It seems that Python support for re.X is broken:
+ # At least for me (Python 3.13), this didn't work
+# (r"""
+# __attribute__\s*\(\(
+# (?:
+# [\w\s]+ # attribute name
+# (?:\([^)]*\))? # attribute arguments
+# \s*,? # optional comma at the end
+# )+
+# \)\)\s+
+# """, "", re.X),
+
+ # So, remove whitespaces and comments from it
+ (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+", "", 0),
+ ]
+
+ for search, sub, flags in sub_prefixes:
+ prototype = KernRe(search, flags).sub(sub, prototype)
+
+ # Macros are a special case, as they change the prototype format
+ new_proto = KernRe(r"^#\s*define\s+").sub("", prototype)
+ if new_proto != prototype:
+ is_define_proto = True
+ prototype = new_proto
+ else:
+ is_define_proto = False
+
+ # Yes, this truly is vile. We are looking for:
+ # 1. Return type (may be nothing if we're looking at a macro)
+ # 2. Function name
+ # 3. Function parameters.
+ #
+ # All the while we have to watch out for function pointer parameters
+ # (which IIRC is what the two sections are for), C types (these
+ # regexps don't even start to express all the possibilities), and
+ # so on.
+ #
+ # If you mess with these regexps, it's a good idea to check that
+ # the following functions' documentation still comes out right:
+ # - parport_register_device (function pointer parameters)
+ # - atomic_set (macro)
+ # - pci_match_device, __copy_to_user (long return type)
+
+ name = r'[a-zA-Z0-9_~:]+'
+ prototype_end1 = r'[^\(]*'
+ prototype_end2 = r'[^\{]*'
+ prototype_end = fr'\(({prototype_end1}|{prototype_end2})\)'
+
+ # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing group.
+ # So, this needs to be mapped in Python with (?:...)? or (?:...)+
+
+ type1 = r'(?:[\w\s]+)?'
+ type2 = r'(?:[\w\s]+\*+)+'
+
+ found = False
+
+ if is_define_proto:
+ r = KernRe(r'^()(' + name + r')\s+')
+
+ if r.search(prototype):
+ return_type = ''
+ declaration_name = r.group(2)
+ func_macro = True
+
+ found = True
+
+ if not found:
+ patterns = [
+ rf'^()({name})\s*{prototype_end}',
+ rf'^({type1})\s+({name})\s*{prototype_end}',
+ rf'^({type2})\s*({name})\s*{prototype_end}',
+ ]
+
+ for p in patterns:
+ r = KernRe(p)
+
+ if r.match(prototype):
+
+ return_type = r.group(1)
+ declaration_name = r.group(2)
+ args = r.group(3)
+
+ self.create_parameter_list(ln, decl_type, args, ',',
+ declaration_name)
+
+ found = True
+ break
+ if not found:
+ self.emit_msg(ln,
+ f"cannot understand function prototype: '{prototype}'")
+ return
+
+ if self.entry.identifier != declaration_name:
+ self.emit_msg(ln,
+ f"expecting prototype for {self.entry.identifier}(). Prototype was for {declaration_name}() instead")
+ return
+
+ self.check_sections(ln, declaration_name, "function")
+
+ self.check_return_section(ln, declaration_name, return_type)
+
+ if 'typedef' in return_type:
+ self.output_declaration(decl_type, declaration_name,
+ typedef=True,
+ functiontype=return_type,
+ purpose=self.entry.declaration_purpose,
+ func_macro=func_macro)
+ else:
+ self.output_declaration(decl_type, declaration_name,
+ typedef=False,
+ functiontype=return_type,
+ purpose=self.entry.declaration_purpose,
+ func_macro=func_macro)
+
+ def dump_typedef(self, ln, proto):
+ """
+ Stores a typedef inside self.entries array.
+ """
+
+ typedef_type = r'((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s*'
+ typedef_ident = r'\*?\s*(\w\S+)\s*'
+ typedef_args = r'\s*\((.*)\);'
+
+ typedef1 = KernRe(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
+ typedef2 = KernRe(r'typedef' + typedef_type + typedef_ident + typedef_args)
+
+ # Strip comments
+ proto = KernRe(r'/\*.*?\*/', flags=re.S).sub('', proto)
+
+ # Parse function typedef prototypes
+ for r in [typedef1, typedef2]:
+ if not r.match(proto):
+ continue
+
+ return_type = r.group(1).strip()
+ declaration_name = r.group(2)
+ args = r.group(3)
+
+ if self.entry.identifier != declaration_name:
+ self.emit_msg(ln,
+ f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+ return
+
+ decl_type = 'function'
+ self.create_parameter_list(ln, decl_type, args, ',', declaration_name)
+
+ self.output_declaration(decl_type, declaration_name,
+ typedef=True,
+ functiontype=return_type,
+ purpose=self.entry.declaration_purpose)
+ return
+
+ # Handle nested parentheses or brackets
+ r = KernRe(r'(\(*.\)\s*|\[*.\]\s*);$')
+ while r.search(proto):
+ proto = r.sub('', proto)
+
+ # Parse simple typedefs
+ r = KernRe(r'typedef.*\s+(\w+)\s*;')
+ if r.match(proto):
+ declaration_name = r.group(1)
+
+ if self.entry.identifier != declaration_name:
+ self.emit_msg(ln,
+ f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
+ return
+
+ self.output_declaration('typedef', declaration_name,
+ purpose=self.entry.declaration_purpose)
+ return
+
+ self.emit_msg(ln, "error: Cannot parse typedef!")
+
+ @staticmethod
+ def process_export(function_set, line):
+ """
+ process EXPORT_SYMBOL* tags
+
+ This method doesn't use any variable from the class, so declare it
+ with a staticmethod decorator.
+ """
+
+ # We support documenting some exported symbols with different
+ # names. A horrible hack.
+ suffixes = [ '_noprof' ]
+
+ # Note: it accepts only one EXPORT_SYMBOL* per line, as having
+ # multiple export lines would violate Kernel coding style.
+
+ if export_symbol.search(line):
+ symbol = export_symbol.group(2)
+ elif export_symbol_ns.search(line):
+ symbol = export_symbol_ns.group(2)
+ else:
+ return False
+ #
+ # Found an export, trim out any special suffixes
+ #
+ for suffix in suffixes:
+ # Be backward compatible with Python < 3.9
+ if symbol.endswith(suffix):
+ symbol = symbol[:-len(suffix)]
+ function_set.add(symbol)
+ return True
+
+ def process_normal(self, ln, line):
+ """
+ STATE_NORMAL: looking for the /** to begin everything.
+ """
+
+ if not doc_start.match(line):
+ return
+
+ # start a new entry
+ self.reset_state(ln)
+
+ # next line is always the function name
+ self.state = state.NAME
+
+ def process_name(self, ln, line):
+ """
+ STATE_NAME: Looking for the "name - description" line
+ """
+ #
+ # Check for a DOC: block and handle them specially.
+ #
+ if doc_block.search(line):
+
+ if not doc_block.group(1):
+ self.entry.begin_section(ln, "Introduction")
+ else:
+ self.entry.begin_section(ln, doc_block.group(1))
+
+ self.entry.identifier = self.entry.section
+ self.state = state.DOCBLOCK
+ #
+ # Otherwise we're looking for a normal kerneldoc declaration line.
+ #
+ elif doc_decl.search(line):
+ self.entry.identifier = doc_decl.group(1)
+
+ # Test for data declaration
+ if doc_begin_data.search(line):
+ self.entry.decl_type = doc_begin_data.group(1)
+ self.entry.identifier = doc_begin_data.group(2)
+ #
+ # Look for a function description
+ #
+ elif doc_begin_func.search(line):
+ self.entry.identifier = doc_begin_func.group(1)
+ self.entry.decl_type = "function"
+ #
+ # We struck out.
+ #
+ else:
+ self.emit_msg(ln,
+ f"This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{line}")
+ self.state = state.NORMAL
+ return
+ #
+ # OK, set up for a new kerneldoc entry.
+ #
+ self.state = state.BODY
+ self.entry.identifier = self.entry.identifier.strip(" ")
+ # if there's no @param blocks need to set up default section here
+ self.entry.begin_section(ln + 1)
+ #
+ # Find the description portion, which *should* be there but
+ # isn't always.
+ # (We should be able to capture this from the previous parsing - someday)
+ #
+ r = KernRe("[-:](.*)")
+ if r.search(line):
+ self.entry.declaration_purpose = trim_whitespace(r.group(1))
+ self.state = state.DECLARATION
+ else:
+ self.entry.declaration_purpose = ""
+
+ if not self.entry.declaration_purpose and self.config.wshort_desc:
+ self.emit_msg(ln,
+ f"missing initial short description on line:\n{line}")
+
+ if not self.entry.identifier and self.entry.decl_type != "enum":
+ self.emit_msg(ln,
+ f"wrong kernel-doc identifier on line:\n{line}")
+ self.state = state.NORMAL
+
+ if self.config.verbose:
+ self.emit_msg(ln,
+ f"Scanning doc for {self.entry.decl_type} {self.entry.identifier}",
+ warning=False)
+ #
+ # Failed to find an identifier. Emit a warning
+ #
+ else:
+ self.emit_msg(ln, f"Cannot find identifier on line:\n{line}")
+
+ #
+ # Helper function to determine if a new section is being started.
+ #
+ def is_new_section(self, ln, line):
+ if doc_sect.search(line):
+ self.state = state.BODY
+ #
+ # Pick out the name of our new section, tweaking it if need be.
+ #
+ newsection = doc_sect.group(1)
+ if newsection.lower() == 'description':
+ newsection = 'Description'
+ elif newsection.lower() == 'context':
+ newsection = 'Context'
+ self.state = state.SPECIAL_SECTION
+ elif newsection.lower() in ["@return", "@returns",
+ "return", "returns"]:
+ newsection = "Return"
+ self.state = state.SPECIAL_SECTION
+ elif newsection[0] == '@':
+ self.state = state.SPECIAL_SECTION
+ #
+ # Initialize the contents, and get the new section going.
+ #
+ newcontents = doc_sect.group(2)
+ if not newcontents:
+ newcontents = ""
+ self.dump_section()
+ self.entry.begin_section(ln, newsection)
+ self.entry.leading_space = None
+
+ self.entry.add_text(newcontents.lstrip())
+ return True
+ return False
+
+ #
+ # Helper function to detect (and effect) the end of a kerneldoc comment.
+ #
+ def is_comment_end(self, ln, line):
+ if doc_end.search(line):
+ self.dump_section()
+
+ # Look for doc_com + <text> + doc_end:
+ r = KernRe(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
+ if r.match(line):
+ self.emit_msg(ln, f"suspicious ending line: {line}")
+
+ self.entry.prototype = ""
+ self.entry.new_start_line = ln + 1
+
+ self.state = state.PROTO
+ return True
+ return False
+
+
+ def process_decl(self, ln, line):
+ """
+ STATE_DECLARATION: We've seen the beginning of a declaration
+ """
+ if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
+ return
+ #
+ # Look for anything with the " * " line beginning.
+ #
+ if doc_content.search(line):
+ cont = doc_content.group(1)
+ #
+ # A blank line means that we have moved out of the declaration
+ # part of the comment (without any "special section" parameter
+ # descriptions).
+ #
+ if cont == "":
+ self.state = state.BODY
+ #
+ # Otherwise we have more of the declaration section to soak up.
+ #
+ else:
+ self.entry.declaration_purpose = \
+ trim_whitespace(self.entry.declaration_purpose + ' ' + cont)
+ else:
+ # Unknown line, ignore
+ self.emit_msg(ln, f"bad line: {line}")
+
+
+ def process_special(self, ln, line):
+ """
+ STATE_SPECIAL_SECTION: a section ending with a blank line
+ """
+ #
+ # If we have hit a blank line (only the " * " marker), then this
+ # section is done.
+ #
+ if KernRe(r"\s*\*\s*$").match(line):
+ self.entry.begin_section(ln, dump = True)
+ self.state = state.BODY
+ return
+ #
+ # Not a blank line, look for the other ways to end the section.
+ #
+ if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
+ return
+ #
+ # OK, we should have a continuation of the text for this section.
+ #
+ if doc_content.search(line):
+ cont = doc_content.group(1)
+ #
+ # If the lines of text after the first in a special section have
+ # leading white space, we need to trim it out or Sphinx will get
+ # confused. For the second line (the None case), see what we
+ # find there and remember it.
+ #
+ if self.entry.leading_space is None:
+ r = KernRe(r'^(\s+)')
+ if r.match(cont):
+ self.entry.leading_space = len(r.group(1))
+ else:
+ self.entry.leading_space = 0
+ #
+ # Otherwise, before trimming any leading chars, be *sure*
+ # that they are white space. We should maybe warn if this
+ # isn't the case.
+ #
+ for i in range(0, self.entry.leading_space):
+ if cont[i] != " ":
+ self.entry.leading_space = i
+ break
+ #
+ # Add the trimmed result to the section and we're done.
+ #
+ self.entry.add_text(cont[self.entry.leading_space:])
+ else:
+ # Unknown line, ignore
+ self.emit_msg(ln, f"bad line: {line}")
+
+ def process_body(self, ln, line):
+ """
+ STATE_BODY: the bulk of a kerneldoc comment.
+ """
+ if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
+ return
+
+ if doc_content.search(line):
+ cont = doc_content.group(1)
+ self.entry.add_text(cont)
+ else:
+ # Unknown line, ignore
+ self.emit_msg(ln, f"bad line: {line}")
+
+ def process_inline_name(self, ln, line):
+ """STATE_INLINE_NAME: beginning of docbook comments within a prototype."""
+
+ if doc_inline_sect.search(line):
+ self.entry.begin_section(ln, doc_inline_sect.group(1))
+ self.entry.add_text(doc_inline_sect.group(2).lstrip())
+ self.state = state.INLINE_TEXT
+ elif doc_inline_end.search(line):
+ self.dump_section()
+ self.state = state.PROTO
+ elif doc_content.search(line):
+ self.emit_msg(ln, f"Incorrect use of kernel-doc format: {line}")
+ self.state = state.PROTO
+ # else ... ??
+
+ def process_inline_text(self, ln, line):
+ """STATE_INLINE_TEXT: docbook comments within a prototype."""
+
+ if doc_inline_end.search(line):
+ self.dump_section()
+ self.state = state.PROTO
+ elif doc_content.search(line):
+ self.entry.add_text(doc_content.group(1))
+ # else ... ??
+
+ def syscall_munge(self, ln, proto): # pylint: disable=W0613
+ """
+ Handle syscall definitions
+ """
+
+ is_void = False
+
+ # Strip newlines/CR's
+ proto = re.sub(r'[\r\n]+', ' ', proto)
+
+ # Check if it's a SYSCALL_DEFINE0
+ if 'SYSCALL_DEFINE0' in proto:
+ is_void = True
+
+ # Replace SYSCALL_DEFINE with correct return type & function name
+ proto = KernRe(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
+
+ r = KernRe(r'long\s+(sys_.*?),')
+ if r.search(proto):
+ proto = KernRe(',').sub('(', proto, count=1)
+ elif is_void:
+ proto = KernRe(r'\)').sub('(void)', proto, count=1)
+
+ # Now delete all of the odd-numbered commas in the proto
+ # so that argument types & names don't have a comma between them
+ count = 0
+ length = len(proto)
+
+ if is_void:
+ length = 0 # skip the loop if is_void
+
+ for ix in range(length):
+ if proto[ix] == ',':
+ count += 1
+ if count % 2 == 1:
+ proto = proto[:ix] + ' ' + proto[ix + 1:]
+
+ return proto
+
+ def tracepoint_munge(self, ln, proto):
+ """
+ Handle tracepoint definitions
+ """
+
+ tracepointname = None
+ tracepointargs = None
+
+ # Match tracepoint name based on different patterns
+ r = KernRe(r'TRACE_EVENT\((.*?),')
+ if r.search(proto):
+ tracepointname = r.group(1)
+
+ r = KernRe(r'DEFINE_SINGLE_EVENT\((.*?),')
+ if r.search(proto):
+ tracepointname = r.group(1)
+
+ r = KernRe(r'DEFINE_EVENT\((.*?),(.*?),')
+ if r.search(proto):
+ tracepointname = r.group(2)
+
+ if tracepointname:
+ tracepointname = tracepointname.lstrip()
+
+ r = KernRe(r'TP_PROTO\((.*?)\)')
+ if r.search(proto):
+ tracepointargs = r.group(1)
+
+ if not tracepointname or not tracepointargs:
+ self.emit_msg(ln,
+ f"Unrecognized tracepoint format:\n{proto}\n")
+ else:
+ proto = f"static inline void trace_{tracepointname}({tracepointargs})"
+ self.entry.identifier = f"trace_{self.entry.identifier}"
+
+ return proto
+
+ def process_proto_function(self, ln, line):
+ """Ancillary routine to process a function prototype"""
+
+ # strip C99-style comments to end of line
+ line = KernRe(r"\/\/.*$", re.S).sub('', line)
+ #
+ # Soak up the line's worth of prototype text, stopping at { or ; if present.
+ #
+ if KernRe(r'\s*#\s*define').match(line):
+ self.entry.prototype = line
+ elif not line.startswith('#'): # skip other preprocessor stuff
+ r = KernRe(r'([^\{]*)')
+ if r.match(line):
+ self.entry.prototype += r.group(1) + " "
+ #
+ # If we now have the whole prototype, clean it up and declare victory.
+ #
+ if '{' in line or ';' in line or KernRe(r'\s*#\s*define').match(line):
+ # strip comments and surrounding spaces
+ self.entry.prototype = KernRe(r'/\*.*\*/').sub('', self.entry.prototype).strip()
+ #
+ # Handle self.entry.prototypes for function pointers like:
+ # int (*pcs_config)(struct foo)
+ # by turning it into
+ # int pcs_config(struct foo)
+ #
+ r = KernRe(r'^(\S+\s+)\(\s*\*(\S+)\)')
+ self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
+ #
+ # Handle special declaration syntaxes
+ #
+ if 'SYSCALL_DEFINE' in self.entry.prototype:
+ self.entry.prototype = self.syscall_munge(ln,
+ self.entry.prototype)
+ else:
+ r = KernRe(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
+ if r.search(self.entry.prototype):
+ self.entry.prototype = self.tracepoint_munge(ln,
+ self.entry.prototype)
+ #
+ # ... and we're done
+ #
+ self.dump_function(ln, self.entry.prototype)
+ self.reset_state(ln)
+
+ def process_proto_type(self, ln, line):
+ """Ancillary routine to process a type"""
+
+ # Strip C99-style comments and surrounding whitespace
+ line = KernRe(r"//.*$", re.S).sub('', line).strip()
+ if not line:
+ return # nothing to see here
+
+ # To distinguish preprocessor directive from regular declaration later.
+ if line.startswith('#'):
+ line += ";"
+ #
+ # Split the declaration on any of { } or ;, and accumulate pieces
+ # until we hit a semicolon while not inside {brackets}
+ #
+ r = KernRe(r'(.*?)([{};])')
+ for chunk in r.split(line):
+ if chunk: # Ignore empty matches
+ self.entry.prototype += chunk
+ #
+ # This cries out for a match statement ... someday after we can
+ # drop Python 3.9 ...
+ #
+ if chunk == '{':
+ self.entry.brcount += 1
+ elif chunk == '}':
+ self.entry.brcount -= 1
+ elif chunk == ';' and self.entry.brcount <= 0:
+ self.dump_declaration(ln, self.entry.prototype)
+ self.reset_state(ln)
+ return
+ #
+ # We hit the end of the line while still in the declaration; put
+ # in a space to represent the newline.
+ #
+ self.entry.prototype += ' '
+
+ def process_proto(self, ln, line):
+ """STATE_PROTO: reading a function/whatever prototype."""
+
+ if doc_inline_oneline.search(line):
+ self.entry.begin_section(ln, doc_inline_oneline.group(1))
+ self.entry.add_text(doc_inline_oneline.group(2))
+ self.dump_section()
+
+ elif doc_inline_start.search(line):
+ self.state = state.INLINE_NAME
+
+ elif self.entry.decl_type == 'function':
+ self.process_proto_function(ln, line)
+
+ else:
+ self.process_proto_type(ln, line)
+
+ def process_docblock(self, ln, line):
+ """STATE_DOCBLOCK: within a DOC: block."""
+
+ if doc_end.search(line):
+ self.dump_section()
+ self.output_declaration("doc", self.entry.identifier)
+ self.reset_state(ln)
+
+ elif doc_content.search(line):
+ self.entry.add_text(doc_content.group(1))
+
+ def parse_export(self):
+ """
+ Parses EXPORT_SYMBOL* macros from a single Kernel source file.
+ """
+
+ export_table = set()
+
+ try:
+ with open(self.fname, "r", encoding="utf8",
+ errors="backslashreplace") as fp:
+
+ for line in fp:
+ self.process_export(export_table, line)
+
+ except IOError:
+ return None
+
+ return export_table
+
+ #
+ # The state/action table telling us which function to invoke in
+ # each state.
+ #
+ state_actions = {
+ state.NORMAL: process_normal,
+ state.NAME: process_name,
+ state.BODY: process_body,
+ state.DECLARATION: process_decl,
+ state.SPECIAL_SECTION: process_special,
+ state.INLINE_NAME: process_inline_name,
+ state.INLINE_TEXT: process_inline_text,
+ state.PROTO: process_proto,
+ state.DOCBLOCK: process_docblock,
+ }
+
+ def parse_kdoc(self):
+ """
+ Open and process each line of a C source file.
+ The parsing is controlled via a state machine, and the line is passed
+ to a different process function depending on the state. The process
+ function may update the state as needed.
+
+ Besides parsing kernel-doc tags, it also parses export symbols.
+ """
+
+ prev = ""
+ prev_ln = None
+ export_table = set()
+
+ try:
+ with open(self.fname, "r", encoding="utf8",
+ errors="backslashreplace") as fp:
+ for ln, line in enumerate(fp):
+
+ line = line.expandtabs().strip("\n")
+
+ # Group continuation lines on prototypes
+ if self.state == state.PROTO:
+ if line.endswith("\\"):
+ prev += line.rstrip("\\")
+ if not prev_ln:
+ prev_ln = ln
+ continue
+
+ if prev:
+ ln = prev_ln
+ line = prev + line
+ prev = ""
+ prev_ln = None
+
+ self.config.log.debug("%d %s: %s",
+ ln, state.name[self.state],
+ line)
+
+ # This is an optimization over the original script.
+ # There, when export_file was used for the same file,
+ # it was read twice. Here, we use the already-existing
+ # loop to parse exported symbols as well.
+ #
+ if (self.state != state.NORMAL) or \
+ not self.process_export(export_table, line):
+ # Hand this line to the appropriate state handler
+ self.state_actions[self.state](self, ln, line)
+
+ except OSError:
+ self.config.log.error(f"Error: Cannot open file {self.fname}")
+
+ return export_table, self.entries
diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
new file mode 100644
index 00000000000..612223e1e72
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_re.py
@@ -0,0 +1,270 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
+
+"""
+Regular expression ancillary classes.
+
+Those help caching regular expressions and do matching for kernel-doc.
+"""
+
+import re
+
+# Local cache for regular expressions
+re_cache = {}
+
+
+class KernRe:
+ """
+ Helper class to simplify regex declaration and usage,
+
+ It calls re.compile for a given pattern. It also allows adding
+ regular expressions and define sub at class init time.
+
+ Regular expressions can be cached via an argument, helping to speedup
+ searches.
+ """
+
+ def _add_regex(self, string, flags):
+ """
+ Adds a new regex or re-use it from the cache.
+ """
+ self.regex = re_cache.get(string, None)
+ if not self.regex:
+ self.regex = re.compile(string, flags=flags)
+ if self.cache:
+ re_cache[string] = self.regex
+
+ def __init__(self, string, cache=True, flags=0):
+ """
+ Compile a regular expression and initialize internal vars.
+ """
+
+ self.cache = cache
+ self.last_match = None
+
+ self._add_regex(string, flags)
+
+ def __str__(self):
+ """
+ Return the regular expression pattern.
+ """
+ return self.regex.pattern
+
+ def __add__(self, other):
+ """
+ Allows adding two regular expressions into one.
+ """
+
+ return KernRe(str(self) + str(other), cache=self.cache or other.cache,
+ flags=self.regex.flags | other.regex.flags)
+
+ def match(self, string):
+ """
+ Handles a re.match storing its results
+ """
+
+ self.last_match = self.regex.match(string)
+ return self.last_match
+
+ def search(self, string):
+ """
+ Handles a re.search storing its results
+ """
+
+ self.last_match = self.regex.search(string)
+ return self.last_match
+
+ def findall(self, string):
+ """
+ Alias to re.findall
+ """
+
+ return self.regex.findall(string)
+
+ def split(self, string):
+ """
+ Alias to re.split
+ """
+
+ return self.regex.split(string)
+
+ def sub(self, sub, string, count=0):
+ """
+ Alias to re.sub
+ """
+
+ return self.regex.sub(sub, string, count=count)
+
+ def group(self, num):
+ """
+ Returns the group results of the last match
+ """
+
+ return self.last_match.group(num)
+
+
+class NestedMatch:
+ """
+ Finding nested delimiters is hard with regular expressions. It is
+ even harder on Python with its normal re module, as there are several
+ advanced regular expressions that are missing.
+
+ This is the case of this pattern:
+
+ '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;'
+
+ which is used to properly match open/close parenthesis of the
+ string search STRUCT_GROUP(),
+
+ Add a class that counts pairs of delimiters, using it to match and
+ replace nested expressions.
+
+ The original approach was suggested by:
+ https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+
+ Although I re-implemented it to make it more generic and match 3 types
+ of delimiters. The logic checks if delimiters are paired. If not, it
+ will ignore the search string.
+ """
+
+ # TODO: make NestedMatch handle multiple match groups
+ #
+ # Right now, regular expressions to match it are defined only up to
+ # the start delimiter, e.g.:
+ #
+ # \bSTRUCT_GROUP\(
+ #
+ # is similar to: STRUCT_GROUP\((.*)\)
+ # except that the content inside the match group is delimiter's aligned.
+ #
+ # The content inside parenthesis are converted into a single replace
+ # group (e.g. r`\1').
+ #
+ # It would be nice to change such definition to support multiple
+ # match groups, allowing a regex equivalent to.
+ #
+ # FOO\((.*), (.*), (.*)\)
+ #
+ # it is probably easier to define it not as a regular expression, but
+ # with some lexical definition like:
+ #
+ # FOO(arg1, arg2, arg3)
+
+ DELIMITER_PAIRS = {
+ '{': '}',
+ '(': ')',
+ '[': ']',
+ }
+
+ RE_DELIM = re.compile(r'[\{\}\[\]\(\)]')
+
+ def _search(self, regex, line):
+ """
+ Finds paired blocks for a regex that ends with a delimiter.
+
+ The suggestion of using finditer to match pairs came from:
+ https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
+ but I ended using a different implementation to align all three types
+ of delimiters and seek for an initial regular expression.
+
+ The algorithm seeks for open/close paired delimiters and place them
+ into a stack, yielding a start/stop position of each match when the
+ stack is zeroed.
+
+ The algorithm shoud work fine for properly paired lines, but will
+ silently ignore end delimiters that preceeds an start delimiter.
+ This should be OK for kernel-doc parser, as unaligned delimiters
+ would cause compilation errors. So, we don't need to rise exceptions
+ to cover such issues.
+ """
+
+ stack = []
+
+ for match_re in regex.finditer(line):
+ start = match_re.start()
+ offset = match_re.end()
+
+ d = line[offset - 1]
+ if d not in self.DELIMITER_PAIRS:
+ continue
+
+ end = self.DELIMITER_PAIRS[d]
+ stack.append(end)
+
+ for match in self.RE_DELIM.finditer(line[offset:]):
+ pos = match.start() + offset
+
+ d = line[pos]
+
+ if d in self.DELIMITER_PAIRS:
+ end = self.DELIMITER_PAIRS[d]
+
+ stack.append(end)
+ continue
+
+ # Does the end delimiter match what it is expected?
+ if stack and d == stack[-1]:
+ stack.pop()
+
+ if not stack:
+ yield start, offset, pos + 1
+ break
+
+ def search(self, regex, line):
+ """
+ This is similar to re.search:
+
+ It matches a regex that it is followed by a delimiter,
+ returning occurrences only if all delimiters are paired.
+ """
+
+ for t in self._search(regex, line):
+
+ yield line[t[0]:t[2]]
+
+ def sub(self, regex, sub, line, count=0):
+ """
+ This is similar to re.sub:
+
+ It matches a regex that it is followed by a delimiter,
+ replacing occurrences only if all delimiters are paired.
+
+ if r'\1' is used, it works just like re: it places there the
+ matched paired data with the delimiter stripped.
+
+ If count is different than zero, it will replace at most count
+ items.
+ """
+ out = ""
+
+ cur_pos = 0
+ n = 0
+
+ for start, end, pos in self._search(regex, line):
+ out += line[cur_pos:start]
+
+ # Value, ignoring start/end delimiters
+ value = line[end:pos - 1]
+
+ # replaces \1 at the sub string, if \1 is used there
+ new_sub = sub
+ new_sub = new_sub.replace(r'\1', value)
+
+ out += new_sub
+
+ # Drop end ';' if any
+ if line[pos] == ';':
+ pos += 1
+
+ cur_pos = pos
+ n += 1
+
+ if count and count >= n:
+ break
+
+ # Append the remaining string
+ l = len(line)
+ out += line[cur_pos:l]
+
+ return out
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (2 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 10:01 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards Peter Maydell
` (5 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
This commit is the Python version of our older commit
b30df2751e5 ("scripts/kernel-doc: strip QEMU_ from function definitions").
Some versions of Sphinx get confused if function attributes are
left on the C code from kernel-doc; strip out any QEMU_* prefixes
from function prototypes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
scripts/lib/kdoc/kdoc_parser.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index fe730099eca..32b43562929 100644
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -907,6 +907,7 @@ def dump_function(self, ln, prototype):
(r"^__always_inline +", "", 0),
(r"^noinline +", "", 0),
(r"^__FORTIFY_INLINE +", "", 0),
+ (r"QEMU_[A-Z_]+ +", "", 0),
(r"__init +", "", 0),
(r"__init_or_module +", "", 0),
(r"__deprecated +", "", 0),
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (3 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 10:34 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 6/8] scripts/kerneldoc: Switch to the Python kernel-doc script Peter Maydell
` (4 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
This commit makes the equivalent changes to the Python script that we
had for the old Perl script in commit 4cf41794411f ("docs: tweak
kernel-doc for QEMU coding standards"). To repeat the rationale from
that commit:
Surprisingly, QEMU does have a pretty consistent doc comment style and
it is not very different from the Linux kernel's. Of the documentation
"sigils", only "#" separates the QEMU doc comment style from Linux's,
and it has 200+ instances vs. 6 for the kernel's '&struct foo' (all in
accel/tcg/translate-all.c), so it's clear that the two standards are
different in this respect. In addition, our structs are typedefed and
recognized by CamelCase names.
Note that in 4cf41794411f we used '(?!)' as our type_fallback regex;
this is strictly not quite a replacement for the upstream
'\&([_\w]+)', because the latter includes a group that can later be
matched with \1, and the former does not. The old perl script did
not care about this, but the python version does, so we must include
the extra set of brackets to ensure we have a group.
This commit does not include all the same changes that 4cf41794411f
did. Of the missing pieces, some had already gone in an earlier
kernel-doc update; the parts we still had but do not include here are:
@@ -2057,7 +2060,7 @@
}
elsif (/$doc_decl/o) {
$identifier = $1;
- if (/\s*([\w\s]+?)(\(\))?\s*-/) {
+ if (/\s*([\w\s]+?)(\s*-|:)/) {
$identifier = $1;
}
@@ -2067,7 +2070,7 @@
$contents = "";
$section = $section_default;
$new_start_line = $. + 1;
- if (/-(.*)/) {
+ if (/[-:](.*)/) {
# strip leading/trailing/multiple spaces
$descr= $1;
$descr =~ s/^\s*//;
The second of these is already in the upstream version: the line r =
KernRe("[-:](.*)") in process_name() matches the regex we have. The
first change has been refactored into the doc_begin_data and
doc_begin_func changes. Since the output HTML for QEMU's
documentation has no relevant changes with the new kerneldoc, we
assume that this too has been handled upstream.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
scripts/lib/kdoc/kdoc_output.py | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index ea8914537ba..39fa872dfca 100644
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -38,12 +38,12 @@
type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
type_env = KernRe(r"(\$\w+)", cache=False)
-type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
-type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
-type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
-type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
-type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
-type_fallback = KernRe(r"\&([_\w]+)", cache=False)
+type_enum = KernRe(r"#(enum\s*([_\w]+))", cache=False)
+type_struct = KernRe(r"#(struct\s*([_\w]+))", cache=False)
+type_typedef = KernRe(r"#(([A-Z][_\w]*))", cache=False)
+type_union = KernRe(r"#(union\s*([_\w]+))", cache=False)
+type_member = KernRe(r"#([_\w]+)(\.|->)([_\w]+)", cache=False)
+type_fallback = KernRe(r"((?!))", cache=False) # this never matches
type_member_func = type_member + KernRe(r"\(\)", cache=False)
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 6/8] scripts/kerneldoc: Switch to the Python kernel-doc script
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (4 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl " Peter Maydell
` (3 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
Change the Sphinx config to run the new Python kernel-doc script
instead of the Perl one. The only difference between the two is that
the new script does not handle the -sphinx-version option, instead
assuming that Sphinx is always at least version 3: so we must
delete the code that passes that option to avoid the Python
script complaining about an unknown option.
QEMU's minimum Sphinx version is already 3.4.3, so this doesn't
change the set of versions we can handle.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
For quite a long time during development of this patchseries
I was still letting Sphinx invoke the new python script
by calling perl, which will read the #! line and invoke
env which then finds python3 and runs it...
---
docs/conf.py | 4 +++-
docs/sphinx/kerneldoc.py | 5 -----
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/docs/conf.py b/docs/conf.py
index f892a6e1da3..e09769e5f83 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -341,7 +341,9 @@
# We use paths starting from qemu_docdir here so that you can run
# sphinx-build from anywhere and the kerneldoc extension can still
# find everything.
-kerneldoc_bin = ['perl', os.path.join(qemu_docdir, '../scripts/kernel-doc')]
+# Since kernel-doc is now a Python script, we should run it with whatever
+# Python this sphinx is using (rather than letting it find one via env)
+kerneldoc_bin = [sys.executable, os.path.join(qemu_docdir, '../scripts/kernel-doc.py')]
kerneldoc_srctree = os.path.join(qemu_docdir, '..')
hxtool_srctree = os.path.join(qemu_docdir, '..')
qapidoc_srctree = os.path.join(qemu_docdir, '..')
diff --git a/docs/sphinx/kerneldoc.py b/docs/sphinx/kerneldoc.py
index 30bb3431983..9721072e476 100644
--- a/docs/sphinx/kerneldoc.py
+++ b/docs/sphinx/kerneldoc.py
@@ -63,11 +63,6 @@ def run(self):
env = self.state.document.settings.env
cmd = env.config.kerneldoc_bin + ['-rst', '-enable-lineno']
- # Pass the version string to kernel-doc, as it needs to use a different
- # dialect, depending what the C domain supports for each specific
- # Sphinx versions
- cmd += ['-sphinx-version', sphinx.__version__]
-
# Pass through the warnings-as-errors flag
if env.config.kerneldoc_werror:
cmd += ['-Werror']
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl kernel-doc script
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (5 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 6/8] scripts/kerneldoc: Switch to the Python kernel-doc script Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 10:35 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section Peter Maydell
` (2 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
We can now delete the old Perl kernel-doc script. For posterity,
this is a complete diff of the local changes that we were carrying
between the kernel's Perl script as of kernel commit 72b97d0b911872ba
(the last time we synced it) and our local copy:
--- /tmp/kdoc 2025-08-14 10:42:47.620331939 +0100
+++ scripts/kernel-doc 2025-02-17 10:44:34.528421457 +0000
@@ -1,5 +1,5 @@
#!/usr/bin/env perl
-# SPDX-License-Identifier: GPL-2.0
+# SPDX-License-Identifier: GPL-2.0-only
use warnings;
use strict;
@@ -224,12 +224,12 @@
my $type_fp_param = '\@(\w+)\(\)'; # Special RST handling for func ptr params
my $type_fp_param2 = '\@(\w+->\S+)\(\)'; # Special RST handling for structs with func ptr params
my $type_env = '(\$\w+)';
-my $type_enum = '\&(enum\s*([_\w]+))';
-my $type_struct = '\&(struct\s*([_\w]+))';
-my $type_typedef = '\&(typedef\s*([_\w]+))';
-my $type_union = '\&(union\s*([_\w]+))';
-my $type_member = '\&([_\w]+)(\.|->)([_\w]+)';
-my $type_fallback = '\&([_\w]+)';
+my $type_enum = '#(enum\s*([_\w]+))';
+my $type_struct = '#(struct\s*([_\w]+))';
+my $type_typedef = '#(([A-Z][_\w]*))';
+my $type_union = '#(union\s*([_\w]+))';
+my $type_member = '#([_\w]+)(\.|->)([_\w]+)';
+my $type_fallback = '(?!)'; # this never matches
my $type_member_func = $type_member . '\(\)';
# Output conversion substitutions.
@@ -1745,6 +1745,9 @@
)+
\)\)\s+//x;
+ # Strip QEMU specific compiler annotations
+ $prototype =~ s/QEMU_[A-Z_]+ +//;
+
# Yes, this truly is vile. We are looking for:
# 1. Return type (may be nothing if we're looking at a macro)
# 2. Function name
@@ -2057,7 +2060,7 @@
}
elsif (/$doc_decl/o) {
$identifier = $1;
- if (/\s*([\w\s]+?)(\(\))?\s*-/) {
+ if (/\s*([\w\s]+?)(\s*-|:)/) {
$identifier = $1;
}
@@ -2067,7 +2070,7 @@
$contents = "";
$section = $section_default;
$new_start_line = $. + 1;
- if (/-(.*)/) {
+ if (/[-:](.*)/) {
# strip leading/trailing/multiple spaces
$descr= $1;
$descr =~ s/^\s*//;
These changes correspond to:
06e2329636f license: Update deprecated SPDX tag GPL-2.0 to GPL-2.0-only
(a bulk change which we won't bother to re-apply to this third-party script)
b30df2751e5 scripts/kernel-doc: strip QEMU_ from function definitions
4cf41794411 docs: tweak kernel-doc for QEMU coding standards
We have already applied the equivalent of these changes to the
Python code in libs/kdoc/ in the preceding commits.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
.editorconfig | 2 +-
scripts/kernel-doc | 2442 --------------------------------------------
2 files changed, 1 insertion(+), 2443 deletions(-)
delete mode 100755 scripts/kernel-doc
diff --git a/.editorconfig b/.editorconfig
index a04cb9054cb..258d41ab485 100644
--- a/.editorconfig
+++ b/.editorconfig
@@ -55,7 +55,7 @@ indent_size = 4
emacs_mode = perl
# but user kernel "style" for imported scripts
-[scripts/{kernel-doc,get_maintainer.pl,checkpatch.pl}]
+[scripts/{get_maintainer.pl,checkpatch.pl}]
indent_style = tab
indent_size = 8
emacs_mode = perl
diff --git a/scripts/kernel-doc b/scripts/kernel-doc
deleted file mode 100755
index fec83f53eda..00000000000
--- a/scripts/kernel-doc
+++ /dev/null
@@ -1,2442 +0,0 @@
-#!/usr/bin/env perl
-# SPDX-License-Identifier: GPL-2.0-only
-
-use warnings;
-use strict;
-
-## Copyright (c) 1998 Michael Zucchi, All Rights Reserved ##
-## Copyright (C) 2000, 1 Tim Waugh <twaugh@redhat.com> ##
-## Copyright (C) 2001 Simon Huggins ##
-## Copyright (C) 2005-2012 Randy Dunlap ##
-## Copyright (C) 2012 Dan Luedtke ##
-## ##
-## #define enhancements by Armin Kuster <akuster@mvista.com> ##
-## Copyright (c) 2000 MontaVista Software, Inc. ##
-## ##
-## This software falls under the GNU General Public License. ##
-## Please read the COPYING file for more information ##
-
-# 18/01/2001 - Cleanups
-# Functions prototyped as foo(void) same as foo()
-# Stop eval'ing where we don't need to.
-# -- huggie@earth.li
-
-# 27/06/2001 - Allowed whitespace after initial "/**" and
-# allowed comments before function declarations.
-# -- Christian Kreibich <ck@whoop.org>
-
-# Still to do:
-# - add perldoc documentation
-# - Look more closely at some of the scarier bits :)
-
-# 26/05/2001 - Support for separate source and object trees.
-# Return error code.
-# Keith Owens <kaos@ocs.com.au>
-
-# 23/09/2001 - Added support for typedefs, structs, enums and unions
-# Support for Context section; can be terminated using empty line
-# Small fixes (like spaces vs. \s in regex)
-# -- Tim Jansen <tim@tjansen.de>
-
-# 25/07/2012 - Added support for HTML5
-# -- Dan Luedtke <mail@danrl.de>
-
-sub usage {
- my $message = <<"EOF";
-Usage: $0 [OPTION ...] FILE ...
-
-Read C language source or header FILEs, extract embedded documentation comments,
-and print formatted documentation to standard output.
-
-The documentation comments are identified by "/**" opening comment mark. See
-Documentation/doc-guide/kernel-doc.rst for the documentation comment syntax.
-
-Output format selection (mutually exclusive):
- -man Output troff manual page format. This is the default.
- -rst Output reStructuredText format.
- -none Do not output documentation, only warnings.
-
-Output format selection modifier (affects only ReST output):
-
- -sphinx-version Use the ReST C domain dialect compatible with an
- specific Sphinx Version.
- If not specified, kernel-doc will auto-detect using
- the sphinx-build version found on PATH.
-
-Output selection (mutually exclusive):
- -export Only output documentation for symbols that have been
- exported using EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL()
- in any input FILE or -export-file FILE.
- -internal Only output documentation for symbols that have NOT been
- exported using EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL()
- in any input FILE or -export-file FILE.
- -function NAME Only output documentation for the given function(s)
- or DOC: section title(s). All other functions and DOC:
- sections are ignored. May be specified multiple times.
- -nosymbol NAME Exclude the specified symbols from the output
- documentation. May be specified multiple times.
-
-Output selection modifiers:
- -no-doc-sections Do not output DOC: sections.
- -enable-lineno Enable output of #define LINENO lines. Only works with
- reStructuredText format.
- -export-file FILE Specify an additional FILE in which to look for
- EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL(). To be used with
- -export or -internal. May be specified multiple times.
-
-Other parameters:
- -v Verbose output, more warnings and other information.
- -h Print this help.
- -Werror Treat warnings as errors.
-
-EOF
- print $message;
- exit 1;
-}
-
-#
-# format of comments.
-# In the following table, (...)? signifies optional structure.
-# (...)* signifies 0 or more structure elements
-# /**
-# * function_name(:)? (- short description)?
-# (* @parameterx: (description of parameter x)?)*
-# (* a blank line)?
-# * (Description:)? (Description of function)?
-# * (section header: (section description)? )*
-# (*)?*/
-#
-# So .. the trivial example would be:
-#
-# /**
-# * my_function
-# */
-#
-# If the Description: header tag is omitted, then there must be a blank line
-# after the last parameter specification.
-# e.g.
-# /**
-# * my_function - does my stuff
-# * @my_arg: its mine damnit
-# *
-# * Does my stuff explained.
-# */
-#
-# or, could also use:
-# /**
-# * my_function - does my stuff
-# * @my_arg: its mine damnit
-# * Description: Does my stuff explained.
-# */
-# etc.
-#
-# Besides functions you can also write documentation for structs, unions,
-# enums and typedefs. Instead of the function name you must write the name
-# of the declaration; the struct/union/enum/typedef must always precede
-# the name. Nesting of declarations is not supported.
-# Use the argument mechanism to document members or constants.
-# e.g.
-# /**
-# * struct my_struct - short description
-# * @a: first member
-# * @b: second member
-# *
-# * Longer description
-# */
-# struct my_struct {
-# int a;
-# int b;
-# /* private: */
-# int c;
-# };
-#
-# All descriptions can be multiline, except the short function description.
-#
-# For really longs structs, you can also describe arguments inside the
-# body of the struct.
-# eg.
-# /**
-# * struct my_struct - short description
-# * @a: first member
-# * @b: second member
-# *
-# * Longer description
-# */
-# struct my_struct {
-# int a;
-# int b;
-# /**
-# * @c: This is longer description of C
-# *
-# * You can use paragraphs to describe arguments
-# * using this method.
-# */
-# int c;
-# };
-#
-# This should be use only for struct/enum members.
-#
-# You can also add additional sections. When documenting kernel functions you
-# should document the "Context:" of the function, e.g. whether the functions
-# can be called form interrupts. Unlike other sections you can end it with an
-# empty line.
-# A non-void function should have a "Return:" section describing the return
-# value(s).
-# Example-sections should contain the string EXAMPLE so that they are marked
-# appropriately in DocBook.
-#
-# Example:
-# /**
-# * user_function - function that can only be called in user context
-# * @a: some argument
-# * Context: !in_interrupt()
-# *
-# * Some description
-# * Example:
-# * user_function(22);
-# */
-# ...
-#
-#
-# All descriptive text is further processed, scanning for the following special
-# patterns, which are highlighted appropriately.
-#
-# 'funcname()' - function
-# '$ENVVAR' - environmental variable
-# '&struct_name' - name of a structure (up to two words including 'struct')
-# '&struct_name.member' - name of a structure member
-# '@parameter' - name of a parameter
-# '%CONST' - name of a constant.
-# '``LITERAL``' - literal string without any spaces on it.
-
-## init lots of data
-
-my $errors = 0;
-my $warnings = 0;
-my $anon_struct_union = 0;
-
-# match expressions used to find embedded type information
-my $type_constant = '\b``([^\`]+)``\b';
-my $type_constant2 = '\%([-_\w]+)';
-my $type_func = '(\w+)\(\)';
-my $type_param = '\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)';
-my $type_param_ref = '([\!]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)';
-my $type_fp_param = '\@(\w+)\(\)'; # Special RST handling for func ptr params
-my $type_fp_param2 = '\@(\w+->\S+)\(\)'; # Special RST handling for structs with func ptr params
-my $type_env = '(\$\w+)';
-my $type_enum = '#(enum\s*([_\w]+))';
-my $type_struct = '#(struct\s*([_\w]+))';
-my $type_typedef = '#(([A-Z][_\w]*))';
-my $type_union = '#(union\s*([_\w]+))';
-my $type_member = '#([_\w]+)(\.|->)([_\w]+)';
-my $type_fallback = '(?!)'; # this never matches
-my $type_member_func = $type_member . '\(\)';
-
-# Output conversion substitutions.
-# One for each output format
-
-# these are pretty rough
-my @highlights_man = (
- [$type_constant, "\$1"],
- [$type_constant2, "\$1"],
- [$type_func, "\\\\fB\$1\\\\fP"],
- [$type_enum, "\\\\fI\$1\\\\fP"],
- [$type_struct, "\\\\fI\$1\\\\fP"],
- [$type_typedef, "\\\\fI\$1\\\\fP"],
- [$type_union, "\\\\fI\$1\\\\fP"],
- [$type_param, "\\\\fI\$1\\\\fP"],
- [$type_param_ref, "\\\\fI\$1\$2\\\\fP"],
- [$type_member, "\\\\fI\$1\$2\$3\\\\fP"],
- [$type_fallback, "\\\\fI\$1\\\\fP"]
- );
-my $blankline_man = "";
-
-# rst-mode
-my @highlights_rst = (
- [$type_constant, "``\$1``"],
- [$type_constant2, "``\$1``"],
- # Note: need to escape () to avoid func matching later
- [$type_member_func, "\\:c\\:type\\:`\$1\$2\$3\\\\(\\\\) <\$1>`"],
- [$type_member, "\\:c\\:type\\:`\$1\$2\$3 <\$1>`"],
- [$type_fp_param, "**\$1\\\\(\\\\)**"],
- [$type_fp_param2, "**\$1\\\\(\\\\)**"],
- [$type_func, "\$1()"],
- [$type_enum, "\\:c\\:type\\:`\$1 <\$2>`"],
- [$type_struct, "\\:c\\:type\\:`\$1 <\$2>`"],
- [$type_typedef, "\\:c\\:type\\:`\$1 <\$2>`"],
- [$type_union, "\\:c\\:type\\:`\$1 <\$2>`"],
- # in rst this can refer to any type
- [$type_fallback, "\\:c\\:type\\:`\$1`"],
- [$type_param_ref, "**\$1\$2**"]
- );
-my $blankline_rst = "\n";
-
-# read arguments
-if ($#ARGV == -1) {
- usage();
-}
-
-my $kernelversion;
-my ($sphinx_major, $sphinx_minor, $sphinx_patch);
-
-my $dohighlight = "";
-
-my $verbose = 0;
-my $Werror = 0;
-my $output_mode = "rst";
-my $output_preformatted = 0;
-my $no_doc_sections = 0;
-my $enable_lineno = 0;
-my @highlights = @highlights_rst;
-my $blankline = $blankline_rst;
-my $modulename = "Kernel API";
-
-use constant {
- OUTPUT_ALL => 0, # output all symbols and doc sections
- OUTPUT_INCLUDE => 1, # output only specified symbols
- OUTPUT_EXPORTED => 2, # output exported symbols
- OUTPUT_INTERNAL => 3, # output non-exported symbols
-};
-my $output_selection = OUTPUT_ALL;
-my $show_not_found = 0; # No longer used
-
-my @export_file_list;
-
-my @build_time;
-if (defined($ENV{'KBUILD_BUILD_TIMESTAMP'}) &&
- (my $seconds = `date -d"${ENV{'KBUILD_BUILD_TIMESTAMP'}}" +%s`) ne '') {
- @build_time = gmtime($seconds);
-} else {
- @build_time = localtime;
-}
-
-my $man_date = ('January', 'February', 'March', 'April', 'May', 'June',
- 'July', 'August', 'September', 'October',
- 'November', 'December')[$build_time[4]] .
- " " . ($build_time[5]+1900);
-
-# Essentially these are globals.
-# They probably want to be tidied up, made more localised or something.
-# CAVEAT EMPTOR! Some of the others I localised may not want to be, which
-# could cause "use of undefined value" or other bugs.
-my ($function, %function_table, %parametertypes, $declaration_purpose);
-my %nosymbol_table = ();
-my $declaration_start_line;
-my ($type, $declaration_name, $return_type);
-my ($newsection, $newcontents, $prototype, $brcount, %source_map);
-
-if (defined($ENV{'KBUILD_VERBOSE'})) {
- $verbose = "$ENV{'KBUILD_VERBOSE'}";
-}
-
-if (defined($ENV{'KDOC_WERROR'})) {
- $Werror = "$ENV{'KDOC_WERROR'}";
-}
-
-if (defined($ENV{'KCFLAGS'})) {
- my $kcflags = "$ENV{'KCFLAGS'}";
-
- if ($kcflags =~ /Werror/) {
- $Werror = 1;
- }
-}
-
-# Generated docbook code is inserted in a template at a point where
-# docbook v3.1 requires a non-zero sequence of RefEntry's; see:
-# https://www.oasis-open.org/docbook/documentation/reference/html/refentry.html
-# We keep track of number of generated entries and generate a dummy
-# if needs be to ensure the expanded template can be postprocessed
-# into html.
-my $section_counter = 0;
-
-my $lineprefix="";
-
-# Parser states
-use constant {
- STATE_NORMAL => 0, # normal code
- STATE_NAME => 1, # looking for function name
- STATE_BODY_MAYBE => 2, # body - or maybe more description
- STATE_BODY => 3, # the body of the comment
- STATE_BODY_WITH_BLANK_LINE => 4, # the body, which has a blank line
- STATE_PROTO => 5, # scanning prototype
- STATE_DOCBLOCK => 6, # documentation block
- STATE_INLINE => 7, # gathering doc outside main block
-};
-my $state;
-my $in_doc_sect;
-my $leading_space;
-
-# Inline documentation state
-use constant {
- STATE_INLINE_NA => 0, # not applicable ($state != STATE_INLINE)
- STATE_INLINE_NAME => 1, # looking for member name (@foo:)
- STATE_INLINE_TEXT => 2, # looking for member documentation
- STATE_INLINE_END => 3, # done
- STATE_INLINE_ERROR => 4, # error - Comment without header was found.
- # Spit a warning as it's not
- # proper kernel-doc and ignore the rest.
-};
-my $inline_doc_state;
-
-#declaration types: can be
-# 'function', 'struct', 'union', 'enum', 'typedef'
-my $decl_type;
-
-my $doc_start = '^/\*\*\s*$'; # Allow whitespace at end of comment start.
-my $doc_end = '\*/';
-my $doc_com = '\s*\*\s*';
-my $doc_com_body = '\s*\* ?';
-my $doc_decl = $doc_com . '(\w+)';
-# @params and a strictly limited set of supported section names
-my $doc_sect = $doc_com .
- '\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:(.*)';
-my $doc_content = $doc_com_body . '(.*)';
-my $doc_block = $doc_com . 'DOC:\s*(.*)?';
-my $doc_inline_start = '^\s*/\*\*\s*$';
-my $doc_inline_sect = '\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)';
-my $doc_inline_end = '^\s*\*/\s*$';
-my $doc_inline_oneline = '^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$';
-my $export_symbol = '^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*;';
-
-my %parameterdescs;
-my %parameterdesc_start_lines;
-my @parameterlist;
-my %sections;
-my @sectionlist;
-my %section_start_lines;
-my $sectcheck;
-my $struct_actual;
-
-my $contents = "";
-my $new_start_line = 0;
-
-# the canonical section names. see also $doc_sect above.
-my $section_default = "Description"; # default section
-my $section_intro = "Introduction";
-my $section = $section_default;
-my $section_context = "Context";
-my $section_return = "Return";
-
-my $undescribed = "-- undescribed --";
-
-reset_state();
-
-while ($ARGV[0] =~ m/^--?(.*)/) {
- my $cmd = $1;
- shift @ARGV;
- if ($cmd eq "man") {
- $output_mode = "man";
- @highlights = @highlights_man;
- $blankline = $blankline_man;
- } elsif ($cmd eq "rst") {
- $output_mode = "rst";
- @highlights = @highlights_rst;
- $blankline = $blankline_rst;
- } elsif ($cmd eq "none") {
- $output_mode = "none";
- } elsif ($cmd eq "module") { # not needed for XML, inherits from calling document
- $modulename = shift @ARGV;
- } elsif ($cmd eq "function") { # to only output specific functions
- $output_selection = OUTPUT_INCLUDE;
- $function = shift @ARGV;
- $function_table{$function} = 1;
- } elsif ($cmd eq "nosymbol") { # Exclude specific symbols
- my $symbol = shift @ARGV;
- $nosymbol_table{$symbol} = 1;
- } elsif ($cmd eq "export") { # only exported symbols
- $output_selection = OUTPUT_EXPORTED;
- %function_table = ();
- } elsif ($cmd eq "internal") { # only non-exported symbols
- $output_selection = OUTPUT_INTERNAL;
- %function_table = ();
- } elsif ($cmd eq "export-file") {
- my $file = shift @ARGV;
- push(@export_file_list, $file);
- } elsif ($cmd eq "v") {
- $verbose = 1;
- } elsif ($cmd eq "Werror") {
- $Werror = 1;
- } elsif (($cmd eq "h") || ($cmd eq "help")) {
- usage();
- } elsif ($cmd eq 'no-doc-sections') {
- $no_doc_sections = 1;
- } elsif ($cmd eq 'enable-lineno') {
- $enable_lineno = 1;
- } elsif ($cmd eq 'show-not-found') {
- $show_not_found = 1; # A no-op but don't fail
- } elsif ($cmd eq "sphinx-version") {
- my $ver_string = shift @ARGV;
- if ($ver_string =~ m/^(\d+)(\.\d+)?(\.\d+)?/) {
- $sphinx_major = $1;
- if (defined($2)) {
- $sphinx_minor = substr($2,1);
- } else {
- $sphinx_minor = 0;
- }
- if (defined($3)) {
- $sphinx_patch = substr($3,1)
- } else {
- $sphinx_patch = 0;
- }
- } else {
- die "Sphinx version should either major.minor or major.minor.patch format\n";
- }
- } else {
- # Unknown argument
- usage();
- }
-}
-
-# continue execution near EOF;
-
-# The C domain dialect changed on Sphinx 3. So, we need to check the
-# version in order to produce the right tags.
-sub findprog($)
-{
- foreach(split(/:/, $ENV{PATH})) {
- return "$_/$_[0]" if(-x "$_/$_[0]");
- }
-}
-
-sub get_sphinx_version()
-{
- my $ver;
-
- my $cmd = "sphinx-build";
- if (!findprog($cmd)) {
- my $cmd = "sphinx-build3";
- if (!findprog($cmd)) {
- $sphinx_major = 1;
- $sphinx_minor = 2;
- $sphinx_patch = 0;
- printf STDERR "Warning: Sphinx version not found. Using default (Sphinx version %d.%d.%d)\n",
- $sphinx_major, $sphinx_minor, $sphinx_patch;
- return;
- }
- }
-
- open IN, "$cmd --version 2>&1 |";
- while (<IN>) {
- if (m/^\s*sphinx-build\s+([\d]+)\.([\d\.]+)(\+\/[\da-f]+)?$/) {
- $sphinx_major = $1;
- $sphinx_minor = $2;
- $sphinx_patch = $3;
- last;
- }
- # Sphinx 1.2.x uses a different format
- if (m/^\s*Sphinx.*\s+([\d]+)\.([\d\.]+)$/) {
- $sphinx_major = $1;
- $sphinx_minor = $2;
- $sphinx_patch = $3;
- last;
- }
- }
- close IN;
-}
-
-# get kernel version from env
-sub get_kernel_version() {
- my $version = 'unknown kernel version';
-
- if (defined($ENV{'KERNELVERSION'})) {
- $version = $ENV{'KERNELVERSION'};
- }
- return $version;
-}
-
-#
-sub print_lineno {
- my $lineno = shift;
- if ($enable_lineno && defined($lineno)) {
- print "#define LINENO " . $lineno . "\n";
- }
-}
-##
-# dumps section contents to arrays/hashes intended for that purpose.
-#
-sub dump_section {
- my $file = shift;
- my $name = shift;
- my $contents = join "\n", @_;
-
- if ($name =~ m/$type_param/) {
- $name = $1;
- $parameterdescs{$name} = $contents;
- $sectcheck = $sectcheck . $name . " ";
- $parameterdesc_start_lines{$name} = $new_start_line;
- $new_start_line = 0;
- } elsif ($name eq "@\.\.\.") {
- $name = "...";
- $parameterdescs{$name} = $contents;
- $sectcheck = $sectcheck . $name . " ";
- $parameterdesc_start_lines{$name} = $new_start_line;
- $new_start_line = 0;
- } else {
- if (defined($sections{$name}) && ($sections{$name} ne "")) {
- # Only warn on user specified duplicate section names.
- if ($name ne $section_default) {
- print STDERR "${file}:$.: warning: duplicate section name '$name'\n";
- ++$warnings;
- }
- $sections{$name} .= $contents;
- } else {
- $sections{$name} = $contents;
- push @sectionlist, $name;
- $section_start_lines{$name} = $new_start_line;
- $new_start_line = 0;
- }
- }
-}
-
-##
-# dump DOC: section after checking that it should go out
-#
-sub dump_doc_section {
- my $file = shift;
- my $name = shift;
- my $contents = join "\n", @_;
-
- if ($no_doc_sections) {
- return;
- }
-
- return if (defined($nosymbol_table{$name}));
-
- if (($output_selection == OUTPUT_ALL) ||
- (($output_selection == OUTPUT_INCLUDE) &&
- defined($function_table{$name})))
- {
- dump_section($file, $name, $contents);
- output_blockhead({'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'module' => $modulename,
- 'content-only' => ($output_selection != OUTPUT_ALL), });
- }
-}
-
-##
-# output function
-#
-# parameterdescs, a hash.
-# function => "function name"
-# parameterlist => @list of parameters
-# parameterdescs => %parameter descriptions
-# sectionlist => @list of sections
-# sections => %section descriptions
-#
-
-sub output_highlight {
- my $contents = join "\n",@_;
- my $line;
-
-# DEBUG
-# if (!defined $contents) {
-# use Carp;
-# confess "output_highlight got called with no args?\n";
-# }
-
-# print STDERR "contents b4:$contents\n";
- eval $dohighlight;
- die $@ if $@;
-# print STDERR "contents af:$contents\n";
-
- foreach $line (split "\n", $contents) {
- if (! $output_preformatted) {
- $line =~ s/^\s*//;
- }
- if ($line eq ""){
- if (! $output_preformatted) {
- print $lineprefix, $blankline;
- }
- } else {
- if ($output_mode eq "man" && substr($line, 0, 1) eq ".") {
- print "\\&$line";
- } else {
- print $lineprefix, $line;
- }
- }
- print "\n";
- }
-}
-
-##
-# output function in man
-sub output_function_man(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
- my $count;
-
- print ".TH \"$args{'function'}\" 9 \"$args{'function'}\" \"$man_date\" \"Kernel Hacker's Manual\" LINUX\n";
-
- print ".SH NAME\n";
- print $args{'function'} . " \\- " . $args{'purpose'} . "\n";
-
- print ".SH SYNOPSIS\n";
- if ($args{'functiontype'} ne "") {
- print ".B \"" . $args{'functiontype'} . "\" " . $args{'function'} . "\n";
- } else {
- print ".B \"" . $args{'function'} . "\n";
- }
- $count = 0;
- my $parenth = "(";
- my $post = ",";
- foreach my $parameter (@{$args{'parameterlist'}}) {
- if ($count == $#{$args{'parameterlist'}}) {
- $post = ");";
- }
- $type = $args{'parametertypes'}{$parameter};
- if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
- # pointer-to-function
- print ".BI \"" . $parenth . $1 . "\" " . " \") (" . $2 . ")" . $post . "\"\n";
- } else {
- $type =~ s/([^\*])$/$1 /;
- print ".BI \"" . $parenth . $type . "\" " . " \"" . $post . "\"\n";
- }
- $count++;
- $parenth = "";
- }
-
- print ".SH ARGUMENTS\n";
- foreach $parameter (@{$args{'parameterlist'}}) {
- my $parameter_name = $parameter;
- $parameter_name =~ s/\[.*//;
-
- print ".IP \"" . $parameter . "\" 12\n";
- output_highlight($args{'parameterdescs'}{$parameter_name});
- }
- foreach $section (@{$args{'sectionlist'}}) {
- print ".SH \"", uc $section, "\"\n";
- output_highlight($args{'sections'}{$section});
- }
-}
-
-##
-# output enum in man
-sub output_enum_man(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
- my $count;
-
- print ".TH \"$args{'module'}\" 9 \"enum $args{'enum'}\" \"$man_date\" \"API Manual\" LINUX\n";
-
- print ".SH NAME\n";
- print "enum " . $args{'enum'} . " \\- " . $args{'purpose'} . "\n";
-
- print ".SH SYNOPSIS\n";
- print "enum " . $args{'enum'} . " {\n";
- $count = 0;
- foreach my $parameter (@{$args{'parameterlist'}}) {
- print ".br\n.BI \" $parameter\"\n";
- if ($count == $#{$args{'parameterlist'}}) {
- print "\n};\n";
- last;
- }
- else {
- print ", \n.br\n";
- }
- $count++;
- }
-
- print ".SH Constants\n";
- foreach $parameter (@{$args{'parameterlist'}}) {
- my $parameter_name = $parameter;
- $parameter_name =~ s/\[.*//;
-
- print ".IP \"" . $parameter . "\" 12\n";
- output_highlight($args{'parameterdescs'}{$parameter_name});
- }
- foreach $section (@{$args{'sectionlist'}}) {
- print ".SH \"$section\"\n";
- output_highlight($args{'sections'}{$section});
- }
-}
-
-##
-# output struct in man
-sub output_struct_man(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
-
- print ".TH \"$args{'module'}\" 9 \"" . $args{'type'} . " " . $args{'struct'} . "\" \"$man_date\" \"API Manual\" LINUX\n";
-
- print ".SH NAME\n";
- print $args{'type'} . " " . $args{'struct'} . " \\- " . $args{'purpose'} . "\n";
-
- my $declaration = $args{'definition'};
- $declaration =~ s/\t/ /g;
- $declaration =~ s/\n/"\n.br\n.BI \"/g;
- print ".SH SYNOPSIS\n";
- print $args{'type'} . " " . $args{'struct'} . " {\n.br\n";
- print ".BI \"$declaration\n};\n.br\n\n";
-
- print ".SH Members\n";
- foreach $parameter (@{$args{'parameterlist'}}) {
- ($parameter =~ /^#/) && next;
-
- my $parameter_name = $parameter;
- $parameter_name =~ s/\[.*//;
-
- ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
- print ".IP \"" . $parameter . "\" 12\n";
- output_highlight($args{'parameterdescs'}{$parameter_name});
- }
- foreach $section (@{$args{'sectionlist'}}) {
- print ".SH \"$section\"\n";
- output_highlight($args{'sections'}{$section});
- }
-}
-
-##
-# output typedef in man
-sub output_typedef_man(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
-
- print ".TH \"$args{'module'}\" 9 \"$args{'typedef'}\" \"$man_date\" \"API Manual\" LINUX\n";
-
- print ".SH NAME\n";
- print "typedef " . $args{'typedef'} . " \\- " . $args{'purpose'} . "\n";
-
- foreach $section (@{$args{'sectionlist'}}) {
- print ".SH \"$section\"\n";
- output_highlight($args{'sections'}{$section});
- }
-}
-
-sub output_blockhead_man(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
- my $count;
-
- print ".TH \"$args{'module'}\" 9 \"$args{'module'}\" \"$man_date\" \"API Manual\" LINUX\n";
-
- foreach $section (@{$args{'sectionlist'}}) {
- print ".SH \"$section\"\n";
- output_highlight($args{'sections'}{$section});
- }
-}
-
-##
-# output in restructured text
-#
-
-#
-# This could use some work; it's used to output the DOC: sections, and
-# starts by putting out the name of the doc section itself, but that tends
-# to duplicate a header already in the template file.
-#
-sub output_blockhead_rst(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
-
- foreach $section (@{$args{'sectionlist'}}) {
- next if (defined($nosymbol_table{$section}));
-
- if ($output_selection != OUTPUT_INCLUDE) {
- print "**$section**\n\n";
- }
- print_lineno($section_start_lines{$section});
- output_highlight_rst($args{'sections'}{$section});
- print "\n";
- }
-}
-
-#
-# Apply the RST highlights to a sub-block of text.
-#
-sub highlight_block($) {
- # The dohighlight kludge requires the text be called $contents
- my $contents = shift;
- eval $dohighlight;
- die $@ if $@;
- return $contents;
-}
-
-#
-# Regexes used only here.
-#
-my $sphinx_literal = '^[^.].*::$';
-my $sphinx_cblock = '^\.\.\ +code-block::';
-
-sub output_highlight_rst {
- my $input = join "\n",@_;
- my $output = "";
- my $line;
- my $in_literal = 0;
- my $litprefix;
- my $block = "";
-
- foreach $line (split "\n",$input) {
- #
- # If we're in a literal block, see if we should drop out
- # of it. Otherwise pass the line straight through unmunged.
- #
- if ($in_literal) {
- if (! ($line =~ /^\s*$/)) {
- #
- # If this is the first non-blank line in a literal
- # block we need to figure out what the proper indent is.
- #
- if ($litprefix eq "") {
- $line =~ /^(\s*)/;
- $litprefix = '^' . $1;
- $output .= $line . "\n";
- } elsif (! ($line =~ /$litprefix/)) {
- $in_literal = 0;
- } else {
- $output .= $line . "\n";
- }
- } else {
- $output .= $line . "\n";
- }
- }
- #
- # Not in a literal block (or just dropped out)
- #
- if (! $in_literal) {
- $block .= $line . "\n";
- if (($line =~ /$sphinx_literal/) || ($line =~ /$sphinx_cblock/)) {
- $in_literal = 1;
- $litprefix = "";
- $output .= highlight_block($block);
- $block = ""
- }
- }
- }
-
- if ($block) {
- $output .= highlight_block($block);
- }
- foreach $line (split "\n", $output) {
- print $lineprefix . $line . "\n";
- }
-}
-
-sub output_function_rst(%) {
- my %args = %{$_[0]};
- my ($parameter, $section);
- my $oldprefix = $lineprefix;
- my $start = "";
- my $is_macro = 0;
-
- if ($sphinx_major < 3) {
- if ($args{'typedef'}) {
- print ".. c:type:: ". $args{'function'} . "\n\n";
- print_lineno($declaration_start_line);
- print " **Typedef**: ";
- $lineprefix = "";
- output_highlight_rst($args{'purpose'});
- $start = "\n\n**Syntax**\n\n ``";
- $is_macro = 1;
- } else {
- print ".. c:function:: ";
- }
- } else {
- if ($args{'typedef'} || $args{'functiontype'} eq "") {
- $is_macro = 1;
- print ".. c:macro:: ". $args{'function'} . "\n\n";
- } else {
- print ".. c:function:: ";
- }
-
- if ($args{'typedef'}) {
- print_lineno($declaration_start_line);
- print " **Typedef**: ";
- $lineprefix = "";
- output_highlight_rst($args{'purpose'});
- $start = "\n\n**Syntax**\n\n ``";
- } else {
- print "``" if ($is_macro);
- }
- }
- if ($args{'functiontype'} ne "") {
- $start .= $args{'functiontype'} . " " . $args{'function'} . " (";
- } else {
- $start .= $args{'function'} . " (";
- }
- print $start;
-
- my $count = 0;
- foreach my $parameter (@{$args{'parameterlist'}}) {
- if ($count ne 0) {
- print ", ";
- }
- $count++;
- $type = $args{'parametertypes'}{$parameter};
-
- if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
- # pointer-to-function
- print $1 . $parameter . ") (" . $2 . ")";
- } else {
- print $type;
- }
- }
- if ($is_macro) {
- print ")``\n\n";
- } else {
- print ")\n\n";
- }
- if (!$args{'typedef'}) {
- print_lineno($declaration_start_line);
- $lineprefix = " ";
- output_highlight_rst($args{'purpose'});
- print "\n";
- }
-
- print "**Parameters**\n\n";
- $lineprefix = " ";
- foreach $parameter (@{$args{'parameterlist'}}) {
- my $parameter_name = $parameter;
- $parameter_name =~ s/\[.*//;
- $type = $args{'parametertypes'}{$parameter};
-
- if ($type ne "") {
- print "``$type``\n";
- } else {
- print "``$parameter``\n";
- }
-
- print_lineno($parameterdesc_start_lines{$parameter_name});
-
- if (defined($args{'parameterdescs'}{$parameter_name}) &&
- $args{'parameterdescs'}{$parameter_name} ne $undescribed) {
- output_highlight_rst($args{'parameterdescs'}{$parameter_name});
- } else {
- print " *undescribed*\n";
- }
- print "\n";
- }
-
- $lineprefix = $oldprefix;
- output_section_rst(@_);
-}
-
-sub output_section_rst(%) {
- my %args = %{$_[0]};
- my $section;
- my $oldprefix = $lineprefix;
- $lineprefix = "";
-
- foreach $section (@{$args{'sectionlist'}}) {
- print "**$section**\n\n";
- print_lineno($section_start_lines{$section});
- output_highlight_rst($args{'sections'}{$section});
- print "\n";
- }
- print "\n";
- $lineprefix = $oldprefix;
-}
-
-sub output_enum_rst(%) {
- my %args = %{$_[0]};
- my ($parameter);
- my $oldprefix = $lineprefix;
- my $count;
-
- if ($sphinx_major < 3) {
- my $name = "enum " . $args{'enum'};
- print "\n\n.. c:type:: " . $name . "\n\n";
- } else {
- my $name = $args{'enum'};
- print "\n\n.. c:enum:: " . $name . "\n\n";
- }
- print_lineno($declaration_start_line);
- $lineprefix = " ";
- output_highlight_rst($args{'purpose'});
- print "\n";
-
- print "**Constants**\n\n";
- $lineprefix = " ";
- foreach $parameter (@{$args{'parameterlist'}}) {
- print "``$parameter``\n";
- if ($args{'parameterdescs'}{$parameter} ne $undescribed) {
- output_highlight_rst($args{'parameterdescs'}{$parameter});
- } else {
- print " *undescribed*\n";
- }
- print "\n";
- }
-
- $lineprefix = $oldprefix;
- output_section_rst(@_);
-}
-
-sub output_typedef_rst(%) {
- my %args = %{$_[0]};
- my ($parameter);
- my $oldprefix = $lineprefix;
- my $name;
-
- if ($sphinx_major < 3) {
- $name = "typedef " . $args{'typedef'};
- } else {
- $name = $args{'typedef'};
- }
- print "\n\n.. c:type:: " . $name . "\n\n";
- print_lineno($declaration_start_line);
- $lineprefix = " ";
- output_highlight_rst($args{'purpose'});
- print "\n";
-
- $lineprefix = $oldprefix;
- output_section_rst(@_);
-}
-
-sub output_struct_rst(%) {
- my %args = %{$_[0]};
- my ($parameter);
- my $oldprefix = $lineprefix;
-
- if ($sphinx_major < 3) {
- my $name = $args{'type'} . " " . $args{'struct'};
- print "\n\n.. c:type:: " . $name . "\n\n";
- } else {
- my $name = $args{'struct'};
- if ($args{'type'} eq 'union') {
- print "\n\n.. c:union:: " . $name . "\n\n";
- } else {
- print "\n\n.. c:struct:: " . $name . "\n\n";
- }
- }
- print_lineno($declaration_start_line);
- $lineprefix = " ";
- output_highlight_rst($args{'purpose'});
- print "\n";
-
- print "**Definition**\n\n";
- print "::\n\n";
- my $declaration = $args{'definition'};
- $declaration =~ s/\t/ /g;
- print " " . $args{'type'} . " " . $args{'struct'} . " {\n$declaration };\n\n";
-
- print "**Members**\n\n";
- $lineprefix = " ";
- foreach $parameter (@{$args{'parameterlist'}}) {
- ($parameter =~ /^#/) && next;
-
- my $parameter_name = $parameter;
- $parameter_name =~ s/\[.*//;
-
- ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
- $type = $args{'parametertypes'}{$parameter};
- print_lineno($parameterdesc_start_lines{$parameter_name});
- print "``" . $parameter . "``\n";
- output_highlight_rst($args{'parameterdescs'}{$parameter_name});
- print "\n";
- }
- print "\n";
-
- $lineprefix = $oldprefix;
- output_section_rst(@_);
-}
-
-## none mode output functions
-
-sub output_function_none(%) {
-}
-
-sub output_enum_none(%) {
-}
-
-sub output_typedef_none(%) {
-}
-
-sub output_struct_none(%) {
-}
-
-sub output_blockhead_none(%) {
-}
-
-##
-# generic output function for all types (function, struct/union, typedef, enum);
-# calls the generated, variable output_ function name based on
-# functype and output_mode
-sub output_declaration {
- no strict 'refs';
- my $name = shift;
- my $functype = shift;
- my $func = "output_${functype}_$output_mode";
-
- return if (defined($nosymbol_table{$name}));
-
- if (($output_selection == OUTPUT_ALL) ||
- (($output_selection == OUTPUT_INCLUDE ||
- $output_selection == OUTPUT_EXPORTED) &&
- defined($function_table{$name})) ||
- ($output_selection == OUTPUT_INTERNAL &&
- !($functype eq "function" && defined($function_table{$name}))))
- {
- &$func(@_);
- $section_counter++;
- }
-}
-
-##
-# generic output function - calls the right one based on current output mode.
-sub output_blockhead {
- no strict 'refs';
- my $func = "output_blockhead_" . $output_mode;
- &$func(@_);
- $section_counter++;
-}
-
-##
-# takes a declaration (struct, union, enum, typedef) and
-# invokes the right handler. NOT called for functions.
-sub dump_declaration($$) {
- no strict 'refs';
- my ($prototype, $file) = @_;
- my $func = "dump_" . $decl_type;
- &$func(@_);
-}
-
-sub dump_union($$) {
- dump_struct(@_);
-}
-
-sub dump_struct($$) {
- my $x = shift;
- my $file = shift;
-
- if ($x =~ /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/) {
- my $decl_type = $1;
- $declaration_name = $2;
- my $members = $3;
-
- # ignore members marked private:
- $members =~ s/\/\*\s*private:.*?\/\*\s*public:.*?\*\///gosi;
- $members =~ s/\/\*\s*private:.*//gosi;
- # strip comments:
- $members =~ s/\/\*.*?\*\///gos;
- # strip attributes
- $members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
- $members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
- $members =~ s/\s*__packed\s*/ /gos;
- $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos;
- $members =~ s/\s*____cacheline_aligned_in_smp/ /gos;
- $members =~ s/\s*____cacheline_aligned/ /gos;
-
- # replace DECLARE_BITMAP
- $members =~ s/__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)/DECLARE_BITMAP($1, __ETHTOOL_LINK_MODE_MASK_NBITS)/gos;
- $members =~ s/DECLARE_BITMAP\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[BITS_TO_LONGS($2)\]/gos;
- # replace DECLARE_HASHTABLE
- $members =~ s/DECLARE_HASHTABLE\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[1 << (($2) - 1)\]/gos;
- # replace DECLARE_KFIFO
- $members =~ s/DECLARE_KFIFO\s*\(([^,)]+),\s*([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
- # replace DECLARE_KFIFO_PTR
- $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
-
- my $declaration = $members;
-
- # Split nested struct/union elements as newer ones
- while ($members =~ m/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/) {
- my $newmember;
- my $maintype = $1;
- my $ids = $4;
- my $content = $3;
- foreach my $id(split /,/, $ids) {
- $newmember .= "$maintype $id; ";
-
- $id =~ s/[:\[].*//;
- $id =~ s/^\s*\**(\S+)\s*/$1/;
- foreach my $arg (split /;/, $content) {
- next if ($arg =~ m/^\s*$/);
- if ($arg =~ m/^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)/) {
- # pointer-to-function
- my $type = $1;
- my $name = $2;
- my $extra = $3;
- next if (!$name);
- if ($id =~ m/^\s*$/) {
- # anonymous struct/union
- $newmember .= "$type$name$extra; ";
- } else {
- $newmember .= "$type$id.$name$extra; ";
- }
- } else {
- my $type;
- my $names;
- $arg =~ s/^\s+//;
- $arg =~ s/\s+$//;
- # Handle bitmaps
- $arg =~ s/:\s*\d+\s*//g;
- # Handle arrays
- $arg =~ s/\[.*\]//g;
- # The type may have multiple words,
- # and multiple IDs can be defined, like:
- # const struct foo, *bar, foobar
- # So, we remove spaces when parsing the
- # names, in order to match just names
- # and commas for the names
- $arg =~ s/\s*,\s*/,/g;
- if ($arg =~ m/(.*)\s+([\S+,]+)/) {
- $type = $1;
- $names = $2;
- } else {
- $newmember .= "$arg; ";
- next;
- }
- foreach my $name (split /,/, $names) {
- $name =~ s/^\s*\**(\S+)\s*/$1/;
- next if (($name =~ m/^\s*$/));
- if ($id =~ m/^\s*$/) {
- # anonymous struct/union
- $newmember .= "$type $name; ";
- } else {
- $newmember .= "$type $id.$name; ";
- }
- }
- }
- }
- }
- $members =~ s/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/$newmember/;
- }
-
- # Ignore other nested elements, like enums
- $members =~ s/(\{[^\{\}]*\})//g;
-
- create_parameterlist($members, ';', $file, $declaration_name);
- check_sections($file, $declaration_name, $decl_type, $sectcheck, $struct_actual);
-
- # Adjust declaration for better display
- $declaration =~ s/([\{;])/$1\n/g;
- $declaration =~ s/\}\s+;/};/g;
- # Better handle inlined enums
- do {} while ($declaration =~ s/(enum\s+\{[^\}]+),([^\n])/$1,\n$2/);
-
- my @def_args = split /\n/, $declaration;
- my $level = 1;
- $declaration = "";
- foreach my $clause (@def_args) {
- $clause =~ s/^\s+//;
- $clause =~ s/\s+$//;
- $clause =~ s/\s+/ /;
- next if (!$clause);
- $level-- if ($clause =~ m/(\})/ && $level > 1);
- if (!($clause =~ m/^\s*#/)) {
- $declaration .= "\t" x $level;
- }
- $declaration .= "\t" . $clause . "\n";
- $level++ if ($clause =~ m/(\{)/ && !($clause =~m/\}/));
- }
- output_declaration($declaration_name,
- 'struct',
- {'struct' => $declaration_name,
- 'module' => $modulename,
- 'definition' => $declaration,
- 'parameterlist' => \@parameterlist,
- 'parameterdescs' => \%parameterdescs,
- 'parametertypes' => \%parametertypes,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose,
- 'type' => $decl_type
- });
- }
- else {
- print STDERR "${file}:$.: error: Cannot parse struct or union!\n";
- ++$errors;
- }
-}
-
-
-sub show_warnings($$) {
- my $functype = shift;
- my $name = shift;
-
- return 0 if (defined($nosymbol_table{$name}));
-
- return 1 if ($output_selection == OUTPUT_ALL);
-
- if ($output_selection == OUTPUT_EXPORTED) {
- if (defined($function_table{$name})) {
- return 1;
- } else {
- return 0;
- }
- }
- if ($output_selection == OUTPUT_INTERNAL) {
- if (!($functype eq "function" && defined($function_table{$name}))) {
- return 1;
- } else {
- return 0;
- }
- }
- if ($output_selection == OUTPUT_INCLUDE) {
- if (defined($function_table{$name})) {
- return 1;
- } else {
- return 0;
- }
- }
- die("Please add the new output type at show_warnings()");
-}
-
-sub dump_enum($$) {
- my $x = shift;
- my $file = shift;
- my $members;
-
-
- $x =~ s@/\*.*?\*/@@gos; # strip comments.
- # strip #define macros inside enums
- $x =~ s@#\s*((define|ifdef)\s+|endif)[^;]*;@@gos;
-
- if ($x =~ /typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;/) {
- $declaration_name = $2;
- $members = $1;
- } elsif ($x =~ /enum\s+(\w*)\s*\{(.*)\}/) {
- $declaration_name = $1;
- $members = $2;
- }
-
- if ($declaration_name) {
- my %_members;
-
- $members =~ s/\s+$//;
-
- foreach my $arg (split ',', $members) {
- $arg =~ s/^\s*(\w+).*/$1/;
- push @parameterlist, $arg;
- if (!$parameterdescs{$arg}) {
- $parameterdescs{$arg} = $undescribed;
- if (show_warnings("enum", $declaration_name)) {
- print STDERR "${file}:$.: warning: Enum value '$arg' not described in enum '$declaration_name'\n";
- }
- }
- $_members{$arg} = 1;
- }
-
- while (my ($k, $v) = each %parameterdescs) {
- if (!exists($_members{$k})) {
- if (show_warnings("enum", $declaration_name)) {
- print STDERR "${file}:$.: warning: Excess enum value '$k' description in '$declaration_name'\n";
- }
- }
- }
-
- output_declaration($declaration_name,
- 'enum',
- {'enum' => $declaration_name,
- 'module' => $modulename,
- 'parameterlist' => \@parameterlist,
- 'parameterdescs' => \%parameterdescs,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose
- });
- } else {
- print STDERR "${file}:$.: error: Cannot parse enum!\n";
- ++$errors;
- }
-}
-
-my $typedef_type = qr { ((?:\s+[\w\*]+){1,8})\s* }x;
-my $typedef_ident = qr { \*?\s*(\w\S+)\s* }x;
-my $typedef_args = qr { \s*\((.*)\); }x;
-
-my $typedef1 = qr { typedef$typedef_type\($typedef_ident\)$typedef_args }x;
-my $typedef2 = qr { typedef$typedef_type$typedef_ident$typedef_args }x;
-
-sub dump_typedef($$) {
- my $x = shift;
- my $file = shift;
-
- $x =~ s@/\*.*?\*/@@gos; # strip comments.
-
- # Parse function typedef prototypes
- if ($x =~ $typedef1 || $x =~ $typedef2) {
- $return_type = $1;
- $declaration_name = $2;
- my $args = $3;
- $return_type =~ s/^\s+//;
-
- create_parameterlist($args, ',', $file, $declaration_name);
-
- output_declaration($declaration_name,
- 'function',
- {'function' => $declaration_name,
- 'typedef' => 1,
- 'module' => $modulename,
- 'functiontype' => $return_type,
- 'parameterlist' => \@parameterlist,
- 'parameterdescs' => \%parameterdescs,
- 'parametertypes' => \%parametertypes,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose
- });
- return;
- }
-
- while (($x =~ /\(*.\)\s*;$/) || ($x =~ /\[*.\]\s*;$/)) {
- $x =~ s/\(*.\)\s*;$/;/;
- $x =~ s/\[*.\]\s*;$/;/;
- }
-
- if ($x =~ /typedef.*\s+(\w+)\s*;/) {
- $declaration_name = $1;
-
- output_declaration($declaration_name,
- 'typedef',
- {'typedef' => $declaration_name,
- 'module' => $modulename,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose
- });
- }
- else {
- print STDERR "${file}:$.: error: Cannot parse typedef!\n";
- ++$errors;
- }
-}
-
-sub save_struct_actual($) {
- my $actual = shift;
-
- # strip all spaces from the actual param so that it looks like one string item
- $actual =~ s/\s*//g;
- $struct_actual = $struct_actual . $actual . " ";
-}
-
-sub create_parameterlist($$$$) {
- my $args = shift;
- my $splitter = shift;
- my $file = shift;
- my $declaration_name = shift;
- my $type;
- my $param;
-
- # temporarily replace commas inside function pointer definition
- while ($args =~ /(\([^\),]+),/) {
- $args =~ s/(\([^\),]+),/$1#/g;
- }
-
- foreach my $arg (split($splitter, $args)) {
- # strip comments
- $arg =~ s/\/\*.*\*\///;
- # strip leading/trailing spaces
- $arg =~ s/^\s*//;
- $arg =~ s/\s*$//;
- $arg =~ s/\s+/ /;
-
- if ($arg =~ /^#/) {
- # Treat preprocessor directive as a typeless variable just to fill
- # corresponding data structures "correctly". Catch it later in
- # output_* subs.
- push_parameter($arg, "", "", $file);
- } elsif ($arg =~ m/\(.+\)\s*\(/) {
- # pointer-to-function
- $arg =~ tr/#/,/;
- $arg =~ m/[^\(]+\(\*?\s*([\w\.]*)\s*\)/;
- $param = $1;
- $type = $arg;
- $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
- save_struct_actual($param);
- push_parameter($param, $type, $arg, $file, $declaration_name);
- } elsif ($arg) {
- $arg =~ s/\s*:\s*/:/g;
- $arg =~ s/\s*\[/\[/g;
-
- my @args = split('\s*,\s*', $arg);
- if ($args[0] =~ m/\*/) {
- $args[0] =~ s/(\*+)\s*/ $1/;
- }
-
- my @first_arg;
- if ($args[0] =~ /^(.*\s+)(.*?\[.*\].*)$/) {
- shift @args;
- push(@first_arg, split('\s+', $1));
- push(@first_arg, $2);
- } else {
- @first_arg = split('\s+', shift @args);
- }
-
- unshift(@args, pop @first_arg);
- $type = join " ", @first_arg;
-
- foreach $param (@args) {
- if ($param =~ m/^(\*+)\s*(.*)/) {
- save_struct_actual($2);
-
- push_parameter($2, "$type $1", $arg, $file, $declaration_name);
- }
- elsif ($param =~ m/(.*?):(\d+)/) {
- if ($type ne "") { # skip unnamed bit-fields
- save_struct_actual($1);
- push_parameter($1, "$type:$2", $arg, $file, $declaration_name)
- }
- }
- else {
- save_struct_actual($param);
- push_parameter($param, $type, $arg, $file, $declaration_name);
- }
- }
- }
- }
-}
-
-sub push_parameter($$$$$) {
- my $param = shift;
- my $type = shift;
- my $org_arg = shift;
- my $file = shift;
- my $declaration_name = shift;
-
- if (($anon_struct_union == 1) && ($type eq "") &&
- ($param eq "}")) {
- return; # ignore the ending }; from anon. struct/union
- }
-
- $anon_struct_union = 0;
- $param =~ s/[\[\)].*//;
-
- if ($type eq "" && $param =~ /\.\.\.$/)
- {
- if (!$param =~ /\w\.\.\.$/) {
- # handles unnamed variable parameters
- $param = "...";
- }
- elsif ($param =~ /\w\.\.\.$/) {
- # for named variable parameters of the form `x...`, remove the dots
- $param =~ s/\.\.\.$//;
- }
- if (!defined $parameterdescs{$param} || $parameterdescs{$param} eq "") {
- $parameterdescs{$param} = "variable arguments";
- }
- }
- elsif ($type eq "" && ($param eq "" or $param eq "void"))
- {
- $param="void";
- $parameterdescs{void} = "no arguments";
- }
- elsif ($type eq "" && ($param eq "struct" or $param eq "union"))
- # handle unnamed (anonymous) union or struct:
- {
- $type = $param;
- $param = "{unnamed_" . $param . "}";
- $parameterdescs{$param} = "anonymous\n";
- $anon_struct_union = 1;
- }
-
- # warn if parameter has no description
- # (but ignore ones starting with # as these are not parameters
- # but inline preprocessor statements);
- # Note: It will also ignore void params and unnamed structs/unions
- if (!defined $parameterdescs{$param} && $param !~ /^#/) {
- $parameterdescs{$param} = $undescribed;
-
- if (show_warnings($type, $declaration_name) && $param !~ /\./) {
- print STDERR
- "${file}:$.: warning: Function parameter or member '$param' not described in '$declaration_name'\n";
- ++$warnings;
- }
- }
-
- # strip spaces from $param so that it is one continuous string
- # on @parameterlist;
- # this fixes a problem where check_sections() cannot find
- # a parameter like "addr[6 + 2]" because it actually appears
- # as "addr[6", "+", "2]" on the parameter list;
- # but it's better to maintain the param string unchanged for output,
- # so just weaken the string compare in check_sections() to ignore
- # "[blah" in a parameter string;
- ###$param =~ s/\s*//g;
- push @parameterlist, $param;
- $org_arg =~ s/\s\s+/ /g;
- $parametertypes{$param} = $org_arg;
-}
-
-sub check_sections($$$$$) {
- my ($file, $decl_name, $decl_type, $sectcheck, $prmscheck) = @_;
- my @sects = split ' ', $sectcheck;
- my @prms = split ' ', $prmscheck;
- my $err;
- my ($px, $sx);
- my $prm_clean; # strip trailing "[array size]" and/or beginning "*"
-
- foreach $sx (0 .. $#sects) {
- $err = 1;
- foreach $px (0 .. $#prms) {
- $prm_clean = $prms[$px];
- $prm_clean =~ s/\[.*\]//;
- $prm_clean =~ s/__attribute__\s*\(\([a-z,_\*\s\(\)]*\)\)//i;
- # ignore array size in a parameter string;
- # however, the original param string may contain
- # spaces, e.g.: addr[6 + 2]
- # and this appears in @prms as "addr[6" since the
- # parameter list is split at spaces;
- # hence just ignore "[..." for the sections check;
- $prm_clean =~ s/\[.*//;
-
- ##$prm_clean =~ s/^\**//;
- if ($prm_clean eq $sects[$sx]) {
- $err = 0;
- last;
- }
- }
- if ($err) {
- if ($decl_type eq "function") {
- print STDERR "${file}:$.: warning: " .
- "Excess function parameter " .
- "'$sects[$sx]' " .
- "description in '$decl_name'\n";
- ++$warnings;
- }
- }
- }
-}
-
-##
-# Checks the section describing the return value of a function.
-sub check_return_section {
- my $file = shift;
- my $declaration_name = shift;
- my $return_type = shift;
-
- # Ignore an empty return type (It's a macro)
- # Ignore functions with a "void" return type. (But don't ignore "void *")
- if (($return_type eq "") || ($return_type =~ /void\s*\w*\s*$/)) {
- return;
- }
-
- if (!defined($sections{$section_return}) ||
- $sections{$section_return} eq "") {
- print STDERR "${file}:$.: warning: " .
- "No description found for return value of " .
- "'$declaration_name'\n";
- ++$warnings;
- }
-}
-
-##
-# takes a function prototype and the name of the current file being
-# processed and spits out all the details stored in the global
-# arrays/hashes.
-sub dump_function($$) {
- my $prototype = shift;
- my $file = shift;
- my $noret = 0;
-
- print_lineno($new_start_line);
-
- $prototype =~ s/^static +//;
- $prototype =~ s/^extern +//;
- $prototype =~ s/^asmlinkage +//;
- $prototype =~ s/^inline +//;
- $prototype =~ s/^__inline__ +//;
- $prototype =~ s/^__inline +//;
- $prototype =~ s/^__always_inline +//;
- $prototype =~ s/^noinline +//;
- $prototype =~ s/__init +//;
- $prototype =~ s/__init_or_module +//;
- $prototype =~ s/__meminit +//;
- $prototype =~ s/__must_check +//;
- $prototype =~ s/__weak +//;
- $prototype =~ s/__sched +//;
- $prototype =~ s/__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +//;
- my $define = $prototype =~ s/^#\s*define\s+//; #ak added
- $prototype =~ s/__attribute__\s*\(\(
- (?:
- [\w\s]++ # attribute name
- (?:\([^)]*+\))? # attribute arguments
- \s*+,? # optional comma at the end
- )+
- \)\)\s+//x;
-
- # Strip QEMU specific compiler annotations
- $prototype =~ s/QEMU_[A-Z_]+ +//;
-
- # Yes, this truly is vile. We are looking for:
- # 1. Return type (may be nothing if we're looking at a macro)
- # 2. Function name
- # 3. Function parameters.
- #
- # All the while we have to watch out for function pointer parameters
- # (which IIRC is what the two sections are for), C types (these
- # regexps don't even start to express all the possibilities), and
- # so on.
- #
- # If you mess with these regexps, it's a good idea to check that
- # the following functions' documentation still comes out right:
- # - parport_register_device (function pointer parameters)
- # - atomic_set (macro)
- # - pci_match_device, __copy_to_user (long return type)
-
- if ($define && $prototype =~ m/^()([a-zA-Z0-9_~:]+)\s+/) {
- # This is an object-like macro, it has no return type and no parameter
- # list.
- # Function-like macros are not allowed to have spaces between
- # declaration_name and opening parenthesis (notice the \s+).
- $return_type = $1;
- $declaration_name = $2;
- $noret = 1;
- } elsif ($prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
- $prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
- $prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/) {
- $return_type = $1;
- $declaration_name = $2;
- my $args = $3;
-
- create_parameterlist($args, ',', $file, $declaration_name);
- } else {
- print STDERR "${file}:$.: warning: cannot understand function prototype: '$prototype'\n";
- return;
- }
-
- my $prms = join " ", @parameterlist;
- check_sections($file, $declaration_name, "function", $sectcheck, $prms);
-
- # This check emits a lot of warnings at the moment, because many
- # functions don't have a 'Return' doc section. So until the number
- # of warnings goes sufficiently down, the check is only performed in
- # verbose mode.
- # TODO: always perform the check.
- if ($verbose && !$noret) {
- check_return_section($file, $declaration_name, $return_type);
- }
-
- # The function parser can be called with a typedef parameter.
- # Handle it.
- if ($return_type =~ /typedef/) {
- output_declaration($declaration_name,
- 'function',
- {'function' => $declaration_name,
- 'typedef' => 1,
- 'module' => $modulename,
- 'functiontype' => $return_type,
- 'parameterlist' => \@parameterlist,
- 'parameterdescs' => \%parameterdescs,
- 'parametertypes' => \%parametertypes,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose
- });
- } else {
- output_declaration($declaration_name,
- 'function',
- {'function' => $declaration_name,
- 'module' => $modulename,
- 'functiontype' => $return_type,
- 'parameterlist' => \@parameterlist,
- 'parameterdescs' => \%parameterdescs,
- 'parametertypes' => \%parametertypes,
- 'sectionlist' => \@sectionlist,
- 'sections' => \%sections,
- 'purpose' => $declaration_purpose
- });
- }
-}
-
-sub reset_state {
- $function = "";
- %parameterdescs = ();
- %parametertypes = ();
- @parameterlist = ();
- %sections = ();
- @sectionlist = ();
- $sectcheck = "";
- $struct_actual = "";
- $prototype = "";
-
- $state = STATE_NORMAL;
- $inline_doc_state = STATE_INLINE_NA;
-}
-
-sub tracepoint_munge($) {
- my $file = shift;
- my $tracepointname = 0;
- my $tracepointargs = 0;
-
- if ($prototype =~ m/TRACE_EVENT\((.*?),/) {
- $tracepointname = $1;
- }
- if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
- $tracepointname = $1;
- }
- if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
- $tracepointname = $2;
- }
- $tracepointname =~ s/^\s+//; #strip leading whitespace
- if ($prototype =~ m/TP_PROTO\((.*?)\)/) {
- $tracepointargs = $1;
- }
- if (($tracepointname eq 0) || ($tracepointargs eq 0)) {
- print STDERR "${file}:$.: warning: Unrecognized tracepoint format: \n".
- "$prototype\n";
- } else {
- $prototype = "static inline void trace_$tracepointname($tracepointargs)";
- }
-}
-
-sub syscall_munge() {
- my $void = 0;
-
- $prototype =~ s@[\r\n]+@ @gos; # strip newlines/CR's
-## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
- if ($prototype =~ m/SYSCALL_DEFINE0/) {
- $void = 1;
-## $prototype = "long sys_$1(void)";
- }
-
- $prototype =~ s/SYSCALL_DEFINE.*\(/long sys_/; # fix return type & func name
- if ($prototype =~ m/long (sys_.*?),/) {
- $prototype =~ s/,/\(/;
- } elsif ($void) {
- $prototype =~ s/\)/\(void\)/;
- }
-
- # now delete all of the odd-number commas in $prototype
- # so that arg types & arg names don't have a comma between them
- my $count = 0;
- my $len = length($prototype);
- if ($void) {
- $len = 0; # skip the for-loop
- }
- for (my $ix = 0; $ix < $len; $ix++) {
- if (substr($prototype, $ix, 1) eq ',') {
- $count++;
- if ($count % 2 == 1) {
- substr($prototype, $ix, 1) = ' ';
- }
- }
- }
-}
-
-sub process_proto_function($$) {
- my $x = shift;
- my $file = shift;
-
- $x =~ s@\/\/.*$@@gos; # strip C99-style comments to end of line
-
- if ($x =~ m#\s*/\*\s+MACDOC\s*#io || ($x =~ /^#/ && $x !~ /^#\s*define/)) {
- # do nothing
- }
- elsif ($x =~ /([^\{]*)/) {
- $prototype .= $1;
- }
-
- if (($x =~ /\{/) || ($x =~ /\#\s*define/) || ($x =~ /;/)) {
- $prototype =~ s@/\*.*?\*/@@gos; # strip comments.
- $prototype =~ s@[\r\n]+@ @gos; # strip newlines/cr's.
- $prototype =~ s@^\s+@@gos; # strip leading spaces
-
- # Handle prototypes for function pointers like:
- # int (*pcs_config)(struct foo)
- $prototype =~ s@^(\S+\s+)\(\s*\*(\S+)\)@$1$2@gos;
-
- if ($prototype =~ /SYSCALL_DEFINE/) {
- syscall_munge();
- }
- if ($prototype =~ /TRACE_EVENT/ || $prototype =~ /DEFINE_EVENT/ ||
- $prototype =~ /DEFINE_SINGLE_EVENT/)
- {
- tracepoint_munge($file);
- }
- dump_function($prototype, $file);
- reset_state();
- }
-}
-
-sub process_proto_type($$) {
- my $x = shift;
- my $file = shift;
-
- $x =~ s@[\r\n]+@ @gos; # strip newlines/cr's.
- $x =~ s@^\s+@@gos; # strip leading spaces
- $x =~ s@\s+$@@gos; # strip trailing spaces
- $x =~ s@\/\/.*$@@gos; # strip C99-style comments to end of line
-
- if ($x =~ /^#/) {
- # To distinguish preprocessor directive from regular declaration later.
- $x .= ";";
- }
-
- while (1) {
- if ( $x =~ /([^\{\};]*)([\{\};])(.*)/ ) {
- if( length $prototype ) {
- $prototype .= " "
- }
- $prototype .= $1 . $2;
- ($2 eq '{') && $brcount++;
- ($2 eq '}') && $brcount--;
- if (($2 eq ';') && ($brcount == 0)) {
- dump_declaration($prototype, $file);
- reset_state();
- last;
- }
- $x = $3;
- } else {
- $prototype .= $x;
- last;
- }
- }
-}
-
-
-sub map_filename($) {
- my $file;
- my ($orig_file) = @_;
-
- if (defined($ENV{'SRCTREE'})) {
- $file = "$ENV{'SRCTREE'}" . "/" . $orig_file;
- } else {
- $file = $orig_file;
- }
-
- if (defined($source_map{$file})) {
- $file = $source_map{$file};
- }
-
- return $file;
-}
-
-sub process_export_file($) {
- my ($orig_file) = @_;
- my $file = map_filename($orig_file);
-
- if (!open(IN,"<$file")) {
- print STDERR "Error: Cannot open file $file\n";
- ++$errors;
- return;
- }
-
- while (<IN>) {
- if (/$export_symbol/) {
- next if (defined($nosymbol_table{$2}));
- $function_table{$2} = 1;
- }
- }
-
- close(IN);
-}
-
-#
-# Parsers for the various processing states.
-#
-# STATE_NORMAL: looking for the /** to begin everything.
-#
-sub process_normal() {
- if (/$doc_start/o) {
- $state = STATE_NAME; # next line is always the function name
- $in_doc_sect = 0;
- $declaration_start_line = $. + 1;
- }
-}
-
-#
-# STATE_NAME: Looking for the "name - description" line
-#
-sub process_name($$) {
- my $file = shift;
- my $identifier;
- my $descr;
-
- if (/$doc_block/o) {
- $state = STATE_DOCBLOCK;
- $contents = "";
- $new_start_line = $.;
-
- if ( $1 eq "" ) {
- $section = $section_intro;
- } else {
- $section = $1;
- }
- }
- elsif (/$doc_decl/o) {
- $identifier = $1;
- if (/\s*([\w\s]+?)(\s*-|:)/) {
- $identifier = $1;
- }
-
- $state = STATE_BODY;
- # if there's no @param blocks need to set up default section
- # here
- $contents = "";
- $section = $section_default;
- $new_start_line = $. + 1;
- if (/[-:](.*)/) {
- # strip leading/trailing/multiple spaces
- $descr= $1;
- $descr =~ s/^\s*//;
- $descr =~ s/\s*$//;
- $descr =~ s/\s+/ /g;
- $declaration_purpose = $descr;
- $state = STATE_BODY_MAYBE;
- } else {
- $declaration_purpose = "";
- }
-
- if (($declaration_purpose eq "") && $verbose) {
- print STDERR "${file}:$.: warning: missing initial short description on line:\n";
- print STDERR $_;
- ++$warnings;
- }
-
- if ($identifier =~ m/^struct\b/) {
- $decl_type = 'struct';
- } elsif ($identifier =~ m/^union\b/) {
- $decl_type = 'union';
- } elsif ($identifier =~ m/^enum\b/) {
- $decl_type = 'enum';
- } elsif ($identifier =~ m/^typedef\b/) {
- $decl_type = 'typedef';
- } else {
- $decl_type = 'function';
- }
-
- if ($verbose) {
- print STDERR "${file}:$.: info: Scanning doc for $identifier\n";
- }
- } else {
- print STDERR "${file}:$.: warning: Cannot understand $_ on line $.",
- " - I thought it was a doc line\n";
- ++$warnings;
- $state = STATE_NORMAL;
- }
-}
-
-
-#
-# STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment.
-#
-sub process_body($$) {
- my $file = shift;
-
- # Until all named variable macro parameters are
- # documented using the bare name (`x`) rather than with
- # dots (`x...`), strip the dots:
- if ($section =~ /\w\.\.\.$/) {
- $section =~ s/\.\.\.$//;
-
- if ($verbose) {
- print STDERR "${file}:$.: warning: Variable macro arguments should be documented without dots\n";
- ++$warnings;
- }
- }
-
- if ($state == STATE_BODY_WITH_BLANK_LINE && /^\s*\*\s?\S/) {
- dump_section($file, $section, $contents);
- $section = $section_default;
- $new_start_line = $.;
- $contents = "";
- }
-
- if (/$doc_sect/i) { # case insensitive for supported section names
- $newsection = $1;
- $newcontents = $2;
-
- # map the supported section names to the canonical names
- if ($newsection =~ m/^description$/i) {
- $newsection = $section_default;
- } elsif ($newsection =~ m/^context$/i) {
- $newsection = $section_context;
- } elsif ($newsection =~ m/^returns?$/i) {
- $newsection = $section_return;
- } elsif ($newsection =~ m/^\@return$/) {
- # special: @return is a section, not a param description
- $newsection = $section_return;
- }
-
- if (($contents ne "") && ($contents ne "\n")) {
- if (!$in_doc_sect && $verbose) {
- print STDERR "${file}:$.: warning: contents before sections\n";
- ++$warnings;
- }
- dump_section($file, $section, $contents);
- $section = $section_default;
- }
-
- $in_doc_sect = 1;
- $state = STATE_BODY;
- $contents = $newcontents;
- $new_start_line = $.;
- while (substr($contents, 0, 1) eq " ") {
- $contents = substr($contents, 1);
- }
- if ($contents ne "") {
- $contents .= "\n";
- }
- $section = $newsection;
- $leading_space = undef;
- } elsif (/$doc_end/) {
- if (($contents ne "") && ($contents ne "\n")) {
- dump_section($file, $section, $contents);
- $section = $section_default;
- $contents = "";
- }
- # look for doc_com + <text> + doc_end:
- if ($_ =~ m'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') {
- print STDERR "${file}:$.: warning: suspicious ending line: $_";
- ++$warnings;
- }
-
- $prototype = "";
- $state = STATE_PROTO;
- $brcount = 0;
- $new_start_line = $. + 1;
- } elsif (/$doc_content/) {
- if ($1 eq "") {
- if ($section eq $section_context) {
- dump_section($file, $section, $contents);
- $section = $section_default;
- $contents = "";
- $new_start_line = $.;
- $state = STATE_BODY;
- } else {
- if ($section ne $section_default) {
- $state = STATE_BODY_WITH_BLANK_LINE;
- } else {
- $state = STATE_BODY;
- }
- $contents .= "\n";
- }
- } elsif ($state == STATE_BODY_MAYBE) {
- # Continued declaration purpose
- chomp($declaration_purpose);
- $declaration_purpose .= " " . $1;
- $declaration_purpose =~ s/\s+/ /g;
- } else {
- my $cont = $1;
- if ($section =~ m/^@/ || $section eq $section_context) {
- if (!defined $leading_space) {
- if ($cont =~ m/^(\s+)/) {
- $leading_space = $1;
- } else {
- $leading_space = "";
- }
- }
- $cont =~ s/^$leading_space//;
- }
- $contents .= $cont . "\n";
- }
- } else {
- # i dont know - bad line? ignore.
- print STDERR "${file}:$.: warning: bad line: $_";
- ++$warnings;
- }
-}
-
-
-#
-# STATE_PROTO: reading a function/whatever prototype.
-#
-sub process_proto($$) {
- my $file = shift;
-
- if (/$doc_inline_oneline/) {
- $section = $1;
- $contents = $2;
- if ($contents ne "") {
- $contents .= "\n";
- dump_section($file, $section, $contents);
- $section = $section_default;
- $contents = "";
- }
- } elsif (/$doc_inline_start/) {
- $state = STATE_INLINE;
- $inline_doc_state = STATE_INLINE_NAME;
- } elsif ($decl_type eq 'function') {
- process_proto_function($_, $file);
- } else {
- process_proto_type($_, $file);
- }
-}
-
-#
-# STATE_DOCBLOCK: within a DOC: block.
-#
-sub process_docblock($$) {
- my $file = shift;
-
- if (/$doc_end/) {
- dump_doc_section($file, $section, $contents);
- $section = $section_default;
- $contents = "";
- $function = "";
- %parameterdescs = ();
- %parametertypes = ();
- @parameterlist = ();
- %sections = ();
- @sectionlist = ();
- $prototype = "";
- $state = STATE_NORMAL;
- } elsif (/$doc_content/) {
- if ( $1 eq "" ) {
- $contents .= $blankline;
- } else {
- $contents .= $1 . "\n";
- }
- }
-}
-
-#
-# STATE_INLINE: docbook comments within a prototype.
-#
-sub process_inline($$) {
- my $file = shift;
-
- # First line (state 1) needs to be a @parameter
- if ($inline_doc_state == STATE_INLINE_NAME && /$doc_inline_sect/o) {
- $section = $1;
- $contents = $2;
- $new_start_line = $.;
- if ($contents ne "") {
- while (substr($contents, 0, 1) eq " ") {
- $contents = substr($contents, 1);
- }
- $contents .= "\n";
- }
- $inline_doc_state = STATE_INLINE_TEXT;
- # Documentation block end */
- } elsif (/$doc_inline_end/) {
- if (($contents ne "") && ($contents ne "\n")) {
- dump_section($file, $section, $contents);
- $section = $section_default;
- $contents = "";
- }
- $state = STATE_PROTO;
- $inline_doc_state = STATE_INLINE_NA;
- # Regular text
- } elsif (/$doc_content/) {
- if ($inline_doc_state == STATE_INLINE_TEXT) {
- $contents .= $1 . "\n";
- # nuke leading blank lines
- if ($contents =~ /^\s*$/) {
- $contents = "";
- }
- } elsif ($inline_doc_state == STATE_INLINE_NAME) {
- $inline_doc_state = STATE_INLINE_ERROR;
- print STDERR "${file}:$.: warning: ";
- print STDERR "Incorrect use of kernel-doc format: $_";
- ++$warnings;
- }
- }
-}
-
-
-sub process_file($) {
- my $file;
- my $initial_section_counter = $section_counter;
- my ($orig_file) = @_;
-
- $file = map_filename($orig_file);
-
- if (!open(IN_FILE,"<$file")) {
- print STDERR "Error: Cannot open file $file\n";
- ++$errors;
- return;
- }
-
- $. = 1;
-
- $section_counter = 0;
- while (<IN_FILE>) {
- while (s/\\\s*$//) {
- $_ .= <IN_FILE>;
- }
- # Replace tabs by spaces
- while ($_ =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e) {};
- # Hand this line to the appropriate state handler
- if ($state == STATE_NORMAL) {
- process_normal();
- } elsif ($state == STATE_NAME) {
- process_name($file, $_);
- } elsif ($state == STATE_BODY || $state == STATE_BODY_MAYBE ||
- $state == STATE_BODY_WITH_BLANK_LINE) {
- process_body($file, $_);
- } elsif ($state == STATE_INLINE) { # scanning for inline parameters
- process_inline($file, $_);
- } elsif ($state == STATE_PROTO) {
- process_proto($file, $_);
- } elsif ($state == STATE_DOCBLOCK) {
- process_docblock($file, $_);
- }
- }
-
- # Make sure we got something interesting.
- if ($initial_section_counter == $section_counter && $
- output_mode ne "none") {
- if ($output_selection == OUTPUT_INCLUDE) {
- print STDERR "${file}:1: warning: '$_' not found\n"
- for keys %function_table;
- }
- else {
- print STDERR "${file}:1: warning: no structured comments found\n";
- }
- }
- close IN_FILE;
-}
-
-
-if ($output_mode eq "rst") {
- get_sphinx_version() if (!$sphinx_major);
-}
-
-$kernelversion = get_kernel_version();
-
-# generate a sequence of code that will splice in highlighting information
-# using the s// operator.
-for (my $k = 0; $k < @highlights; $k++) {
- my $pattern = $highlights[$k][0];
- my $result = $highlights[$k][1];
-# print STDERR "scanning pattern:$pattern, highlight:($result)\n";
- $dohighlight .= "\$contents =~ s:$pattern:$result:gs;\n";
-}
-
-# Read the file that maps relative names to absolute names for
-# separate source and object directories and for shadow trees.
-if (open(SOURCE_MAP, "<.tmp_filelist.txt")) {
- my ($relname, $absname);
- while(<SOURCE_MAP>) {
- chop();
- ($relname, $absname) = (split())[0..1];
- $relname =~ s:^/+::;
- $source_map{$relname} = $absname;
- }
- close(SOURCE_MAP);
-}
-
-if ($output_selection == OUTPUT_EXPORTED ||
- $output_selection == OUTPUT_INTERNAL) {
-
- push(@export_file_list, @ARGV);
-
- foreach (@export_file_list) {
- chomp;
- process_export_file($_);
- }
-}
-
-foreach (@ARGV) {
- chomp;
- process_file($_);
-}
-if ($verbose && $errors) {
- print STDERR "$errors errors\n";
-}
-if ($verbose && $warnings) {
- print STDERR "$warnings warnings\n";
-}
-
-if ($Werror && $warnings) {
- print STDERR "$warnings warnings as Errors\n";
- exit($warnings);
-} else {
- exit($output_mode eq "none" ? 0 : $errors)
-}
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (6 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl " Peter Maydell
@ 2025-08-14 17:13 ` Peter Maydell
2025-08-15 10:40 ` Mauro Carvalho Chehab
2025-08-15 9:11 ` [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Jonathan Cameron via
2025-08-19 10:34 ` Paolo Bonzini
9 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-14 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
We never had a MAINTAINERS entry for the old kernel-doc script; add
the files for the new Python kernel-doc under "Sphinx documentation
configuration and build machinery", as the most appropriate
subsection.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index a07086ed762..efa59ce7c36 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4436,6 +4436,8 @@ F: docs/sphinx/
F: docs/_templates/
F: docs/devel/docs.rst
F: docs/devel/qapi-domain.rst
+F: scripts/kernel-doc
+F: scripts/lib/kdoc/
Rust build system integration
M: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (7 preceding siblings ...)
2025-08-14 17:13 ` [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section Peter Maydell
@ 2025-08-15 9:11 ` Jonathan Cameron via
2025-08-15 9:39 ` Mauro Carvalho Chehab
2025-08-19 10:34 ` Paolo Bonzini
9 siblings, 1 reply; 24+ messages in thread
From: Jonathan Cameron via @ 2025-08-15 9:11 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow, Mauro Carvalho Chehab
On Thu, 14 Aug 2025 18:13:15 +0100
Peter Maydell <peter.maydell@linaro.org> wrote:
> Earlier this year, the Linux kernel's kernel-doc script was rewritten
> from the old Perl version into a shiny and hopefully more maintainable
> Python version. This commit series updates our copy of this script
> to the latest kernel version. I have tested it by comparing the
> generated HTML documentation and checking that there are no
> unexpected changes.
>
> Luckily we are carrying very few local modifications to the Perl
> script, so this is fairly straightforward. The structure of the
> patchset is:
> * a minor update to the kerneldoc.py Sphinx extension so it
> will work with both old and new kernel-doc script output
> * a fix to a doc comment markup error that I noticed while comparing
> the HTML output from the two versions of the script
> * import the new Python script, unmodified from the kernel's version
> (conveniently the kernel calls it kernel-doc.py, so it doesn't
> clash with the existing script)
> * make the changes to that library code that correspond to the
> two local QEMU-specific changes we carry
> * tell sphinx to use the Python version
> * delete the Perl script (I have put a diff of our local mods
> to the Perl script in the commit message of this commit, for
> posterity)
>
> The diffstat looks big, but almost all of it is "import the
> kernel's new script that we trust and don't need to review in
> detail" and "delete the old script".
Given Mauro is somewhat active in qemu as well, +CC for information
if nothing else.
Jonathan
>
> My immediate motivation for doing this update is that I noticed
> that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> is using a Perl that complains about a construct in the perl script,
> which prompted me to check if the kernel folks had already fixed
> it, which it turned out that they had, by rewriting the whole thing :-)
> More generally, if we don't do this update, then we're effectively
> going to drift down the same path we did with checkpatch.pl, where
> we have our own version that diverges from the kernel's version
> and we have to maintain it ourselves.
>
> We should also update the Sphinx plugin itself (i.e.
> docs/sphinx/kerneldoc.py), but because I did not need to do
> that to update the main kernel-doc script, I have left that as
> a separate todo item.
>
> Testing
> -------
>
> I looked at the HTML output of the old kernel-doc script versus the
> new one, using the following diff command which mechanically excludes
> a couple of "same minor change" everywhere diffs, and eyeballing the
> resulting ~150 lines of diff.
>
> diff -w -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
>
> The HTML changes are:
>
> (1) some paras now have ID tags, eg:
> -<p><strong>Functions operating on arrays of bits</strong></p>
> +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
>
> (2) Some extra named <div>s, eg:
> +<div class="kernelindent docutils container">
> <p><strong>Parameters</strong></p>
> <dl class="simple">
> <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> @@ -144,12 +145,14 @@
> <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
> </dd>
> </dl>
> +</div>
>
> (3) The new version correctly parses the multi-line Return: block for
> the memory_translate_iotlb() doc comment. You can see that the
> old HTML here had dt/dd markup, and it mis-renders in the HTML at
> https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
>
> <p><strong>Return</strong></p>
> -<dl class="simple">
> -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr. The MemoryRegion must not be
> accessed after rcu_read_unlock.
> +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> +addr. The MemoryRegion must not be accessed after rcu_read_unlock.
> On failure, return NULL, setting <strong>errp</strong> with error.</p>
> -</dd>
> -</dl>
> +</div>
>
> "Definition" sections now get output with a trailing colon:
>
> -<p><strong>Definition</strong></p>
> +<div class="kernelindent docutils container">
> +<p><strong>Definition</strong>:</p>
>
> This seems like it might be a bug in kernel-doc since the Parameters,
> Return, etc sections don't get the trailing colon. I don't think it's
> important enough to worry about.
>
> thanks
> -- PMM
>
> Peter Maydell (8):
> docs/sphinx/kerneldoc.py: Handle new LINENO syntax
> tests/qtest/libqtest.h: Remove stray space from doc comment
> scripts: Import Python kerneldoc from Linux kernel
> scripts/kernel-doc: strip QEMU_ from function definitions
> scripts/kernel-doc: tweak for QEMU coding standards
> scripts/kerneldoc: Switch to the Python kernel-doc script
> scripts/kernel-doc: Delete the old Perl kernel-doc script
> MAINTAINERS: Put kernel-doc under the "docs build machinery" section
>
> MAINTAINERS | 2 +
> docs/conf.py | 4 +-
> docs/sphinx/kerneldoc.py | 7 +-
> tests/qtest/libqtest.h | 2 +-
> .editorconfig | 2 +-
> scripts/kernel-doc | 2442 -------------------------------
> scripts/kernel-doc.py | 325 ++++
> scripts/lib/kdoc/kdoc_files.py | 291 ++++
> scripts/lib/kdoc/kdoc_item.py | 42 +
> scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++
> scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
> scripts/lib/kdoc/kdoc_re.py | 270 ++++
> 12 files changed, 3355 insertions(+), 2451 deletions(-)
> delete mode 100755 scripts/kernel-doc
> create mode 100755 scripts/kernel-doc.py
> create mode 100644 scripts/lib/kdoc/kdoc_files.py
> create mode 100644 scripts/lib/kdoc/kdoc_item.py
> create mode 100644 scripts/lib/kdoc/kdoc_output.py
> create mode 100644 scripts/lib/kdoc/kdoc_parser.py
> create mode 100644 scripts/lib/kdoc/kdoc_re.py
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
2025-08-15 9:11 ` [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Jonathan Cameron via
@ 2025-08-15 9:39 ` Mauro Carvalho Chehab
2025-08-15 10:10 ` Peter Maydell
0 siblings, 1 reply; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 9:39 UTC (permalink / raw)
To: Jonathan Cameron; +Cc: Peter Maydell, qemu-devel, Paolo Bonzini, John Snow
Hi Peter/Jonathan,
Em Fri, 15 Aug 2025 10:11:09 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
> On Thu, 14 Aug 2025 18:13:15 +0100
> Peter Maydell <peter.maydell@linaro.org> wrote:
>
> > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > from the old Perl version into a shiny and hopefully more maintainable
> > Python version. This commit series updates our copy of this script
> > to the latest kernel version. I have tested it by comparing the
> > generated HTML documentation and checking that there are no
> > unexpected changes.
Nice! Yeah, I had a branch here doing something similar for QEMU,
but got sidetracked by other things and didn't have time to address
a couple of issues. I'm glad you find the time for it.
> > Luckily we are carrying very few local modifications to the Perl
> > script, so this is fairly straightforward. The structure of the
> > patchset is:
> > * a minor update to the kerneldoc.py Sphinx extension so it
> > will work with both old and new kernel-doc script output
> > * a fix to a doc comment markup error that I noticed while comparing
> > the HTML output from the two versions of the script
> > * import the new Python script, unmodified from the kernel's version
> > (conveniently the kernel calls it kernel-doc.py, so it doesn't
> > clash with the existing script)
> > * make the changes to that library code that correspond to the
> > two local QEMU-specific changes we carry
To make it easier to maintain and keep in sync with Kernel upstream,
perhaps we can try to change Kernel upstream to make easier for QEMU
to have a class override for the kdoc parser, allowing it to just
sync with Linux upstream, while having its own set of rules on a
separate file.
A RFC on that sense is welcomed. Otherwise, I'll try to spare some
time to think on a good way for doing that.
> > * tell sphinx to use the Python version
> > * delete the Perl script (I have put a diff of our local mods
> > to the Perl script in the commit message of this commit, for
> > posterity)
> >
> > The diffstat looks big, but almost all of it is "import the
> > kernel's new script that we trust and don't need to review in
> > detail" and "delete the old script".
One thing that should be noticed is that Jonathan Corbet is currently
doing several cleanups at the Python script, simplifying some
regular expressions, avoiding them when str.replace() does the job
and adding comments. The end goal is to make it easier for developers
to understand and help maintaining its code.
So, it is probably worth backporting Linux upstream changes after
the end of Kernel 6.17 cycle.
>
> Given Mauro is somewhat active in qemu as well, +CC for information
> if nothing else.
>
> Jonathan
>
>
>
> >
> > My immediate motivation for doing this update is that I noticed
> > that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> > is using a Perl that complains about a construct in the perl script,
> > which prompted me to check if the kernel folks had already fixed
> > it, which it turned out that they had, by rewriting the whole thing :-)
> > More generally, if we don't do this update, then we're effectively
> > going to drift down the same path we did with checkpatch.pl, where
> > we have our own version that diverges from the kernel's version
> > and we have to maintain it ourselves.
> >
> > We should also update the Sphinx plugin itself (i.e.
> > docs/sphinx/kerneldoc.py), but because I did not need to do
> > that to update the main kernel-doc script, I have left that as
> > a separate todo item.
The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
than before, and hendles better kernel-doc warnings, as they are now
using Sphinx logger class.
(*) I'm a little bit suspicious when talking about it, as I did the
changes there too ;-)
-
Btw, one important point to notice: if you picked the latest version
of kernel-doc, it currently requires at least Python 3.6 (3.7 is the
recommended minimal one). It does check that, silently bailing out
if Python < 3.6.
With Python 3.6, it emits a warning, as the parameter order for
structs and functions won't match the original order, as the script
assumes 3.7+ dict behavior where the insert order is preserved.
So, at QEMU build instructions, I would add a notice asking for at
least 3.7 to build docs.
> >
> > Testing
> > -------
> >
> > I looked at the HTML output of the old kernel-doc script versus the
> > new one, using the following diff command which mechanically excludes
> > a couple of "same minor change" everywhere diffs, and eyeballing the
> > resulting ~150 lines of diff.
> >
> > diff -w -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
> >
> > The HTML changes are:
> >
> > (1) some paras now have ID tags, eg:
> > -<p><strong>Functions operating on arrays of bits</strong></p>
> > +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
> >
> > (2) Some extra named <div>s, eg:
> > +<div class="kernelindent docutils container">
> > <p><strong>Parameters</strong></p>
> > <dl class="simple">
> > <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> > @@ -144,12 +145,14 @@
> > <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
> > </dd>
> > </dl>
> > +</div>
> >
> > (3) The new version correctly parses the multi-line Return: block for
> > the memory_translate_iotlb() doc comment. You can see that the
> > old HTML here had dt/dd markup, and it mis-renders in the HTML at
> > https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
> >
> > <p><strong>Return</strong></p>
> > -<dl class="simple">
> > -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr. The MemoryRegion must not be
> > accessed after rcu_read_unlock.
> > +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> > +addr. The MemoryRegion must not be accessed after rcu_read_unlock.
> > On failure, return NULL, setting <strong>errp</strong> with error.</p>
> > -</dd>
> > -</dl>
> > +</div>
> >
> > "Definition" sections now get output with a trailing colon:
> >
> > -<p><strong>Definition</strong></p>
> > +<div class="kernelindent docutils container">
> > +<p><strong>Definition</strong>:</p>
> >
> > This seems like it might be a bug in kernel-doc since the Parameters,
> > Return, etc sections don't get the trailing colon. I don't think it's
> > important enough to worry about.
> >
> > thanks
> > -- PMM
> >
> > Peter Maydell (8):
> > docs/sphinx/kerneldoc.py: Handle new LINENO syntax
> > tests/qtest/libqtest.h: Remove stray space from doc comment
> > scripts: Import Python kerneldoc from Linux kernel
> > scripts/kernel-doc: strip QEMU_ from function definitions
> > scripts/kernel-doc: tweak for QEMU coding standards
> > scripts/kerneldoc: Switch to the Python kernel-doc script
> > scripts/kernel-doc: Delete the old Perl kernel-doc script
> > MAINTAINERS: Put kernel-doc under the "docs build machinery" section
I'll review the actual patches later.
> >
> > MAINTAINERS | 2 +
> > docs/conf.py | 4 +-
> > docs/sphinx/kerneldoc.py | 7 +-
> > tests/qtest/libqtest.h | 2 +-
> > .editorconfig | 2 +-
> > scripts/kernel-doc | 2442 -------------------------------
> > scripts/kernel-doc.py | 325 ++++
> > scripts/lib/kdoc/kdoc_files.py | 291 ++++
> > scripts/lib/kdoc/kdoc_item.py | 42 +
> > scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++
> > scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
> > scripts/lib/kdoc/kdoc_re.py | 270 ++++
> > 12 files changed, 3355 insertions(+), 2451 deletions(-)
> > delete mode 100755 scripts/kernel-doc
> > create mode 100755 scripts/kernel-doc.py
> > create mode 100644 scripts/lib/kdoc/kdoc_files.py
> > create mode 100644 scripts/lib/kdoc/kdoc_item.py
> > create mode 100644 scripts/lib/kdoc/kdoc_output.py
> > create mode 100644 scripts/lib/kdoc/kdoc_parser.py
> > create mode 100644 scripts/lib/kdoc/kdoc_re.py
> >
>
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax
2025-08-14 17:13 ` [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax Peter Maydell
@ 2025-08-15 9:49 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 9:49 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:16 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> The new upstream kernel-doc that we plan to update to uses a different
> syntax for the LINENO directives that the Sphinx extension parses:
> instead of
> #define LINENO 86
> it has
> .. LINENO 86
>
> Update the kerneldoc.py extension to handle both syntaxes, so
> that it will work with both the old and the new kernel-doc.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LGTM.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> docs/sphinx/kerneldoc.py | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/docs/sphinx/kerneldoc.py b/docs/sphinx/kerneldoc.py
> index 3aa972f2e89..30bb3431983 100644
> --- a/docs/sphinx/kerneldoc.py
> +++ b/docs/sphinx/kerneldoc.py
> @@ -127,7 +127,7 @@ def run(self):
> result = ViewList()
>
> lineoffset = 0;
> - line_regex = re.compile("^#define LINENO ([0-9]+)$")
> + line_regex = re.compile(r"^(?:\.\.|#define) LINENO ([0-9]+)$")
> for line in lines:
> match = line_regex.search(line)
> if match:
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment
2025-08-14 17:13 ` [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment Peter Maydell
@ 2025-08-15 9:51 ` Mauro Carvalho Chehab
2025-08-15 10:14 ` Peter Maydell
0 siblings, 1 reply; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 9:51 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:17 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> The doc comment for qtest_cb_for_every_machine has a stray
> space at the start of its description, which makes kernel-doc
> think that this line is part of the documentation of the
> skip_old_versioned argument. The result is that the HTML
> doesn't have a "Description" section and the text is instead
> put in the wrong place.
>
> Remove the stray space.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LGTM. Even the previous version should have handled it wrong here
(if not, it is a bug there - or perhaps QEMU version was using
a very old kernel-doc.pl version).
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> tests/qtest/libqtest.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
> index b3f2e7fbefd..fd27521a9c7 100644
> --- a/tests/qtest/libqtest.h
> +++ b/tests/qtest/libqtest.h
> @@ -977,7 +977,7 @@ void qtest_qmp_fds_assert_success(QTestState *qts, int *fds, size_t nfds,
> * @cb: Pointer to the callback function
> * @skip_old_versioned: true if versioned old machine types should be skipped
> *
> - * Call a callback function for every name of all available machines.
> + * Call a callback function for every name of all available machines.
> */
> void qtest_cb_for_every_machine(void (*cb)(const char *machine),
> bool skip_old_versioned);
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel
2025-08-14 17:13 ` [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel Peter Maydell
@ 2025-08-15 10:00 ` Mauro Carvalho Chehab
2025-08-15 10:19 ` Peter Maydell
1 sibling, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 10:00 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:18 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> We last synced our copy of kerneldoc with Linux back in 2020. In the
> interim, upstream has entirely rewritten the script in Python, and
> the new Python version is split into a main script plus some
> libraries in the kernel's scripts/lib/kdoc.
>
> Import all these files. These are the versions as of kernel commit
> 0cc53520e68be, with no local changes.
I would place here the patch name, as it makes easier to identify
from where this got forked:
0cc53520e68b ("Merge tag 'probes-fixes-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace")
Btw, as I pointed on patch 0/8, docs-next is bringing several
cleanups to it:
$ git diff 0cc53520e68b scripts/lib/kdoc scripts/kernel-doc.py|diffstat -p1
scripts/kernel-doc.py | 34 +-
scripts/lib/kdoc/kdoc_parser.py | 499 ++++++++++++++++++++--------------------
2 files changed, 282 insertions(+), 251 deletions(-)
Better to backport them once v6.17 is released.
> We use the same lib/kdoc/ directory as the kernel does here, so we
> can avoid having to edit the top-level script just to adjust a
> pathname, even though it is probably not the naming we would have
> picked if this was a purely QEMU script.
>
> The Sphinx conf.py still points at the Perl version of the script,
> so this Python code will not be invoked to build the docs yet.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I didn't apply it locally to check if they match upstream...
> ---
> scripts/kernel-doc.py | 325 ++++++
> scripts/lib/kdoc/kdoc_files.py | 291 ++++++
> scripts/lib/kdoc/kdoc_item.py | 42 +
> scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++++++
> scripts/lib/kdoc/kdoc_parser.py | 1669 +++++++++++++++++++++++++++++++
> scripts/lib/kdoc/kdoc_re.py | 270 +++++
Yet:
$ git diff v6.15..0cc53520e68b scripts/lib/kdoc scripts/kernel-doc.py|diffstat -p1
scripts/kernel-doc.py | 325 +++++++
scripts/lib/kdoc/kdoc_files.py | 291 ++++++
scripts/lib/kdoc/kdoc_item.py | 42 +
scripts/lib/kdoc/kdoc_output.py | 749 +++++++++++++++++
scripts/lib/kdoc/kdoc_parser.py | 1669 ++++++++++++++++++++++++++++++++++++++++
scripts/lib/kdoc/kdoc_re.py | 270 ++++++
diffstat is identical. So:
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> 6 files changed, 3346 insertions(+)
> create mode 100755 scripts/kernel-doc.py
> create mode 100644 scripts/lib/kdoc/kdoc_files.py
> create mode 100644 scripts/lib/kdoc/kdoc_item.py
> create mode 100644 scripts/lib/kdoc/kdoc_output.py
> create mode 100644 scripts/lib/kdoc/kdoc_parser.py
> create mode 100644 scripts/lib/kdoc/kdoc_re.py
>
> diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
> new file mode 100755
> index 00000000000..fc3d46ef519
> --- /dev/null
> +++ b/scripts/kernel-doc.py
> @@ -0,0 +1,325 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +#
> +# pylint: disable=C0103,R0915
> +#
> +# Converted from the kernel-doc script originally written in Perl
> +# under GPLv2, copyrighted since 1998 by the following authors:
> +#
> +# Aditya Srivastava <yashsri421@gmail.com>
> +# Akira Yokosawa <akiyks@gmail.com>
> +# Alexander A. Klimov <grandmaster@al2klimov.de>
> +# Alexander Lobakin <aleksander.lobakin@intel.com>
> +# André Almeida <andrealmeid@igalia.com>
> +# Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> +# Anna-Maria Behnsen <anna-maria@linutronix.de>
> +# Armin Kuster <akuster@mvista.com>
> +# Bart Van Assche <bart.vanassche@sandisk.com>
> +# Ben Hutchings <ben@decadent.org.uk>
> +# Borislav Petkov <bbpetkov@yahoo.de>
> +# Chen-Yu Tsai <wenst@chromium.org>
> +# Coco Li <lixiaoyan@google.com>
> +# Conchúr Navid <conchur@web.de>
> +# Daniel Santos <daniel.santos@pobox.com>
> +# Danilo Cesar Lemes de Paula <danilo.cesar@collabora.co.uk>
> +# Dan Luedtke <mail@danrl.de>
> +# Donald Hunter <donald.hunter@gmail.com>
> +# Gabriel Krisman Bertazi <krisman@collabora.co.uk>
> +# Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> +# Harvey Harrison <harvey.harrison@gmail.com>
> +# Horia Geanta <horia.geanta@freescale.com>
> +# Ilya Dryomov <idryomov@gmail.com>
> +# Jakub Kicinski <kuba@kernel.org>
> +# Jani Nikula <jani.nikula@intel.com>
> +# Jason Baron <jbaron@redhat.com>
> +# Jason Gunthorpe <jgg@nvidia.com>
> +# Jérémy Bobbio <lunar@debian.org>
> +# Johannes Berg <johannes.berg@intel.com>
> +# Johannes Weiner <hannes@cmpxchg.org>
> +# Jonathan Cameron <Jonathan.Cameron@huawei.com>
> +# Jonathan Corbet <corbet@lwn.net>
> +# Jonathan Neuschäfer <j.neuschaefer@gmx.net>
> +# Kamil Rytarowski <n54@gmx.com>
> +# Kees Cook <kees@kernel.org>
> +# Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> +# Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com>
> +# Linus Torvalds <torvalds@linux-foundation.org>
> +# Lucas De Marchi <lucas.demarchi@profusion.mobi>
> +# Mark Rutland <mark.rutland@arm.com>
> +# Markus Heiser <markus.heiser@darmarit.de>
> +# Martin Waitz <tali@admingilde.org>
> +# Masahiro Yamada <masahiroy@kernel.org>
> +# Matthew Wilcox <willy@infradead.org>
> +# Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> +# Michal Wajdeczko <michal.wajdeczko@intel.com>
> +# Michael Zucchi
> +# Mike Rapoport <rppt@linux.ibm.com>
> +# Niklas Söderlund <niklas.soderlund@corigine.com>
> +# Nishanth Menon <nm@ti.com>
> +# Paolo Bonzini <pbonzini@redhat.com>
> +# Pavan Kumar Linga <pavan.kumar.linga@intel.com>
> +# Pavel Pisa <pisa@cmp.felk.cvut.cz>
> +# Peter Maydell <peter.maydell@linaro.org>
> +# Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> +# Randy Dunlap <rdunlap@infradead.org>
> +# Richard Kennedy <richard@rsk.demon.co.uk>
> +# Rich Walker <rw@shadow.org.uk>
> +# Rolf Eike Beer <eike-kernel@sf-tec.de>
> +# Sakari Ailus <sakari.ailus@linux.intel.com>
> +# Silvio Fricke <silvio.fricke@gmail.com>
> +# Simon Huggins
> +# Tim Waugh <twaugh@redhat.com>
> +# Tomasz Warniełło <tomasz.warniello@gmail.com>
> +# Utkarsh Tripathi <utripathi2002@gmail.com>
> +# valdis.kletnieks@vt.edu <valdis.kletnieks@vt.edu>
> +# Vegard Nossum <vegard.nossum@oracle.com>
> +# Will Deacon <will.deacon@arm.com>
> +# Yacine Belkadi <yacine.belkadi.1@gmail.com>
> +# Yujie Liu <yujie.liu@intel.com>
> +
> +"""
> +kernel_doc
> +==========
> +
> +Print formatted kernel documentation to stdout
> +
> +Read C language source or header FILEs, extract embedded
> +documentation comments, and print formatted documentation
> +to standard output.
> +
> +The documentation comments are identified by the "/**"
> +opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the
> +documentation comment syntax.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import sys
> +
> +# Import Python modules
> +
> +LIB_DIR = "lib/kdoc"
> +SRC_DIR = os.path.dirname(os.path.realpath(__file__))
> +
> +sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
> +
> +from kdoc_files import KernelFiles # pylint: disable=C0413
> +from kdoc_output import RestFormat, ManFormat # pylint: disable=C0413
> +
> +DESC = """
> +Read C language source or header FILEs, extract embedded documentation comments,
> +and print formatted documentation to standard output.
> +
> +The documentation comments are identified by the "/**" opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the documentation comment syntax.
> +"""
> +
> +EXPORT_FILE_DESC = """
> +Specify an additional FILE in which to look for EXPORT_SYMBOL information.
> +
> +May be used multiple times.
> +"""
> +
> +EXPORT_DESC = """
> +Only output documentation for the symbols that have been
> +exported using EXPORT_SYMBOL() and related macros in any input
> +FILE or -export-file FILE.
> +"""
> +
> +INTERNAL_DESC = """
> +Only output documentation for the symbols that have NOT been
> +exported using EXPORT_SYMBOL() and related macros in any input
> +FILE or -export-file FILE.
> +"""
> +
> +FUNCTION_DESC = """
> +Only output documentation for the given function or DOC: section
> +title. All other functions and DOC: sections are ignored.
> +
> +May be used multiple times.
> +"""
> +
> +NOSYMBOL_DESC = """
> +Exclude the specified symbol from the output documentation.
> +
> +May be used multiple times.
> +"""
> +
> +FILES_DESC = """
> +Header and C source files to be parsed.
> +"""
> +
> +WARN_CONTENTS_BEFORE_SECTIONS_DESC = """
> +Warns if there are contents before sections (deprecated).
> +
> +This option is kept just for backward-compatibility, but it does nothing,
> +neither here nor at the original Perl script.
> +"""
> +
> +
> +class MsgFormatter(logging.Formatter):
> + """Helper class to format warnings on a similar way to kernel-doc.pl"""
> +
> + def format(self, record):
> + record.levelname = record.levelname.capitalize()
> + return logging.Formatter.format(self, record)
> +
> +def main():
> + """Main program"""
> +
> + parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter,
> + description=DESC)
> +
> + # Normal arguments
> +
> + parser.add_argument("-v", "-verbose", "--verbose", action="store_true",
> + help="Verbose output, more warnings and other information.")
> +
> + parser.add_argument("-d", "-debug", "--debug", action="store_true",
> + help="Enable debug messages")
> +
> + parser.add_argument("-M", "-modulename", "--modulename",
> + default="Kernel API",
> + help="Allow setting a module name at the output.")
> +
> + parser.add_argument("-l", "-enable-lineno", "--enable_lineno",
> + action="store_true",
> + help="Enable line number output (only in ReST mode)")
> +
> + # Arguments to control the warning behavior
> +
> + parser.add_argument("-Wreturn", "--wreturn", action="store_true",
> + help="Warns about the lack of a return markup on functions.")
> +
> + parser.add_argument("-Wshort-desc", "-Wshort-description", "--wshort-desc",
> + action="store_true",
> + help="Warns if initial short description is missing")
> +
> + parser.add_argument("-Wcontents-before-sections",
> + "--wcontents-before-sections", action="store_true",
> + help=WARN_CONTENTS_BEFORE_SECTIONS_DESC)
> +
> + parser.add_argument("-Wall", "--wall", action="store_true",
> + help="Enable all types of warnings")
> +
> + parser.add_argument("-Werror", "--werror", action="store_true",
> + help="Treat warnings as errors.")
> +
> + parser.add_argument("-export-file", "--export-file", action='append',
> + help=EXPORT_FILE_DESC)
> +
> + # Output format mutually-exclusive group
> +
> + out_group = parser.add_argument_group("Output format selection (mutually exclusive)")
> +
> + out_fmt = out_group.add_mutually_exclusive_group()
> +
> + out_fmt.add_argument("-m", "-man", "--man", action="store_true",
> + help="Output troff manual page format.")
> + out_fmt.add_argument("-r", "-rst", "--rst", action="store_true",
> + help="Output reStructuredText format (default).")
> + out_fmt.add_argument("-N", "-none", "--none", action="store_true",
> + help="Do not output documentation, only warnings.")
> +
> + # Output selection mutually-exclusive group
> +
> + sel_group = parser.add_argument_group("Output selection (mutually exclusive)")
> + sel_mut = sel_group.add_mutually_exclusive_group()
> +
> + sel_mut.add_argument("-e", "-export", "--export", action='store_true',
> + help=EXPORT_DESC)
> +
> + sel_mut.add_argument("-i", "-internal", "--internal", action='store_true',
> + help=INTERNAL_DESC)
> +
> + sel_mut.add_argument("-s", "-function", "--symbol", action='append',
> + help=FUNCTION_DESC)
> +
> + # Those are valid for all 3 types of filter
> + parser.add_argument("-n", "-nosymbol", "--nosymbol", action='append',
> + help=NOSYMBOL_DESC)
> +
> + parser.add_argument("-D", "-no-doc-sections", "--no-doc-sections",
> + action='store_true', help="Don't outputt DOC sections")
> +
> + parser.add_argument("files", metavar="FILE",
> + nargs="+", help=FILES_DESC)
> +
> + args = parser.parse_args()
> +
> + if args.wall:
> + args.wreturn = True
> + args.wshort_desc = True
> + args.wcontents_before_sections = True
> +
> + logger = logging.getLogger()
> +
> + if not args.debug:
> + logger.setLevel(logging.INFO)
> + else:
> + logger.setLevel(logging.DEBUG)
> +
> + formatter = MsgFormatter('%(levelname)s: %(message)s')
> +
> + handler = logging.StreamHandler()
> + handler.setFormatter(formatter)
> +
> + logger.addHandler(handler)
> +
> + python_ver = sys.version_info[:2]
> + if python_ver < (3,6):
> + logger.warning("Python 3.6 or later is required by kernel-doc")
> +
> + # Return 0 here to avoid breaking compilation
> + sys.exit(0)
> +
> + if python_ver < (3,7):
> + logger.warning("Python 3.7 or later is required for correct results")
> +
> + if args.man:
> + out_style = ManFormat(modulename=args.modulename)
> + elif args.none:
> + out_style = None
> + else:
> + out_style = RestFormat()
> +
> + kfiles = KernelFiles(verbose=args.verbose,
> + out_style=out_style, werror=args.werror,
> + wreturn=args.wreturn, wshort_desc=args.wshort_desc,
> + wcontents_before_sections=args.wcontents_before_sections)
> +
> + kfiles.parse(args.files, export_file=args.export_file)
> +
> + for t in kfiles.msg(enable_lineno=args.enable_lineno, export=args.export,
> + internal=args.internal, symbol=args.symbol,
> + nosymbol=args.nosymbol, export_file=args.export_file,
> + no_doc_sections=args.no_doc_sections):
> + msg = t[1]
> + if msg:
> + print(msg)
> +
> + error_count = kfiles.errors
> + if not error_count:
> + sys.exit(0)
> +
> + if args.werror:
> + print(f"{error_count} warnings as errors")
> + sys.exit(error_count)
> +
> + if args.verbose:
> + print(f"{error_count} errors")
> +
> + if args.none:
> + sys.exit(0)
> +
> + sys.exit(error_count)
> +
> +
> +# Call main method
> +if __name__ == "__main__":
> + main()
> diff --git a/scripts/lib/kdoc/kdoc_files.py b/scripts/lib/kdoc/kdoc_files.py
> new file mode 100644
> index 00000000000..9e09b45b02f
> --- /dev/null
> +++ b/scripts/lib/kdoc/kdoc_files.py
> @@ -0,0 +1,291 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +#
> +# pylint: disable=R0903,R0913,R0914,R0917
> +
> +"""
> +Parse lernel-doc tags on multiple kernel source files.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import re
> +
> +from kdoc_parser import KernelDoc
> +from kdoc_output import OutputFormat
> +
> +
> +class GlobSourceFiles:
> + """
> + Parse C source code file names and directories via an Interactor.
> + """
> +
> + def __init__(self, srctree=None, valid_extensions=None):
> + """
> + Initialize valid extensions with a tuple.
> +
> + If not defined, assume default C extensions (.c and .h)
> +
> + It would be possible to use python's glob function, but it is
> + very slow, and it is not interactive. So, it would wait to read all
> + directories before actually do something.
> +
> + So, let's use our own implementation.
> + """
> +
> + if not valid_extensions:
> + self.extensions = (".c", ".h")
> + else:
> + self.extensions = valid_extensions
> +
> + self.srctree = srctree
> +
> + def _parse_dir(self, dirname):
> + """Internal function to parse files recursively"""
> +
> + with os.scandir(dirname) as obj:
> + for entry in obj:
> + name = os.path.join(dirname, entry.name)
> +
> + if entry.is_dir():
> + yield from self._parse_dir(name)
> +
> + if not entry.is_file():
> + continue
> +
> + basename = os.path.basename(name)
> +
> + if not basename.endswith(self.extensions):
> + continue
> +
> + yield name
> +
> + def parse_files(self, file_list, file_not_found_cb):
> + """
> + Define an interator to parse all source files from file_list,
> + handling directories if any
> + """
> +
> + if not file_list:
> + return
> +
> + for fname in file_list:
> + if self.srctree:
> + f = os.path.join(self.srctree, fname)
> + else:
> + f = fname
> +
> + if os.path.isdir(f):
> + yield from self._parse_dir(f)
> + elif os.path.isfile(f):
> + yield f
> + elif file_not_found_cb:
> + file_not_found_cb(fname)
> +
> +
> +class KernelFiles():
> + """
> + Parse kernel-doc tags on multiple kernel source files.
> +
> + There are two type of parsers defined here:
> + - self.parse_file(): parses both kernel-doc markups and
> + EXPORT_SYMBOL* macros;
> + - self.process_export_file(): parses only EXPORT_SYMBOL* macros.
> + """
> +
> + def warning(self, msg):
> + """Ancillary routine to output a warning and increment error count"""
> +
> + self.config.log.warning(msg)
> + self.errors += 1
> +
> + def error(self, msg):
> + """Ancillary routine to output an error and increment error count"""
> +
> + self.config.log.error(msg)
> + self.errors += 1
> +
> + def parse_file(self, fname):
> + """
> + Parse a single Kernel source.
> + """
> +
> + # Prevent parsing the same file twice if results are cached
> + if fname in self.files:
> + return
> +
> + doc = KernelDoc(self.config, fname)
> + export_table, entries = doc.parse_kdoc()
> +
> + self.export_table[fname] = export_table
> +
> + self.files.add(fname)
> + self.export_files.add(fname) # parse_kdoc() already check exports
> +
> + self.results[fname] = entries
> +
> + def process_export_file(self, fname):
> + """
> + Parses EXPORT_SYMBOL* macros from a single Kernel source file.
> + """
> +
> + # Prevent parsing the same file twice if results are cached
> + if fname in self.export_files:
> + return
> +
> + doc = KernelDoc(self.config, fname)
> + export_table = doc.parse_export()
> +
> + if not export_table:
> + self.error(f"Error: Cannot check EXPORT_SYMBOL* on {fname}")
> + export_table = set()
> +
> + self.export_table[fname] = export_table
> + self.export_files.add(fname)
> +
> + def file_not_found_cb(self, fname):
> + """
> + Callback to warn if a file was not found.
> + """
> +
> + self.error(f"Cannot find file {fname}")
> +
> + def __init__(self, verbose=False, out_style=None,
> + werror=False, wreturn=False, wshort_desc=False,
> + wcontents_before_sections=False,
> + logger=None):
> + """
> + Initialize startup variables and parse all files
> + """
> +
> + if not verbose:
> + verbose = bool(os.environ.get("KBUILD_VERBOSE", 0))
> +
> + if out_style is None:
> + out_style = OutputFormat()
> +
> + if not werror:
> + kcflags = os.environ.get("KCFLAGS", None)
> + if kcflags:
> + match = re.search(r"(\s|^)-Werror(\s|$)/", kcflags)
> + if match:
> + werror = True
> +
> + # reading this variable is for backwards compat just in case
> + # someone was calling it with the variable from outside the
> + # kernel's build system
> + kdoc_werror = os.environ.get("KDOC_WERROR", None)
> + if kdoc_werror:
> + werror = kdoc_werror
> +
> + # Some variables are global to the parser logic as a whole as they are
> + # used to send control configuration to KernelDoc class. As such,
> + # those variables are read-only inside the KernelDoc.
> + self.config = argparse.Namespace
> +
> + self.config.verbose = verbose
> + self.config.werror = werror
> + self.config.wreturn = wreturn
> + self.config.wshort_desc = wshort_desc
> + self.config.wcontents_before_sections = wcontents_before_sections
> +
> + if not logger:
> + self.config.log = logging.getLogger("kernel-doc")
> + else:
> + self.config.log = logger
> +
> + self.config.warning = self.warning
> +
> + self.config.src_tree = os.environ.get("SRCTREE", None)
> +
> + # Initialize variables that are internal to KernelFiles
> +
> + self.out_style = out_style
> +
> + self.errors = 0
> + self.results = {}
> +
> + self.files = set()
> + self.export_files = set()
> + self.export_table = {}
> +
> + def parse(self, file_list, export_file=None):
> + """
> + Parse all files
> + """
> +
> + glob = GlobSourceFiles(srctree=self.config.src_tree)
> +
> + for fname in glob.parse_files(file_list, self.file_not_found_cb):
> + self.parse_file(fname)
> +
> + for fname in glob.parse_files(export_file, self.file_not_found_cb):
> + self.process_export_file(fname)
> +
> + def out_msg(self, fname, name, arg):
> + """
> + Return output messages from a file name using the output style
> + filtering.
> +
> + If output type was not handled by the syler, return None.
> + """
> +
> + # NOTE: we can add rules here to filter out unwanted parts,
> + # although OutputFormat.msg already does that.
> +
> + return self.out_style.msg(fname, name, arg)
> +
> + def msg(self, enable_lineno=False, export=False, internal=False,
> + symbol=None, nosymbol=None, no_doc_sections=False,
> + filenames=None, export_file=None):
> + """
> + Interacts over the kernel-doc results and output messages,
> + returning kernel-doc markups on each interaction
> + """
> +
> + self.out_style.set_config(self.config)
> +
> + if not filenames:
> + filenames = sorted(self.results.keys())
> +
> + glob = GlobSourceFiles(srctree=self.config.src_tree)
> +
> + for fname in filenames:
> + function_table = set()
> +
> + if internal or export:
> + if not export_file:
> + export_file = [fname]
> +
> + for f in glob.parse_files(export_file, self.file_not_found_cb):
> + function_table |= self.export_table[f]
> +
> + if symbol:
> + for s in symbol:
> + function_table.add(s)
> +
> + self.out_style.set_filter(export, internal, symbol, nosymbol,
> + function_table, enable_lineno,
> + no_doc_sections)
> +
> + msg = ""
> + if fname not in self.results:
> + self.config.log.warning("No kernel-doc for file %s", fname)
> + continue
> +
> + for arg in self.results[fname]:
> + m = self.out_msg(fname, arg.name, arg)
> +
> + if m is None:
> + ln = arg.get("ln", 0)
> + dtype = arg.get('type', "")
> +
> + self.config.log.warning("%s:%d Can't handle %s",
> + fname, ln, dtype)
> + else:
> + msg += m
> +
> + if msg:
> + yield fname, msg
> diff --git a/scripts/lib/kdoc/kdoc_item.py b/scripts/lib/kdoc/kdoc_item.py
> new file mode 100644
> index 00000000000..b3b22576455
> --- /dev/null
> +++ b/scripts/lib/kdoc/kdoc_item.py
> @@ -0,0 +1,42 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# A class that will, eventually, encapsulate all of the parsed data that we
> +# then pass into the output modules.
> +#
> +
> +class KdocItem:
> + def __init__(self, name, type, start_line, **other_stuff):
> + self.name = name
> + self.type = type
> + self.declaration_start_line = start_line
> + self.sections = {}
> + self.sections_start_lines = {}
> + self.parameterlist = []
> + self.parameterdesc_start_lines = []
> + self.parameterdescs = {}
> + self.parametertypes = {}
> + #
> + # Just save everything else into our own dict so that the output
> + # side can grab it directly as before. As we move things into more
> + # structured data, this will, hopefully, fade away.
> + #
> + self.other_stuff = other_stuff
> +
> + def get(self, key, default = None):
> + return self.other_stuff.get(key, default)
> +
> + def __getitem__(self, key):
> + return self.get(key)
> +
> + #
> + # Tracking of section and parameter information.
> + #
> + def set_sections(self, sections, start_lines):
> + self.sections = sections
> + self.section_start_lines = start_lines
> +
> + def set_params(self, names, descs, types, starts):
> + self.parameterlist = names
> + self.parameterdescs = descs
> + self.parametertypes = types
> + self.parameterdesc_start_lines = starts
> diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
> new file mode 100644
> index 00000000000..ea8914537ba
> --- /dev/null
> +++ b/scripts/lib/kdoc/kdoc_output.py
> @@ -0,0 +1,749 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +#
> +# pylint: disable=C0301,R0902,R0911,R0912,R0913,R0914,R0915,R0917
> +
> +"""
> +Implement output filters to print kernel-doc documentation.
> +
> +The implementation uses a virtual base class (OutputFormat) which
> +contains a dispatches to virtual methods, and some code to filter
> +out output messages.
> +
> +The actual implementation is done on one separate class per each type
> +of output. Currently, there are output classes for ReST and man/troff.
> +"""
> +
> +import os
> +import re
> +from datetime import datetime
> +
> +from kdoc_parser import KernelDoc, type_param
> +from kdoc_re import KernRe
> +
> +
> +function_pointer = KernRe(r"([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)", cache=False)
> +
> +# match expressions used to find embedded type information
> +type_constant = KernRe(r"\b``([^\`]+)``\b", cache=False)
> +type_constant2 = KernRe(r"\%([-_*\w]+)", cache=False)
> +type_func = KernRe(r"(\w+)\(\)", cache=False)
> +type_param_ref = KernRe(r"([\!~\*]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
> +
> +# Special RST handling for func ptr params
> +type_fp_param = KernRe(r"\@(\w+)\(\)", cache=False)
> +
> +# Special RST handling for structs with func ptr params
> +type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
> +
> +type_env = KernRe(r"(\$\w+)", cache=False)
> +type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
> +type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
> +type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
> +type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
> +type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
> +type_fallback = KernRe(r"\&([_\w]+)", cache=False)
> +type_member_func = type_member + KernRe(r"\(\)", cache=False)
> +
> +
> +class OutputFormat:
> + """
> + Base class for OutputFormat. If used as-is, it means that only
> + warnings will be displayed.
> + """
> +
> + # output mode.
> + OUTPUT_ALL = 0 # output all symbols and doc sections
> + OUTPUT_INCLUDE = 1 # output only specified symbols
> + OUTPUT_EXPORTED = 2 # output exported symbols
> + OUTPUT_INTERNAL = 3 # output non-exported symbols
> +
> + # Virtual member to be overriden at the inherited classes
> + highlights = []
> +
> + def __init__(self):
> + """Declare internal vars and set mode to OUTPUT_ALL"""
> +
> + self.out_mode = self.OUTPUT_ALL
> + self.enable_lineno = None
> + self.nosymbol = {}
> + self.symbol = None
> + self.function_table = None
> + self.config = None
> + self.no_doc_sections = False
> +
> + self.data = ""
> +
> + def set_config(self, config):
> + """
> + Setup global config variables used by both parser and output.
> + """
> +
> + self.config = config
> +
> + def set_filter(self, export, internal, symbol, nosymbol, function_table,
> + enable_lineno, no_doc_sections):
> + """
> + Initialize filter variables according with the requested mode.
> +
> + Only one choice is valid between export, internal and symbol.
> +
> + The nosymbol filter can be used on all modes.
> + """
> +
> + self.enable_lineno = enable_lineno
> + self.no_doc_sections = no_doc_sections
> + self.function_table = function_table
> +
> + if symbol:
> + self.out_mode = self.OUTPUT_INCLUDE
> + elif export:
> + self.out_mode = self.OUTPUT_EXPORTED
> + elif internal:
> + self.out_mode = self.OUTPUT_INTERNAL
> + else:
> + self.out_mode = self.OUTPUT_ALL
> +
> + if nosymbol:
> + self.nosymbol = set(nosymbol)
> +
> +
> + def highlight_block(self, block):
> + """
> + Apply the RST highlights to a sub-block of text.
> + """
> +
> + for r, sub in self.highlights:
> + block = r.sub(sub, block)
> +
> + return block
> +
> + def out_warnings(self, args):
> + """
> + Output warnings for identifiers that will be displayed.
> + """
> +
> + for log_msg in args.warnings:
> + self.config.warning(log_msg)
> +
> + def check_doc(self, name, args):
> + """Check if DOC should be output"""
> +
> + if self.no_doc_sections:
> + return False
> +
> + if name in self.nosymbol:
> + return False
> +
> + if self.out_mode == self.OUTPUT_ALL:
> + self.out_warnings(args)
> + return True
> +
> + if self.out_mode == self.OUTPUT_INCLUDE:
> + if name in self.function_table:
> + self.out_warnings(args)
> + return True
> +
> + return False
> +
> + def check_declaration(self, dtype, name, args):
> + """
> + Checks if a declaration should be output or not based on the
> + filtering criteria.
> + """
> +
> + if name in self.nosymbol:
> + return False
> +
> + if self.out_mode == self.OUTPUT_ALL:
> + self.out_warnings(args)
> + return True
> +
> + if self.out_mode in [self.OUTPUT_INCLUDE, self.OUTPUT_EXPORTED]:
> + if name in self.function_table:
> + return True
> +
> + if self.out_mode == self.OUTPUT_INTERNAL:
> + if dtype != "function":
> + self.out_warnings(args)
> + return True
> +
> + if name not in self.function_table:
> + self.out_warnings(args)
> + return True
> +
> + return False
> +
> + def msg(self, fname, name, args):
> + """
> + Handles a single entry from kernel-doc parser
> + """
> +
> + self.data = ""
> +
> + dtype = args.type
> +
> + if dtype == "doc":
> + self.out_doc(fname, name, args)
> + return self.data
> +
> + if not self.check_declaration(dtype, name, args):
> + return self.data
> +
> + if dtype == "function":
> + self.out_function(fname, name, args)
> + return self.data
> +
> + if dtype == "enum":
> + self.out_enum(fname, name, args)
> + return self.data
> +
> + if dtype == "typedef":
> + self.out_typedef(fname, name, args)
> + return self.data
> +
> + if dtype in ["struct", "union"]:
> + self.out_struct(fname, name, args)
> + return self.data
> +
> + # Warn if some type requires an output logic
> + self.config.log.warning("doesn't now how to output '%s' block",
> + dtype)
> +
> + return None
> +
> + # Virtual methods to be overridden by inherited classes
> + # At the base class, those do nothing.
> + def out_doc(self, fname, name, args):
> + """Outputs a DOC block"""
> +
> + def out_function(self, fname, name, args):
> + """Outputs a function"""
> +
> + def out_enum(self, fname, name, args):
> + """Outputs an enum"""
> +
> + def out_typedef(self, fname, name, args):
> + """Outputs a typedef"""
> +
> + def out_struct(self, fname, name, args):
> + """Outputs a struct"""
> +
> +
> +class RestFormat(OutputFormat):
> + """Consts and functions used by ReST output"""
> +
> + highlights = [
> + (type_constant, r"``\1``"),
> + (type_constant2, r"``\1``"),
> +
> + # Note: need to escape () to avoid func matching later
> + (type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
> + (type_member, r":c:type:`\1\2\3 <\1>`"),
> + (type_fp_param, r"**\1\\(\\)**"),
> + (type_fp_param2, r"**\1\\(\\)**"),
> + (type_func, r"\1()"),
> + (type_enum, r":c:type:`\1 <\2>`"),
> + (type_struct, r":c:type:`\1 <\2>`"),
> + (type_typedef, r":c:type:`\1 <\2>`"),
> + (type_union, r":c:type:`\1 <\2>`"),
> +
> + # in rst this can refer to any type
> + (type_fallback, r":c:type:`\1`"),
> + (type_param_ref, r"**\1\2**")
> + ]
> + blankline = "\n"
> +
> + sphinx_literal = KernRe(r'^[^.].*::$', cache=False)
> + sphinx_cblock = KernRe(r'^\.\.\ +code-block::', cache=False)
> +
> + def __init__(self):
> + """
> + Creates class variables.
> +
> + Not really mandatory, but it is a good coding style and makes
> + pylint happy.
> + """
> +
> + super().__init__()
> + self.lineprefix = ""
> +
> + def print_lineno(self, ln):
> + """Outputs a line number"""
> +
> + if self.enable_lineno and ln is not None:
> + ln += 1
> + self.data += f".. LINENO {ln}\n"
> +
> + def output_highlight(self, args):
> + """
> + Outputs a C symbol that may require being converted to ReST using
> + the self.highlights variable
> + """
> +
> + input_text = args
> + output = ""
> + in_literal = False
> + litprefix = ""
> + block = ""
> +
> + for line in input_text.strip("\n").split("\n"):
> +
> + # If we're in a literal block, see if we should drop out of it.
> + # Otherwise, pass the line straight through unmunged.
> + if in_literal:
> + if line.strip(): # If the line is not blank
> + # If this is the first non-blank line in a literal block,
> + # figure out the proper indent.
> + if not litprefix:
> + r = KernRe(r'^(\s*)')
> + if r.match(line):
> + litprefix = '^' + r.group(1)
> + else:
> + litprefix = ""
> +
> + output += line + "\n"
> + elif not KernRe(litprefix).match(line):
> + in_literal = False
> + else:
> + output += line + "\n"
> + else:
> + output += line + "\n"
> +
> + # Not in a literal block (or just dropped out)
> + if not in_literal:
> + block += line + "\n"
> + if self.sphinx_literal.match(line) or self.sphinx_cblock.match(line):
> + in_literal = True
> + litprefix = ""
> + output += self.highlight_block(block)
> + block = ""
> +
> + # Handle any remaining block
> + if block:
> + output += self.highlight_block(block)
> +
> + # Print the output with the line prefix
> + for line in output.strip("\n").split("\n"):
> + self.data += self.lineprefix + line + "\n"
> +
> + def out_section(self, args, out_docblock=False):
> + """
> + Outputs a block section.
> +
> + This could use some work; it's used to output the DOC: sections, and
> + starts by putting out the name of the doc section itself, but that
> + tends to duplicate a header already in the template file.
> + """
> + for section, text in args.sections.items():
> + # Skip sections that are in the nosymbol_table
> + if section in self.nosymbol:
> + continue
> +
> + if out_docblock:
> + if not self.out_mode == self.OUTPUT_INCLUDE:
> + self.data += f".. _{section}:\n\n"
> + self.data += f'{self.lineprefix}**{section}**\n\n'
> + else:
> + self.data += f'{self.lineprefix}**{section}**\n\n'
> +
> + self.print_lineno(args.section_start_lines.get(section, 0))
> + self.output_highlight(text)
> + self.data += "\n"
> + self.data += "\n"
> +
> + def out_doc(self, fname, name, args):
> + if not self.check_doc(name, args):
> + return
> + self.out_section(args, out_docblock=True)
> +
> + def out_function(self, fname, name, args):
> +
> + oldprefix = self.lineprefix
> + signature = ""
> +
> + func_macro = args.get('func_macro', False)
> + if func_macro:
> + signature = name
> + else:
> + if args.get('functiontype'):
> + signature = args['functiontype'] + " "
> + signature += name + " ("
> +
> + ln = args.declaration_start_line
> + count = 0
> + for parameter in args.parameterlist:
> + if count != 0:
> + signature += ", "
> + count += 1
> + dtype = args.parametertypes.get(parameter, "")
> +
> + if function_pointer.search(dtype):
> + signature += function_pointer.group(1) + parameter + function_pointer.group(3)
> + else:
> + signature += dtype
> +
> + if not func_macro:
> + signature += ")"
> +
> + self.print_lineno(ln)
> + if args.get('typedef') or not args.get('functiontype'):
> + self.data += f".. c:macro:: {name}\n\n"
> +
> + if args.get('typedef'):
> + self.data += " **Typedef**: "
> + self.lineprefix = ""
> + self.output_highlight(args.get('purpose', ""))
> + self.data += "\n\n**Syntax**\n\n"
> + self.data += f" ``{signature}``\n\n"
> + else:
> + self.data += f"``{signature}``\n\n"
> + else:
> + self.data += f".. c:function:: {signature}\n\n"
> +
> + if not args.get('typedef'):
> + self.print_lineno(ln)
> + self.lineprefix = " "
> + self.output_highlight(args.get('purpose', ""))
> + self.data += "\n"
> +
> + # Put descriptive text into a container (HTML <div>) to help set
> + # function prototypes apart
> + self.lineprefix = " "
> +
> + if args.parameterlist:
> + self.data += ".. container:: kernelindent\n\n"
> + self.data += f"{self.lineprefix}**Parameters**\n\n"
> +
> + for parameter in args.parameterlist:
> + parameter_name = KernRe(r'\[.*').sub('', parameter)
> + dtype = args.parametertypes.get(parameter, "")
> +
> + if dtype:
> + self.data += f"{self.lineprefix}``{dtype}``\n"
> + else:
> + self.data += f"{self.lineprefix}``{parameter}``\n"
> +
> + self.print_lineno(args.parameterdesc_start_lines.get(parameter_name, 0))
> +
> + self.lineprefix = " "
> + if parameter_name in args.parameterdescs and \
> + args.parameterdescs[parameter_name] != KernelDoc.undescribed:
> +
> + self.output_highlight(args.parameterdescs[parameter_name])
> + self.data += "\n"
> + else:
> + self.data += f"{self.lineprefix}*undescribed*\n\n"
> + self.lineprefix = " "
> +
> + self.out_section(args)
> + self.lineprefix = oldprefix
> +
> + def out_enum(self, fname, name, args):
> +
> + oldprefix = self.lineprefix
> + ln = args.declaration_start_line
> +
> + self.data += f"\n\n.. c:enum:: {name}\n\n"
> +
> + self.print_lineno(ln)
> + self.lineprefix = " "
> + self.output_highlight(args.get('purpose', ''))
> + self.data += "\n"
> +
> + self.data += ".. container:: kernelindent\n\n"
> + outer = self.lineprefix + " "
> + self.lineprefix = outer + " "
> + self.data += f"{outer}**Constants**\n\n"
> +
> + for parameter in args.parameterlist:
> + self.data += f"{outer}``{parameter}``\n"
> +
> + if args.parameterdescs.get(parameter, '') != KernelDoc.undescribed:
> + self.output_highlight(args.parameterdescs[parameter])
> + else:
> + self.data += f"{self.lineprefix}*undescribed*\n\n"
> + self.data += "\n"
> +
> + self.lineprefix = oldprefix
> + self.out_section(args)
> +
> + def out_typedef(self, fname, name, args):
> +
> + oldprefix = self.lineprefix
> + ln = args.declaration_start_line
> +
> + self.data += f"\n\n.. c:type:: {name}\n\n"
> +
> + self.print_lineno(ln)
> + self.lineprefix = " "
> +
> + self.output_highlight(args.get('purpose', ''))
> +
> + self.data += "\n"
> +
> + self.lineprefix = oldprefix
> + self.out_section(args)
> +
> + def out_struct(self, fname, name, args):
> +
> + purpose = args.get('purpose', "")
> + declaration = args.get('definition', "")
> + dtype = args.type
> + ln = args.declaration_start_line
> +
> + self.data += f"\n\n.. c:{dtype}:: {name}\n\n"
> +
> + self.print_lineno(ln)
> +
> + oldprefix = self.lineprefix
> + self.lineprefix += " "
> +
> + self.output_highlight(purpose)
> + self.data += "\n"
> +
> + self.data += ".. container:: kernelindent\n\n"
> + self.data += f"{self.lineprefix}**Definition**::\n\n"
> +
> + self.lineprefix = self.lineprefix + " "
> +
> + declaration = declaration.replace("\t", self.lineprefix)
> +
> + self.data += f"{self.lineprefix}{dtype} {name}" + ' {' + "\n"
> + self.data += f"{declaration}{self.lineprefix}" + "};\n\n"
> +
> + self.lineprefix = " "
> + self.data += f"{self.lineprefix}**Members**\n\n"
> + for parameter in args.parameterlist:
> + if not parameter or parameter.startswith("#"):
> + continue
> +
> + parameter_name = parameter.split("[", maxsplit=1)[0]
> +
> + if args.parameterdescs.get(parameter_name) == KernelDoc.undescribed:
> + continue
> +
> + self.print_lineno(args.parameterdesc_start_lines.get(parameter_name, 0))
> +
> + self.data += f"{self.lineprefix}``{parameter}``\n"
> +
> + self.lineprefix = " "
> + self.output_highlight(args.parameterdescs[parameter_name])
> + self.lineprefix = " "
> +
> + self.data += "\n"
> +
> + self.data += "\n"
> +
> + self.lineprefix = oldprefix
> + self.out_section(args)
> +
> +
> +class ManFormat(OutputFormat):
> + """Consts and functions used by man pages output"""
> +
> + highlights = (
> + (type_constant, r"\1"),
> + (type_constant2, r"\1"),
> + (type_func, r"\\fB\1\\fP"),
> + (type_enum, r"\\fI\1\\fP"),
> + (type_struct, r"\\fI\1\\fP"),
> + (type_typedef, r"\\fI\1\\fP"),
> + (type_union, r"\\fI\1\\fP"),
> + (type_param, r"\\fI\1\\fP"),
> + (type_param_ref, r"\\fI\1\2\\fP"),
> + (type_member, r"\\fI\1\2\3\\fP"),
> + (type_fallback, r"\\fI\1\\fP")
> + )
> + blankline = ""
> +
> + date_formats = [
> + "%a %b %d %H:%M:%S %Z %Y",
> + "%a %b %d %H:%M:%S %Y",
> + "%Y-%m-%d",
> + "%b %d %Y",
> + "%B %d %Y",
> + "%m %d %Y",
> + ]
> +
> + def __init__(self, modulename):
> + """
> + Creates class variables.
> +
> + Not really mandatory, but it is a good coding style and makes
> + pylint happy.
> + """
> +
> + super().__init__()
> + self.modulename = modulename
> +
> + dt = None
> + tstamp = os.environ.get("KBUILD_BUILD_TIMESTAMP")
> + if tstamp:
> + for fmt in self.date_formats:
> + try:
> + dt = datetime.strptime(tstamp, fmt)
> + break
> + except ValueError:
> + pass
> +
> + if not dt:
> + dt = datetime.now()
> +
> + self.man_date = dt.strftime("%B %Y")
> +
> + def output_highlight(self, block):
> + """
> + Outputs a C symbol that may require being highlighted with
> + self.highlights variable using troff syntax
> + """
> +
> + contents = self.highlight_block(block)
> +
> + if isinstance(contents, list):
> + contents = "\n".join(contents)
> +
> + for line in contents.strip("\n").split("\n"):
> + line = KernRe(r"^\s*").sub("", line)
> + if not line:
> + continue
> +
> + if line[0] == ".":
> + self.data += "\\&" + line + "\n"
> + else:
> + self.data += line + "\n"
> +
> + def out_doc(self, fname, name, args):
> + if not self.check_doc(name, args):
> + return
> +
> + self.data += f'.TH "{self.modulename}" 9 "{self.modulename}" "{self.man_date}" "API Manual" LINUX' + "\n"
> +
> + for section, text in args.sections.items():
> + self.data += f'.SH "{section}"' + "\n"
> + self.output_highlight(text)
> +
> + def out_function(self, fname, name, args):
> + """output function in man"""
> +
> + self.data += f'.TH "{name}" 9 "{name}" "{self.man_date}" "Kernel Hacker\'s Manual" LINUX' + "\n"
> +
> + self.data += ".SH NAME\n"
> + self.data += f"{name} \\- {args['purpose']}\n"
> +
> + self.data += ".SH SYNOPSIS\n"
> + if args.get('functiontype', ''):
> + self.data += f'.B "{args["functiontype"]}" {name}' + "\n"
> + else:
> + self.data += f'.B "{name}' + "\n"
> +
> + count = 0
> + parenth = "("
> + post = ","
> +
> + for parameter in args.parameterlist:
> + if count == len(args.parameterlist) - 1:
> + post = ");"
> +
> + dtype = args.parametertypes.get(parameter, "")
> + if function_pointer.match(dtype):
> + # Pointer-to-function
> + self.data += f'".BI "{parenth}{function_pointer.group(1)}" " ") ({function_pointer.group(2)}){post}"' + "\n"
> + else:
> + dtype = KernRe(r'([^\*])$').sub(r'\1 ', dtype)
> +
> + self.data += f'.BI "{parenth}{dtype}" "{post}"' + "\n"
> + count += 1
> + parenth = ""
> +
> + if args.parameterlist:
> + self.data += ".SH ARGUMENTS\n"
> +
> + for parameter in args.parameterlist:
> + parameter_name = re.sub(r'\[.*', '', parameter)
> +
> + self.data += f'.IP "{parameter}" 12' + "\n"
> + self.output_highlight(args.parameterdescs.get(parameter_name, ""))
> +
> + for section, text in args.sections.items():
> + self.data += f'.SH "{section.upper()}"' + "\n"
> + self.output_highlight(text)
> +
> + def out_enum(self, fname, name, args):
> + self.data += f'.TH "{self.modulename}" 9 "enum {name}" "{self.man_date}" "API Manual" LINUX' + "\n"
> +
> + self.data += ".SH NAME\n"
> + self.data += f"enum {name} \\- {args['purpose']}\n"
> +
> + self.data += ".SH SYNOPSIS\n"
> + self.data += f"enum {name}" + " {\n"
> +
> + count = 0
> + for parameter in args.parameterlist:
> + self.data += f'.br\n.BI " {parameter}"' + "\n"
> + if count == len(args.parameterlist) - 1:
> + self.data += "\n};\n"
> + else:
> + self.data += ", \n.br\n"
> +
> + count += 1
> +
> + self.data += ".SH Constants\n"
> +
> + for parameter in args.parameterlist:
> + parameter_name = KernRe(r'\[.*').sub('', parameter)
> + self.data += f'.IP "{parameter}" 12' + "\n"
> + self.output_highlight(args.parameterdescs.get(parameter_name, ""))
> +
> + for section, text in args.sections.items():
> + self.data += f'.SH "{section}"' + "\n"
> + self.output_highlight(text)
> +
> + def out_typedef(self, fname, name, args):
> + module = self.modulename
> + purpose = args.get('purpose')
> +
> + self.data += f'.TH "{module}" 9 "{name}" "{self.man_date}" "API Manual" LINUX' + "\n"
> +
> + self.data += ".SH NAME\n"
> + self.data += f"typedef {name} \\- {purpose}\n"
> +
> + for section, text in args.sections.items():
> + self.data += f'.SH "{section}"' + "\n"
> + self.output_highlight(text)
> +
> + def out_struct(self, fname, name, args):
> + module = self.modulename
> + purpose = args.get('purpose')
> + definition = args.get('definition')
> +
> + self.data += f'.TH "{module}" 9 "{args.type} {name}" "{self.man_date}" "API Manual" LINUX' + "\n"
> +
> + self.data += ".SH NAME\n"
> + self.data += f"{args.type} {name} \\- {purpose}\n"
> +
> + # Replace tabs with two spaces and handle newlines
> + declaration = definition.replace("\t", " ")
> + declaration = KernRe(r"\n").sub('"\n.br\n.BI "', declaration)
> +
> + self.data += ".SH SYNOPSIS\n"
> + self.data += f"{args.type} {name} " + "{" + "\n.br\n"
> + self.data += f'.BI "{declaration}\n' + "};\n.br\n\n"
> +
> + self.data += ".SH Members\n"
> + for parameter in args.parameterlist:
> + if parameter.startswith("#"):
> + continue
> +
> + parameter_name = re.sub(r"\[.*", "", parameter)
> +
> + if args.parameterdescs.get(parameter_name) == KernelDoc.undescribed:
> + continue
> +
> + self.data += f'.IP "{parameter}" 12' + "\n"
> + self.output_highlight(args.parameterdescs.get(parameter_name))
> +
> + for section, text in args.sections.items():
> + self.data += f'.SH "{section}"' + "\n"
> + self.output_highlight(text)
> diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
> new file mode 100644
> index 00000000000..fe730099eca
> --- /dev/null
> +++ b/scripts/lib/kdoc/kdoc_parser.py
> @@ -0,0 +1,1669 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +#
> +# pylint: disable=C0301,C0302,R0904,R0912,R0913,R0914,R0915,R0917,R1702
> +
> +"""
> +kdoc_parser
> +===========
> +
> +Read a C language source or header FILE and extract embedded
> +documentation comments
> +"""
> +
> +import sys
> +import re
> +from pprint import pformat
> +
> +from kdoc_re import NestedMatch, KernRe
> +from kdoc_item import KdocItem
> +
> +#
> +# Regular expressions used to parse kernel-doc markups at KernelDoc class.
> +#
> +# Let's declare them in lowercase outside any class to make easier to
> +# convert from the python script.
> +#
> +# As those are evaluated at the beginning, no need to cache them
> +#
> +
> +# Allow whitespace at end of comment start.
> +doc_start = KernRe(r'^/\*\*\s*$', cache=False)
> +
> +doc_end = KernRe(r'\*/', cache=False)
> +doc_com = KernRe(r'\s*\*\s*', cache=False)
> +doc_com_body = KernRe(r'\s*\* ?', cache=False)
> +doc_decl = doc_com + KernRe(r'(\w+)', cache=False)
> +
> +# @params and a strictly limited set of supported section names
> +# Specifically:
> +# Match @word:
> +# @...:
> +# @{section-name}:
> +# while trying to not match literal block starts like "example::"
> +#
> +known_section_names = 'description|context|returns?|notes?|examples?'
> +known_sections = KernRe(known_section_names, flags = re.I)
> +doc_sect = doc_com + \
> + KernRe(r'\s*(\@[.\w]+|\@\.\.\.|' + known_section_names + r')\s*:([^:].*)?$',
> + flags=re.I, cache=False)
> +
> +doc_content = doc_com_body + KernRe(r'(.*)', cache=False)
> +doc_inline_start = KernRe(r'^\s*/\*\*\s*$', cache=False)
> +doc_inline_sect = KernRe(r'\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)', cache=False)
> +doc_inline_end = KernRe(r'^\s*\*/\s*$', cache=False)
> +doc_inline_oneline = KernRe(r'^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$', cache=False)
> +attribute = KernRe(r"__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)",
> + flags=re.I | re.S, cache=False)
> +
> +export_symbol = KernRe(r'^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*', cache=False)
> +export_symbol_ns = KernRe(r'^\s*EXPORT_SYMBOL_NS(_GPL)?\s*\(\s*(\w+)\s*,\s*"\S+"\)\s*', cache=False)
> +
> +type_param = KernRe(r"\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)", cache=False)
> +
> +#
> +# Tests for the beginning of a kerneldoc block in its various forms.
> +#
> +doc_block = doc_com + KernRe(r'DOC:\s*(.*)?', cache=False)
> +doc_begin_data = KernRe(r"^\s*\*?\s*(struct|union|enum|typedef)\b\s*(\w*)", cache = False)
> +doc_begin_func = KernRe(str(doc_com) + # initial " * '
> + r"(?:\w+\s*\*\s*)?" + # type (not captured)
> + r'(?:define\s+)?' + # possible "define" (not captured)
> + r'(\w+)\s*(?:\(\w*\))?\s*' + # name and optional "(...)"
> + r'(?:[-:].*)?$', # description (not captured)
> + cache = False)
> +
> +#
> +# A little helper to get rid of excess white space
> +#
> +multi_space = KernRe(r'\s\s+')
> +def trim_whitespace(s):
> + return multi_space.sub(' ', s.strip())
> +
> +class state:
> + """
> + State machine enums
> + """
> +
> + # Parser states
> + NORMAL = 0 # normal code
> + NAME = 1 # looking for function name
> + DECLARATION = 2 # We have seen a declaration which might not be done
> + BODY = 3 # the body of the comment
> + SPECIAL_SECTION = 4 # doc section ending with a blank line
> + PROTO = 5 # scanning prototype
> + DOCBLOCK = 6 # documentation block
> + INLINE_NAME = 7 # gathering doc outside main block
> + INLINE_TEXT = 8 # reading the body of inline docs
> +
> + name = [
> + "NORMAL",
> + "NAME",
> + "DECLARATION",
> + "BODY",
> + "SPECIAL_SECTION",
> + "PROTO",
> + "DOCBLOCK",
> + "INLINE_NAME",
> + "INLINE_TEXT",
> + ]
> +
> +
> +SECTION_DEFAULT = "Description" # default section
> +
> +class KernelEntry:
> +
> + def __init__(self, config, ln):
> + self.config = config
> +
> + self._contents = []
> + self.prototype = ""
> +
> + self.warnings = []
> +
> + self.parameterlist = []
> + self.parameterdescs = {}
> + self.parametertypes = {}
> + self.parameterdesc_start_lines = {}
> +
> + self.section_start_lines = {}
> + self.sections = {}
> +
> + self.anon_struct_union = False
> +
> + self.leading_space = None
> +
> + # State flags
> + self.brcount = 0
> + self.declaration_start_line = ln + 1
> +
> + #
> + # Management of section contents
> + #
> + def add_text(self, text):
> + self._contents.append(text)
> +
> + def contents(self):
> + return '\n'.join(self._contents) + '\n'
> +
> + # TODO: rename to emit_message after removal of kernel-doc.pl
> + def emit_msg(self, log_msg, warning=True):
> + """Emit a message"""
> +
> + if not warning:
> + self.config.log.info(log_msg)
> + return
> +
> + # Delegate warning output to output logic, as this way it
> + # will report warnings/info only for symbols that are output
> +
> + self.warnings.append(log_msg)
> + return
> +
> + #
> + # Begin a new section.
> + #
> + def begin_section(self, line_no, title = SECTION_DEFAULT, dump = False):
> + if dump:
> + self.dump_section(start_new = True)
> + self.section = title
> + self.new_start_line = line_no
> +
> + def dump_section(self, start_new=True):
> + """
> + Dumps section contents to arrays/hashes intended for that purpose.
> + """
> + #
> + # If we have accumulated no contents in the default ("description")
> + # section, don't bother.
> + #
> + if self.section == SECTION_DEFAULT and not self._contents:
> + return
> + name = self.section
> + contents = self.contents()
> +
> + if type_param.match(name):
> + name = type_param.group(1)
> +
> + self.parameterdescs[name] = contents
> + self.parameterdesc_start_lines[name] = self.new_start_line
> +
> + self.new_start_line = 0
> +
> + else:
> + if name in self.sections and self.sections[name] != "":
> + # Only warn on user-specified duplicate section names
> + if name != SECTION_DEFAULT:
> + self.emit_msg(self.new_start_line,
> + f"duplicate section name '{name}'\n")
> + # Treat as a new paragraph - add a blank line
> + self.sections[name] += '\n' + contents
> + else:
> + self.sections[name] = contents
> + self.section_start_lines[name] = self.new_start_line
> + self.new_start_line = 0
> +
> +# self.config.log.debug("Section: %s : %s", name, pformat(vars(self)))
> +
> + if start_new:
> + self.section = SECTION_DEFAULT
> + self._contents = []
> +
> +
> +class KernelDoc:
> + """
> + Read a C language source or header FILE and extract embedded
> + documentation comments.
> + """
> +
> + # Section names
> +
> + section_context = "Context"
> + section_return = "Return"
> +
> + undescribed = "-- undescribed --"
> +
> + def __init__(self, config, fname):
> + """Initialize internal variables"""
> +
> + self.fname = fname
> + self.config = config
> +
> + # Initial state for the state machines
> + self.state = state.NORMAL
> +
> + # Store entry currently being processed
> + self.entry = None
> +
> + # Place all potential outputs into an array
> + self.entries = []
> +
> + #
> + # We need Python 3.7 for its "dicts remember the insertion
> + # order" guarantee
> + #
> + if sys.version_info.major == 3 and sys.version_info.minor < 7:
> + self.emit_msg(0,
> + 'Python 3.7 or later is required for correct results')
> +
> + def emit_msg(self, ln, msg, warning=True):
> + """Emit a message"""
> +
> + log_msg = f"{self.fname}:{ln} {msg}"
> +
> + if self.entry:
> + self.entry.emit_msg(log_msg, warning)
> + return
> +
> + if warning:
> + self.config.log.warning(log_msg)
> + else:
> + self.config.log.info(log_msg)
> +
> + def dump_section(self, start_new=True):
> + """
> + Dumps section contents to arrays/hashes intended for that purpose.
> + """
> +
> + if self.entry:
> + self.entry.dump_section(start_new)
> +
> + # TODO: rename it to store_declaration after removal of kernel-doc.pl
> + def output_declaration(self, dtype, name, **args):
> + """
> + Stores the entry into an entry array.
> +
> + The actual output and output filters will be handled elsewhere
> + """
> +
> + item = KdocItem(name, dtype, self.entry.declaration_start_line, **args)
> + item.warnings = self.entry.warnings
> +
> + # Drop empty sections
> + # TODO: improve empty sections logic to emit warnings
> + sections = self.entry.sections
> + for section in ["Description", "Return"]:
> + if section in sections and not sections[section].rstrip():
> + del sections[section]
> + item.set_sections(sections, self.entry.section_start_lines)
> + item.set_params(self.entry.parameterlist, self.entry.parameterdescs,
> + self.entry.parametertypes,
> + self.entry.parameterdesc_start_lines)
> + self.entries.append(item)
> +
> + self.config.log.debug("Output: %s:%s = %s", dtype, name, pformat(args))
> +
> + def reset_state(self, ln):
> + """
> + Ancillary routine to create a new entry. It initializes all
> + variables used by the state machine.
> + """
> +
> + self.entry = KernelEntry(self.config, ln)
> +
> + # State flags
> + self.state = state.NORMAL
> +
> + def push_parameter(self, ln, decl_type, param, dtype,
> + org_arg, declaration_name):
> + """
> + Store parameters and their descriptions at self.entry.
> + """
> +
> + if self.entry.anon_struct_union and dtype == "" and param == "}":
> + return # Ignore the ending }; from anonymous struct/union
> +
> + self.entry.anon_struct_union = False
> +
> + param = KernRe(r'[\[\)].*').sub('', param, count=1)
> +
> + if dtype == "" and param.endswith("..."):
> + if KernRe(r'\w\.\.\.$').search(param):
> + # For named variable parameters of the form `x...`,
> + # remove the dots
> + param = param[:-3]
> + else:
> + # Handles unnamed variable parameters
> + param = "..."
> +
> + if param not in self.entry.parameterdescs or \
> + not self.entry.parameterdescs[param]:
> +
> + self.entry.parameterdescs[param] = "variable arguments"
> +
> + elif dtype == "" and (not param or param == "void"):
> + param = "void"
> + self.entry.parameterdescs[param] = "no arguments"
> +
> + elif dtype == "" and param in ["struct", "union"]:
> + # Handle unnamed (anonymous) union or struct
> + dtype = param
> + param = "{unnamed_" + param + "}"
> + self.entry.parameterdescs[param] = "anonymous\n"
> + self.entry.anon_struct_union = True
> +
> + # Handle cache group enforcing variables: they do not need
> + # to be described in header files
> + elif "__cacheline_group" in param:
> + # Ignore __cacheline_group_begin and __cacheline_group_end
> + return
> +
> + # Warn if parameter has no description
> + # (but ignore ones starting with # as these are not parameters
> + # but inline preprocessor statements)
> + if param not in self.entry.parameterdescs and not param.startswith("#"):
> + self.entry.parameterdescs[param] = self.undescribed
> +
> + if "." not in param:
> + if decl_type == 'function':
> + dname = f"{decl_type} parameter"
> + else:
> + dname = f"{decl_type} member"
> +
> + self.emit_msg(ln,
> + f"{dname} '{param}' not described in '{declaration_name}'")
> +
> + # Strip spaces from param so that it is one continuous string on
> + # parameterlist. This fixes a problem where check_sections()
> + # cannot find a parameter like "addr[6 + 2]" because it actually
> + # appears as "addr[6", "+", "2]" on the parameter list.
> + # However, it's better to maintain the param string unchanged for
> + # output, so just weaken the string compare in check_sections()
> + # to ignore "[blah" in a parameter string.
> +
> + self.entry.parameterlist.append(param)
> + org_arg = KernRe(r'\s\s+').sub(' ', org_arg)
> + self.entry.parametertypes[param] = org_arg
> +
> +
> + def create_parameter_list(self, ln, decl_type, args,
> + splitter, declaration_name):
> + """
> + Creates a list of parameters, storing them at self.entry.
> + """
> +
> + # temporarily replace all commas inside function pointer definition
> + arg_expr = KernRe(r'(\([^\),]+),')
> + while arg_expr.search(args):
> + args = arg_expr.sub(r"\1#", args)
> +
> + for arg in args.split(splitter):
> + # Strip comments
> + arg = KernRe(r'\/\*.*\*\/').sub('', arg)
> +
> + # Ignore argument attributes
> + arg = KernRe(r'\sPOS0?\s').sub(' ', arg)
> +
> + # Strip leading/trailing spaces
> + arg = arg.strip()
> + arg = KernRe(r'\s+').sub(' ', arg, count=1)
> +
> + if arg.startswith('#'):
> + # Treat preprocessor directive as a typeless variable just to fill
> + # corresponding data structures "correctly". Catch it later in
> + # output_* subs.
> +
> + # Treat preprocessor directive as a typeless variable
> + self.push_parameter(ln, decl_type, arg, "",
> + "", declaration_name)
> +
> + elif KernRe(r'\(.+\)\s*\(').search(arg):
> + # Pointer-to-function
> +
> + arg = arg.replace('#', ',')
> +
> + r = KernRe(r'[^\(]+\(\*?\s*([\w\[\]\.]*)\s*\)')
> + if r.match(arg):
> + param = r.group(1)
> + else:
> + self.emit_msg(ln, f"Invalid param: {arg}")
> + param = arg
> +
> + dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
> + self.push_parameter(ln, decl_type, param, dtype,
> + arg, declaration_name)
> +
> + elif KernRe(r'\(.+\)\s*\[').search(arg):
> + # Array-of-pointers
> +
> + arg = arg.replace('#', ',')
> + r = KernRe(r'[^\(]+\(\s*\*\s*([\w\[\]\.]*?)\s*(\s*\[\s*[\w]+\s*\]\s*)*\)')
> + if r.match(arg):
> + param = r.group(1)
> + else:
> + self.emit_msg(ln, f"Invalid param: {arg}")
> + param = arg
> +
> + dtype = KernRe(r'([^\(]+\(\*?)\s*' + re.escape(param)).sub(r'\1', arg)
> +
> + self.push_parameter(ln, decl_type, param, dtype,
> + arg, declaration_name)
> +
> + elif arg:
> + arg = KernRe(r'\s*:\s*').sub(":", arg)
> + arg = KernRe(r'\s*\[').sub('[', arg)
> +
> + args = KernRe(r'\s*,\s*').split(arg)
> + if args[0] and '*' in args[0]:
> + args[0] = re.sub(r'(\*+)\s*', r' \1', args[0])
> +
> + first_arg = []
> + r = KernRe(r'^(.*\s+)(.*?\[.*\].*)$')
> + if args[0] and r.match(args[0]):
> + args.pop(0)
> + first_arg.extend(r.group(1))
> + first_arg.append(r.group(2))
> + else:
> + first_arg = KernRe(r'\s+').split(args.pop(0))
> +
> + args.insert(0, first_arg.pop())
> + dtype = ' '.join(first_arg)
> +
> + for param in args:
> + if KernRe(r'^(\*+)\s*(.*)').match(param):
> + r = KernRe(r'^(\*+)\s*(.*)')
> + if not r.match(param):
> + self.emit_msg(ln, f"Invalid param: {param}")
> + continue
> +
> + param = r.group(1)
> +
> + self.push_parameter(ln, decl_type, r.group(2),
> + f"{dtype} {r.group(1)}",
> + arg, declaration_name)
> +
> + elif KernRe(r'(.*?):(\w+)').search(param):
> + r = KernRe(r'(.*?):(\w+)')
> + if not r.match(param):
> + self.emit_msg(ln, f"Invalid param: {param}")
> + continue
> +
> + if dtype != "": # Skip unnamed bit-fields
> + self.push_parameter(ln, decl_type, r.group(1),
> + f"{dtype}:{r.group(2)}",
> + arg, declaration_name)
> + else:
> + self.push_parameter(ln, decl_type, param, dtype,
> + arg, declaration_name)
> +
> + def check_sections(self, ln, decl_name, decl_type):
> + """
> + Check for errors inside sections, emitting warnings if not found
> + parameters are described.
> + """
> + for section in self.entry.sections:
> + if section not in self.entry.parameterlist and \
> + not known_sections.search(section):
> + if decl_type == 'function':
> + dname = f"{decl_type} parameter"
> + else:
> + dname = f"{decl_type} member"
> + self.emit_msg(ln,
> + f"Excess {dname} '{section}' description in '{decl_name}'")
> +
> + def check_return_section(self, ln, declaration_name, return_type):
> + """
> + If the function doesn't return void, warns about the lack of a
> + return description.
> + """
> +
> + if not self.config.wreturn:
> + return
> +
> + # Ignore an empty return type (It's a macro)
> + # Ignore functions with a "void" return type (but not "void *")
> + if not return_type or KernRe(r'void\s*\w*\s*$').search(return_type):
> + return
> +
> + if not self.entry.sections.get("Return", None):
> + self.emit_msg(ln,
> + f"No description found for return value of '{declaration_name}'")
> +
> + def dump_struct(self, ln, proto):
> + """
> + Store an entry for an struct or union
> + """
> +
> + type_pattern = r'(struct|union)'
> +
> + qualifiers = [
> + "__attribute__",
> + "__packed",
> + "__aligned",
> + "____cacheline_aligned_in_smp",
> + "____cacheline_aligned",
> + ]
> +
> + definition_body = r'\{(.*)\}\s*' + "(?:" + '|'.join(qualifiers) + ")?"
> + struct_members = KernRe(type_pattern + r'([^\{\};]+)(\{)([^\{\}]*)(\})([^\{\}\;]*)(\;)')
> +
> + # Extract struct/union definition
> + members = None
> + declaration_name = None
> + decl_type = None
> +
> + r = KernRe(type_pattern + r'\s+(\w+)\s*' + definition_body)
> + if r.search(proto):
> + decl_type = r.group(1)
> + declaration_name = r.group(2)
> + members = r.group(3)
> + else:
> + r = KernRe(r'typedef\s+' + type_pattern + r'\s*' + definition_body + r'\s*(\w+)\s*;')
> +
> + if r.search(proto):
> + decl_type = r.group(1)
> + declaration_name = r.group(3)
> + members = r.group(2)
> +
> + if not members:
> + self.emit_msg(ln, f"{proto} error: Cannot parse struct or union!")
> + return
> +
> + if self.entry.identifier != declaration_name:
> + self.emit_msg(ln,
> + f"expecting prototype for {decl_type} {self.entry.identifier}. Prototype was for {decl_type} {declaration_name} instead\n")
> + return
> +
> + args_pattern = r'([^,)]+)'
> +
> + sub_prefixes = [
> + (KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', re.S | re.I), ''),
> + (KernRe(r'\/\*\s*private:.*', re.S | re.I), ''),
> +
> + # Strip comments
> + (KernRe(r'\/\*.*?\*\/', re.S), ''),
> +
> + # Strip attributes
> + (attribute, ' '),
> + (KernRe(r'\s*__aligned\s*\([^;]*\)', re.S), ' '),
> + (KernRe(r'\s*__counted_by\s*\([^;]*\)', re.S), ' '),
> + (KernRe(r'\s*__counted_by_(le|be)\s*\([^;]*\)', re.S), ' '),
> + (KernRe(r'\s*__packed\s*', re.S), ' '),
> + (KernRe(r'\s*CRYPTO_MINALIGN_ATTR', re.S), ' '),
> + (KernRe(r'\s*____cacheline_aligned_in_smp', re.S), ' '),
> + (KernRe(r'\s*____cacheline_aligned', re.S), ' '),
> +
> + # Unwrap struct_group macros based on this definition:
> + # __struct_group(TAG, NAME, ATTRS, MEMBERS...)
> + # which has variants like: struct_group(NAME, MEMBERS...)
> + # Only MEMBERS arguments require documentation.
> + #
> + # Parsing them happens on two steps:
> + #
> + # 1. drop struct group arguments that aren't at MEMBERS,
> + # storing them as STRUCT_GROUP(MEMBERS)
> + #
> + # 2. remove STRUCT_GROUP() ancillary macro.
> + #
> + # The original logic used to remove STRUCT_GROUP() using an
> + # advanced regex:
> + #
> + # \bSTRUCT_GROUP(\(((?:(?>[^)(]+)|(?1))*)\))[^;]*;
> + #
> + # with two patterns that are incompatible with
> + # Python re module, as it has:
> + #
> + # - a recursive pattern: (?1)
> + # - an atomic grouping: (?>...)
> + #
> + # I tried a simpler version: but it didn't work either:
> + # \bSTRUCT_GROUP\(([^\)]+)\)[^;]*;
> + #
> + # As it doesn't properly match the end parenthesis on some cases.
> + #
> + # So, a better solution was crafted: there's now a NestedMatch
> + # class that ensures that delimiters after a search are properly
> + # matched. So, the implementation to drop STRUCT_GROUP() will be
> + # handled in separate.
> +
> + (KernRe(r'\bstruct_group\s*\(([^,]*,)', re.S), r'STRUCT_GROUP('),
> + (KernRe(r'\bstruct_group_attr\s*\(([^,]*,){2}', re.S), r'STRUCT_GROUP('),
> + (KernRe(r'\bstruct_group_tagged\s*\(([^,]*),([^,]*),', re.S), r'struct \1 \2; STRUCT_GROUP('),
> + (KernRe(r'\b__struct_group\s*\(([^,]*,){3}', re.S), r'STRUCT_GROUP('),
> +
> + # Replace macros
> + #
> + # TODO: use NestedMatch for FOO($1, $2, ...) matches
> + #
> + # it is better to also move those to the NestedMatch logic,
> + # to ensure that parenthesis will be properly matched.
> +
> + (KernRe(r'__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, __ETHTOOL_LINK_MODE_MASK_NBITS)'),
> + (KernRe(r'DECLARE_PHY_INTERFACE_MASK\s*\(([^\)]+)\)', re.S), r'DECLARE_BITMAP(\1, PHY_INTERFACE_MODE_MAX)'),
> + (KernRe(r'DECLARE_BITMAP\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[BITS_TO_LONGS(\2)]'),
> + (KernRe(r'DECLARE_HASHTABLE\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'unsigned long \1[1 << ((\2) - 1)]'),
> + (KernRe(r'DECLARE_KFIFO\s*\(' + args_pattern + r',\s*' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
> + (KernRe(r'DECLARE_KFIFO_PTR\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\2 *\1'),
> + (KernRe(r'(?:__)?DECLARE_FLEX_ARRAY\s*\(' + args_pattern + r',\s*' + args_pattern + r'\)', re.S), r'\1 \2[]'),
> + (KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + args_pattern + r'\)', re.S), r'dma_addr_t \1'),
> + (KernRe(r'DEFINE_DMA_UNMAP_LEN\s*\(' + args_pattern + r'\)', re.S), r'__u32 \1'),
> + (KernRe(r'VIRTIO_DECLARE_FEATURES\s*\(' + args_pattern + r'\)', re.S), r'u64 \1; u64 \1_array[VIRTIO_FEATURES_DWORDS]'),
> + ]
> +
> + # Regexes here are guaranteed to have the end limiter matching
> + # the start delimiter. Yet, right now, only one replace group
> + # is allowed.
> +
> + sub_nested_prefixes = [
> + (re.compile(r'\bSTRUCT_GROUP\('), r'\1'),
> + ]
> +
> + for search, sub in sub_prefixes:
> + members = search.sub(sub, members)
> +
> + nested = NestedMatch()
> +
> + for search, sub in sub_nested_prefixes:
> + members = nested.sub(search, sub, members)
> +
> + # Keeps the original declaration as-is
> + declaration = members
> +
> + # Split nested struct/union elements
> + #
> + # This loop was simpler at the original kernel-doc perl version, as
> + # while ($members =~ m/$struct_members/) { ... }
> + # reads 'members' string on each interaction.
> + #
> + # Python behavior is different: it parses 'members' only once,
> + # creating a list of tuples from the first interaction.
> + #
> + # On other words, this won't get nested structs.
> + #
> + # So, we need to have an extra loop on Python to override such
> + # re limitation.
> +
> + while True:
> + tuples = struct_members.findall(members)
> + if not tuples:
> + break
> +
> + for t in tuples:
> + newmember = ""
> + maintype = t[0]
> + s_ids = t[5]
> + content = t[3]
> +
> + oldmember = "".join(t)
> +
> + for s_id in s_ids.split(','):
> + s_id = s_id.strip()
> +
> + newmember += f"{maintype} {s_id}; "
> + s_id = KernRe(r'[:\[].*').sub('', s_id)
> + s_id = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', s_id)
> +
> + for arg in content.split(';'):
> + arg = arg.strip()
> +
> + if not arg:
> + continue
> +
> + r = KernRe(r'^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)')
> + if r.match(arg):
> + # Pointer-to-function
> + dtype = r.group(1)
> + name = r.group(2)
> + extra = r.group(3)
> +
> + if not name:
> + continue
> +
> + if not s_id:
> + # Anonymous struct/union
> + newmember += f"{dtype}{name}{extra}; "
> + else:
> + newmember += f"{dtype}{s_id}.{name}{extra}; "
> +
> + else:
> + arg = arg.strip()
> + # Handle bitmaps
> + arg = KernRe(r':\s*\d+\s*').sub('', arg)
> +
> + # Handle arrays
> + arg = KernRe(r'\[.*\]').sub('', arg)
> +
> + # Handle multiple IDs
> + arg = KernRe(r'\s*,\s*').sub(',', arg)
> +
> + r = KernRe(r'(.*)\s+([\S+,]+)')
> +
> + if r.search(arg):
> + dtype = r.group(1)
> + names = r.group(2)
> + else:
> + newmember += f"{arg}; "
> + continue
> +
> + for name in names.split(','):
> + name = KernRe(r'^\s*\**(\S+)\s*').sub(r'\1', name).strip()
> +
> + if not name:
> + continue
> +
> + if not s_id:
> + # Anonymous struct/union
> + newmember += f"{dtype} {name}; "
> + else:
> + newmember += f"{dtype} {s_id}.{name}; "
> +
> + members = members.replace(oldmember, newmember)
> +
> + # Ignore other nested elements, like enums
> + members = re.sub(r'(\{[^\{\}]*\})', '', members)
> +
> + self.create_parameter_list(ln, decl_type, members, ';',
> + declaration_name)
> + self.check_sections(ln, declaration_name, decl_type)
> +
> + # Adjust declaration for better display
> + declaration = KernRe(r'([\{;])').sub(r'\1\n', declaration)
> + declaration = KernRe(r'\}\s+;').sub('};', declaration)
> +
> + # Better handle inlined enums
> + while True:
> + r = KernRe(r'(enum\s+\{[^\}]+),([^\n])')
> + if not r.search(declaration):
> + break
> +
> + declaration = r.sub(r'\1,\n\2', declaration)
> +
> + def_args = declaration.split('\n')
> + level = 1
> + declaration = ""
> + for clause in def_args:
> +
> + clause = clause.strip()
> + clause = KernRe(r'\s+').sub(' ', clause, count=1)
> +
> + if not clause:
> + continue
> +
> + if '}' in clause and level > 1:
> + level -= 1
> +
> + if not KernRe(r'^\s*#').match(clause):
> + declaration += "\t" * level
> +
> + declaration += "\t" + clause + "\n"
> + if "{" in clause and "}" not in clause:
> + level += 1
> +
> + self.output_declaration(decl_type, declaration_name,
> + definition=declaration,
> + purpose=self.entry.declaration_purpose)
> +
> + def dump_enum(self, ln, proto):
> + """
> + Stores an enum inside self.entries array.
> + """
> +
> + # Ignore members marked private
> + proto = KernRe(r'\/\*\s*private:.*?\/\*\s*public:.*?\*\/', flags=re.S).sub('', proto)
> + proto = KernRe(r'\/\*\s*private:.*}', flags=re.S).sub('}', proto)
> +
> + # Strip comments
> + proto = KernRe(r'\/\*.*?\*\/', flags=re.S).sub('', proto)
> +
> + # Strip #define macros inside enums
> + proto = KernRe(r'#\s*((define|ifdef|if)\s+|endif)[^;]*;', flags=re.S).sub('', proto)
> +
> + #
> + # Parse out the name and members of the enum. Typedef form first.
> + #
> + r = KernRe(r'typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;')
> + if r.search(proto):
> + declaration_name = r.group(2)
> + members = r.group(1).rstrip()
> + #
> + # Failing that, look for a straight enum
> + #
> + else:
> + r = KernRe(r'enum\s+(\w*)\s*\{(.*)\}')
> + if r.match(proto):
> + declaration_name = r.group(1)
> + members = r.group(2).rstrip()
> + #
> + # OK, this isn't going to work.
> + #
> + else:
> + self.emit_msg(ln, f"{proto}: error: Cannot parse enum!")
> + return
> + #
> + # Make sure we found what we were expecting.
> + #
> + if self.entry.identifier != declaration_name:
> + if self.entry.identifier == "":
> + self.emit_msg(ln,
> + f"{proto}: wrong kernel-doc identifier on prototype")
> + else:
> + self.emit_msg(ln,
> + f"expecting prototype for enum {self.entry.identifier}. "
> + f"Prototype was for enum {declaration_name} instead")
> + return
> +
> + if not declaration_name:
> + declaration_name = "(anonymous)"
> + #
> + # Parse out the name of each enum member, and verify that we
> + # have a description for it.
> + #
> + member_set = set()
> + members = KernRe(r'\([^;)]*\)').sub('', members)
> + for arg in members.split(','):
> + if not arg:
> + continue
> + arg = KernRe(r'^\s*(\w+).*').sub(r'\1', arg)
> + self.entry.parameterlist.append(arg)
> + if arg not in self.entry.parameterdescs:
> + self.entry.parameterdescs[arg] = self.undescribed
> + self.emit_msg(ln,
> + f"Enum value '{arg}' not described in enum '{declaration_name}'")
> + member_set.add(arg)
> + #
> + # Ensure that every described member actually exists in the enum.
> + #
> + for k in self.entry.parameterdescs:
> + if k not in member_set:
> + self.emit_msg(ln,
> + f"Excess enum value '%{k}' description in '{declaration_name}'")
> +
> + self.output_declaration('enum', declaration_name,
> + purpose=self.entry.declaration_purpose)
> +
> + def dump_declaration(self, ln, prototype):
> + """
> + Stores a data declaration inside self.entries array.
> + """
> +
> + if self.entry.decl_type == "enum":
> + self.dump_enum(ln, prototype)
> + elif self.entry.decl_type == "typedef":
> + self.dump_typedef(ln, prototype)
> + elif self.entry.decl_type in ["union", "struct"]:
> + self.dump_struct(ln, prototype)
> + else:
> + # This would be a bug
> + self.emit_message(ln, f'Unknown declaration type: {self.entry.decl_type}')
> +
> + def dump_function(self, ln, prototype):
> + """
> + Stores a function of function macro inside self.entries array.
> + """
> +
> + func_macro = False
> + return_type = ''
> + decl_type = 'function'
> +
> + # Prefixes that would be removed
> + sub_prefixes = [
> + (r"^static +", "", 0),
> + (r"^extern +", "", 0),
> + (r"^asmlinkage +", "", 0),
> + (r"^inline +", "", 0),
> + (r"^__inline__ +", "", 0),
> + (r"^__inline +", "", 0),
> + (r"^__always_inline +", "", 0),
> + (r"^noinline +", "", 0),
> + (r"^__FORTIFY_INLINE +", "", 0),
> + (r"__init +", "", 0),
> + (r"__init_or_module +", "", 0),
> + (r"__deprecated +", "", 0),
> + (r"__flatten +", "", 0),
> + (r"__meminit +", "", 0),
> + (r"__must_check +", "", 0),
> + (r"__weak +", "", 0),
> + (r"__sched +", "", 0),
> + (r"_noprof", "", 0),
> + (r"__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +", "", 0),
> + (r"__(?:re)?alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\) +", "", 0),
> + (r"__diagnose_as\s*\(\s*\S+\s*(?:,\s*\d+\s*)*\) +", "", 0),
> + (r"DECL_BUCKET_PARAMS\s*\(\s*(\S+)\s*,\s*(\S+)\s*\)", r"\1, \2", 0),
> + (r"__attribute_const__ +", "", 0),
> +
> + # It seems that Python support for re.X is broken:
> + # At least for me (Python 3.13), this didn't work
> +# (r"""
> +# __attribute__\s*\(\(
> +# (?:
> +# [\w\s]+ # attribute name
> +# (?:\([^)]*\))? # attribute arguments
> +# \s*,? # optional comma at the end
> +# )+
> +# \)\)\s+
> +# """, "", re.X),
> +
> + # So, remove whitespaces and comments from it
> + (r"__attribute__\s*\(\((?:[\w\s]+(?:\([^)]*\))?\s*,?)+\)\)\s+", "", 0),
> + ]
> +
> + for search, sub, flags in sub_prefixes:
> + prototype = KernRe(search, flags).sub(sub, prototype)
> +
> + # Macros are a special case, as they change the prototype format
> + new_proto = KernRe(r"^#\s*define\s+").sub("", prototype)
> + if new_proto != prototype:
> + is_define_proto = True
> + prototype = new_proto
> + else:
> + is_define_proto = False
> +
> + # Yes, this truly is vile. We are looking for:
> + # 1. Return type (may be nothing if we're looking at a macro)
> + # 2. Function name
> + # 3. Function parameters.
> + #
> + # All the while we have to watch out for function pointer parameters
> + # (which IIRC is what the two sections are for), C types (these
> + # regexps don't even start to express all the possibilities), and
> + # so on.
> + #
> + # If you mess with these regexps, it's a good idea to check that
> + # the following functions' documentation still comes out right:
> + # - parport_register_device (function pointer parameters)
> + # - atomic_set (macro)
> + # - pci_match_device, __copy_to_user (long return type)
> +
> + name = r'[a-zA-Z0-9_~:]+'
> + prototype_end1 = r'[^\(]*'
> + prototype_end2 = r'[^\{]*'
> + prototype_end = fr'\(({prototype_end1}|{prototype_end2})\)'
> +
> + # Besides compiling, Perl qr{[\w\s]+} works as a non-capturing group.
> + # So, this needs to be mapped in Python with (?:...)? or (?:...)+
> +
> + type1 = r'(?:[\w\s]+)?'
> + type2 = r'(?:[\w\s]+\*+)+'
> +
> + found = False
> +
> + if is_define_proto:
> + r = KernRe(r'^()(' + name + r')\s+')
> +
> + if r.search(prototype):
> + return_type = ''
> + declaration_name = r.group(2)
> + func_macro = True
> +
> + found = True
> +
> + if not found:
> + patterns = [
> + rf'^()({name})\s*{prototype_end}',
> + rf'^({type1})\s+({name})\s*{prototype_end}',
> + rf'^({type2})\s*({name})\s*{prototype_end}',
> + ]
> +
> + for p in patterns:
> + r = KernRe(p)
> +
> + if r.match(prototype):
> +
> + return_type = r.group(1)
> + declaration_name = r.group(2)
> + args = r.group(3)
> +
> + self.create_parameter_list(ln, decl_type, args, ',',
> + declaration_name)
> +
> + found = True
> + break
> + if not found:
> + self.emit_msg(ln,
> + f"cannot understand function prototype: '{prototype}'")
> + return
> +
> + if self.entry.identifier != declaration_name:
> + self.emit_msg(ln,
> + f"expecting prototype for {self.entry.identifier}(). Prototype was for {declaration_name}() instead")
> + return
> +
> + self.check_sections(ln, declaration_name, "function")
> +
> + self.check_return_section(ln, declaration_name, return_type)
> +
> + if 'typedef' in return_type:
> + self.output_declaration(decl_type, declaration_name,
> + typedef=True,
> + functiontype=return_type,
> + purpose=self.entry.declaration_purpose,
> + func_macro=func_macro)
> + else:
> + self.output_declaration(decl_type, declaration_name,
> + typedef=False,
> + functiontype=return_type,
> + purpose=self.entry.declaration_purpose,
> + func_macro=func_macro)
> +
> + def dump_typedef(self, ln, proto):
> + """
> + Stores a typedef inside self.entries array.
> + """
> +
> + typedef_type = r'((?:\s+[\w\*]+\b){0,7}\s+(?:\w+\b|\*+))\s*'
> + typedef_ident = r'\*?\s*(\w\S+)\s*'
> + typedef_args = r'\s*\((.*)\);'
> +
> + typedef1 = KernRe(r'typedef' + typedef_type + r'\(' + typedef_ident + r'\)' + typedef_args)
> + typedef2 = KernRe(r'typedef' + typedef_type + typedef_ident + typedef_args)
> +
> + # Strip comments
> + proto = KernRe(r'/\*.*?\*/', flags=re.S).sub('', proto)
> +
> + # Parse function typedef prototypes
> + for r in [typedef1, typedef2]:
> + if not r.match(proto):
> + continue
> +
> + return_type = r.group(1).strip()
> + declaration_name = r.group(2)
> + args = r.group(3)
> +
> + if self.entry.identifier != declaration_name:
> + self.emit_msg(ln,
> + f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
> + return
> +
> + decl_type = 'function'
> + self.create_parameter_list(ln, decl_type, args, ',', declaration_name)
> +
> + self.output_declaration(decl_type, declaration_name,
> + typedef=True,
> + functiontype=return_type,
> + purpose=self.entry.declaration_purpose)
> + return
> +
> + # Handle nested parentheses or brackets
> + r = KernRe(r'(\(*.\)\s*|\[*.\]\s*);$')
> + while r.search(proto):
> + proto = r.sub('', proto)
> +
> + # Parse simple typedefs
> + r = KernRe(r'typedef.*\s+(\w+)\s*;')
> + if r.match(proto):
> + declaration_name = r.group(1)
> +
> + if self.entry.identifier != declaration_name:
> + self.emit_msg(ln,
> + f"expecting prototype for typedef {self.entry.identifier}. Prototype was for typedef {declaration_name} instead\n")
> + return
> +
> + self.output_declaration('typedef', declaration_name,
> + purpose=self.entry.declaration_purpose)
> + return
> +
> + self.emit_msg(ln, "error: Cannot parse typedef!")
> +
> + @staticmethod
> + def process_export(function_set, line):
> + """
> + process EXPORT_SYMBOL* tags
> +
> + This method doesn't use any variable from the class, so declare it
> + with a staticmethod decorator.
> + """
> +
> + # We support documenting some exported symbols with different
> + # names. A horrible hack.
> + suffixes = [ '_noprof' ]
> +
> + # Note: it accepts only one EXPORT_SYMBOL* per line, as having
> + # multiple export lines would violate Kernel coding style.
> +
> + if export_symbol.search(line):
> + symbol = export_symbol.group(2)
> + elif export_symbol_ns.search(line):
> + symbol = export_symbol_ns.group(2)
> + else:
> + return False
> + #
> + # Found an export, trim out any special suffixes
> + #
> + for suffix in suffixes:
> + # Be backward compatible with Python < 3.9
> + if symbol.endswith(suffix):
> + symbol = symbol[:-len(suffix)]
> + function_set.add(symbol)
> + return True
> +
> + def process_normal(self, ln, line):
> + """
> + STATE_NORMAL: looking for the /** to begin everything.
> + """
> +
> + if not doc_start.match(line):
> + return
> +
> + # start a new entry
> + self.reset_state(ln)
> +
> + # next line is always the function name
> + self.state = state.NAME
> +
> + def process_name(self, ln, line):
> + """
> + STATE_NAME: Looking for the "name - description" line
> + """
> + #
> + # Check for a DOC: block and handle them specially.
> + #
> + if doc_block.search(line):
> +
> + if not doc_block.group(1):
> + self.entry.begin_section(ln, "Introduction")
> + else:
> + self.entry.begin_section(ln, doc_block.group(1))
> +
> + self.entry.identifier = self.entry.section
> + self.state = state.DOCBLOCK
> + #
> + # Otherwise we're looking for a normal kerneldoc declaration line.
> + #
> + elif doc_decl.search(line):
> + self.entry.identifier = doc_decl.group(1)
> +
> + # Test for data declaration
> + if doc_begin_data.search(line):
> + self.entry.decl_type = doc_begin_data.group(1)
> + self.entry.identifier = doc_begin_data.group(2)
> + #
> + # Look for a function description
> + #
> + elif doc_begin_func.search(line):
> + self.entry.identifier = doc_begin_func.group(1)
> + self.entry.decl_type = "function"
> + #
> + # We struck out.
> + #
> + else:
> + self.emit_msg(ln,
> + f"This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst\n{line}")
> + self.state = state.NORMAL
> + return
> + #
> + # OK, set up for a new kerneldoc entry.
> + #
> + self.state = state.BODY
> + self.entry.identifier = self.entry.identifier.strip(" ")
> + # if there's no @param blocks need to set up default section here
> + self.entry.begin_section(ln + 1)
> + #
> + # Find the description portion, which *should* be there but
> + # isn't always.
> + # (We should be able to capture this from the previous parsing - someday)
> + #
> + r = KernRe("[-:](.*)")
> + if r.search(line):
> + self.entry.declaration_purpose = trim_whitespace(r.group(1))
> + self.state = state.DECLARATION
> + else:
> + self.entry.declaration_purpose = ""
> +
> + if not self.entry.declaration_purpose and self.config.wshort_desc:
> + self.emit_msg(ln,
> + f"missing initial short description on line:\n{line}")
> +
> + if not self.entry.identifier and self.entry.decl_type != "enum":
> + self.emit_msg(ln,
> + f"wrong kernel-doc identifier on line:\n{line}")
> + self.state = state.NORMAL
> +
> + if self.config.verbose:
> + self.emit_msg(ln,
> + f"Scanning doc for {self.entry.decl_type} {self.entry.identifier}",
> + warning=False)
> + #
> + # Failed to find an identifier. Emit a warning
> + #
> + else:
> + self.emit_msg(ln, f"Cannot find identifier on line:\n{line}")
> +
> + #
> + # Helper function to determine if a new section is being started.
> + #
> + def is_new_section(self, ln, line):
> + if doc_sect.search(line):
> + self.state = state.BODY
> + #
> + # Pick out the name of our new section, tweaking it if need be.
> + #
> + newsection = doc_sect.group(1)
> + if newsection.lower() == 'description':
> + newsection = 'Description'
> + elif newsection.lower() == 'context':
> + newsection = 'Context'
> + self.state = state.SPECIAL_SECTION
> + elif newsection.lower() in ["@return", "@returns",
> + "return", "returns"]:
> + newsection = "Return"
> + self.state = state.SPECIAL_SECTION
> + elif newsection[0] == '@':
> + self.state = state.SPECIAL_SECTION
> + #
> + # Initialize the contents, and get the new section going.
> + #
> + newcontents = doc_sect.group(2)
> + if not newcontents:
> + newcontents = ""
> + self.dump_section()
> + self.entry.begin_section(ln, newsection)
> + self.entry.leading_space = None
> +
> + self.entry.add_text(newcontents.lstrip())
> + return True
> + return False
> +
> + #
> + # Helper function to detect (and effect) the end of a kerneldoc comment.
> + #
> + def is_comment_end(self, ln, line):
> + if doc_end.search(line):
> + self.dump_section()
> +
> + # Look for doc_com + <text> + doc_end:
> + r = KernRe(r'\s*\*\s*[a-zA-Z_0-9:\.]+\*/')
> + if r.match(line):
> + self.emit_msg(ln, f"suspicious ending line: {line}")
> +
> + self.entry.prototype = ""
> + self.entry.new_start_line = ln + 1
> +
> + self.state = state.PROTO
> + return True
> + return False
> +
> +
> + def process_decl(self, ln, line):
> + """
> + STATE_DECLARATION: We've seen the beginning of a declaration
> + """
> + if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
> + return
> + #
> + # Look for anything with the " * " line beginning.
> + #
> + if doc_content.search(line):
> + cont = doc_content.group(1)
> + #
> + # A blank line means that we have moved out of the declaration
> + # part of the comment (without any "special section" parameter
> + # descriptions).
> + #
> + if cont == "":
> + self.state = state.BODY
> + #
> + # Otherwise we have more of the declaration section to soak up.
> + #
> + else:
> + self.entry.declaration_purpose = \
> + trim_whitespace(self.entry.declaration_purpose + ' ' + cont)
> + else:
> + # Unknown line, ignore
> + self.emit_msg(ln, f"bad line: {line}")
> +
> +
> + def process_special(self, ln, line):
> + """
> + STATE_SPECIAL_SECTION: a section ending with a blank line
> + """
> + #
> + # If we have hit a blank line (only the " * " marker), then this
> + # section is done.
> + #
> + if KernRe(r"\s*\*\s*$").match(line):
> + self.entry.begin_section(ln, dump = True)
> + self.state = state.BODY
> + return
> + #
> + # Not a blank line, look for the other ways to end the section.
> + #
> + if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
> + return
> + #
> + # OK, we should have a continuation of the text for this section.
> + #
> + if doc_content.search(line):
> + cont = doc_content.group(1)
> + #
> + # If the lines of text after the first in a special section have
> + # leading white space, we need to trim it out or Sphinx will get
> + # confused. For the second line (the None case), see what we
> + # find there and remember it.
> + #
> + if self.entry.leading_space is None:
> + r = KernRe(r'^(\s+)')
> + if r.match(cont):
> + self.entry.leading_space = len(r.group(1))
> + else:
> + self.entry.leading_space = 0
> + #
> + # Otherwise, before trimming any leading chars, be *sure*
> + # that they are white space. We should maybe warn if this
> + # isn't the case.
> + #
> + for i in range(0, self.entry.leading_space):
> + if cont[i] != " ":
> + self.entry.leading_space = i
> + break
> + #
> + # Add the trimmed result to the section and we're done.
> + #
> + self.entry.add_text(cont[self.entry.leading_space:])
> + else:
> + # Unknown line, ignore
> + self.emit_msg(ln, f"bad line: {line}")
> +
> + def process_body(self, ln, line):
> + """
> + STATE_BODY: the bulk of a kerneldoc comment.
> + """
> + if self.is_new_section(ln, line) or self.is_comment_end(ln, line):
> + return
> +
> + if doc_content.search(line):
> + cont = doc_content.group(1)
> + self.entry.add_text(cont)
> + else:
> + # Unknown line, ignore
> + self.emit_msg(ln, f"bad line: {line}")
> +
> + def process_inline_name(self, ln, line):
> + """STATE_INLINE_NAME: beginning of docbook comments within a prototype."""
> +
> + if doc_inline_sect.search(line):
> + self.entry.begin_section(ln, doc_inline_sect.group(1))
> + self.entry.add_text(doc_inline_sect.group(2).lstrip())
> + self.state = state.INLINE_TEXT
> + elif doc_inline_end.search(line):
> + self.dump_section()
> + self.state = state.PROTO
> + elif doc_content.search(line):
> + self.emit_msg(ln, f"Incorrect use of kernel-doc format: {line}")
> + self.state = state.PROTO
> + # else ... ??
> +
> + def process_inline_text(self, ln, line):
> + """STATE_INLINE_TEXT: docbook comments within a prototype."""
> +
> + if doc_inline_end.search(line):
> + self.dump_section()
> + self.state = state.PROTO
> + elif doc_content.search(line):
> + self.entry.add_text(doc_content.group(1))
> + # else ... ??
> +
> + def syscall_munge(self, ln, proto): # pylint: disable=W0613
> + """
> + Handle syscall definitions
> + """
> +
> + is_void = False
> +
> + # Strip newlines/CR's
> + proto = re.sub(r'[\r\n]+', ' ', proto)
> +
> + # Check if it's a SYSCALL_DEFINE0
> + if 'SYSCALL_DEFINE0' in proto:
> + is_void = True
> +
> + # Replace SYSCALL_DEFINE with correct return type & function name
> + proto = KernRe(r'SYSCALL_DEFINE.*\(').sub('long sys_', proto)
> +
> + r = KernRe(r'long\s+(sys_.*?),')
> + if r.search(proto):
> + proto = KernRe(',').sub('(', proto, count=1)
> + elif is_void:
> + proto = KernRe(r'\)').sub('(void)', proto, count=1)
> +
> + # Now delete all of the odd-numbered commas in the proto
> + # so that argument types & names don't have a comma between them
> + count = 0
> + length = len(proto)
> +
> + if is_void:
> + length = 0 # skip the loop if is_void
> +
> + for ix in range(length):
> + if proto[ix] == ',':
> + count += 1
> + if count % 2 == 1:
> + proto = proto[:ix] + ' ' + proto[ix + 1:]
> +
> + return proto
> +
> + def tracepoint_munge(self, ln, proto):
> + """
> + Handle tracepoint definitions
> + """
> +
> + tracepointname = None
> + tracepointargs = None
> +
> + # Match tracepoint name based on different patterns
> + r = KernRe(r'TRACE_EVENT\((.*?),')
> + if r.search(proto):
> + tracepointname = r.group(1)
> +
> + r = KernRe(r'DEFINE_SINGLE_EVENT\((.*?),')
> + if r.search(proto):
> + tracepointname = r.group(1)
> +
> + r = KernRe(r'DEFINE_EVENT\((.*?),(.*?),')
> + if r.search(proto):
> + tracepointname = r.group(2)
> +
> + if tracepointname:
> + tracepointname = tracepointname.lstrip()
> +
> + r = KernRe(r'TP_PROTO\((.*?)\)')
> + if r.search(proto):
> + tracepointargs = r.group(1)
> +
> + if not tracepointname or not tracepointargs:
> + self.emit_msg(ln,
> + f"Unrecognized tracepoint format:\n{proto}\n")
> + else:
> + proto = f"static inline void trace_{tracepointname}({tracepointargs})"
> + self.entry.identifier = f"trace_{self.entry.identifier}"
> +
> + return proto
> +
> + def process_proto_function(self, ln, line):
> + """Ancillary routine to process a function prototype"""
> +
> + # strip C99-style comments to end of line
> + line = KernRe(r"\/\/.*$", re.S).sub('', line)
> + #
> + # Soak up the line's worth of prototype text, stopping at { or ; if present.
> + #
> + if KernRe(r'\s*#\s*define').match(line):
> + self.entry.prototype = line
> + elif not line.startswith('#'): # skip other preprocessor stuff
> + r = KernRe(r'([^\{]*)')
> + if r.match(line):
> + self.entry.prototype += r.group(1) + " "
> + #
> + # If we now have the whole prototype, clean it up and declare victory.
> + #
> + if '{' in line or ';' in line or KernRe(r'\s*#\s*define').match(line):
> + # strip comments and surrounding spaces
> + self.entry.prototype = KernRe(r'/\*.*\*/').sub('', self.entry.prototype).strip()
> + #
> + # Handle self.entry.prototypes for function pointers like:
> + # int (*pcs_config)(struct foo)
> + # by turning it into
> + # int pcs_config(struct foo)
> + #
> + r = KernRe(r'^(\S+\s+)\(\s*\*(\S+)\)')
> + self.entry.prototype = r.sub(r'\1\2', self.entry.prototype)
> + #
> + # Handle special declaration syntaxes
> + #
> + if 'SYSCALL_DEFINE' in self.entry.prototype:
> + self.entry.prototype = self.syscall_munge(ln,
> + self.entry.prototype)
> + else:
> + r = KernRe(r'TRACE_EVENT|DEFINE_EVENT|DEFINE_SINGLE_EVENT')
> + if r.search(self.entry.prototype):
> + self.entry.prototype = self.tracepoint_munge(ln,
> + self.entry.prototype)
> + #
> + # ... and we're done
> + #
> + self.dump_function(ln, self.entry.prototype)
> + self.reset_state(ln)
> +
> + def process_proto_type(self, ln, line):
> + """Ancillary routine to process a type"""
> +
> + # Strip C99-style comments and surrounding whitespace
> + line = KernRe(r"//.*$", re.S).sub('', line).strip()
> + if not line:
> + return # nothing to see here
> +
> + # To distinguish preprocessor directive from regular declaration later.
> + if line.startswith('#'):
> + line += ";"
> + #
> + # Split the declaration on any of { } or ;, and accumulate pieces
> + # until we hit a semicolon while not inside {brackets}
> + #
> + r = KernRe(r'(.*?)([{};])')
> + for chunk in r.split(line):
> + if chunk: # Ignore empty matches
> + self.entry.prototype += chunk
> + #
> + # This cries out for a match statement ... someday after we can
> + # drop Python 3.9 ...
> + #
> + if chunk == '{':
> + self.entry.brcount += 1
> + elif chunk == '}':
> + self.entry.brcount -= 1
> + elif chunk == ';' and self.entry.brcount <= 0:
> + self.dump_declaration(ln, self.entry.prototype)
> + self.reset_state(ln)
> + return
> + #
> + # We hit the end of the line while still in the declaration; put
> + # in a space to represent the newline.
> + #
> + self.entry.prototype += ' '
> +
> + def process_proto(self, ln, line):
> + """STATE_PROTO: reading a function/whatever prototype."""
> +
> + if doc_inline_oneline.search(line):
> + self.entry.begin_section(ln, doc_inline_oneline.group(1))
> + self.entry.add_text(doc_inline_oneline.group(2))
> + self.dump_section()
> +
> + elif doc_inline_start.search(line):
> + self.state = state.INLINE_NAME
> +
> + elif self.entry.decl_type == 'function':
> + self.process_proto_function(ln, line)
> +
> + else:
> + self.process_proto_type(ln, line)
> +
> + def process_docblock(self, ln, line):
> + """STATE_DOCBLOCK: within a DOC: block."""
> +
> + if doc_end.search(line):
> + self.dump_section()
> + self.output_declaration("doc", self.entry.identifier)
> + self.reset_state(ln)
> +
> + elif doc_content.search(line):
> + self.entry.add_text(doc_content.group(1))
> +
> + def parse_export(self):
> + """
> + Parses EXPORT_SYMBOL* macros from a single Kernel source file.
> + """
> +
> + export_table = set()
> +
> + try:
> + with open(self.fname, "r", encoding="utf8",
> + errors="backslashreplace") as fp:
> +
> + for line in fp:
> + self.process_export(export_table, line)
> +
> + except IOError:
> + return None
> +
> + return export_table
> +
> + #
> + # The state/action table telling us which function to invoke in
> + # each state.
> + #
> + state_actions = {
> + state.NORMAL: process_normal,
> + state.NAME: process_name,
> + state.BODY: process_body,
> + state.DECLARATION: process_decl,
> + state.SPECIAL_SECTION: process_special,
> + state.INLINE_NAME: process_inline_name,
> + state.INLINE_TEXT: process_inline_text,
> + state.PROTO: process_proto,
> + state.DOCBLOCK: process_docblock,
> + }
> +
> + def parse_kdoc(self):
> + """
> + Open and process each line of a C source file.
> + The parsing is controlled via a state machine, and the line is passed
> + to a different process function depending on the state. The process
> + function may update the state as needed.
> +
> + Besides parsing kernel-doc tags, it also parses export symbols.
> + """
> +
> + prev = ""
> + prev_ln = None
> + export_table = set()
> +
> + try:
> + with open(self.fname, "r", encoding="utf8",
> + errors="backslashreplace") as fp:
> + for ln, line in enumerate(fp):
> +
> + line = line.expandtabs().strip("\n")
> +
> + # Group continuation lines on prototypes
> + if self.state == state.PROTO:
> + if line.endswith("\\"):
> + prev += line.rstrip("\\")
> + if not prev_ln:
> + prev_ln = ln
> + continue
> +
> + if prev:
> + ln = prev_ln
> + line = prev + line
> + prev = ""
> + prev_ln = None
> +
> + self.config.log.debug("%d %s: %s",
> + ln, state.name[self.state],
> + line)
> +
> + # This is an optimization over the original script.
> + # There, when export_file was used for the same file,
> + # it was read twice. Here, we use the already-existing
> + # loop to parse exported symbols as well.
> + #
> + if (self.state != state.NORMAL) or \
> + not self.process_export(export_table, line):
> + # Hand this line to the appropriate state handler
> + self.state_actions[self.state](self, ln, line)
> +
> + except OSError:
> + self.config.log.error(f"Error: Cannot open file {self.fname}")
> +
> + return export_table, self.entries
> diff --git a/scripts/lib/kdoc/kdoc_re.py b/scripts/lib/kdoc/kdoc_re.py
> new file mode 100644
> index 00000000000..612223e1e72
> --- /dev/null
> +++ b/scripts/lib/kdoc/kdoc_re.py
> @@ -0,0 +1,270 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +
> +"""
> +Regular expression ancillary classes.
> +
> +Those help caching regular expressions and do matching for kernel-doc.
> +"""
> +
> +import re
> +
> +# Local cache for regular expressions
> +re_cache = {}
> +
> +
> +class KernRe:
> + """
> + Helper class to simplify regex declaration and usage,
> +
> + It calls re.compile for a given pattern. It also allows adding
> + regular expressions and define sub at class init time.
> +
> + Regular expressions can be cached via an argument, helping to speedup
> + searches.
> + """
> +
> + def _add_regex(self, string, flags):
> + """
> + Adds a new regex or re-use it from the cache.
> + """
> + self.regex = re_cache.get(string, None)
> + if not self.regex:
> + self.regex = re.compile(string, flags=flags)
> + if self.cache:
> + re_cache[string] = self.regex
> +
> + def __init__(self, string, cache=True, flags=0):
> + """
> + Compile a regular expression and initialize internal vars.
> + """
> +
> + self.cache = cache
> + self.last_match = None
> +
> + self._add_regex(string, flags)
> +
> + def __str__(self):
> + """
> + Return the regular expression pattern.
> + """
> + return self.regex.pattern
> +
> + def __add__(self, other):
> + """
> + Allows adding two regular expressions into one.
> + """
> +
> + return KernRe(str(self) + str(other), cache=self.cache or other.cache,
> + flags=self.regex.flags | other.regex.flags)
> +
> + def match(self, string):
> + """
> + Handles a re.match storing its results
> + """
> +
> + self.last_match = self.regex.match(string)
> + return self.last_match
> +
> + def search(self, string):
> + """
> + Handles a re.search storing its results
> + """
> +
> + self.last_match = self.regex.search(string)
> + return self.last_match
> +
> + def findall(self, string):
> + """
> + Alias to re.findall
> + """
> +
> + return self.regex.findall(string)
> +
> + def split(self, string):
> + """
> + Alias to re.split
> + """
> +
> + return self.regex.split(string)
> +
> + def sub(self, sub, string, count=0):
> + """
> + Alias to re.sub
> + """
> +
> + return self.regex.sub(sub, string, count=count)
> +
> + def group(self, num):
> + """
> + Returns the group results of the last match
> + """
> +
> + return self.last_match.group(num)
> +
> +
> +class NestedMatch:
> + """
> + Finding nested delimiters is hard with regular expressions. It is
> + even harder on Python with its normal re module, as there are several
> + advanced regular expressions that are missing.
> +
> + This is the case of this pattern:
> +
> + '\\bSTRUCT_GROUP(\\(((?:(?>[^)(]+)|(?1))*)\\))[^;]*;'
> +
> + which is used to properly match open/close parenthesis of the
> + string search STRUCT_GROUP(),
> +
> + Add a class that counts pairs of delimiters, using it to match and
> + replace nested expressions.
> +
> + The original approach was suggested by:
> + https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
> +
> + Although I re-implemented it to make it more generic and match 3 types
> + of delimiters. The logic checks if delimiters are paired. If not, it
> + will ignore the search string.
> + """
> +
> + # TODO: make NestedMatch handle multiple match groups
> + #
> + # Right now, regular expressions to match it are defined only up to
> + # the start delimiter, e.g.:
> + #
> + # \bSTRUCT_GROUP\(
> + #
> + # is similar to: STRUCT_GROUP\((.*)\)
> + # except that the content inside the match group is delimiter's aligned.
> + #
> + # The content inside parenthesis are converted into a single replace
> + # group (e.g. r`\1').
> + #
> + # It would be nice to change such definition to support multiple
> + # match groups, allowing a regex equivalent to.
> + #
> + # FOO\((.*), (.*), (.*)\)
> + #
> + # it is probably easier to define it not as a regular expression, but
> + # with some lexical definition like:
> + #
> + # FOO(arg1, arg2, arg3)
> +
> + DELIMITER_PAIRS = {
> + '{': '}',
> + '(': ')',
> + '[': ']',
> + }
> +
> + RE_DELIM = re.compile(r'[\{\}\[\]\(\)]')
> +
> + def _search(self, regex, line):
> + """
> + Finds paired blocks for a regex that ends with a delimiter.
> +
> + The suggestion of using finditer to match pairs came from:
> + https://stackoverflow.com/questions/5454322/python-how-to-match-nested-parentheses-with-regex
> + but I ended using a different implementation to align all three types
> + of delimiters and seek for an initial regular expression.
> +
> + The algorithm seeks for open/close paired delimiters and place them
> + into a stack, yielding a start/stop position of each match when the
> + stack is zeroed.
> +
> + The algorithm shoud work fine for properly paired lines, but will
> + silently ignore end delimiters that preceeds an start delimiter.
> + This should be OK for kernel-doc parser, as unaligned delimiters
> + would cause compilation errors. So, we don't need to rise exceptions
> + to cover such issues.
> + """
> +
> + stack = []
> +
> + for match_re in regex.finditer(line):
> + start = match_re.start()
> + offset = match_re.end()
> +
> + d = line[offset - 1]
> + if d not in self.DELIMITER_PAIRS:
> + continue
> +
> + end = self.DELIMITER_PAIRS[d]
> + stack.append(end)
> +
> + for match in self.RE_DELIM.finditer(line[offset:]):
> + pos = match.start() + offset
> +
> + d = line[pos]
> +
> + if d in self.DELIMITER_PAIRS:
> + end = self.DELIMITER_PAIRS[d]
> +
> + stack.append(end)
> + continue
> +
> + # Does the end delimiter match what it is expected?
> + if stack and d == stack[-1]:
> + stack.pop()
> +
> + if not stack:
> + yield start, offset, pos + 1
> + break
> +
> + def search(self, regex, line):
> + """
> + This is similar to re.search:
> +
> + It matches a regex that it is followed by a delimiter,
> + returning occurrences only if all delimiters are paired.
> + """
> +
> + for t in self._search(regex, line):
> +
> + yield line[t[0]:t[2]]
> +
> + def sub(self, regex, sub, line, count=0):
> + """
> + This is similar to re.sub:
> +
> + It matches a regex that it is followed by a delimiter,
> + replacing occurrences only if all delimiters are paired.
> +
> + if r'\1' is used, it works just like re: it places there the
> + matched paired data with the delimiter stripped.
> +
> + If count is different than zero, it will replace at most count
> + items.
> + """
> + out = ""
> +
> + cur_pos = 0
> + n = 0
> +
> + for start, end, pos in self._search(regex, line):
> + out += line[cur_pos:start]
> +
> + # Value, ignoring start/end delimiters
> + value = line[end:pos - 1]
> +
> + # replaces \1 at the sub string, if \1 is used there
> + new_sub = sub
> + new_sub = new_sub.replace(r'\1', value)
> +
> + out += new_sub
> +
> + # Drop end ';' if any
> + if line[pos] == ';':
> + pos += 1
> +
> + cur_pos = pos
> + n += 1
> +
> + if count and count >= n:
> + break
> +
> + # Append the remaining string
> + l = len(line)
> + out += line[cur_pos:l]
> +
> + return out
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions
2025-08-14 17:13 ` [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions Peter Maydell
@ 2025-08-15 10:01 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 10:01 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:19 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> This commit is the Python version of our older commit
> b30df2751e5 ("scripts/kernel-doc: strip QEMU_ from function definitions").
>
> Some versions of Sphinx get confused if function attributes are
> left on the C code from kernel-doc; strip out any QEMU_* prefixes
> from function prototypes.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Yeah, I saw this difference before when I did a quick test meant
to port it to QEMU.
So,
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> scripts/lib/kdoc/kdoc_parser.py | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
> index fe730099eca..32b43562929 100644
> --- a/scripts/lib/kdoc/kdoc_parser.py
> +++ b/scripts/lib/kdoc/kdoc_parser.py
> @@ -907,6 +907,7 @@ def dump_function(self, ln, prototype):
> (r"^__always_inline +", "", 0),
> (r"^noinline +", "", 0),
> (r"^__FORTIFY_INLINE +", "", 0),
> + (r"QEMU_[A-Z_]+ +", "", 0),
> (r"__init +", "", 0),
> (r"__init_or_module +", "", 0),
> (r"__deprecated +", "", 0),
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
2025-08-15 9:39 ` Mauro Carvalho Chehab
@ 2025-08-15 10:10 ` Peter Maydell
2025-08-15 11:12 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 24+ messages in thread
From: Peter Maydell @ 2025-08-15 10:10 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Jonathan Cameron, qemu-devel, Paolo Bonzini, John Snow
On Fri, 15 Aug 2025 at 10:39, Mauro Carvalho Chehab
<mchehab+huawei@kernel.org> wrote:
>
> Hi Peter/Jonathan,
>
> Em Fri, 15 Aug 2025 10:11:09 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
>
> > On Thu, 14 Aug 2025 18:13:15 +0100
> > Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > > from the old Perl version into a shiny and hopefully more maintainable
> > > Python version. This commit series updates our copy of this script
> > > to the latest kernel version. I have tested it by comparing the
> > > generated HTML documentation and checking that there are no
> > > unexpected changes.
>
> Nice! Yeah, I had a branch here doing something similar for QEMU,
> but got sidetracked by other things and didn't have time to address
> a couple of issues. I'm glad you find the time for it.
>
> > > Luckily we are carrying very few local modifications to the Perl
> > > script, so this is fairly straightforward. The structure of the
> > > patchset is:
> > > * a minor update to the kerneldoc.py Sphinx extension so it
> > > will work with both old and new kernel-doc script output
> > > * a fix to a doc comment markup error that I noticed while comparing
> > > the HTML output from the two versions of the script
> > > * import the new Python script, unmodified from the kernel's version
> > > (conveniently the kernel calls it kernel-doc.py, so it doesn't
> > > clash with the existing script)
>
> > > * make the changes to that library code that correspond to the
> > > two local QEMU-specific changes we carry
>
> To make it easier to maintain and keep in sync with Kernel upstream,
> perhaps we can try to change Kernel upstream to make easier for QEMU
> to have a class override for the kdoc parser, allowing it to just
> sync with Linux upstream, while having its own set of rules on a
> separate file.
Mmm, this would certainly be nice, but at least so far we haven't
needed to make extensive changes, luckily (you can see how small
our local adjustments are here).
> > > * tell sphinx to use the Python version
> > > * delete the Perl script (I have put a diff of our local mods
> > > to the Perl script in the commit message of this commit, for
> > > posterity)
> > >
> > > The diffstat looks big, but almost all of it is "import the
> > > kernel's new script that we trust and don't need to review in
> > > detail" and "delete the old script".
>
> One thing that should be noticed is that Jonathan Corbet is currently
> doing several cleanups at the Python script, simplifying some
> regular expressions, avoiding them when str.replace() does the job
> and adding comments. The end goal is to make it easier for developers
> to understand and help maintaining its code.
>
> So, it is probably worth backporting Linux upstream changes after
> the end of Kernel 6.17 cycle.
Thanks for the heads-up on that one. A further sync should
be straightforward after this one, I expect.
> > > We should also update the Sphinx plugin itself (i.e.
> > > docs/sphinx/kerneldoc.py), but because I did not need to do
> > > that to update the main kernel-doc script, I have left that as
> > > a separate todo item.
>
> The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
> than before, and hendles better kernel-doc warnings, as they are now
> using Sphinx logger class.
Also as much as anything else it's just nice for us not to
diverge if we can avoid it.
Incidentally, I'm curious if the kernel docs see problems
with docutils 0.22 -- we had a report about problems there,
at least some of which seem to be because the way kerneldoc.py
adds its rST output is triggering the new docutils to complain
if the added code doesn't have a consistent title style
hierarchy: https://sourceforge.net/p/docutils/bugs/508/
(It looks like they're trying to address this on the docutils side;
we might or might not adjust on our side too by fixing up the
title styles if that's not too awkward for us.)
> Btw, one important point to notice: if you picked the latest version
> of kernel-doc, it currently requires at least Python 3.6 (3.7 is the
> recommended minimal one). It does check that, silently bailing out
> if Python < 3.6.
QEMU already requires Python 3.9 or better; our configure checks:
check_py_version() {
# We require python >= 3.9.
# NB: a True python conditional creates a non-zero return code (Failure)
"$1" -c 'import sys; sys.exit(sys.version_info < (3,9))'
}
Thanks for the confirmation that the kernel is being more
conservative on python requirements than we are; I did
wonder about this but merely assumed you probably were
rather than specifically checking :-)
On this minor output change:
> > > "Definition" sections now get output with a trailing colon:
> > >
> > > -<p><strong>Definition</strong></p>
> > > +<div class="kernelindent docutils container">
> > > +<p><strong>Definition</strong>:</p>
> > >
> > > This seems like it might be a bug in kernel-doc since the Parameters,
> > > Return, etc sections don't get the trailing colon. I don't think it's
> > > important enough to worry about.
is the extra colon intentional, or do you agree that it's
a bug? You can see it in the kernel docs output at e.g.
https://docs.kernel.org/core-api/workqueue.html#c.workqueue_attrs
where in the documentation of struct workqueue_attrs,
"Definition:" gets a kernel but the corresponding "Members"
and "Description" don't. (Also "Description" is out-dented
there when it probably should not be, but that's separate.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment
2025-08-15 9:51 ` Mauro Carvalho Chehab
@ 2025-08-15 10:14 ` Peter Maydell
0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-15 10:14 UTC (permalink / raw)
To: Mauro Carvalho Chehab; +Cc: qemu-devel, Paolo Bonzini, John Snow
On Fri, 15 Aug 2025 at 10:51, Mauro Carvalho Chehab
<mchehab+huawei@kernel.org> wrote:
>
> Em Thu, 14 Aug 2025 18:13:17 +0100
> Peter Maydell <peter.maydell@linaro.org> escreveu:
>
> > The doc comment for qtest_cb_for_every_machine has a stray
> > space at the start of its description, which makes kernel-doc
> > think that this line is part of the documentation of the
> > skip_old_versioned argument. The result is that the HTML
> > doesn't have a "Description" section and the text is instead
> > put in the wrong place.
> >
> > Remove the stray space.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>
> LGTM. Even the previous version should have handled it wrong here
> (if not, it is a bug there - or perhaps QEMU version was using
> a very old kernel-doc.pl version).
Yes, the documentation comes out looking wrong on the
old version too -- I only noticed this because I was
examining the diffs of the HTML for before and after
and the exact way it's rendered changed, so it showed up.
Easiest way to reduce the diff was to fix our markup
error :-)
You can see how the old version outputs it at:
https://www.qemu.org/docs/master/devel/testing/qtest.html#c.qtest_cb_for_every_machine
-- PMM
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel
2025-08-14 17:13 ` [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel Peter Maydell
2025-08-15 10:00 ` Mauro Carvalho Chehab
@ 2025-08-15 10:19 ` Peter Maydell
1 sibling, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-15 10:19 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, John Snow
On Thu, 14 Aug 2025 at 18:13, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> We last synced our copy of kerneldoc with Linux back in 2020. In the
> interim, upstream has entirely rewritten the script in Python, and
> the new Python version is split into a main script plus some
> libraries in the kernel's scripts/lib/kdoc.
>
> Import all these files. These are the versions as of kernel commit
> 0cc53520e68be, with no local changes.
>
> We use the same lib/kdoc/ directory as the kernel does here, so we
> can avoid having to edit the top-level script just to adjust a
> pathname, even though it is probably not the naming we would have
> picked if this was a purely QEMU script.
> scripts/kernel-doc.py | 325 ++++++
> scripts/lib/kdoc/kdoc_files.py | 291 ++++++
> scripts/lib/kdoc/kdoc_item.py | 42 +
> scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++++++
> scripts/lib/kdoc/kdoc_parser.py | 1669 +++++++++++++++++++++++++++++++
> scripts/lib/kdoc/kdoc_re.py | 270 +++++
We could alternatively put the whole thing into a subdir, e.g.:
scripts/kdoc/kernel-doc.py
scripts/kdoc/lib/kdoc/kdoc_files.py
scripts/kdoc/lib/kdoc/kdoc_item.py
etc
if we don't like having lib/ at the top level of our
scripts/ directory. The thing the script cares about
is that the relative path from kernel-doc.py to the
lib/kdoc files is the same.
thanks
-- PMM
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards
2025-08-14 17:13 ` [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards Peter Maydell
@ 2025-08-15 10:34 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 10:34 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:20 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> This commit makes the equivalent changes to the Python script that we
> had for the old Perl script in commit 4cf41794411f ("docs: tweak
> kernel-doc for QEMU coding standards"). To repeat the rationale from
> that commit:
>
> Surprisingly, QEMU does have a pretty consistent doc comment style and
> it is not very different from the Linux kernel's. Of the documentation
> "sigils", only "#" separates the QEMU doc comment style from Linux's,
> and it has 200+ instances vs. 6 for the kernel's '&struct foo' (all in
> accel/tcg/translate-all.c), so it's clear that the two standards are
> different in this respect. In addition, our structs are typedefed and
> recognized by CamelCase names.
>
> Note that in 4cf41794411f we used '(?!)' as our type_fallback regex;
> this is strictly not quite a replacement for the upstream
> '\&([_\w]+)', because the latter includes a group that can later be
> matched with \1, and the former does not. The old perl script did
> not care about this, but the python version does, so we must include
> the extra set of brackets to ensure we have a group.
>
> This commit does not include all the same changes that 4cf41794411f
> did. Of the missing pieces, some had already gone in an earlier
> kernel-doc update; the parts we still had but do not include here are:
>
> @@ -2057,7 +2060,7 @@
> }
> elsif (/$doc_decl/o) {
> $identifier = $1;
> - if (/\s*([\w\s]+?)(\(\))?\s*-/) {
> + if (/\s*([\w\s]+?)(\s*-|:)/) {
> $identifier = $1;
> }
>
> @@ -2067,7 +2070,7 @@
> $contents = "";
> $section = $section_default;
> $new_start_line = $. + 1;
> - if (/-(.*)/) {
> + if (/[-:](.*)/) {
> # strip leading/trailing/multiple spaces
> $descr= $1;
> $descr =~ s/^\s*//;
>
> The second of these is already in the upstream version: the line r =
> KernRe("[-:](.*)") in process_name() matches the regex we have.
Yes. If I recall correctly, we added this one to solve some issues on a
couple of files that were full of ":" as separator. They violate what
is documented as a valid kernel-doc markup, but it didn't hurt adding
support for such variant.
> The
> first change has been refactored into the doc_begin_data and
> doc_begin_func changes. Since the output HTML for QEMU's
> documentation has no relevant changes with the new kerneldoc, we
> assume that this too has been handled upstream.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LGTM, but see my notes below.
Anyway:
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> scripts/lib/kdoc/kdoc_output.py | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
> index ea8914537ba..39fa872dfca 100644
> --- a/scripts/lib/kdoc/kdoc_output.py
> +++ b/scripts/lib/kdoc/kdoc_output.py
> @@ -38,12 +38,12 @@
> type_fp_param2 = KernRe(r"\@(\w+->\S+)\(\)", cache=False)
>
> type_env = KernRe(r"(\$\w+)", cache=False)
> -type_enum = KernRe(r"\&(enum\s*([_\w]+))", cache=False)
> -type_struct = KernRe(r"\&(struct\s*([_\w]+))", cache=False)
> -type_typedef = KernRe(r"\&(typedef\s*([_\w]+))", cache=False)
> -type_union = KernRe(r"\&(union\s*([_\w]+))", cache=False)
> -type_member = KernRe(r"\&([_\w]+)(\.|->)([_\w]+)", cache=False)
> -type_fallback = KernRe(r"\&([_\w]+)", cache=False)
> +type_enum = KernRe(r"#(enum\s*([_\w]+))", cache=False)
> +type_struct = KernRe(r"#(struct\s*([_\w]+))", cache=False)
> +type_typedef = KernRe(r"#(([A-Z][_\w]*))", cache=False)
> +type_union = KernRe(r"#(union\s*([_\w]+))", cache=False)
> +type_member = KernRe(r"#([_\w]+)(\.|->)([_\w]+)", cache=False)
> +type_fallback = KernRe(r"((?!))", cache=False) # this never matches
> type_member_func = type_member + KernRe(r"\(\)", cache=False)
That seems something that a class override would address it better.
Basically, you can do something like:
type_enum = KernRe(r"#(enum\s*([_\w]+))", cache=False)
type_struct = KernRe(r"#(struct\s*([_\w]+))", cache=False)
type_typedef = KernRe(r"#(([A-Z][_\w]*))", cache=False)
type_union = KernRe(r"#(union\s*([_\w]+))", cache=False)
type_member = KernRe(r"#([_\w]+)(\.|->)([_\w]+)", cache=False)
type_fallback = KernRe(r"((?!))", cache=False) # this never matches
...
(either keep the other types or add a __init__ that would append
or replace only the above elements)
class QemuRestFormat(RestFormatOutput):
highlights = [
(type_constant, r"``\1``"),
(type_constant2, r"``\1``"),
# Note: need to escape () to avoid func matching later
(type_member_func, r":c:type:`\1\2\3\\(\\) <\1>`"),
(type_member, r":c:type:`\1\2\3 <\1>`"),
(type_fp_param, r"**\1\\(\\)**"),
(type_fp_param2, r"**\1\\(\\)**"),
(type_func, r"\1()"),
(type_enum, r":c:type:`\1 <\2>`"),
(type_struct, r":c:type:`\1 <\2>`"),
(type_typedef, r":c:type:`\1 <\2>`"),
(type_union, r":c:type:`\1 <\2>`"),
# in rst this can refer to any type
(type_fallback, r":c:type:`\1`"),
(type_param_ref, r"**\1\2**")
]
Where the above will be the QEMU-specific regexes.
Then, when creating a KernelFiles() instance at kerneldoc.py Sphinx
extension:
def setup_kfiles(app):
global kfiles
out_style = QemuRestFormat()
kfiles = KernelFiles(out_style=out_style, logger=logger)
keeping the remaining code of the Kernel version of kerneldoc.py.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl kernel-doc script
2025-08-14 17:13 ` [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl " Peter Maydell
@ 2025-08-15 10:35 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 10:35 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:22 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> We can now delete the old Perl kernel-doc script. For posterity,
> this is a complete diff of the local changes that we were carrying
> between the kernel's Perl script as of kernel commit 72b97d0b911872ba
> (the last time we synced it) and our local copy:
>
> --- /tmp/kdoc 2025-08-14 10:42:47.620331939 +0100
> +++ scripts/kernel-doc 2025-02-17 10:44:34.528421457 +0000
> @@ -1,5 +1,5 @@
> #!/usr/bin/env perl
> -# SPDX-License-Identifier: GPL-2.0
> +# SPDX-License-Identifier: GPL-2.0-only
>
> use warnings;
> use strict;
> @@ -224,12 +224,12 @@
> my $type_fp_param = '\@(\w+)\(\)'; # Special RST handling for func ptr params
> my $type_fp_param2 = '\@(\w+->\S+)\(\)'; # Special RST handling for structs with func ptr params
> my $type_env = '(\$\w+)';
> -my $type_enum = '\&(enum\s*([_\w]+))';
> -my $type_struct = '\&(struct\s*([_\w]+))';
> -my $type_typedef = '\&(typedef\s*([_\w]+))';
> -my $type_union = '\&(union\s*([_\w]+))';
> -my $type_member = '\&([_\w]+)(\.|->)([_\w]+)';
> -my $type_fallback = '\&([_\w]+)';
> +my $type_enum = '#(enum\s*([_\w]+))';
> +my $type_struct = '#(struct\s*([_\w]+))';
> +my $type_typedef = '#(([A-Z][_\w]*))';
> +my $type_union = '#(union\s*([_\w]+))';
> +my $type_member = '#([_\w]+)(\.|->)([_\w]+)';
> +my $type_fallback = '(?!)'; # this never matches
> my $type_member_func = $type_member . '\(\)';
>
> # Output conversion substitutions.
> @@ -1745,6 +1745,9 @@
> )+
> \)\)\s+//x;
>
> + # Strip QEMU specific compiler annotations
> + $prototype =~ s/QEMU_[A-Z_]+ +//;
> +
> # Yes, this truly is vile. We are looking for:
> # 1. Return type (may be nothing if we're looking at a macro)
> # 2. Function name
> @@ -2057,7 +2060,7 @@
> }
> elsif (/$doc_decl/o) {
> $identifier = $1;
> - if (/\s*([\w\s]+?)(\(\))?\s*-/) {
> + if (/\s*([\w\s]+?)(\s*-|:)/) {
> $identifier = $1;
> }
>
> @@ -2067,7 +2070,7 @@
> $contents = "";
> $section = $section_default;
> $new_start_line = $. + 1;
> - if (/-(.*)/) {
> + if (/[-:](.*)/) {
> # strip leading/trailing/multiple spaces
> $descr= $1;
> $descr =~ s/^\s*//;
>
> These changes correspond to:
> 06e2329636f license: Update deprecated SPDX tag GPL-2.0 to GPL-2.0-only
> (a bulk change which we won't bother to re-apply to this third-party script)
> b30df2751e5 scripts/kernel-doc: strip QEMU_ from function definitions
> 4cf41794411 docs: tweak kernel-doc for QEMU coding standards
>
> We have already applied the equivalent of these changes to the
> Python code in libs/kdoc/ in the preceding commits.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
LGTM.
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> .editorconfig | 2 +-
> scripts/kernel-doc | 2442 --------------------------------------------
> 2 files changed, 1 insertion(+), 2443 deletions(-)
> delete mode 100755 scripts/kernel-doc
>
> diff --git a/.editorconfig b/.editorconfig
> index a04cb9054cb..258d41ab485 100644
> --- a/.editorconfig
> +++ b/.editorconfig
> @@ -55,7 +55,7 @@ indent_size = 4
> emacs_mode = perl
>
> # but user kernel "style" for imported scripts
> -[scripts/{kernel-doc,get_maintainer.pl,checkpatch.pl}]
> +[scripts/{get_maintainer.pl,checkpatch.pl}]
> indent_style = tab
> indent_size = 8
> emacs_mode = perl
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> deleted file mode 100755
> index fec83f53eda..00000000000
> --- a/scripts/kernel-doc
> +++ /dev/null
> @@ -1,2442 +0,0 @@
> -#!/usr/bin/env perl
> -# SPDX-License-Identifier: GPL-2.0-only
> -
> -use warnings;
> -use strict;
> -
> -## Copyright (c) 1998 Michael Zucchi, All Rights Reserved ##
> -## Copyright (C) 2000, 1 Tim Waugh <twaugh@redhat.com> ##
> -## Copyright (C) 2001 Simon Huggins ##
> -## Copyright (C) 2005-2012 Randy Dunlap ##
> -## Copyright (C) 2012 Dan Luedtke ##
> -## ##
> -## #define enhancements by Armin Kuster <akuster@mvista.com> ##
> -## Copyright (c) 2000 MontaVista Software, Inc. ##
> -## ##
> -## This software falls under the GNU General Public License. ##
> -## Please read the COPYING file for more information ##
> -
> -# 18/01/2001 - Cleanups
> -# Functions prototyped as foo(void) same as foo()
> -# Stop eval'ing where we don't need to.
> -# -- huggie@earth.li
> -
> -# 27/06/2001 - Allowed whitespace after initial "/**" and
> -# allowed comments before function declarations.
> -# -- Christian Kreibich <ck@whoop.org>
> -
> -# Still to do:
> -# - add perldoc documentation
> -# - Look more closely at some of the scarier bits :)
> -
> -# 26/05/2001 - Support for separate source and object trees.
> -# Return error code.
> -# Keith Owens <kaos@ocs.com.au>
> -
> -# 23/09/2001 - Added support for typedefs, structs, enums and unions
> -# Support for Context section; can be terminated using empty line
> -# Small fixes (like spaces vs. \s in regex)
> -# -- Tim Jansen <tim@tjansen.de>
> -
> -# 25/07/2012 - Added support for HTML5
> -# -- Dan Luedtke <mail@danrl.de>
> -
> -sub usage {
> - my $message = <<"EOF";
> -Usage: $0 [OPTION ...] FILE ...
> -
> -Read C language source or header FILEs, extract embedded documentation comments,
> -and print formatted documentation to standard output.
> -
> -The documentation comments are identified by "/**" opening comment mark. See
> -Documentation/doc-guide/kernel-doc.rst for the documentation comment syntax.
> -
> -Output format selection (mutually exclusive):
> - -man Output troff manual page format. This is the default.
> - -rst Output reStructuredText format.
> - -none Do not output documentation, only warnings.
> -
> -Output format selection modifier (affects only ReST output):
> -
> - -sphinx-version Use the ReST C domain dialect compatible with an
> - specific Sphinx Version.
> - If not specified, kernel-doc will auto-detect using
> - the sphinx-build version found on PATH.
> -
> -Output selection (mutually exclusive):
> - -export Only output documentation for symbols that have been
> - exported using EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL()
> - in any input FILE or -export-file FILE.
> - -internal Only output documentation for symbols that have NOT been
> - exported using EXPORT_SYMBOL() or EXPORT_SYMBOL_GPL()
> - in any input FILE or -export-file FILE.
> - -function NAME Only output documentation for the given function(s)
> - or DOC: section title(s). All other functions and DOC:
> - sections are ignored. May be specified multiple times.
> - -nosymbol NAME Exclude the specified symbols from the output
> - documentation. May be specified multiple times.
> -
> -Output selection modifiers:
> - -no-doc-sections Do not output DOC: sections.
> - -enable-lineno Enable output of #define LINENO lines. Only works with
> - reStructuredText format.
> - -export-file FILE Specify an additional FILE in which to look for
> - EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL(). To be used with
> - -export or -internal. May be specified multiple times.
> -
> -Other parameters:
> - -v Verbose output, more warnings and other information.
> - -h Print this help.
> - -Werror Treat warnings as errors.
> -
> -EOF
> - print $message;
> - exit 1;
> -}
> -
> -#
> -# format of comments.
> -# In the following table, (...)? signifies optional structure.
> -# (...)* signifies 0 or more structure elements
> -# /**
> -# * function_name(:)? (- short description)?
> -# (* @parameterx: (description of parameter x)?)*
> -# (* a blank line)?
> -# * (Description:)? (Description of function)?
> -# * (section header: (section description)? )*
> -# (*)?*/
> -#
> -# So .. the trivial example would be:
> -#
> -# /**
> -# * my_function
> -# */
> -#
> -# If the Description: header tag is omitted, then there must be a blank line
> -# after the last parameter specification.
> -# e.g.
> -# /**
> -# * my_function - does my stuff
> -# * @my_arg: its mine damnit
> -# *
> -# * Does my stuff explained.
> -# */
> -#
> -# or, could also use:
> -# /**
> -# * my_function - does my stuff
> -# * @my_arg: its mine damnit
> -# * Description: Does my stuff explained.
> -# */
> -# etc.
> -#
> -# Besides functions you can also write documentation for structs, unions,
> -# enums and typedefs. Instead of the function name you must write the name
> -# of the declaration; the struct/union/enum/typedef must always precede
> -# the name. Nesting of declarations is not supported.
> -# Use the argument mechanism to document members or constants.
> -# e.g.
> -# /**
> -# * struct my_struct - short description
> -# * @a: first member
> -# * @b: second member
> -# *
> -# * Longer description
> -# */
> -# struct my_struct {
> -# int a;
> -# int b;
> -# /* private: */
> -# int c;
> -# };
> -#
> -# All descriptions can be multiline, except the short function description.
> -#
> -# For really longs structs, you can also describe arguments inside the
> -# body of the struct.
> -# eg.
> -# /**
> -# * struct my_struct - short description
> -# * @a: first member
> -# * @b: second member
> -# *
> -# * Longer description
> -# */
> -# struct my_struct {
> -# int a;
> -# int b;
> -# /**
> -# * @c: This is longer description of C
> -# *
> -# * You can use paragraphs to describe arguments
> -# * using this method.
> -# */
> -# int c;
> -# };
> -#
> -# This should be use only for struct/enum members.
> -#
> -# You can also add additional sections. When documenting kernel functions you
> -# should document the "Context:" of the function, e.g. whether the functions
> -# can be called form interrupts. Unlike other sections you can end it with an
> -# empty line.
> -# A non-void function should have a "Return:" section describing the return
> -# value(s).
> -# Example-sections should contain the string EXAMPLE so that they are marked
> -# appropriately in DocBook.
> -#
> -# Example:
> -# /**
> -# * user_function - function that can only be called in user context
> -# * @a: some argument
> -# * Context: !in_interrupt()
> -# *
> -# * Some description
> -# * Example:
> -# * user_function(22);
> -# */
> -# ...
> -#
> -#
> -# All descriptive text is further processed, scanning for the following special
> -# patterns, which are highlighted appropriately.
> -#
> -# 'funcname()' - function
> -# '$ENVVAR' - environmental variable
> -# '&struct_name' - name of a structure (up to two words including 'struct')
> -# '&struct_name.member' - name of a structure member
> -# '@parameter' - name of a parameter
> -# '%CONST' - name of a constant.
> -# '``LITERAL``' - literal string without any spaces on it.
> -
> -## init lots of data
> -
> -my $errors = 0;
> -my $warnings = 0;
> -my $anon_struct_union = 0;
> -
> -# match expressions used to find embedded type information
> -my $type_constant = '\b``([^\`]+)``\b';
> -my $type_constant2 = '\%([-_\w]+)';
> -my $type_func = '(\w+)\(\)';
> -my $type_param = '\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)';
> -my $type_param_ref = '([\!]?)\@(\w*((\.\w+)|(->\w+))*(\.\.\.)?)';
> -my $type_fp_param = '\@(\w+)\(\)'; # Special RST handling for func ptr params
> -my $type_fp_param2 = '\@(\w+->\S+)\(\)'; # Special RST handling for structs with func ptr params
> -my $type_env = '(\$\w+)';
> -my $type_enum = '#(enum\s*([_\w]+))';
> -my $type_struct = '#(struct\s*([_\w]+))';
> -my $type_typedef = '#(([A-Z][_\w]*))';
> -my $type_union = '#(union\s*([_\w]+))';
> -my $type_member = '#([_\w]+)(\.|->)([_\w]+)';
> -my $type_fallback = '(?!)'; # this never matches
> -my $type_member_func = $type_member . '\(\)';
> -
> -# Output conversion substitutions.
> -# One for each output format
> -
> -# these are pretty rough
> -my @highlights_man = (
> - [$type_constant, "\$1"],
> - [$type_constant2, "\$1"],
> - [$type_func, "\\\\fB\$1\\\\fP"],
> - [$type_enum, "\\\\fI\$1\\\\fP"],
> - [$type_struct, "\\\\fI\$1\\\\fP"],
> - [$type_typedef, "\\\\fI\$1\\\\fP"],
> - [$type_union, "\\\\fI\$1\\\\fP"],
> - [$type_param, "\\\\fI\$1\\\\fP"],
> - [$type_param_ref, "\\\\fI\$1\$2\\\\fP"],
> - [$type_member, "\\\\fI\$1\$2\$3\\\\fP"],
> - [$type_fallback, "\\\\fI\$1\\\\fP"]
> - );
> -my $blankline_man = "";
> -
> -# rst-mode
> -my @highlights_rst = (
> - [$type_constant, "``\$1``"],
> - [$type_constant2, "``\$1``"],
> - # Note: need to escape () to avoid func matching later
> - [$type_member_func, "\\:c\\:type\\:`\$1\$2\$3\\\\(\\\\) <\$1>`"],
> - [$type_member, "\\:c\\:type\\:`\$1\$2\$3 <\$1>`"],
> - [$type_fp_param, "**\$1\\\\(\\\\)**"],
> - [$type_fp_param2, "**\$1\\\\(\\\\)**"],
> - [$type_func, "\$1()"],
> - [$type_enum, "\\:c\\:type\\:`\$1 <\$2>`"],
> - [$type_struct, "\\:c\\:type\\:`\$1 <\$2>`"],
> - [$type_typedef, "\\:c\\:type\\:`\$1 <\$2>`"],
> - [$type_union, "\\:c\\:type\\:`\$1 <\$2>`"],
> - # in rst this can refer to any type
> - [$type_fallback, "\\:c\\:type\\:`\$1`"],
> - [$type_param_ref, "**\$1\$2**"]
> - );
> -my $blankline_rst = "\n";
> -
> -# read arguments
> -if ($#ARGV == -1) {
> - usage();
> -}
> -
> -my $kernelversion;
> -my ($sphinx_major, $sphinx_minor, $sphinx_patch);
> -
> -my $dohighlight = "";
> -
> -my $verbose = 0;
> -my $Werror = 0;
> -my $output_mode = "rst";
> -my $output_preformatted = 0;
> -my $no_doc_sections = 0;
> -my $enable_lineno = 0;
> -my @highlights = @highlights_rst;
> -my $blankline = $blankline_rst;
> -my $modulename = "Kernel API";
> -
> -use constant {
> - OUTPUT_ALL => 0, # output all symbols and doc sections
> - OUTPUT_INCLUDE => 1, # output only specified symbols
> - OUTPUT_EXPORTED => 2, # output exported symbols
> - OUTPUT_INTERNAL => 3, # output non-exported symbols
> -};
> -my $output_selection = OUTPUT_ALL;
> -my $show_not_found = 0; # No longer used
> -
> -my @export_file_list;
> -
> -my @build_time;
> -if (defined($ENV{'KBUILD_BUILD_TIMESTAMP'}) &&
> - (my $seconds = `date -d"${ENV{'KBUILD_BUILD_TIMESTAMP'}}" +%s`) ne '') {
> - @build_time = gmtime($seconds);
> -} else {
> - @build_time = localtime;
> -}
> -
> -my $man_date = ('January', 'February', 'March', 'April', 'May', 'June',
> - 'July', 'August', 'September', 'October',
> - 'November', 'December')[$build_time[4]] .
> - " " . ($build_time[5]+1900);
> -
> -# Essentially these are globals.
> -# They probably want to be tidied up, made more localised or something.
> -# CAVEAT EMPTOR! Some of the others I localised may not want to be, which
> -# could cause "use of undefined value" or other bugs.
> -my ($function, %function_table, %parametertypes, $declaration_purpose);
> -my %nosymbol_table = ();
> -my $declaration_start_line;
> -my ($type, $declaration_name, $return_type);
> -my ($newsection, $newcontents, $prototype, $brcount, %source_map);
> -
> -if (defined($ENV{'KBUILD_VERBOSE'})) {
> - $verbose = "$ENV{'KBUILD_VERBOSE'}";
> -}
> -
> -if (defined($ENV{'KDOC_WERROR'})) {
> - $Werror = "$ENV{'KDOC_WERROR'}";
> -}
> -
> -if (defined($ENV{'KCFLAGS'})) {
> - my $kcflags = "$ENV{'KCFLAGS'}";
> -
> - if ($kcflags =~ /Werror/) {
> - $Werror = 1;
> - }
> -}
> -
> -# Generated docbook code is inserted in a template at a point where
> -# docbook v3.1 requires a non-zero sequence of RefEntry's; see:
> -# https://www.oasis-open.org/docbook/documentation/reference/html/refentry.html
> -# We keep track of number of generated entries and generate a dummy
> -# if needs be to ensure the expanded template can be postprocessed
> -# into html.
> -my $section_counter = 0;
> -
> -my $lineprefix="";
> -
> -# Parser states
> -use constant {
> - STATE_NORMAL => 0, # normal code
> - STATE_NAME => 1, # looking for function name
> - STATE_BODY_MAYBE => 2, # body - or maybe more description
> - STATE_BODY => 3, # the body of the comment
> - STATE_BODY_WITH_BLANK_LINE => 4, # the body, which has a blank line
> - STATE_PROTO => 5, # scanning prototype
> - STATE_DOCBLOCK => 6, # documentation block
> - STATE_INLINE => 7, # gathering doc outside main block
> -};
> -my $state;
> -my $in_doc_sect;
> -my $leading_space;
> -
> -# Inline documentation state
> -use constant {
> - STATE_INLINE_NA => 0, # not applicable ($state != STATE_INLINE)
> - STATE_INLINE_NAME => 1, # looking for member name (@foo:)
> - STATE_INLINE_TEXT => 2, # looking for member documentation
> - STATE_INLINE_END => 3, # done
> - STATE_INLINE_ERROR => 4, # error - Comment without header was found.
> - # Spit a warning as it's not
> - # proper kernel-doc and ignore the rest.
> -};
> -my $inline_doc_state;
> -
> -#declaration types: can be
> -# 'function', 'struct', 'union', 'enum', 'typedef'
> -my $decl_type;
> -
> -my $doc_start = '^/\*\*\s*$'; # Allow whitespace at end of comment start.
> -my $doc_end = '\*/';
> -my $doc_com = '\s*\*\s*';
> -my $doc_com_body = '\s*\* ?';
> -my $doc_decl = $doc_com . '(\w+)';
> -# @params and a strictly limited set of supported section names
> -my $doc_sect = $doc_com .
> - '\s*(\@[.\w]+|\@\.\.\.|description|context|returns?|notes?|examples?)\s*:(.*)';
> -my $doc_content = $doc_com_body . '(.*)';
> -my $doc_block = $doc_com . 'DOC:\s*(.*)?';
> -my $doc_inline_start = '^\s*/\*\*\s*$';
> -my $doc_inline_sect = '\s*\*\s*(@\s*[\w][\w\.]*\s*):(.*)';
> -my $doc_inline_end = '^\s*\*/\s*$';
> -my $doc_inline_oneline = '^\s*/\*\*\s*(@[\w\s]+):\s*(.*)\s*\*/\s*$';
> -my $export_symbol = '^\s*EXPORT_SYMBOL(_GPL)?\s*\(\s*(\w+)\s*\)\s*;';
> -
> -my %parameterdescs;
> -my %parameterdesc_start_lines;
> -my @parameterlist;
> -my %sections;
> -my @sectionlist;
> -my %section_start_lines;
> -my $sectcheck;
> -my $struct_actual;
> -
> -my $contents = "";
> -my $new_start_line = 0;
> -
> -# the canonical section names. see also $doc_sect above.
> -my $section_default = "Description"; # default section
> -my $section_intro = "Introduction";
> -my $section = $section_default;
> -my $section_context = "Context";
> -my $section_return = "Return";
> -
> -my $undescribed = "-- undescribed --";
> -
> -reset_state();
> -
> -while ($ARGV[0] =~ m/^--?(.*)/) {
> - my $cmd = $1;
> - shift @ARGV;
> - if ($cmd eq "man") {
> - $output_mode = "man";
> - @highlights = @highlights_man;
> - $blankline = $blankline_man;
> - } elsif ($cmd eq "rst") {
> - $output_mode = "rst";
> - @highlights = @highlights_rst;
> - $blankline = $blankline_rst;
> - } elsif ($cmd eq "none") {
> - $output_mode = "none";
> - } elsif ($cmd eq "module") { # not needed for XML, inherits from calling document
> - $modulename = shift @ARGV;
> - } elsif ($cmd eq "function") { # to only output specific functions
> - $output_selection = OUTPUT_INCLUDE;
> - $function = shift @ARGV;
> - $function_table{$function} = 1;
> - } elsif ($cmd eq "nosymbol") { # Exclude specific symbols
> - my $symbol = shift @ARGV;
> - $nosymbol_table{$symbol} = 1;
> - } elsif ($cmd eq "export") { # only exported symbols
> - $output_selection = OUTPUT_EXPORTED;
> - %function_table = ();
> - } elsif ($cmd eq "internal") { # only non-exported symbols
> - $output_selection = OUTPUT_INTERNAL;
> - %function_table = ();
> - } elsif ($cmd eq "export-file") {
> - my $file = shift @ARGV;
> - push(@export_file_list, $file);
> - } elsif ($cmd eq "v") {
> - $verbose = 1;
> - } elsif ($cmd eq "Werror") {
> - $Werror = 1;
> - } elsif (($cmd eq "h") || ($cmd eq "help")) {
> - usage();
> - } elsif ($cmd eq 'no-doc-sections') {
> - $no_doc_sections = 1;
> - } elsif ($cmd eq 'enable-lineno') {
> - $enable_lineno = 1;
> - } elsif ($cmd eq 'show-not-found') {
> - $show_not_found = 1; # A no-op but don't fail
> - } elsif ($cmd eq "sphinx-version") {
> - my $ver_string = shift @ARGV;
> - if ($ver_string =~ m/^(\d+)(\.\d+)?(\.\d+)?/) {
> - $sphinx_major = $1;
> - if (defined($2)) {
> - $sphinx_minor = substr($2,1);
> - } else {
> - $sphinx_minor = 0;
> - }
> - if (defined($3)) {
> - $sphinx_patch = substr($3,1)
> - } else {
> - $sphinx_patch = 0;
> - }
> - } else {
> - die "Sphinx version should either major.minor or major.minor.patch format\n";
> - }
> - } else {
> - # Unknown argument
> - usage();
> - }
> -}
> -
> -# continue execution near EOF;
> -
> -# The C domain dialect changed on Sphinx 3. So, we need to check the
> -# version in order to produce the right tags.
> -sub findprog($)
> -{
> - foreach(split(/:/, $ENV{PATH})) {
> - return "$_/$_[0]" if(-x "$_/$_[0]");
> - }
> -}
> -
> -sub get_sphinx_version()
> -{
> - my $ver;
> -
> - my $cmd = "sphinx-build";
> - if (!findprog($cmd)) {
> - my $cmd = "sphinx-build3";
> - if (!findprog($cmd)) {
> - $sphinx_major = 1;
> - $sphinx_minor = 2;
> - $sphinx_patch = 0;
> - printf STDERR "Warning: Sphinx version not found. Using default (Sphinx version %d.%d.%d)\n",
> - $sphinx_major, $sphinx_minor, $sphinx_patch;
> - return;
> - }
> - }
> -
> - open IN, "$cmd --version 2>&1 |";
> - while (<IN>) {
> - if (m/^\s*sphinx-build\s+([\d]+)\.([\d\.]+)(\+\/[\da-f]+)?$/) {
> - $sphinx_major = $1;
> - $sphinx_minor = $2;
> - $sphinx_patch = $3;
> - last;
> - }
> - # Sphinx 1.2.x uses a different format
> - if (m/^\s*Sphinx.*\s+([\d]+)\.([\d\.]+)$/) {
> - $sphinx_major = $1;
> - $sphinx_minor = $2;
> - $sphinx_patch = $3;
> - last;
> - }
> - }
> - close IN;
> -}
> -
> -# get kernel version from env
> -sub get_kernel_version() {
> - my $version = 'unknown kernel version';
> -
> - if (defined($ENV{'KERNELVERSION'})) {
> - $version = $ENV{'KERNELVERSION'};
> - }
> - return $version;
> -}
> -
> -#
> -sub print_lineno {
> - my $lineno = shift;
> - if ($enable_lineno && defined($lineno)) {
> - print "#define LINENO " . $lineno . "\n";
> - }
> -}
> -##
> -# dumps section contents to arrays/hashes intended for that purpose.
> -#
> -sub dump_section {
> - my $file = shift;
> - my $name = shift;
> - my $contents = join "\n", @_;
> -
> - if ($name =~ m/$type_param/) {
> - $name = $1;
> - $parameterdescs{$name} = $contents;
> - $sectcheck = $sectcheck . $name . " ";
> - $parameterdesc_start_lines{$name} = $new_start_line;
> - $new_start_line = 0;
> - } elsif ($name eq "@\.\.\.") {
> - $name = "...";
> - $parameterdescs{$name} = $contents;
> - $sectcheck = $sectcheck . $name . " ";
> - $parameterdesc_start_lines{$name} = $new_start_line;
> - $new_start_line = 0;
> - } else {
> - if (defined($sections{$name}) && ($sections{$name} ne "")) {
> - # Only warn on user specified duplicate section names.
> - if ($name ne $section_default) {
> - print STDERR "${file}:$.: warning: duplicate section name '$name'\n";
> - ++$warnings;
> - }
> - $sections{$name} .= $contents;
> - } else {
> - $sections{$name} = $contents;
> - push @sectionlist, $name;
> - $section_start_lines{$name} = $new_start_line;
> - $new_start_line = 0;
> - }
> - }
> -}
> -
> -##
> -# dump DOC: section after checking that it should go out
> -#
> -sub dump_doc_section {
> - my $file = shift;
> - my $name = shift;
> - my $contents = join "\n", @_;
> -
> - if ($no_doc_sections) {
> - return;
> - }
> -
> - return if (defined($nosymbol_table{$name}));
> -
> - if (($output_selection == OUTPUT_ALL) ||
> - (($output_selection == OUTPUT_INCLUDE) &&
> - defined($function_table{$name})))
> - {
> - dump_section($file, $name, $contents);
> - output_blockhead({'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'module' => $modulename,
> - 'content-only' => ($output_selection != OUTPUT_ALL), });
> - }
> -}
> -
> -##
> -# output function
> -#
> -# parameterdescs, a hash.
> -# function => "function name"
> -# parameterlist => @list of parameters
> -# parameterdescs => %parameter descriptions
> -# sectionlist => @list of sections
> -# sections => %section descriptions
> -#
> -
> -sub output_highlight {
> - my $contents = join "\n",@_;
> - my $line;
> -
> -# DEBUG
> -# if (!defined $contents) {
> -# use Carp;
> -# confess "output_highlight got called with no args?\n";
> -# }
> -
> -# print STDERR "contents b4:$contents\n";
> - eval $dohighlight;
> - die $@ if $@;
> -# print STDERR "contents af:$contents\n";
> -
> - foreach $line (split "\n", $contents) {
> - if (! $output_preformatted) {
> - $line =~ s/^\s*//;
> - }
> - if ($line eq ""){
> - if (! $output_preformatted) {
> - print $lineprefix, $blankline;
> - }
> - } else {
> - if ($output_mode eq "man" && substr($line, 0, 1) eq ".") {
> - print "\\&$line";
> - } else {
> - print $lineprefix, $line;
> - }
> - }
> - print "\n";
> - }
> -}
> -
> -##
> -# output function in man
> -sub output_function_man(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> - my $count;
> -
> - print ".TH \"$args{'function'}\" 9 \"$args{'function'}\" \"$man_date\" \"Kernel Hacker's Manual\" LINUX\n";
> -
> - print ".SH NAME\n";
> - print $args{'function'} . " \\- " . $args{'purpose'} . "\n";
> -
> - print ".SH SYNOPSIS\n";
> - if ($args{'functiontype'} ne "") {
> - print ".B \"" . $args{'functiontype'} . "\" " . $args{'function'} . "\n";
> - } else {
> - print ".B \"" . $args{'function'} . "\n";
> - }
> - $count = 0;
> - my $parenth = "(";
> - my $post = ",";
> - foreach my $parameter (@{$args{'parameterlist'}}) {
> - if ($count == $#{$args{'parameterlist'}}) {
> - $post = ");";
> - }
> - $type = $args{'parametertypes'}{$parameter};
> - if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
> - # pointer-to-function
> - print ".BI \"" . $parenth . $1 . "\" " . " \") (" . $2 . ")" . $post . "\"\n";
> - } else {
> - $type =~ s/([^\*])$/$1 /;
> - print ".BI \"" . $parenth . $type . "\" " . " \"" . $post . "\"\n";
> - }
> - $count++;
> - $parenth = "";
> - }
> -
> - print ".SH ARGUMENTS\n";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - my $parameter_name = $parameter;
> - $parameter_name =~ s/\[.*//;
> -
> - print ".IP \"" . $parameter . "\" 12\n";
> - output_highlight($args{'parameterdescs'}{$parameter_name});
> - }
> - foreach $section (@{$args{'sectionlist'}}) {
> - print ".SH \"", uc $section, "\"\n";
> - output_highlight($args{'sections'}{$section});
> - }
> -}
> -
> -##
> -# output enum in man
> -sub output_enum_man(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> - my $count;
> -
> - print ".TH \"$args{'module'}\" 9 \"enum $args{'enum'}\" \"$man_date\" \"API Manual\" LINUX\n";
> -
> - print ".SH NAME\n";
> - print "enum " . $args{'enum'} . " \\- " . $args{'purpose'} . "\n";
> -
> - print ".SH SYNOPSIS\n";
> - print "enum " . $args{'enum'} . " {\n";
> - $count = 0;
> - foreach my $parameter (@{$args{'parameterlist'}}) {
> - print ".br\n.BI \" $parameter\"\n";
> - if ($count == $#{$args{'parameterlist'}}) {
> - print "\n};\n";
> - last;
> - }
> - else {
> - print ", \n.br\n";
> - }
> - $count++;
> - }
> -
> - print ".SH Constants\n";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - my $parameter_name = $parameter;
> - $parameter_name =~ s/\[.*//;
> -
> - print ".IP \"" . $parameter . "\" 12\n";
> - output_highlight($args{'parameterdescs'}{$parameter_name});
> - }
> - foreach $section (@{$args{'sectionlist'}}) {
> - print ".SH \"$section\"\n";
> - output_highlight($args{'sections'}{$section});
> - }
> -}
> -
> -##
> -# output struct in man
> -sub output_struct_man(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> -
> - print ".TH \"$args{'module'}\" 9 \"" . $args{'type'} . " " . $args{'struct'} . "\" \"$man_date\" \"API Manual\" LINUX\n";
> -
> - print ".SH NAME\n";
> - print $args{'type'} . " " . $args{'struct'} . " \\- " . $args{'purpose'} . "\n";
> -
> - my $declaration = $args{'definition'};
> - $declaration =~ s/\t/ /g;
> - $declaration =~ s/\n/"\n.br\n.BI \"/g;
> - print ".SH SYNOPSIS\n";
> - print $args{'type'} . " " . $args{'struct'} . " {\n.br\n";
> - print ".BI \"$declaration\n};\n.br\n\n";
> -
> - print ".SH Members\n";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - ($parameter =~ /^#/) && next;
> -
> - my $parameter_name = $parameter;
> - $parameter_name =~ s/\[.*//;
> -
> - ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
> - print ".IP \"" . $parameter . "\" 12\n";
> - output_highlight($args{'parameterdescs'}{$parameter_name});
> - }
> - foreach $section (@{$args{'sectionlist'}}) {
> - print ".SH \"$section\"\n";
> - output_highlight($args{'sections'}{$section});
> - }
> -}
> -
> -##
> -# output typedef in man
> -sub output_typedef_man(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> -
> - print ".TH \"$args{'module'}\" 9 \"$args{'typedef'}\" \"$man_date\" \"API Manual\" LINUX\n";
> -
> - print ".SH NAME\n";
> - print "typedef " . $args{'typedef'} . " \\- " . $args{'purpose'} . "\n";
> -
> - foreach $section (@{$args{'sectionlist'}}) {
> - print ".SH \"$section\"\n";
> - output_highlight($args{'sections'}{$section});
> - }
> -}
> -
> -sub output_blockhead_man(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> - my $count;
> -
> - print ".TH \"$args{'module'}\" 9 \"$args{'module'}\" \"$man_date\" \"API Manual\" LINUX\n";
> -
> - foreach $section (@{$args{'sectionlist'}}) {
> - print ".SH \"$section\"\n";
> - output_highlight($args{'sections'}{$section});
> - }
> -}
> -
> -##
> -# output in restructured text
> -#
> -
> -#
> -# This could use some work; it's used to output the DOC: sections, and
> -# starts by putting out the name of the doc section itself, but that tends
> -# to duplicate a header already in the template file.
> -#
> -sub output_blockhead_rst(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> -
> - foreach $section (@{$args{'sectionlist'}}) {
> - next if (defined($nosymbol_table{$section}));
> -
> - if ($output_selection != OUTPUT_INCLUDE) {
> - print "**$section**\n\n";
> - }
> - print_lineno($section_start_lines{$section});
> - output_highlight_rst($args{'sections'}{$section});
> - print "\n";
> - }
> -}
> -
> -#
> -# Apply the RST highlights to a sub-block of text.
> -#
> -sub highlight_block($) {
> - # The dohighlight kludge requires the text be called $contents
> - my $contents = shift;
> - eval $dohighlight;
> - die $@ if $@;
> - return $contents;
> -}
> -
> -#
> -# Regexes used only here.
> -#
> -my $sphinx_literal = '^[^.].*::$';
> -my $sphinx_cblock = '^\.\.\ +code-block::';
> -
> -sub output_highlight_rst {
> - my $input = join "\n",@_;
> - my $output = "";
> - my $line;
> - my $in_literal = 0;
> - my $litprefix;
> - my $block = "";
> -
> - foreach $line (split "\n",$input) {
> - #
> - # If we're in a literal block, see if we should drop out
> - # of it. Otherwise pass the line straight through unmunged.
> - #
> - if ($in_literal) {
> - if (! ($line =~ /^\s*$/)) {
> - #
> - # If this is the first non-blank line in a literal
> - # block we need to figure out what the proper indent is.
> - #
> - if ($litprefix eq "") {
> - $line =~ /^(\s*)/;
> - $litprefix = '^' . $1;
> - $output .= $line . "\n";
> - } elsif (! ($line =~ /$litprefix/)) {
> - $in_literal = 0;
> - } else {
> - $output .= $line . "\n";
> - }
> - } else {
> - $output .= $line . "\n";
> - }
> - }
> - #
> - # Not in a literal block (or just dropped out)
> - #
> - if (! $in_literal) {
> - $block .= $line . "\n";
> - if (($line =~ /$sphinx_literal/) || ($line =~ /$sphinx_cblock/)) {
> - $in_literal = 1;
> - $litprefix = "";
> - $output .= highlight_block($block);
> - $block = ""
> - }
> - }
> - }
> -
> - if ($block) {
> - $output .= highlight_block($block);
> - }
> - foreach $line (split "\n", $output) {
> - print $lineprefix . $line . "\n";
> - }
> -}
> -
> -sub output_function_rst(%) {
> - my %args = %{$_[0]};
> - my ($parameter, $section);
> - my $oldprefix = $lineprefix;
> - my $start = "";
> - my $is_macro = 0;
> -
> - if ($sphinx_major < 3) {
> - if ($args{'typedef'}) {
> - print ".. c:type:: ". $args{'function'} . "\n\n";
> - print_lineno($declaration_start_line);
> - print " **Typedef**: ";
> - $lineprefix = "";
> - output_highlight_rst($args{'purpose'});
> - $start = "\n\n**Syntax**\n\n ``";
> - $is_macro = 1;
> - } else {
> - print ".. c:function:: ";
> - }
> - } else {
> - if ($args{'typedef'} || $args{'functiontype'} eq "") {
> - $is_macro = 1;
> - print ".. c:macro:: ". $args{'function'} . "\n\n";
> - } else {
> - print ".. c:function:: ";
> - }
> -
> - if ($args{'typedef'}) {
> - print_lineno($declaration_start_line);
> - print " **Typedef**: ";
> - $lineprefix = "";
> - output_highlight_rst($args{'purpose'});
> - $start = "\n\n**Syntax**\n\n ``";
> - } else {
> - print "``" if ($is_macro);
> - }
> - }
> - if ($args{'functiontype'} ne "") {
> - $start .= $args{'functiontype'} . " " . $args{'function'} . " (";
> - } else {
> - $start .= $args{'function'} . " (";
> - }
> - print $start;
> -
> - my $count = 0;
> - foreach my $parameter (@{$args{'parameterlist'}}) {
> - if ($count ne 0) {
> - print ", ";
> - }
> - $count++;
> - $type = $args{'parametertypes'}{$parameter};
> -
> - if ($type =~ m/([^\(]*\(\*)\s*\)\s*\(([^\)]*)\)/) {
> - # pointer-to-function
> - print $1 . $parameter . ") (" . $2 . ")";
> - } else {
> - print $type;
> - }
> - }
> - if ($is_macro) {
> - print ")``\n\n";
> - } else {
> - print ")\n\n";
> - }
> - if (!$args{'typedef'}) {
> - print_lineno($declaration_start_line);
> - $lineprefix = " ";
> - output_highlight_rst($args{'purpose'});
> - print "\n";
> - }
> -
> - print "**Parameters**\n\n";
> - $lineprefix = " ";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - my $parameter_name = $parameter;
> - $parameter_name =~ s/\[.*//;
> - $type = $args{'parametertypes'}{$parameter};
> -
> - if ($type ne "") {
> - print "``$type``\n";
> - } else {
> - print "``$parameter``\n";
> - }
> -
> - print_lineno($parameterdesc_start_lines{$parameter_name});
> -
> - if (defined($args{'parameterdescs'}{$parameter_name}) &&
> - $args{'parameterdescs'}{$parameter_name} ne $undescribed) {
> - output_highlight_rst($args{'parameterdescs'}{$parameter_name});
> - } else {
> - print " *undescribed*\n";
> - }
> - print "\n";
> - }
> -
> - $lineprefix = $oldprefix;
> - output_section_rst(@_);
> -}
> -
> -sub output_section_rst(%) {
> - my %args = %{$_[0]};
> - my $section;
> - my $oldprefix = $lineprefix;
> - $lineprefix = "";
> -
> - foreach $section (@{$args{'sectionlist'}}) {
> - print "**$section**\n\n";
> - print_lineno($section_start_lines{$section});
> - output_highlight_rst($args{'sections'}{$section});
> - print "\n";
> - }
> - print "\n";
> - $lineprefix = $oldprefix;
> -}
> -
> -sub output_enum_rst(%) {
> - my %args = %{$_[0]};
> - my ($parameter);
> - my $oldprefix = $lineprefix;
> - my $count;
> -
> - if ($sphinx_major < 3) {
> - my $name = "enum " . $args{'enum'};
> - print "\n\n.. c:type:: " . $name . "\n\n";
> - } else {
> - my $name = $args{'enum'};
> - print "\n\n.. c:enum:: " . $name . "\n\n";
> - }
> - print_lineno($declaration_start_line);
> - $lineprefix = " ";
> - output_highlight_rst($args{'purpose'});
> - print "\n";
> -
> - print "**Constants**\n\n";
> - $lineprefix = " ";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - print "``$parameter``\n";
> - if ($args{'parameterdescs'}{$parameter} ne $undescribed) {
> - output_highlight_rst($args{'parameterdescs'}{$parameter});
> - } else {
> - print " *undescribed*\n";
> - }
> - print "\n";
> - }
> -
> - $lineprefix = $oldprefix;
> - output_section_rst(@_);
> -}
> -
> -sub output_typedef_rst(%) {
> - my %args = %{$_[0]};
> - my ($parameter);
> - my $oldprefix = $lineprefix;
> - my $name;
> -
> - if ($sphinx_major < 3) {
> - $name = "typedef " . $args{'typedef'};
> - } else {
> - $name = $args{'typedef'};
> - }
> - print "\n\n.. c:type:: " . $name . "\n\n";
> - print_lineno($declaration_start_line);
> - $lineprefix = " ";
> - output_highlight_rst($args{'purpose'});
> - print "\n";
> -
> - $lineprefix = $oldprefix;
> - output_section_rst(@_);
> -}
> -
> -sub output_struct_rst(%) {
> - my %args = %{$_[0]};
> - my ($parameter);
> - my $oldprefix = $lineprefix;
> -
> - if ($sphinx_major < 3) {
> - my $name = $args{'type'} . " " . $args{'struct'};
> - print "\n\n.. c:type:: " . $name . "\n\n";
> - } else {
> - my $name = $args{'struct'};
> - if ($args{'type'} eq 'union') {
> - print "\n\n.. c:union:: " . $name . "\n\n";
> - } else {
> - print "\n\n.. c:struct:: " . $name . "\n\n";
> - }
> - }
> - print_lineno($declaration_start_line);
> - $lineprefix = " ";
> - output_highlight_rst($args{'purpose'});
> - print "\n";
> -
> - print "**Definition**\n\n";
> - print "::\n\n";
> - my $declaration = $args{'definition'};
> - $declaration =~ s/\t/ /g;
> - print " " . $args{'type'} . " " . $args{'struct'} . " {\n$declaration };\n\n";
> -
> - print "**Members**\n\n";
> - $lineprefix = " ";
> - foreach $parameter (@{$args{'parameterlist'}}) {
> - ($parameter =~ /^#/) && next;
> -
> - my $parameter_name = $parameter;
> - $parameter_name =~ s/\[.*//;
> -
> - ($args{'parameterdescs'}{$parameter_name} ne $undescribed) || next;
> - $type = $args{'parametertypes'}{$parameter};
> - print_lineno($parameterdesc_start_lines{$parameter_name});
> - print "``" . $parameter . "``\n";
> - output_highlight_rst($args{'parameterdescs'}{$parameter_name});
> - print "\n";
> - }
> - print "\n";
> -
> - $lineprefix = $oldprefix;
> - output_section_rst(@_);
> -}
> -
> -## none mode output functions
> -
> -sub output_function_none(%) {
> -}
> -
> -sub output_enum_none(%) {
> -}
> -
> -sub output_typedef_none(%) {
> -}
> -
> -sub output_struct_none(%) {
> -}
> -
> -sub output_blockhead_none(%) {
> -}
> -
> -##
> -# generic output function for all types (function, struct/union, typedef, enum);
> -# calls the generated, variable output_ function name based on
> -# functype and output_mode
> -sub output_declaration {
> - no strict 'refs';
> - my $name = shift;
> - my $functype = shift;
> - my $func = "output_${functype}_$output_mode";
> -
> - return if (defined($nosymbol_table{$name}));
> -
> - if (($output_selection == OUTPUT_ALL) ||
> - (($output_selection == OUTPUT_INCLUDE ||
> - $output_selection == OUTPUT_EXPORTED) &&
> - defined($function_table{$name})) ||
> - ($output_selection == OUTPUT_INTERNAL &&
> - !($functype eq "function" && defined($function_table{$name}))))
> - {
> - &$func(@_);
> - $section_counter++;
> - }
> -}
> -
> -##
> -# generic output function - calls the right one based on current output mode.
> -sub output_blockhead {
> - no strict 'refs';
> - my $func = "output_blockhead_" . $output_mode;
> - &$func(@_);
> - $section_counter++;
> -}
> -
> -##
> -# takes a declaration (struct, union, enum, typedef) and
> -# invokes the right handler. NOT called for functions.
> -sub dump_declaration($$) {
> - no strict 'refs';
> - my ($prototype, $file) = @_;
> - my $func = "dump_" . $decl_type;
> - &$func(@_);
> -}
> -
> -sub dump_union($$) {
> - dump_struct(@_);
> -}
> -
> -sub dump_struct($$) {
> - my $x = shift;
> - my $file = shift;
> -
> - if ($x =~ /(struct|union)\s+(\w+)\s*\{(.*)\}(\s*(__packed|__aligned|____cacheline_aligned_in_smp|____cacheline_aligned|__attribute__\s*\(\([a-z0-9,_\s\(\)]*\)\)))*/) {
> - my $decl_type = $1;
> - $declaration_name = $2;
> - my $members = $3;
> -
> - # ignore members marked private:
> - $members =~ s/\/\*\s*private:.*?\/\*\s*public:.*?\*\///gosi;
> - $members =~ s/\/\*\s*private:.*//gosi;
> - # strip comments:
> - $members =~ s/\/\*.*?\*\///gos;
> - # strip attributes
> - $members =~ s/\s*__attribute__\s*\(\([a-z0-9,_\*\s\(\)]*\)\)/ /gi;
> - $members =~ s/\s*__aligned\s*\([^;]*\)/ /gos;
> - $members =~ s/\s*__packed\s*/ /gos;
> - $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos;
> - $members =~ s/\s*____cacheline_aligned_in_smp/ /gos;
> - $members =~ s/\s*____cacheline_aligned/ /gos;
> -
> - # replace DECLARE_BITMAP
> - $members =~ s/__ETHTOOL_DECLARE_LINK_MODE_MASK\s*\(([^\)]+)\)/DECLARE_BITMAP($1, __ETHTOOL_LINK_MODE_MASK_NBITS)/gos;
> - $members =~ s/DECLARE_BITMAP\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[BITS_TO_LONGS($2)\]/gos;
> - # replace DECLARE_HASHTABLE
> - $members =~ s/DECLARE_HASHTABLE\s*\(([^,)]+),\s*([^,)]+)\)/unsigned long $1\[1 << (($2) - 1)\]/gos;
> - # replace DECLARE_KFIFO
> - $members =~ s/DECLARE_KFIFO\s*\(([^,)]+),\s*([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
> - # replace DECLARE_KFIFO_PTR
> - $members =~ s/DECLARE_KFIFO_PTR\s*\(([^,)]+),\s*([^,)]+)\)/$2 \*$1/gos;
> -
> - my $declaration = $members;
> -
> - # Split nested struct/union elements as newer ones
> - while ($members =~ m/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/) {
> - my $newmember;
> - my $maintype = $1;
> - my $ids = $4;
> - my $content = $3;
> - foreach my $id(split /,/, $ids) {
> - $newmember .= "$maintype $id; ";
> -
> - $id =~ s/[:\[].*//;
> - $id =~ s/^\s*\**(\S+)\s*/$1/;
> - foreach my $arg (split /;/, $content) {
> - next if ($arg =~ m/^\s*$/);
> - if ($arg =~ m/^([^\(]+\(\*?\s*)([\w\.]*)(\s*\).*)/) {
> - # pointer-to-function
> - my $type = $1;
> - my $name = $2;
> - my $extra = $3;
> - next if (!$name);
> - if ($id =~ m/^\s*$/) {
> - # anonymous struct/union
> - $newmember .= "$type$name$extra; ";
> - } else {
> - $newmember .= "$type$id.$name$extra; ";
> - }
> - } else {
> - my $type;
> - my $names;
> - $arg =~ s/^\s+//;
> - $arg =~ s/\s+$//;
> - # Handle bitmaps
> - $arg =~ s/:\s*\d+\s*//g;
> - # Handle arrays
> - $arg =~ s/\[.*\]//g;
> - # The type may have multiple words,
> - # and multiple IDs can be defined, like:
> - # const struct foo, *bar, foobar
> - # So, we remove spaces when parsing the
> - # names, in order to match just names
> - # and commas for the names
> - $arg =~ s/\s*,\s*/,/g;
> - if ($arg =~ m/(.*)\s+([\S+,]+)/) {
> - $type = $1;
> - $names = $2;
> - } else {
> - $newmember .= "$arg; ";
> - next;
> - }
> - foreach my $name (split /,/, $names) {
> - $name =~ s/^\s*\**(\S+)\s*/$1/;
> - next if (($name =~ m/^\s*$/));
> - if ($id =~ m/^\s*$/) {
> - # anonymous struct/union
> - $newmember .= "$type $name; ";
> - } else {
> - $newmember .= "$type $id.$name; ";
> - }
> - }
> - }
> - }
> - }
> - $members =~ s/(struct|union)([^\{\};]+)\{([^\{\}]*)\}([^\{\}\;]*)\;/$newmember/;
> - }
> -
> - # Ignore other nested elements, like enums
> - $members =~ s/(\{[^\{\}]*\})//g;
> -
> - create_parameterlist($members, ';', $file, $declaration_name);
> - check_sections($file, $declaration_name, $decl_type, $sectcheck, $struct_actual);
> -
> - # Adjust declaration for better display
> - $declaration =~ s/([\{;])/$1\n/g;
> - $declaration =~ s/\}\s+;/};/g;
> - # Better handle inlined enums
> - do {} while ($declaration =~ s/(enum\s+\{[^\}]+),([^\n])/$1,\n$2/);
> -
> - my @def_args = split /\n/, $declaration;
> - my $level = 1;
> - $declaration = "";
> - foreach my $clause (@def_args) {
> - $clause =~ s/^\s+//;
> - $clause =~ s/\s+$//;
> - $clause =~ s/\s+/ /;
> - next if (!$clause);
> - $level-- if ($clause =~ m/(\})/ && $level > 1);
> - if (!($clause =~ m/^\s*#/)) {
> - $declaration .= "\t" x $level;
> - }
> - $declaration .= "\t" . $clause . "\n";
> - $level++ if ($clause =~ m/(\{)/ && !($clause =~m/\}/));
> - }
> - output_declaration($declaration_name,
> - 'struct',
> - {'struct' => $declaration_name,
> - 'module' => $modulename,
> - 'definition' => $declaration,
> - 'parameterlist' => \@parameterlist,
> - 'parameterdescs' => \%parameterdescs,
> - 'parametertypes' => \%parametertypes,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose,
> - 'type' => $decl_type
> - });
> - }
> - else {
> - print STDERR "${file}:$.: error: Cannot parse struct or union!\n";
> - ++$errors;
> - }
> -}
> -
> -
> -sub show_warnings($$) {
> - my $functype = shift;
> - my $name = shift;
> -
> - return 0 if (defined($nosymbol_table{$name}));
> -
> - return 1 if ($output_selection == OUTPUT_ALL);
> -
> - if ($output_selection == OUTPUT_EXPORTED) {
> - if (defined($function_table{$name})) {
> - return 1;
> - } else {
> - return 0;
> - }
> - }
> - if ($output_selection == OUTPUT_INTERNAL) {
> - if (!($functype eq "function" && defined($function_table{$name}))) {
> - return 1;
> - } else {
> - return 0;
> - }
> - }
> - if ($output_selection == OUTPUT_INCLUDE) {
> - if (defined($function_table{$name})) {
> - return 1;
> - } else {
> - return 0;
> - }
> - }
> - die("Please add the new output type at show_warnings()");
> -}
> -
> -sub dump_enum($$) {
> - my $x = shift;
> - my $file = shift;
> - my $members;
> -
> -
> - $x =~ s@/\*.*?\*/@@gos; # strip comments.
> - # strip #define macros inside enums
> - $x =~ s@#\s*((define|ifdef)\s+|endif)[^;]*;@@gos;
> -
> - if ($x =~ /typedef\s+enum\s*\{(.*)\}\s*(\w*)\s*;/) {
> - $declaration_name = $2;
> - $members = $1;
> - } elsif ($x =~ /enum\s+(\w*)\s*\{(.*)\}/) {
> - $declaration_name = $1;
> - $members = $2;
> - }
> -
> - if ($declaration_name) {
> - my %_members;
> -
> - $members =~ s/\s+$//;
> -
> - foreach my $arg (split ',', $members) {
> - $arg =~ s/^\s*(\w+).*/$1/;
> - push @parameterlist, $arg;
> - if (!$parameterdescs{$arg}) {
> - $parameterdescs{$arg} = $undescribed;
> - if (show_warnings("enum", $declaration_name)) {
> - print STDERR "${file}:$.: warning: Enum value '$arg' not described in enum '$declaration_name'\n";
> - }
> - }
> - $_members{$arg} = 1;
> - }
> -
> - while (my ($k, $v) = each %parameterdescs) {
> - if (!exists($_members{$k})) {
> - if (show_warnings("enum", $declaration_name)) {
> - print STDERR "${file}:$.: warning: Excess enum value '$k' description in '$declaration_name'\n";
> - }
> - }
> - }
> -
> - output_declaration($declaration_name,
> - 'enum',
> - {'enum' => $declaration_name,
> - 'module' => $modulename,
> - 'parameterlist' => \@parameterlist,
> - 'parameterdescs' => \%parameterdescs,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose
> - });
> - } else {
> - print STDERR "${file}:$.: error: Cannot parse enum!\n";
> - ++$errors;
> - }
> -}
> -
> -my $typedef_type = qr { ((?:\s+[\w\*]+){1,8})\s* }x;
> -my $typedef_ident = qr { \*?\s*(\w\S+)\s* }x;
> -my $typedef_args = qr { \s*\((.*)\); }x;
> -
> -my $typedef1 = qr { typedef$typedef_type\($typedef_ident\)$typedef_args }x;
> -my $typedef2 = qr { typedef$typedef_type$typedef_ident$typedef_args }x;
> -
> -sub dump_typedef($$) {
> - my $x = shift;
> - my $file = shift;
> -
> - $x =~ s@/\*.*?\*/@@gos; # strip comments.
> -
> - # Parse function typedef prototypes
> - if ($x =~ $typedef1 || $x =~ $typedef2) {
> - $return_type = $1;
> - $declaration_name = $2;
> - my $args = $3;
> - $return_type =~ s/^\s+//;
> -
> - create_parameterlist($args, ',', $file, $declaration_name);
> -
> - output_declaration($declaration_name,
> - 'function',
> - {'function' => $declaration_name,
> - 'typedef' => 1,
> - 'module' => $modulename,
> - 'functiontype' => $return_type,
> - 'parameterlist' => \@parameterlist,
> - 'parameterdescs' => \%parameterdescs,
> - 'parametertypes' => \%parametertypes,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose
> - });
> - return;
> - }
> -
> - while (($x =~ /\(*.\)\s*;$/) || ($x =~ /\[*.\]\s*;$/)) {
> - $x =~ s/\(*.\)\s*;$/;/;
> - $x =~ s/\[*.\]\s*;$/;/;
> - }
> -
> - if ($x =~ /typedef.*\s+(\w+)\s*;/) {
> - $declaration_name = $1;
> -
> - output_declaration($declaration_name,
> - 'typedef',
> - {'typedef' => $declaration_name,
> - 'module' => $modulename,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose
> - });
> - }
> - else {
> - print STDERR "${file}:$.: error: Cannot parse typedef!\n";
> - ++$errors;
> - }
> -}
> -
> -sub save_struct_actual($) {
> - my $actual = shift;
> -
> - # strip all spaces from the actual param so that it looks like one string item
> - $actual =~ s/\s*//g;
> - $struct_actual = $struct_actual . $actual . " ";
> -}
> -
> -sub create_parameterlist($$$$) {
> - my $args = shift;
> - my $splitter = shift;
> - my $file = shift;
> - my $declaration_name = shift;
> - my $type;
> - my $param;
> -
> - # temporarily replace commas inside function pointer definition
> - while ($args =~ /(\([^\),]+),/) {
> - $args =~ s/(\([^\),]+),/$1#/g;
> - }
> -
> - foreach my $arg (split($splitter, $args)) {
> - # strip comments
> - $arg =~ s/\/\*.*\*\///;
> - # strip leading/trailing spaces
> - $arg =~ s/^\s*//;
> - $arg =~ s/\s*$//;
> - $arg =~ s/\s+/ /;
> -
> - if ($arg =~ /^#/) {
> - # Treat preprocessor directive as a typeless variable just to fill
> - # corresponding data structures "correctly". Catch it later in
> - # output_* subs.
> - push_parameter($arg, "", "", $file);
> - } elsif ($arg =~ m/\(.+\)\s*\(/) {
> - # pointer-to-function
> - $arg =~ tr/#/,/;
> - $arg =~ m/[^\(]+\(\*?\s*([\w\.]*)\s*\)/;
> - $param = $1;
> - $type = $arg;
> - $type =~ s/([^\(]+\(\*?)\s*$param/$1/;
> - save_struct_actual($param);
> - push_parameter($param, $type, $arg, $file, $declaration_name);
> - } elsif ($arg) {
> - $arg =~ s/\s*:\s*/:/g;
> - $arg =~ s/\s*\[/\[/g;
> -
> - my @args = split('\s*,\s*', $arg);
> - if ($args[0] =~ m/\*/) {
> - $args[0] =~ s/(\*+)\s*/ $1/;
> - }
> -
> - my @first_arg;
> - if ($args[0] =~ /^(.*\s+)(.*?\[.*\].*)$/) {
> - shift @args;
> - push(@first_arg, split('\s+', $1));
> - push(@first_arg, $2);
> - } else {
> - @first_arg = split('\s+', shift @args);
> - }
> -
> - unshift(@args, pop @first_arg);
> - $type = join " ", @first_arg;
> -
> - foreach $param (@args) {
> - if ($param =~ m/^(\*+)\s*(.*)/) {
> - save_struct_actual($2);
> -
> - push_parameter($2, "$type $1", $arg, $file, $declaration_name);
> - }
> - elsif ($param =~ m/(.*?):(\d+)/) {
> - if ($type ne "") { # skip unnamed bit-fields
> - save_struct_actual($1);
> - push_parameter($1, "$type:$2", $arg, $file, $declaration_name)
> - }
> - }
> - else {
> - save_struct_actual($param);
> - push_parameter($param, $type, $arg, $file, $declaration_name);
> - }
> - }
> - }
> - }
> -}
> -
> -sub push_parameter($$$$$) {
> - my $param = shift;
> - my $type = shift;
> - my $org_arg = shift;
> - my $file = shift;
> - my $declaration_name = shift;
> -
> - if (($anon_struct_union == 1) && ($type eq "") &&
> - ($param eq "}")) {
> - return; # ignore the ending }; from anon. struct/union
> - }
> -
> - $anon_struct_union = 0;
> - $param =~ s/[\[\)].*//;
> -
> - if ($type eq "" && $param =~ /\.\.\.$/)
> - {
> - if (!$param =~ /\w\.\.\.$/) {
> - # handles unnamed variable parameters
> - $param = "...";
> - }
> - elsif ($param =~ /\w\.\.\.$/) {
> - # for named variable parameters of the form `x...`, remove the dots
> - $param =~ s/\.\.\.$//;
> - }
> - if (!defined $parameterdescs{$param} || $parameterdescs{$param} eq "") {
> - $parameterdescs{$param} = "variable arguments";
> - }
> - }
> - elsif ($type eq "" && ($param eq "" or $param eq "void"))
> - {
> - $param="void";
> - $parameterdescs{void} = "no arguments";
> - }
> - elsif ($type eq "" && ($param eq "struct" or $param eq "union"))
> - # handle unnamed (anonymous) union or struct:
> - {
> - $type = $param;
> - $param = "{unnamed_" . $param . "}";
> - $parameterdescs{$param} = "anonymous\n";
> - $anon_struct_union = 1;
> - }
> -
> - # warn if parameter has no description
> - # (but ignore ones starting with # as these are not parameters
> - # but inline preprocessor statements);
> - # Note: It will also ignore void params and unnamed structs/unions
> - if (!defined $parameterdescs{$param} && $param !~ /^#/) {
> - $parameterdescs{$param} = $undescribed;
> -
> - if (show_warnings($type, $declaration_name) && $param !~ /\./) {
> - print STDERR
> - "${file}:$.: warning: Function parameter or member '$param' not described in '$declaration_name'\n";
> - ++$warnings;
> - }
> - }
> -
> - # strip spaces from $param so that it is one continuous string
> - # on @parameterlist;
> - # this fixes a problem where check_sections() cannot find
> - # a parameter like "addr[6 + 2]" because it actually appears
> - # as "addr[6", "+", "2]" on the parameter list;
> - # but it's better to maintain the param string unchanged for output,
> - # so just weaken the string compare in check_sections() to ignore
> - # "[blah" in a parameter string;
> - ###$param =~ s/\s*//g;
> - push @parameterlist, $param;
> - $org_arg =~ s/\s\s+/ /g;
> - $parametertypes{$param} = $org_arg;
> -}
> -
> -sub check_sections($$$$$) {
> - my ($file, $decl_name, $decl_type, $sectcheck, $prmscheck) = @_;
> - my @sects = split ' ', $sectcheck;
> - my @prms = split ' ', $prmscheck;
> - my $err;
> - my ($px, $sx);
> - my $prm_clean; # strip trailing "[array size]" and/or beginning "*"
> -
> - foreach $sx (0 .. $#sects) {
> - $err = 1;
> - foreach $px (0 .. $#prms) {
> - $prm_clean = $prms[$px];
> - $prm_clean =~ s/\[.*\]//;
> - $prm_clean =~ s/__attribute__\s*\(\([a-z,_\*\s\(\)]*\)\)//i;
> - # ignore array size in a parameter string;
> - # however, the original param string may contain
> - # spaces, e.g.: addr[6 + 2]
> - # and this appears in @prms as "addr[6" since the
> - # parameter list is split at spaces;
> - # hence just ignore "[..." for the sections check;
> - $prm_clean =~ s/\[.*//;
> -
> - ##$prm_clean =~ s/^\**//;
> - if ($prm_clean eq $sects[$sx]) {
> - $err = 0;
> - last;
> - }
> - }
> - if ($err) {
> - if ($decl_type eq "function") {
> - print STDERR "${file}:$.: warning: " .
> - "Excess function parameter " .
> - "'$sects[$sx]' " .
> - "description in '$decl_name'\n";
> - ++$warnings;
> - }
> - }
> - }
> -}
> -
> -##
> -# Checks the section describing the return value of a function.
> -sub check_return_section {
> - my $file = shift;
> - my $declaration_name = shift;
> - my $return_type = shift;
> -
> - # Ignore an empty return type (It's a macro)
> - # Ignore functions with a "void" return type. (But don't ignore "void *")
> - if (($return_type eq "") || ($return_type =~ /void\s*\w*\s*$/)) {
> - return;
> - }
> -
> - if (!defined($sections{$section_return}) ||
> - $sections{$section_return} eq "") {
> - print STDERR "${file}:$.: warning: " .
> - "No description found for return value of " .
> - "'$declaration_name'\n";
> - ++$warnings;
> - }
> -}
> -
> -##
> -# takes a function prototype and the name of the current file being
> -# processed and spits out all the details stored in the global
> -# arrays/hashes.
> -sub dump_function($$) {
> - my $prototype = shift;
> - my $file = shift;
> - my $noret = 0;
> -
> - print_lineno($new_start_line);
> -
> - $prototype =~ s/^static +//;
> - $prototype =~ s/^extern +//;
> - $prototype =~ s/^asmlinkage +//;
> - $prototype =~ s/^inline +//;
> - $prototype =~ s/^__inline__ +//;
> - $prototype =~ s/^__inline +//;
> - $prototype =~ s/^__always_inline +//;
> - $prototype =~ s/^noinline +//;
> - $prototype =~ s/__init +//;
> - $prototype =~ s/__init_or_module +//;
> - $prototype =~ s/__meminit +//;
> - $prototype =~ s/__must_check +//;
> - $prototype =~ s/__weak +//;
> - $prototype =~ s/__sched +//;
> - $prototype =~ s/__printf\s*\(\s*\d*\s*,\s*\d*\s*\) +//;
> - my $define = $prototype =~ s/^#\s*define\s+//; #ak added
> - $prototype =~ s/__attribute__\s*\(\(
> - (?:
> - [\w\s]++ # attribute name
> - (?:\([^)]*+\))? # attribute arguments
> - \s*+,? # optional comma at the end
> - )+
> - \)\)\s+//x;
> -
> - # Strip QEMU specific compiler annotations
> - $prototype =~ s/QEMU_[A-Z_]+ +//;
> -
> - # Yes, this truly is vile. We are looking for:
> - # 1. Return type (may be nothing if we're looking at a macro)
> - # 2. Function name
> - # 3. Function parameters.
> - #
> - # All the while we have to watch out for function pointer parameters
> - # (which IIRC is what the two sections are for), C types (these
> - # regexps don't even start to express all the possibilities), and
> - # so on.
> - #
> - # If you mess with these regexps, it's a good idea to check that
> - # the following functions' documentation still comes out right:
> - # - parport_register_device (function pointer parameters)
> - # - atomic_set (macro)
> - # - pci_match_device, __copy_to_user (long return type)
> -
> - if ($define && $prototype =~ m/^()([a-zA-Z0-9_~:]+)\s+/) {
> - # This is an object-like macro, it has no return type and no parameter
> - # list.
> - # Function-like macros are not allowed to have spaces between
> - # declaration_name and opening parenthesis (notice the \s+).
> - $return_type = $1;
> - $declaration_name = $2;
> - $noret = 1;
> - } elsif ($prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\(]*)\)/ ||
> - $prototype =~ m/^()([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+)\s+([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s+\w+\s+\w+\s*\*+)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/ ||
> - $prototype =~ m/^(\w+\s+\w+\s*\*+\s*\w+\s*\*+\s*)\s*([a-zA-Z0-9_~:]+)\s*\(([^\{]*)\)/) {
> - $return_type = $1;
> - $declaration_name = $2;
> - my $args = $3;
> -
> - create_parameterlist($args, ',', $file, $declaration_name);
> - } else {
> - print STDERR "${file}:$.: warning: cannot understand function prototype: '$prototype'\n";
> - return;
> - }
> -
> - my $prms = join " ", @parameterlist;
> - check_sections($file, $declaration_name, "function", $sectcheck, $prms);
> -
> - # This check emits a lot of warnings at the moment, because many
> - # functions don't have a 'Return' doc section. So until the number
> - # of warnings goes sufficiently down, the check is only performed in
> - # verbose mode.
> - # TODO: always perform the check.
> - if ($verbose && !$noret) {
> - check_return_section($file, $declaration_name, $return_type);
> - }
> -
> - # The function parser can be called with a typedef parameter.
> - # Handle it.
> - if ($return_type =~ /typedef/) {
> - output_declaration($declaration_name,
> - 'function',
> - {'function' => $declaration_name,
> - 'typedef' => 1,
> - 'module' => $modulename,
> - 'functiontype' => $return_type,
> - 'parameterlist' => \@parameterlist,
> - 'parameterdescs' => \%parameterdescs,
> - 'parametertypes' => \%parametertypes,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose
> - });
> - } else {
> - output_declaration($declaration_name,
> - 'function',
> - {'function' => $declaration_name,
> - 'module' => $modulename,
> - 'functiontype' => $return_type,
> - 'parameterlist' => \@parameterlist,
> - 'parameterdescs' => \%parameterdescs,
> - 'parametertypes' => \%parametertypes,
> - 'sectionlist' => \@sectionlist,
> - 'sections' => \%sections,
> - 'purpose' => $declaration_purpose
> - });
> - }
> -}
> -
> -sub reset_state {
> - $function = "";
> - %parameterdescs = ();
> - %parametertypes = ();
> - @parameterlist = ();
> - %sections = ();
> - @sectionlist = ();
> - $sectcheck = "";
> - $struct_actual = "";
> - $prototype = "";
> -
> - $state = STATE_NORMAL;
> - $inline_doc_state = STATE_INLINE_NA;
> -}
> -
> -sub tracepoint_munge($) {
> - my $file = shift;
> - my $tracepointname = 0;
> - my $tracepointargs = 0;
> -
> - if ($prototype =~ m/TRACE_EVENT\((.*?),/) {
> - $tracepointname = $1;
> - }
> - if ($prototype =~ m/DEFINE_SINGLE_EVENT\((.*?),/) {
> - $tracepointname = $1;
> - }
> - if ($prototype =~ m/DEFINE_EVENT\((.*?),(.*?),/) {
> - $tracepointname = $2;
> - }
> - $tracepointname =~ s/^\s+//; #strip leading whitespace
> - if ($prototype =~ m/TP_PROTO\((.*?)\)/) {
> - $tracepointargs = $1;
> - }
> - if (($tracepointname eq 0) || ($tracepointargs eq 0)) {
> - print STDERR "${file}:$.: warning: Unrecognized tracepoint format: \n".
> - "$prototype\n";
> - } else {
> - $prototype = "static inline void trace_$tracepointname($tracepointargs)";
> - }
> -}
> -
> -sub syscall_munge() {
> - my $void = 0;
> -
> - $prototype =~ s@[\r\n]+@ @gos; # strip newlines/CR's
> -## if ($prototype =~ m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
> - if ($prototype =~ m/SYSCALL_DEFINE0/) {
> - $void = 1;
> -## $prototype = "long sys_$1(void)";
> - }
> -
> - $prototype =~ s/SYSCALL_DEFINE.*\(/long sys_/; # fix return type & func name
> - if ($prototype =~ m/long (sys_.*?),/) {
> - $prototype =~ s/,/\(/;
> - } elsif ($void) {
> - $prototype =~ s/\)/\(void\)/;
> - }
> -
> - # now delete all of the odd-number commas in $prototype
> - # so that arg types & arg names don't have a comma between them
> - my $count = 0;
> - my $len = length($prototype);
> - if ($void) {
> - $len = 0; # skip the for-loop
> - }
> - for (my $ix = 0; $ix < $len; $ix++) {
> - if (substr($prototype, $ix, 1) eq ',') {
> - $count++;
> - if ($count % 2 == 1) {
> - substr($prototype, $ix, 1) = ' ';
> - }
> - }
> - }
> -}
> -
> -sub process_proto_function($$) {
> - my $x = shift;
> - my $file = shift;
> -
> - $x =~ s@\/\/.*$@@gos; # strip C99-style comments to end of line
> -
> - if ($x =~ m#\s*/\*\s+MACDOC\s*#io || ($x =~ /^#/ && $x !~ /^#\s*define/)) {
> - # do nothing
> - }
> - elsif ($x =~ /([^\{]*)/) {
> - $prototype .= $1;
> - }
> -
> - if (($x =~ /\{/) || ($x =~ /\#\s*define/) || ($x =~ /;/)) {
> - $prototype =~ s@/\*.*?\*/@@gos; # strip comments.
> - $prototype =~ s@[\r\n]+@ @gos; # strip newlines/cr's.
> - $prototype =~ s@^\s+@@gos; # strip leading spaces
> -
> - # Handle prototypes for function pointers like:
> - # int (*pcs_config)(struct foo)
> - $prototype =~ s@^(\S+\s+)\(\s*\*(\S+)\)@$1$2@gos;
> -
> - if ($prototype =~ /SYSCALL_DEFINE/) {
> - syscall_munge();
> - }
> - if ($prototype =~ /TRACE_EVENT/ || $prototype =~ /DEFINE_EVENT/ ||
> - $prototype =~ /DEFINE_SINGLE_EVENT/)
> - {
> - tracepoint_munge($file);
> - }
> - dump_function($prototype, $file);
> - reset_state();
> - }
> -}
> -
> -sub process_proto_type($$) {
> - my $x = shift;
> - my $file = shift;
> -
> - $x =~ s@[\r\n]+@ @gos; # strip newlines/cr's.
> - $x =~ s@^\s+@@gos; # strip leading spaces
> - $x =~ s@\s+$@@gos; # strip trailing spaces
> - $x =~ s@\/\/.*$@@gos; # strip C99-style comments to end of line
> -
> - if ($x =~ /^#/) {
> - # To distinguish preprocessor directive from regular declaration later.
> - $x .= ";";
> - }
> -
> - while (1) {
> - if ( $x =~ /([^\{\};]*)([\{\};])(.*)/ ) {
> - if( length $prototype ) {
> - $prototype .= " "
> - }
> - $prototype .= $1 . $2;
> - ($2 eq '{') && $brcount++;
> - ($2 eq '}') && $brcount--;
> - if (($2 eq ';') && ($brcount == 0)) {
> - dump_declaration($prototype, $file);
> - reset_state();
> - last;
> - }
> - $x = $3;
> - } else {
> - $prototype .= $x;
> - last;
> - }
> - }
> -}
> -
> -
> -sub map_filename($) {
> - my $file;
> - my ($orig_file) = @_;
> -
> - if (defined($ENV{'SRCTREE'})) {
> - $file = "$ENV{'SRCTREE'}" . "/" . $orig_file;
> - } else {
> - $file = $orig_file;
> - }
> -
> - if (defined($source_map{$file})) {
> - $file = $source_map{$file};
> - }
> -
> - return $file;
> -}
> -
> -sub process_export_file($) {
> - my ($orig_file) = @_;
> - my $file = map_filename($orig_file);
> -
> - if (!open(IN,"<$file")) {
> - print STDERR "Error: Cannot open file $file\n";
> - ++$errors;
> - return;
> - }
> -
> - while (<IN>) {
> - if (/$export_symbol/) {
> - next if (defined($nosymbol_table{$2}));
> - $function_table{$2} = 1;
> - }
> - }
> -
> - close(IN);
> -}
> -
> -#
> -# Parsers for the various processing states.
> -#
> -# STATE_NORMAL: looking for the /** to begin everything.
> -#
> -sub process_normal() {
> - if (/$doc_start/o) {
> - $state = STATE_NAME; # next line is always the function name
> - $in_doc_sect = 0;
> - $declaration_start_line = $. + 1;
> - }
> -}
> -
> -#
> -# STATE_NAME: Looking for the "name - description" line
> -#
> -sub process_name($$) {
> - my $file = shift;
> - my $identifier;
> - my $descr;
> -
> - if (/$doc_block/o) {
> - $state = STATE_DOCBLOCK;
> - $contents = "";
> - $new_start_line = $.;
> -
> - if ( $1 eq "" ) {
> - $section = $section_intro;
> - } else {
> - $section = $1;
> - }
> - }
> - elsif (/$doc_decl/o) {
> - $identifier = $1;
> - if (/\s*([\w\s]+?)(\s*-|:)/) {
> - $identifier = $1;
> - }
> -
> - $state = STATE_BODY;
> - # if there's no @param blocks need to set up default section
> - # here
> - $contents = "";
> - $section = $section_default;
> - $new_start_line = $. + 1;
> - if (/[-:](.*)/) {
> - # strip leading/trailing/multiple spaces
> - $descr= $1;
> - $descr =~ s/^\s*//;
> - $descr =~ s/\s*$//;
> - $descr =~ s/\s+/ /g;
> - $declaration_purpose = $descr;
> - $state = STATE_BODY_MAYBE;
> - } else {
> - $declaration_purpose = "";
> - }
> -
> - if (($declaration_purpose eq "") && $verbose) {
> - print STDERR "${file}:$.: warning: missing initial short description on line:\n";
> - print STDERR $_;
> - ++$warnings;
> - }
> -
> - if ($identifier =~ m/^struct\b/) {
> - $decl_type = 'struct';
> - } elsif ($identifier =~ m/^union\b/) {
> - $decl_type = 'union';
> - } elsif ($identifier =~ m/^enum\b/) {
> - $decl_type = 'enum';
> - } elsif ($identifier =~ m/^typedef\b/) {
> - $decl_type = 'typedef';
> - } else {
> - $decl_type = 'function';
> - }
> -
> - if ($verbose) {
> - print STDERR "${file}:$.: info: Scanning doc for $identifier\n";
> - }
> - } else {
> - print STDERR "${file}:$.: warning: Cannot understand $_ on line $.",
> - " - I thought it was a doc line\n";
> - ++$warnings;
> - $state = STATE_NORMAL;
> - }
> -}
> -
> -
> -#
> -# STATE_BODY and STATE_BODY_MAYBE: the bulk of a kerneldoc comment.
> -#
> -sub process_body($$) {
> - my $file = shift;
> -
> - # Until all named variable macro parameters are
> - # documented using the bare name (`x`) rather than with
> - # dots (`x...`), strip the dots:
> - if ($section =~ /\w\.\.\.$/) {
> - $section =~ s/\.\.\.$//;
> -
> - if ($verbose) {
> - print STDERR "${file}:$.: warning: Variable macro arguments should be documented without dots\n";
> - ++$warnings;
> - }
> - }
> -
> - if ($state == STATE_BODY_WITH_BLANK_LINE && /^\s*\*\s?\S/) {
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - $new_start_line = $.;
> - $contents = "";
> - }
> -
> - if (/$doc_sect/i) { # case insensitive for supported section names
> - $newsection = $1;
> - $newcontents = $2;
> -
> - # map the supported section names to the canonical names
> - if ($newsection =~ m/^description$/i) {
> - $newsection = $section_default;
> - } elsif ($newsection =~ m/^context$/i) {
> - $newsection = $section_context;
> - } elsif ($newsection =~ m/^returns?$/i) {
> - $newsection = $section_return;
> - } elsif ($newsection =~ m/^\@return$/) {
> - # special: @return is a section, not a param description
> - $newsection = $section_return;
> - }
> -
> - if (($contents ne "") && ($contents ne "\n")) {
> - if (!$in_doc_sect && $verbose) {
> - print STDERR "${file}:$.: warning: contents before sections\n";
> - ++$warnings;
> - }
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - }
> -
> - $in_doc_sect = 1;
> - $state = STATE_BODY;
> - $contents = $newcontents;
> - $new_start_line = $.;
> - while (substr($contents, 0, 1) eq " ") {
> - $contents = substr($contents, 1);
> - }
> - if ($contents ne "") {
> - $contents .= "\n";
> - }
> - $section = $newsection;
> - $leading_space = undef;
> - } elsif (/$doc_end/) {
> - if (($contents ne "") && ($contents ne "\n")) {
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - $contents = "";
> - }
> - # look for doc_com + <text> + doc_end:
> - if ($_ =~ m'\s*\*\s*[a-zA-Z_0-9:\.]+\*/') {
> - print STDERR "${file}:$.: warning: suspicious ending line: $_";
> - ++$warnings;
> - }
> -
> - $prototype = "";
> - $state = STATE_PROTO;
> - $brcount = 0;
> - $new_start_line = $. + 1;
> - } elsif (/$doc_content/) {
> - if ($1 eq "") {
> - if ($section eq $section_context) {
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - $contents = "";
> - $new_start_line = $.;
> - $state = STATE_BODY;
> - } else {
> - if ($section ne $section_default) {
> - $state = STATE_BODY_WITH_BLANK_LINE;
> - } else {
> - $state = STATE_BODY;
> - }
> - $contents .= "\n";
> - }
> - } elsif ($state == STATE_BODY_MAYBE) {
> - # Continued declaration purpose
> - chomp($declaration_purpose);
> - $declaration_purpose .= " " . $1;
> - $declaration_purpose =~ s/\s+/ /g;
> - } else {
> - my $cont = $1;
> - if ($section =~ m/^@/ || $section eq $section_context) {
> - if (!defined $leading_space) {
> - if ($cont =~ m/^(\s+)/) {
> - $leading_space = $1;
> - } else {
> - $leading_space = "";
> - }
> - }
> - $cont =~ s/^$leading_space//;
> - }
> - $contents .= $cont . "\n";
> - }
> - } else {
> - # i dont know - bad line? ignore.
> - print STDERR "${file}:$.: warning: bad line: $_";
> - ++$warnings;
> - }
> -}
> -
> -
> -#
> -# STATE_PROTO: reading a function/whatever prototype.
> -#
> -sub process_proto($$) {
> - my $file = shift;
> -
> - if (/$doc_inline_oneline/) {
> - $section = $1;
> - $contents = $2;
> - if ($contents ne "") {
> - $contents .= "\n";
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - $contents = "";
> - }
> - } elsif (/$doc_inline_start/) {
> - $state = STATE_INLINE;
> - $inline_doc_state = STATE_INLINE_NAME;
> - } elsif ($decl_type eq 'function') {
> - process_proto_function($_, $file);
> - } else {
> - process_proto_type($_, $file);
> - }
> -}
> -
> -#
> -# STATE_DOCBLOCK: within a DOC: block.
> -#
> -sub process_docblock($$) {
> - my $file = shift;
> -
> - if (/$doc_end/) {
> - dump_doc_section($file, $section, $contents);
> - $section = $section_default;
> - $contents = "";
> - $function = "";
> - %parameterdescs = ();
> - %parametertypes = ();
> - @parameterlist = ();
> - %sections = ();
> - @sectionlist = ();
> - $prototype = "";
> - $state = STATE_NORMAL;
> - } elsif (/$doc_content/) {
> - if ( $1 eq "" ) {
> - $contents .= $blankline;
> - } else {
> - $contents .= $1 . "\n";
> - }
> - }
> -}
> -
> -#
> -# STATE_INLINE: docbook comments within a prototype.
> -#
> -sub process_inline($$) {
> - my $file = shift;
> -
> - # First line (state 1) needs to be a @parameter
> - if ($inline_doc_state == STATE_INLINE_NAME && /$doc_inline_sect/o) {
> - $section = $1;
> - $contents = $2;
> - $new_start_line = $.;
> - if ($contents ne "") {
> - while (substr($contents, 0, 1) eq " ") {
> - $contents = substr($contents, 1);
> - }
> - $contents .= "\n";
> - }
> - $inline_doc_state = STATE_INLINE_TEXT;
> - # Documentation block end */
> - } elsif (/$doc_inline_end/) {
> - if (($contents ne "") && ($contents ne "\n")) {
> - dump_section($file, $section, $contents);
> - $section = $section_default;
> - $contents = "";
> - }
> - $state = STATE_PROTO;
> - $inline_doc_state = STATE_INLINE_NA;
> - # Regular text
> - } elsif (/$doc_content/) {
> - if ($inline_doc_state == STATE_INLINE_TEXT) {
> - $contents .= $1 . "\n";
> - # nuke leading blank lines
> - if ($contents =~ /^\s*$/) {
> - $contents = "";
> - }
> - } elsif ($inline_doc_state == STATE_INLINE_NAME) {
> - $inline_doc_state = STATE_INLINE_ERROR;
> - print STDERR "${file}:$.: warning: ";
> - print STDERR "Incorrect use of kernel-doc format: $_";
> - ++$warnings;
> - }
> - }
> -}
> -
> -
> -sub process_file($) {
> - my $file;
> - my $initial_section_counter = $section_counter;
> - my ($orig_file) = @_;
> -
> - $file = map_filename($orig_file);
> -
> - if (!open(IN_FILE,"<$file")) {
> - print STDERR "Error: Cannot open file $file\n";
> - ++$errors;
> - return;
> - }
> -
> - $. = 1;
> -
> - $section_counter = 0;
> - while (<IN_FILE>) {
> - while (s/\\\s*$//) {
> - $_ .= <IN_FILE>;
> - }
> - # Replace tabs by spaces
> - while ($_ =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e) {};
> - # Hand this line to the appropriate state handler
> - if ($state == STATE_NORMAL) {
> - process_normal();
> - } elsif ($state == STATE_NAME) {
> - process_name($file, $_);
> - } elsif ($state == STATE_BODY || $state == STATE_BODY_MAYBE ||
> - $state == STATE_BODY_WITH_BLANK_LINE) {
> - process_body($file, $_);
> - } elsif ($state == STATE_INLINE) { # scanning for inline parameters
> - process_inline($file, $_);
> - } elsif ($state == STATE_PROTO) {
> - process_proto($file, $_);
> - } elsif ($state == STATE_DOCBLOCK) {
> - process_docblock($file, $_);
> - }
> - }
> -
> - # Make sure we got something interesting.
> - if ($initial_section_counter == $section_counter && $
> - output_mode ne "none") {
> - if ($output_selection == OUTPUT_INCLUDE) {
> - print STDERR "${file}:1: warning: '$_' not found\n"
> - for keys %function_table;
> - }
> - else {
> - print STDERR "${file}:1: warning: no structured comments found\n";
> - }
> - }
> - close IN_FILE;
> -}
> -
> -
> -if ($output_mode eq "rst") {
> - get_sphinx_version() if (!$sphinx_major);
> -}
> -
> -$kernelversion = get_kernel_version();
> -
> -# generate a sequence of code that will splice in highlighting information
> -# using the s// operator.
> -for (my $k = 0; $k < @highlights; $k++) {
> - my $pattern = $highlights[$k][0];
> - my $result = $highlights[$k][1];
> -# print STDERR "scanning pattern:$pattern, highlight:($result)\n";
> - $dohighlight .= "\$contents =~ s:$pattern:$result:gs;\n";
> -}
> -
> -# Read the file that maps relative names to absolute names for
> -# separate source and object directories and for shadow trees.
> -if (open(SOURCE_MAP, "<.tmp_filelist.txt")) {
> - my ($relname, $absname);
> - while(<SOURCE_MAP>) {
> - chop();
> - ($relname, $absname) = (split())[0..1];
> - $relname =~ s:^/+::;
> - $source_map{$relname} = $absname;
> - }
> - close(SOURCE_MAP);
> -}
> -
> -if ($output_selection == OUTPUT_EXPORTED ||
> - $output_selection == OUTPUT_INTERNAL) {
> -
> - push(@export_file_list, @ARGV);
> -
> - foreach (@export_file_list) {
> - chomp;
> - process_export_file($_);
> - }
> -}
> -
> -foreach (@ARGV) {
> - chomp;
> - process_file($_);
> -}
> -if ($verbose && $errors) {
> - print STDERR "$errors errors\n";
> -}
> -if ($verbose && $warnings) {
> - print STDERR "$warnings warnings\n";
> -}
> -
> -if ($Werror && $warnings) {
> - print STDERR "$warnings warnings as Errors\n";
> - exit($warnings);
> -} else {
> - exit($output_mode eq "none" ? 0 : $errors)
> -}
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section
2025-08-14 17:13 ` [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section Peter Maydell
@ 2025-08-15 10:40 ` Mauro Carvalho Chehab
2025-08-26 10:36 ` Peter Maydell
0 siblings, 1 reply; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 10:40 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel, Paolo Bonzini, John Snow
Em Thu, 14 Aug 2025 18:13:23 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> We never had a MAINTAINERS entry for the old kernel-doc script; add
> the files for the new Python kernel-doc under "Sphinx documentation
> configuration and build machinery", as the most appropriate
> subsection.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> MAINTAINERS | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a07086ed762..efa59ce7c36 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4436,6 +4436,8 @@ F: docs/sphinx/
> F: docs/_templates/
> F: docs/devel/docs.rst
> F: docs/devel/qapi-domain.rst
> +F: scripts/kernel-doc
> +F: scripts/lib/kdoc/
If you want, feel free to add me there either as maintainer or
reviewer. I can gladly help you maintaining it.
Either way:
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
-
PS.: On a side note, there is a RFC still under discussion about
moving the script location upstream to tools/docs. Nothing decided
yet. One of the points we're still unsure is where we would place
the library directory. So, this may end being slipping to the next
kernel cycle.
>
> Rust build system integration
> M: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
2025-08-15 10:10 ` Peter Maydell
@ 2025-08-15 11:12 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 24+ messages in thread
From: Mauro Carvalho Chehab @ 2025-08-15 11:12 UTC (permalink / raw)
To: Peter Maydell; +Cc: Jonathan Cameron, qemu-devel, Paolo Bonzini, John Snow
Em Fri, 15 Aug 2025 11:10:05 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> On Fri, 15 Aug 2025 at 10:39, Mauro Carvalho Chehab
> <mchehab+huawei@kernel.org> wrote:
> >
> > Hi Peter/Jonathan,
> >
> > Em Fri, 15 Aug 2025 10:11:09 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> escreveu:
> >
> > > On Thu, 14 Aug 2025 18:13:15 +0100
> > > Peter Maydell <peter.maydell@linaro.org> wrote:
> > >
> > > > Earlier this year, the Linux kernel's kernel-doc script was rewritten
> > > > from the old Perl version into a shiny and hopefully more maintainable
> > > > Python version. This commit series updates our copy of this script
> > > > to the latest kernel version. I have tested it by comparing the
> > > > generated HTML documentation and checking that there are no
> > > > unexpected changes.
> >
> > Nice! Yeah, I had a branch here doing something similar for QEMU,
> > but got sidetracked by other things and didn't have time to address
> > a couple of issues. I'm glad you find the time for it.
> >
> > > > Luckily we are carrying very few local modifications to the Perl
> > > > script, so this is fairly straightforward. The structure of the
> > > > patchset is:
> > > > * a minor update to the kerneldoc.py Sphinx extension so it
> > > > will work with both old and new kernel-doc script output
> > > > * a fix to a doc comment markup error that I noticed while comparing
> > > > the HTML output from the two versions of the script
> > > > * import the new Python script, unmodified from the kernel's version
> > > > (conveniently the kernel calls it kernel-doc.py, so it doesn't
> > > > clash with the existing script)
> >
> > > > * make the changes to that library code that correspond to the
> > > > two local QEMU-specific changes we carry
> >
> > To make it easier to maintain and keep in sync with Kernel upstream,
> > perhaps we can try to change Kernel upstream to make easier for QEMU
> > to have a class override for the kdoc parser, allowing it to just
> > sync with Linux upstream, while having its own set of rules on a
> > separate file.
>
> Mmm, this would certainly be nice, but at least so far we haven't
> needed to make extensive changes, luckily (you can see how small
> our local adjustments are here).
I just reviewed the series. IMO, if you create a class override for
RestOutput, as I suggested, there will be just a single line
of difference:
(r"QEMU_[A-Z_]+ +", "", 0),
Not sure about others, but, from my side, I don't mind picking a
patch like that at Kernel upstream, if it doesn't cause any
regressions there (unlikely, but I didn't check).
> > > > * tell sphinx to use the Python version
> > > > * delete the Perl script (I have put a diff of our local mods
> > > > to the Perl script in the commit message of this commit, for
> > > > posterity)
> > > >
> > > > The diffstat looks big, but almost all of it is "import the
> > > > kernel's new script that we trust and don't need to review in
> > > > detail" and "delete the old script".
> >
> > One thing that should be noticed is that Jonathan Corbet is currently
> > doing several cleanups at the Python script, simplifying some
> > regular expressions, avoiding them when str.replace() does the job
> > and adding comments. The end goal is to make it easier for developers
> > to understand and help maintaining its code.
> >
> > So, it is probably worth backporting Linux upstream changes after
> > the end of Kernel 6.17 cycle.
>
> Thanks for the heads-up on that one. A further sync should
> be straightforward after this one, I expect.
Yeah, it sounds so.
> > > > We should also update the Sphinx plugin itself (i.e.
> > > > docs/sphinx/kerneldoc.py), but because I did not need to do
> > > > that to update the main kernel-doc script, I have left that as
> > > > a separate todo item.
> >
> > The Kernel Sphinx plugin after the change is IMHO (*) a lot cleaner
> > than before, and hendles better kernel-doc warnings, as they are now
> > using Sphinx logger class.
>
> Also as much as anything else it's just nice for us not to
> diverge if we can avoid it.
>
> Incidentally, I'm curious if the kernel docs see problems
> with docutils 0.22 -- we had a report about problems there,
> at least some of which seem to be because the way kerneldoc.py
> adds its rST output is triggering the new docutils to complain
> if the added code doesn't have a consistent title style
> hierarchy: https://sourceforge.net/p/docutils/bugs/508/
> (It looks like they're trying to address this on the docutils side;
> we might or might not adjust on our side too by fixing up the
> title styles if that's not too awkward for us.)
I did test building only from 0.17 up to 0.21.2. It worked fine
for all of them. Now, 0.22 was released on 2025-07-29. I didn't
test it yet, nor I'm aware of anyone complaining about it on
Kernel MLs yet.
Btw, I wrote an upstream script to test building docs with different
Sphinx and docutils versions.
It is under:
scripts/test_doc_build.py
It probably makes sense to port it to QEMU and add it to CI. Most of
the logic is independent from the Kernel. The only part that would
require adjustments is the logic at _handle_version() that creates
make commands to clean docs and build html.
>
> > Btw, one important point to notice: if you picked the latest version
> > of kernel-doc, it currently requires at least Python 3.6 (3.7 is the
> > recommended minimal one). It does check that, silently bailing out
> > if Python < 3.6.
>
> QEMU already requires Python 3.9 or better; our configure checks:
>
> check_py_version() {
> # We require python >= 3.9.
> # NB: a True python conditional creates a non-zero return code (Failure)
> "$1" -c 'import sys; sys.exit(sys.version_info < (3,9))'
> }
Great!
> Thanks for the confirmation that the kernel is being more
> conservative on python requirements than we are; I did
> wonder about this but merely assumed you probably were
> rather than specifically checking :-)
Heh, an early change on 6.17 cycle incidentally made it requiring
3.9 ;-)
We ended changing it to preserve 3.7+ support, as we wanted to
ensure it would build with OpenSuse Leap.
> On this minor output change:
>
> > > > "Definition" sections now get output with a trailing colon:
> > > >
> > > > -<p><strong>Definition</strong></p>
> > > > +<div class="kernelindent docutils container">
> > > > +<p><strong>Definition</strong>:</p>
> > > >
> > > > This seems like it might be a bug in kernel-doc since the Parameters,
> > > > Return, etc sections don't get the trailing colon. I don't think it's
> > > > important enough to worry about.
>
> is the extra colon intentional, or do you agree that it's
> a bug? You can see it in the kernel docs output at e.g.
> https://docs.kernel.org/core-api/workqueue.html#c.workqueue_attrs
>
> where in the documentation of struct workqueue_attrs,
> "Definition:" gets a kernel but the corresponding "Members"
> and "Description" don't.
This one predates kernel-doc.py, as it exists at the Perl version:
$ grep Definition scripts/kernel-doc.pl
print $lineprefix . "**Definition**::\n\n";
It seems this was added on this upstream commit:
commit eaf710ceb5ae284778a87c0d0f2348c19e3e4751
Author: Jonathan Corbet <corbet@lwn.net>
Date: Fri Sep 30 11:52:09 2022 -0600
docs: improve the HTML formatting of kerneldoc comments
Make a few changes to cause functions documented by kerneldoc to stand out
better in the rendered documentation. Specifically, change kernel-doc to
put the description section into a ".. container::" section, then add a bit
of CSS to indent that section relative to the function prototype (or struct
or enum definition). Tweak a few other CSS parameters while in the
neighborhood to improve the formatting.
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
While I don't matter much about that, IMO the best would be to drop
the extra ":" at the end.
Feel free to submit a Kernel patch upstream dropping it from
scripts/lib/kdoc/kdoc_output.py.
> (Also "Description" is out-dented
> there when it probably should not be, but that's separate.)
Yeah, indenting Description makes sense to me.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
` (8 preceding siblings ...)
2025-08-15 9:11 ` [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Jonathan Cameron via
@ 2025-08-19 10:34 ` Paolo Bonzini
9 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2025-08-19 10:34 UTC (permalink / raw)
To: Peter Maydell, qemu-devel; +Cc: John Snow
On 8/14/25 19:13, Peter Maydell wrote:
> Earlier this year, the Linux kernel's kernel-doc script was rewritten
> from the old Perl version into a shiny and hopefully more maintainable
> Python version. This commit series updates our copy of this script
> to the latest kernel version. I have tested it by comparing the
> generated HTML documentation and checking that there are no
> unexpected changes.
>
> Luckily we are carrying very few local modifications to the Perl
> script, so this is fairly straightforward. The structure of the
> patchset is:
> * a minor update to the kerneldoc.py Sphinx extension so it
> will work with both old and new kernel-doc script output
> * a fix to a doc comment markup error that I noticed while comparing
> the HTML output from the two versions of the script
> * import the new Python script, unmodified from the kernel's version
> (conveniently the kernel calls it kernel-doc.py, so it doesn't
> clash with the existing script)
> * make the changes to that library code that correspond to the
> two local QEMU-specific changes we carry
> * tell sphinx to use the Python version
> * delete the Perl script (I have put a diff of our local mods
> to the Perl script in the commit message of this commit, for
> posterity)
>
> The diffstat looks big, but almost all of it is "import the
> kernel's new script that we trust and don't need to review in
> detail" and "delete the old script".
>
> My immediate motivation for doing this update is that I noticed
> that the submitter of https://gitlab.com/qemu-project/qemu/-/issues/3077
> is using a Perl that complains about a construct in the perl script,
> which prompted me to check if the kernel folks had already fixed
> it, which it turned out that they had, by rewriting the whole thing :-)
> More generally, if we don't do this update, then we're effectively
> going to drift down the same path we did with checkpatch.pl, where
> we have our own version that diverges from the kernel's version
> and we have to maintain it ourselves.
Yep - for checkpatch.pl that makes sense, since we have more differences
in what we test and we have backported the most pressing parser fixes,
but kerneldoc has no reason to diverge.
Thanks for doing this! For the whole series...
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo
> We should also update the Sphinx plugin itself (i.e.
> docs/sphinx/kerneldoc.py), but because I did not need to do
> that to update the main kernel-doc script, I have left that as
> a separate todo item.
>
> Testing
> -------
>
> I looked at the HTML output of the old kernel-doc script versus the
> new one, using the following diff command which mechanically excludes
> a couple of "same minor change" everywhere diffs, and eyeballing the
> resulting ~150 lines of diff.
>
> diff -w -I '^<div class="kernelindent docutils container">$' -I '^</div>$' -I '^<p><strong>Definition</strong>' -r -u -x searchindex.js build/x86/docs-old-kerneldoc/manual build/x86/docs/manual
>
> The HTML changes are:
>
> (1) some paras now have ID tags, eg:
> -<p><strong>Functions operating on arrays of bits</strong></p>
> +<p id="functions-operating-on-arrays-of-bits"><strong>Functions operating on arrays of bits</strong></p>
>
> (2) Some extra named <div>s, eg:
> +<div class="kernelindent docutils container">
> <p><strong>Parameters</strong></p>
> <dl class="simple">
> <dt><code class="docutils literal notranslate"><span class="pre">long</span> <span class="pre">nr</span></code></dt><dd><p>the bit to set</p>
> @@ -144,12 +145,14 @@
> <dt><code class="docutils literal notranslate"><span class="pre">unsigned</span> <span class="pre">long</span> <span class="pre">*addr</span></code></dt><dd><p>the address to start counting from</p>
> </dd>
> </dl>
> +</div>
>
> (3) The new version correctly parses the multi-line Return: block for
> the memory_translate_iotlb() doc comment. You can see that the
> old HTML here had dt/dd markup, and it mis-renders in the HTML at
> https://www.qemu.org/docs/master/devel/memory.html#c.memory_translate_iotlb
>
> <p><strong>Return</strong></p>
> -<dl class="simple">
> -<dt>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated</dt><dd><p>addr. The MemoryRegion must not be
> accessed after rcu_read_unlock.
> +<p>On success, return the MemoryRegion containing the <strong>iotlb</strong> translated
> +addr. The MemoryRegion must not be accessed after rcu_read_unlock.
> On failure, return NULL, setting <strong>errp</strong> with error.</p>
> -</dd>
> -</dl>
> +</div>
>
> "Definition" sections now get output with a trailing colon:
>
> -<p><strong>Definition</strong></p>
> +<div class="kernelindent docutils container">
> +<p><strong>Definition</strong>:</p>
>
> This seems like it might be a bug in kernel-doc since the Parameters,
> Return, etc sections don't get the trailing colon. I don't think it's
> important enough to worry about.
>
> thanks
> -- PMM
>
> Peter Maydell (8):
> docs/sphinx/kerneldoc.py: Handle new LINENO syntax
> tests/qtest/libqtest.h: Remove stray space from doc comment
> scripts: Import Python kerneldoc from Linux kernel
> scripts/kernel-doc: strip QEMU_ from function definitions
> scripts/kernel-doc: tweak for QEMU coding standards
> scripts/kerneldoc: Switch to the Python kernel-doc script
> scripts/kernel-doc: Delete the old Perl kernel-doc script
> MAINTAINERS: Put kernel-doc under the "docs build machinery" section
>
> MAINTAINERS | 2 +
> docs/conf.py | 4 +-
> docs/sphinx/kerneldoc.py | 7 +-
> tests/qtest/libqtest.h | 2 +-
> .editorconfig | 2 +-
> scripts/kernel-doc | 2442 -------------------------------
> scripts/kernel-doc.py | 325 ++++
> scripts/lib/kdoc/kdoc_files.py | 291 ++++
> scripts/lib/kdoc/kdoc_item.py | 42 +
> scripts/lib/kdoc/kdoc_output.py | 749 ++++++++++
> scripts/lib/kdoc/kdoc_parser.py | 1670 +++++++++++++++++++++
> scripts/lib/kdoc/kdoc_re.py | 270 ++++
> 12 files changed, 3355 insertions(+), 2451 deletions(-)
> delete mode 100755 scripts/kernel-doc
> create mode 100755 scripts/kernel-doc.py
> create mode 100644 scripts/lib/kdoc/kdoc_files.py
> create mode 100644 scripts/lib/kdoc/kdoc_item.py
> create mode 100644 scripts/lib/kdoc/kdoc_output.py
> create mode 100644 scripts/lib/kdoc/kdoc_parser.py
> create mode 100644 scripts/lib/kdoc/kdoc_re.py
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section
2025-08-15 10:40 ` Mauro Carvalho Chehab
@ 2025-08-26 10:36 ` Peter Maydell
0 siblings, 0 replies; 24+ messages in thread
From: Peter Maydell @ 2025-08-26 10:36 UTC (permalink / raw)
To: Mauro Carvalho Chehab; +Cc: qemu-devel, Paolo Bonzini, John Snow
On Fri, 15 Aug 2025 at 11:40, Mauro Carvalho Chehab
<mchehab+huawei@kernel.org> wrote:
>
> Em Thu, 14 Aug 2025 18:13:23 +0100
> Peter Maydell <peter.maydell@linaro.org> escreveu:
>
> > We never had a MAINTAINERS entry for the old kernel-doc script; add
> > the files for the new Python kernel-doc under "Sphinx documentation
> > configuration and build machinery", as the most appropriate
> > subsection.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> > ---
> > MAINTAINERS | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index a07086ed762..efa59ce7c36 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -4436,6 +4436,8 @@ F: docs/sphinx/
> > F: docs/_templates/
> > F: docs/devel/docs.rst
> > F: docs/devel/qapi-domain.rst
> > +F: scripts/kernel-doc
> > +F: scripts/lib/kdoc/
>
> If you want, feel free to add me there either as maintainer or
> reviewer. I can gladly help you maintaining it.
Thanks, that would be very helpful. I've added you as a line to
this "Sphinx documentation configuration and build machinery"
section:
M: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
(and since this whole series is now reviewed I'll take
it via target-arm.next with that tweak.)
-- PMM
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-08-26 10:38 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14 17:13 [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 1/8] docs/sphinx/kerneldoc.py: Handle new LINENO syntax Peter Maydell
2025-08-15 9:49 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 2/8] tests/qtest/libqtest.h: Remove stray space from doc comment Peter Maydell
2025-08-15 9:51 ` Mauro Carvalho Chehab
2025-08-15 10:14 ` Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 3/8] scripts: Import Python kerneldoc from Linux kernel Peter Maydell
2025-08-15 10:00 ` Mauro Carvalho Chehab
2025-08-15 10:19 ` Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 4/8] scripts/kernel-doc: strip QEMU_ from function definitions Peter Maydell
2025-08-15 10:01 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 5/8] scripts/kernel-doc: tweak for QEMU coding standards Peter Maydell
2025-08-15 10:34 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 6/8] scripts/kerneldoc: Switch to the Python kernel-doc script Peter Maydell
2025-08-14 17:13 ` [PATCH for-10.2 7/8] scripts/kernel-doc: Delete the old Perl " Peter Maydell
2025-08-15 10:35 ` Mauro Carvalho Chehab
2025-08-14 17:13 ` [PATCH for-10.2 8/8] MAINTAINERS: Put kernel-doc under the "docs build machinery" section Peter Maydell
2025-08-15 10:40 ` Mauro Carvalho Chehab
2025-08-26 10:36 ` Peter Maydell
2025-08-15 9:11 ` [PATCH for-10.2 0/8] docs: Update our kernel-doc script to the kernel's new Python one Jonathan Cameron via
2025-08-15 9:39 ` Mauro Carvalho Chehab
2025-08-15 10:10 ` Peter Maydell
2025-08-15 11:12 ` Mauro Carvalho Chehab
2025-08-19 10:34 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).