From: Jonathan Corbet <corbet@lwn.net>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Linux Doc Mailing List <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
"Gustavo A. R. Silva" <mchehab+huawei@kernel.org>,
Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Kees Cook <mchehab+huawei@kernel.org>,
linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser
Date: Mon, 24 Feb 2025 16:38:58 -0700 [thread overview]
Message-ID: <87v7sy29rh.fsf@trenco.lwn.net> (raw)
In-Reply-To: <3905b7386d5f1bfa76639cdf1108a46f0bccbbea.1740387599.git.mchehab+huawei@kernel.org>
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> Maintaining kernel-doc has been a challenge, as there aren't many
> perl developers among maintainers. Also, the logic there is too
> complex. Having lots of global variables and using pure functions
> doesn't help.
>
> Rewrite the script in Python, placing most global variables
> inside classes. This should help maintaining the script in long
> term.
[...]
> diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
> new file mode 100755
> index 000000000000..5cf5ed63f215
> --- /dev/null
> +++ b/scripts/kernel-doc.py
> @@ -0,0 +1,2757 @@
> +#!/usr/bin/env python3
> +# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
> +# pylint: disable=C0302,C0103,C0301
> +# pylint: disable=C0116,C0115,W0511,W0613
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +# SPDX-License-Identifier: GPL-2.0
The SPDX tag is supposed to be up top, right under the shebang
I also think you should give consideration to preserving the other
copyright notices in the Perl version. A language translation doesn't
remove existing copyrights...who knows how much creativity went into
some of those regexes?
> +# TODO: implement warning filtering
> +
> +"""
> +kernel_doc
> +==========
> +
> +Print formatted kernel documentation to stdout
> +
> +Read C language source or header FILEs, extract embedded
> +documentation comments, and print formatted documentation
> +to standard output.
> +
> +The documentation comments are identified by the "/**"
> +opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the
> +documentation comment syntax.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import re
> +import sys
> +
> +from datetime import datetime
> +from pprint import pformat
> +
> +from dateutil import tz
> +
> +# Local cache for regular expressions
> +re_cache = {}
> +
> +
> +class Re:
So I have to say this bugs me a bit ... the class is fine, but the
one-letter case-only difference from the standard "re" class is just
going to make the code harder for others to approach. "kern_re" or
something like that? Or even "kre" if you really want it to be as short
as possible.
> + """
> + Helper class to simplify regex declaration and usage,
> +
> + It calls re.compile for a given pattern. It also allows adding
> + regular expressions and define sub at class init time.
> +
> + Regular expressions can be cached via an argument, helping to speedup
> + searches.
> + """
[...]
> +
> +class KernelDoc:
> + # Parser states
> + STATE_NORMAL = 0 # normal code
> + STATE_NAME = 1 # looking for function name
> + STATE_BODY_MAYBE = 2 # body - or maybe more description
> + STATE_BODY = 3 # the body of the comment
> + STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
> + STATE_PROTO = 5 # scanning prototype
> + STATE_DOCBLOCK = 6 # documentation block
> + STATE_INLINE = 7 # gathering doc outside main block
> +
> + st_name = [
> + "NORMAL",
> + "NAME",
> + "BODY_MAYBE",
> + "BODY",
> + "BODY_WITH_BLANK_LINE",
> + "PROTO",
> + "DOCBLOCK",
> + "INLINE",
> + ]
So these ... kind of look like enums?
That's kind of it for nits ... I do have one wish that will kind of hard
to grant overall ... for the long-term maintenance of this code, it
would be really nice if every non-trivial regex were described by a
comment explaining what it is trying to do. It's not reasonable to
expect that as a condition for accepting this rewrite, but it sure would
be a nice goal to be working toward.
Thanks,
jon
next prev parent reply other threads:[~2025-02-24 23:39 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-24 9:08 [PATCH v2 00/39] Implement kernel-doc in Python Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 01/39] include/asm-generic/io.h: fix kerneldoc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 02/39] drivers: media: intel-ipu3.h: fix identation on a kernel-doc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 03/39] drivers: firewire: firewire-cdev.h: " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 04/39] docs: driver-api/infiniband.rst: fix Kerneldoc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 05/39] scripts/kernel-doc: don't add not needed new lines Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 06/39] scripts/kernel-doc: drop dead code for Wcontents_before_sections Mauro Carvalho Chehab
2025-03-04 16:52 ` Jonathan Corbet
2025-02-24 9:08 ` [PATCH v2 07/39] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
2025-02-24 23:23 ` Jonathan Corbet
2025-02-25 6:26 ` Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 08/39] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
2025-02-24 23:38 ` Jonathan Corbet [this message]
2025-02-25 7:38 ` Mauro Carvalho Chehab
2025-02-25 20:10 ` Jonathan Corbet
2025-02-26 6:56 ` Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 10/39] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 11/39] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 12/39] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 13/39] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 14/39] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 15/39] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 16/39] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 17/39] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 18/39] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 19/39] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 20/39] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 21/39] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 22/39] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 23/39] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 24/39] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 25/39] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 26/39] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 27/39] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 28/39] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 29/39] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 30/39] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 31/39] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 32/39] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 33/39] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 34/39] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 35/39] scripts/kernel-doc.py: some coding style cleanups Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 36/39] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 37/39] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 38/39] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 39/39] docs: sphinx: kerneldoc: Use python class if available Mauro Carvalho Chehab
2025-02-24 23:49 ` [PATCH v2 00/39] Implement kernel-doc in Python Jonathan Corbet
2025-02-25 7:54 ` Mauro Carvalho Chehab
2025-02-25 14:33 ` Jonathan Corbet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v7sy29rh.fsf@trenco.lwn.net \
--to=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+huawei@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox