From: Jonathan Corbet <corbet@lwn.net>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Linux Doc Mailing List <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
"Gustavo A. R. Silva" <mchehab+huawei@kernel.org>,
Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Kees Cook <mchehab+huawei@kernel.org>,
linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser
Date: Mon, 24 Feb 2025 16:38:58 -0700 [thread overview]
Message-ID: <87v7sy29rh.fsf@trenco.lwn.net> (raw)
In-Reply-To: <3905b7386d5f1bfa76639cdf1108a46f0bccbbea.1740387599.git.mchehab+huawei@kernel.org>
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> Maintaining kernel-doc has been a challenge, as there aren't many
> perl developers among maintainers. Also, the logic there is too
> complex. Having lots of global variables and using pure functions
> doesn't help.
>
> Rewrite the script in Python, placing most global variables
> inside classes. This should help maintaining the script in long
> term.
[...]
> diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
> new file mode 100755
> index 000000000000..5cf5ed63f215
> --- /dev/null
> +++ b/scripts/kernel-doc.py
> @@ -0,0 +1,2757 @@
> +#!/usr/bin/env python3
> +# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
> +# pylint: disable=C0302,C0103,C0301
> +# pylint: disable=C0116,C0115,W0511,W0613
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +# SPDX-License-Identifier: GPL-2.0
The SPDX tag is supposed to be up top, right under the shebang
I also think you should give consideration to preserving the other
copyright notices in the Perl version. A language translation doesn't
remove existing copyrights...who knows how much creativity went into
some of those regexes?
> +# TODO: implement warning filtering
> +
> +"""
> +kernel_doc
> +==========
> +
> +Print formatted kernel documentation to stdout
> +
> +Read C language source or header FILEs, extract embedded
> +documentation comments, and print formatted documentation
> +to standard output.
> +
> +The documentation comments are identified by the "/**"
> +opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the
> +documentation comment syntax.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import re
> +import sys
> +
> +from datetime import datetime
> +from pprint import pformat
> +
> +from dateutil import tz
> +
> +# Local cache for regular expressions
> +re_cache = {}
> +
> +
> +class Re:
So I have to say this bugs me a bit ... the class is fine, but the
one-letter case-only difference from the standard "re" class is just
going to make the code harder for others to approach. "kern_re" or
something like that? Or even "kre" if you really want it to be as short
as possible.
> + """
> + Helper class to simplify regex declaration and usage,
> +
> + It calls re.compile for a given pattern. It also allows adding
> + regular expressions and define sub at class init time.
> +
> + Regular expressions can be cached via an argument, helping to speedup
> + searches.
> + """
[...]
> +
> +class KernelDoc:
> + # Parser states
> + STATE_NORMAL = 0 # normal code
> + STATE_NAME = 1 # looking for function name
> + STATE_BODY_MAYBE = 2 # body - or maybe more description
> + STATE_BODY = 3 # the body of the comment
> + STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
> + STATE_PROTO = 5 # scanning prototype
> + STATE_DOCBLOCK = 6 # documentation block
> + STATE_INLINE = 7 # gathering doc outside main block
> +
> + st_name = [
> + "NORMAL",
> + "NAME",
> + "BODY_MAYBE",
> + "BODY",
> + "BODY_WITH_BLANK_LINE",
> + "PROTO",
> + "DOCBLOCK",
> + "INLINE",
> + ]
So these ... kind of look like enums?
That's kind of it for nits ... I do have one wish that will kind of hard
to grant overall ... for the long-term maintenance of this code, it
would be really nice if every non-trivial regex were described by a
comment explaining what it is trying to do. It's not reasonable to
expect that as a condition for accepting this rewrite, but it sure would
be a nice goal to be working toward.
Thanks,
jon
next prev parent reply other threads:[~2025-02-24 23:39 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-24 9:08 [PATCH v2 00/39] Implement kernel-doc in Python Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 01/39] include/asm-generic/io.h: fix kerneldoc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 02/39] drivers: media: intel-ipu3.h: fix identation on a kernel-doc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 03/39] drivers: firewire: firewire-cdev.h: " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 04/39] docs: driver-api/infiniband.rst: fix Kerneldoc markup Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 05/39] scripts/kernel-doc: don't add not needed new lines Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 06/39] scripts/kernel-doc: drop dead code for Wcontents_before_sections Mauro Carvalho Chehab
2025-03-04 16:52 ` Jonathan Corbet
2025-02-24 9:08 ` [PATCH v2 07/39] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
2025-02-24 23:23 ` Jonathan Corbet
2025-02-25 6:26 ` Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 08/39] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
2025-02-24 23:38 ` Jonathan Corbet [this message]
2025-02-25 7:38 ` Mauro Carvalho Chehab
2025-02-25 20:10 ` Jonathan Corbet
2025-02-26 6:56 ` Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 10/39] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 11/39] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 12/39] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 13/39] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 14/39] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 15/39] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 16/39] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 17/39] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 18/39] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 19/39] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 20/39] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 21/39] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 22/39] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 23/39] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 24/39] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 25/39] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 26/39] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 27/39] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 28/39] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 29/39] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 30/39] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 31/39] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 32/39] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 33/39] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 34/39] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 35/39] scripts/kernel-doc.py: some coding style cleanups Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 36/39] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 37/39] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 38/39] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
2025-02-24 9:08 ` [PATCH v2 39/39] docs: sphinx: kerneldoc: Use python class if available Mauro Carvalho Chehab
2025-02-24 23:49 ` [PATCH v2 00/39] Implement kernel-doc in Python Jonathan Corbet
2025-02-25 7:54 ` Mauro Carvalho Chehab
2025-02-25 14:33 ` Jonathan Corbet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v7sy29rh.fsf@trenco.lwn.net \
--to=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+huawei@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.