public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Corbet <corbet@lwn.net>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	"Gustavo A. R. Silva" <mchehab+huawei@kernel.org>,
	Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Kees Cook <mchehab+huawei@kernel.org>,
	linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser
Date: Mon, 24 Feb 2025 16:38:58 -0700	[thread overview]
Message-ID: <87v7sy29rh.fsf@trenco.lwn.net> (raw)
In-Reply-To: <3905b7386d5f1bfa76639cdf1108a46f0bccbbea.1740387599.git.mchehab+huawei@kernel.org>

Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:

> Maintaining kernel-doc has been a challenge, as there aren't many
> perl developers among maintainers. Also, the logic there is too
> complex. Having lots of global variables and using pure functions
> doesn't help.
>
> Rewrite the script in Python, placing most global variables
> inside classes. This should help maintaining the script in long
> term.

[...]

> diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
> new file mode 100755
> index 000000000000..5cf5ed63f215
> --- /dev/null
> +++ b/scripts/kernel-doc.py
> @@ -0,0 +1,2757 @@
> +#!/usr/bin/env python3
> +# pylint: disable=R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R0917,R1702
> +# pylint: disable=C0302,C0103,C0301
> +# pylint: disable=C0116,C0115,W0511,W0613
> +# Copyright(c) 2025: Mauro Carvalho Chehab <mchehab@kernel.org>.
> +# SPDX-License-Identifier: GPL-2.0

The SPDX tag is supposed to be up top, right under the shebang

I also think you should give consideration to preserving the other
copyright notices in the Perl version.  A language translation doesn't
remove existing copyrights...who knows how much creativity went into
some of those regexes?

> +# TODO: implement warning filtering
> +
> +"""
> +kernel_doc
> +==========
> +
> +Print formatted kernel documentation to stdout
> +
> +Read C language source or header FILEs, extract embedded
> +documentation comments, and print formatted documentation
> +to standard output.
> +
> +The documentation comments are identified by the "/**"
> +opening comment mark.
> +
> +See Documentation/doc-guide/kernel-doc.rst for the
> +documentation comment syntax.
> +"""
> +
> +import argparse
> +import logging
> +import os
> +import re
> +import sys
> +
> +from datetime import datetime
> +from pprint import pformat
> +
> +from dateutil import tz
> +
> +# Local cache for regular expressions
> +re_cache = {}
> +
> +
> +class Re:

So I have to say this bugs me a bit ... the class is fine, but the
one-letter case-only difference from the standard "re" class is just
going to make the code harder for others to approach.  "kern_re" or
something like that?  Or even "kre" if you really want it to be as short
as possible.

> +    """
> +    Helper class to simplify regex declaration and usage,
> +
> +    It calls re.compile for a given pattern. It also allows adding
> +    regular expressions and define sub at class init time.
> +
> +    Regular expressions can be cached via an argument, helping to speedup
> +    searches.
> +    """

[...]

> +
> +class KernelDoc:
> +    # Parser states
> +    STATE_NORMAL        = 0        # normal code
> +    STATE_NAME          = 1        # looking for function name
> +    STATE_BODY_MAYBE    = 2        # body - or maybe more description
> +    STATE_BODY          = 3        # the body of the comment
> +    STATE_BODY_WITH_BLANK_LINE = 4 # the body which has a blank line
> +    STATE_PROTO         = 5        # scanning prototype
> +    STATE_DOCBLOCK      = 6        # documentation block
> +    STATE_INLINE        = 7        # gathering doc outside main block
> +
> +    st_name = [
> +        "NORMAL",
> +        "NAME",
> +        "BODY_MAYBE",
> +        "BODY",
> +        "BODY_WITH_BLANK_LINE",
> +        "PROTO",
> +        "DOCBLOCK",
> +        "INLINE",
> +    ]

So these ... kind of look like enums?

That's kind of it for nits ... I do have one wish that will kind of hard
to grant overall ... for the long-term maintenance of this code, it
would be really nice if every non-trivial regex were described by a
comment explaining what it is trying to do.  It's not reasonable to
expect that as a condition for accepting this rewrite, but it sure would
be a nice goal to be working toward.

Thanks,

jon

  reply	other threads:[~2025-02-24 23:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-24  9:08 [PATCH v2 00/39] Implement kernel-doc in Python Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 01/39] include/asm-generic/io.h: fix kerneldoc markup Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 02/39] drivers: media: intel-ipu3.h: fix identation on a kernel-doc markup Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 03/39] drivers: firewire: firewire-cdev.h: " Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 04/39] docs: driver-api/infiniband.rst: fix Kerneldoc markup Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 05/39] scripts/kernel-doc: don't add not needed new lines Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 06/39] scripts/kernel-doc: drop dead code for Wcontents_before_sections Mauro Carvalho Chehab
2025-03-04 16:52   ` Jonathan Corbet
2025-02-24  9:08 ` [PATCH v2 07/39] scripts/kernel-doc: rename it to scripts/kernel-doc.pl Mauro Carvalho Chehab
2025-02-24 23:23   ` Jonathan Corbet
2025-02-25  6:26     ` Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 08/39] scripts/kernel-doc: add a symlink to the Perl version of kernel-doc Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 09/39] scripts/kernel-doc.py: add a Python parser Mauro Carvalho Chehab
2025-02-24 23:38   ` Jonathan Corbet [this message]
2025-02-25  7:38     ` Mauro Carvalho Chehab
2025-02-25 20:10       ` Jonathan Corbet
2025-02-26  6:56         ` Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 10/39] scripts/kernel-doc.py: output warnings the same way as kerneldoc Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 11/39] scripts/kernel-doc.py: better handle empty sections Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 12/39] scripts/kernel-doc.py: properly handle struct_group macros Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 13/39] scripts/kernel-doc.py: move regex methods to a separate file Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 14/39] scripts/kernel-doc.py: move KernelDoc class " Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 15/39] scripts/kernel-doc.py: move KernelFiles " Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 16/39] scripts/kernel-doc.py: move output classes " Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 17/39] scripts/kernel-doc.py: convert message output to an interactor Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 18/39] scripts/kernel-doc.py: move file lists to the parser function Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 19/39] scripts/kernel-doc.py: implement support for -no-doc-sections Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 20/39] scripts/kernel-doc.py: fix line number output Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 21/39] scripts/kernel-doc.py: fix handling of doc output check Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 22/39] scripts/kernel-doc.py: properly handle out_section for ReST Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 23/39] scripts/kernel-doc.py: postpone warnings to the output plugin Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 24/39] docs: add a .pylintrc file with sys path for docs scripts Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 25/39] docs: sphinx: kerneldoc: verbose kernel-doc command if V=1 Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 26/39] docs: sphinx: kerneldoc: ignore "\" characters from options Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 27/39] docs: sphinx: kerneldoc: use kernel-doc.py script Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 28/39] scripts/kernel-doc.py: Set an output format for --none Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 29/39] scripts/kernel-doc.py: adjust some coding style issues Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 30/39] scripts/lib/kdoc/kdoc_parser.py: fix Python compat with < v3.13 Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 31/39] scripts/kernel-doc.py: move modulename to man class Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 32/39] scripts/kernel-doc.py: properly handle KBUILD_BUILD_TIMESTAMP Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 33/39] scripts/lib/kdoc/kdoc_parser.py: remove a python 3.9 dependency Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 34/39] scripts/kernel-doc.py: Properly handle Werror and exit codes Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 35/39] scripts/kernel-doc.py: some coding style cleanups Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 36/39] scripts/kernel-doc: switch to use kernel-doc.py Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 37/39] scripts/lib/kdoc/kdoc_files.py: allow filtering output per fname Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 38/39] scripts/kernel_doc.py: better handle exported symbols Mauro Carvalho Chehab
2025-02-24  9:08 ` [PATCH v2 39/39] docs: sphinx: kerneldoc: Use python class if available Mauro Carvalho Chehab
2025-02-24 23:49 ` [PATCH v2 00/39] Implement kernel-doc in Python Jonathan Corbet
2025-02-25  7:54   ` Mauro Carvalho Chehab
2025-02-25 14:33     ` Jonathan Corbet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7sy29rh.fsf@trenco.lwn.net \
    --to=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab+huawei@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox