linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Akira Yokosawa <akiyks@gmail.com>
Subject: Re: [PATCH 10/12] docs: kdoc: further rewrite_struct_members() cleanup
Date: Mon, 4 Aug 2025 15:15:11 +0200	[thread overview]
Message-ID: <20250804151511.73ffb949@foz.lan> (raw)
In-Reply-To: <87v7n6pscu.fsf@trenco.lwn.net>

Em Fri, 01 Aug 2025 16:52:33 -0600
Jonathan Corbet <corbet@lwn.net> escreveu:

> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:
> 
> > Em Thu, 31 Jul 2025 18:13:24 -0600
> > Jonathan Corbet <corbet@lwn.net> escreveu:
> >  
> >> Get rid of some single-use variables and redundant checks, and generally
> >> tighten up the code; no logical change.
> >> 
> >> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
> >> ---
> >>  scripts/lib/kdoc/kdoc_parser.py | 89 ++++++++++++++++-----------------
> >>  1 file changed, 42 insertions(+), 47 deletions(-)
> >> 
> >> diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
> >> index 20e0a2abe13b..2b7d7e646367 100644
> >> --- a/scripts/lib/kdoc/kdoc_parser.py
> >> +++ b/scripts/lib/kdoc/kdoc_parser.py
> >> @@ -673,73 +673,68 @@ class KernelDoc:
> >>          while tuples:
> >>              for t in tuples:
> >>                  newmember = ""
> >> -                maintype = t[0]
> >> -                _ids = t[5]s
> >> -                content = t[3]  
> >
> > The reason I opted for this particular approach...  
> >> -
> >> -                oldmember = "".join(t)
> >> -
> >> -                for s_id in s_ids.split(','):
> >> +                oldmember = "".join(t) # Reconstruct the original formatting
> >> +                #
> >> +                # Pass through each field name, normalizing the form and formatting.
> >> +                #
> >> +                for s_id in t[5].split(','):  
> >
> > ... is that it is easier to understand and to maintain:
> >
> > 	for s_id in s_ids.split(','):
> >
> > than when magic numbers like this are used:
> >
> > 	for s_id in t[5].split(','):  
> 
> Coming into this code, I had a different experience, and found the
> variables to just be a layer of indirection I had to pass through to get
> to the capture groups and see what was really going on.  That was part
> of why I put the group numbers in the comments next to that gnarly
> regex, to make that mapping more direct and easier to understand.
> 
> I will not insist on this change either - at least not indefinitely :)
> I do feel, though, that adding a step between the regex and its use just
> serves to obscure things.
> 
> (And yes, I don't really think that named groups make things better.
> I've found those useful in situations where multiple regexes are in use
> and the ordering of the groups may vary, but otherwise have generally
> avoided them).

I'd say that, when the magic number is within up to 3-lines hunk
distance - e.g. if all of them will appear at the same hunk, it is 
probably safe to use, but when it gets far away, it makes more harm 
than good.

Perhaps one alternative would do something like:

	tuples = struct_members.findall(members)
        if not tuples:
            break

	maintype, -, -, content, -, s_ids = tuples

(assuming that we don't need t[1], t[2] and t[4] here)

Btw, on this specific case, better to use non-capture group matches
to avoid those "empty" spaces, e.g. (if I got it right):

	# Curly Brackets are not captured
        struct_members = KernRe(type_pattern +	        # Capture main type
				r'([^\{\};]+)' +
				r'(?:\{)' +
				r'(?:[^\{\}]*)' +	# Capture content
				r'(?:\})' +
				r'([^\{\}\;]*)(\;)')	# Capture IDs
	...
	tuples = struct_members.findall(members)
        if not tuples:
            break

	maintype, content, s_ids = tuples

Btw, a cleanup like the above is, IMHO, another good reason why not using
magic numbers: people may end fixing match groups to use non-capture
matches, but end forgetting to fix some hidden magic numbers. It is hard
for a reviewer to see it if the affected magic numbers aren't within the
3 lines range of a default unified diff, and may introduce hidden bugs.

Thanks,
Mauro

  reply	other threads:[~2025-08-04 13:15 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-01  0:13 [PATCH 00/12] docs: kdoc: thrash up dump_struct() Jonathan Corbet
2025-08-01  0:13 ` [PATCH 01/12] docs: kdoc: consolidate the stripping of private struct/union members Jonathan Corbet
2025-08-01  5:29   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 02/12] docs: kdoc: Move a regex line in dump_struct() Jonathan Corbet
2025-08-01  5:29   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 03/12] docs: kdoc: backslashectomy in kdoc_parser Jonathan Corbet
2025-08-01  4:27   ` Mauro Carvalho Chehab
2025-08-01 14:21     ` Jonathan Corbet
2025-08-04 12:58       ` Mauro Carvalho Chehab
2025-08-04 16:00         ` Mauro Carvalho Chehab
2025-08-04 18:29           ` Jonathan Corbet
2025-08-01  0:13 ` [PATCH 04/12] docs: kdoc: move the prefix transforms out of dump_struct() Jonathan Corbet
2025-08-01  5:28   ` Mauro Carvalho Chehab
2025-08-01  5:35     ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 05/12] docs: kdoc: split top-level prototype parsing " Jonathan Corbet
2025-08-01  5:34   ` Mauro Carvalho Chehab
2025-08-01 14:10     ` Jonathan Corbet
2025-08-04 12:20       ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 06/12] docs: kdoc: split struct-member rewriting " Jonathan Corbet
2025-08-01  5:37   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 07/12] docs: kdoc: rework the rewrite_struct_members() main loop Jonathan Corbet
2025-08-01  5:42   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 08/12] docs: kdoc: remove an extraneous strip() call Jonathan Corbet
2025-08-01  5:45   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 09/12] docs: kdoc: Some rewrite_struct_members() commenting Jonathan Corbet
2025-08-01  5:50   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 10/12] docs: kdoc: further rewrite_struct_members() cleanup Jonathan Corbet
2025-08-01  6:07   ` Mauro Carvalho Chehab
2025-08-01 22:52     ` Jonathan Corbet
2025-08-04 13:15       ` Mauro Carvalho Chehab [this message]
2025-08-05 22:46         ` Jonathan Corbet
2025-08-06  9:05           ` Mauro Carvalho Chehab
2025-08-06 13:00             ` Jonathan Corbet
2025-08-06 21:27               ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 11/12] docs: kdoc: extract output formatting from dump_struct() Jonathan Corbet
2025-08-01  6:09   ` Mauro Carvalho Chehab
2025-08-01  0:13 ` [PATCH 12/12] docs: kdoc: a few final dump_struct() touches Jonathan Corbet
2025-08-01  6:10   ` Mauro Carvalho Chehab
2025-08-01  6:23 ` [PATCH 00/12] docs: kdoc: thrash up dump_struct() Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250804151511.73ffb949@foz.lan \
    --to=mchehab+huawei@kernel.org \
    --cc=akiyks@gmail.com \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).