From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jonathan Corbet <corbet@lwn.net>,
Linux Doc Mailing List <linux-doc@vger.kernel.org>,
Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org,
Shuah Khan <skhan@linuxfoundation.org>
Subject: [PATCH 3/9] docs: maintainers_include.py: split state machine on multiple funcs
Date: Mon, 4 May 2026 17:51:12 +0200 [thread overview]
Message-ID: <7cdfae61b68c7613663ddd528020f6b4a4ccf8ec.1777908711.git.mchehab+huawei@kernel.org> (raw)
In-Reply-To: <cover.1777908711.git.mchehab+huawei@kernel.org>
Instead of one big __init__ code, split the MaintainersParser
code in a way that the state machine remains on __init__, but
the actual parser for descriptions and subsystems are moved
to separate functions.
To make parser easier, instead storing parsed results on a list,
place them directly on a string.
That granted 15% of performance increase(*) with Python 3.14 and
made the logic simpler.
(*) measured by creating a new directory under Documentation/,
and placing justmaintainers.rst and an index file there,
building it via sphinx-build-wrapper.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
Documentation/sphinx/maintainers_include.py | 299 +++++++++++---------
1 file changed, 159 insertions(+), 140 deletions(-)
diff --git a/Documentation/sphinx/maintainers_include.py b/Documentation/sphinx/maintainers_include.py
index e679acf0633d..8867ecc0aad3 100755
--- a/Documentation/sphinx/maintainers_include.py
+++ b/Documentation/sphinx/maintainers_include.py
@@ -47,168 +47,187 @@ class MaintainersParser:
self.profile_toc = set()
self.profile_entries = {}
- result = list()
- result.append(".. _maintainers:")
- result.append("")
+ self.output = ".. _maintainers:\n\n"
# Poor man's state machine.
- descriptions = False
- maintainers = False
- subsystems = False
+ self.descriptions = False
+ self.maintainers = False
+ self.subsystems = False
# Field letter to field name mapping.
- field_letter = None
- fields = dict()
+ self.field_letter = None
+ self.fields = dict()
+
+ self.field_prev = ""
+ self.field_content = ""
+ self.subsystem_name = None
+
+ self.app_dir = app_dir
+ self.base_dir, self.doc_dir, self.sphinx_dir = app_dir.partition("Documentation")
+
+ self.re_doc = re.compile(r'(Documentation/([^\s\?\*]*)\.rst)')
prev = None
- field_prev = ""
- field_content = ""
- subsystem_name = None
-
- base_dir, doc_dir, sphinx_dir = app_dir.partition("Documentation")
-
for line in open(path):
- # Have we reached the end of the preformatted Descriptions text?
- if descriptions and line.startswith('Maintainers'):
- descriptions = False
- # Ensure a blank line following the last "|"-prefixed line.
- result.append("")
-
- # Start subsystem processing? This is to skip processing the text
- # between the Maintainers heading and the first subsystem name.
- if maintainers and not subsystems:
+ if self.descriptions:
+ self.parse_descriptions(line)
+ elif self.maintainers and not self.subsystems:
if re.search('^[A-Z0-9]', line):
- subsystems = True
-
- # Drop needless input whitespace.
- line = line.rstrip()
-
- #
- # Handle profile entries - either as files or as https refs
- #
- match = re.match(rf"P:\s*({doc_dir})(/\S+)\.rst", line)
- if match:
- name = "".join(match.groups())
- entry = os.path.relpath(base_dir + name, app_dir)
-
- full_name = os.path.join(base_dir, name)
- path = os.path.relpath(full_name, app_dir)
- #
- # When SPHINXDIRS is used, it will try to reference files
- # outside srctree, causing warnings. To avoid that, point
- # to the latest official documentation
- #
- if path.startswith("../"):
- entry = KERNELDOC_URL + match.group(2) + ".html"
+ self.subsystems = True
+ self.parse_subsystems(line)
else:
- entry = "/" + entry
-
- if "*" in entry:
- for e in glob(entry):
- self.profile_toc.add(e)
- self.profile_entries[subsystem_name] = e
- else:
- self.profile_toc.add(entry)
- self.profile_entries[subsystem_name] = entry
- else:
- match = re.match(r"P:\s*(https?://.*)", line)
- if match:
- entry = match.group(1).strip()
- self.profile_entries[subsystem_name] = entry
-
- # Linkify all non-wildcard refs to ReST files in Documentation/.
- pat = r'(Documentation/([^\s\?\*]*)\.rst)'
- m = re.search(pat, line)
- if m:
- # maintainers.rst is in a subdirectory, so include "../".
- line = re.sub(pat, ':doc:`%s <../%s>`' % (m.group(2), m.group(2)), line)
-
- # Check state machine for output rendering behavior.
- output = None
- if descriptions:
- # Escape the escapes in preformatted text.
- output = "| %s" % (line.replace("\\", "\\\\")
- .replace("**", "\\**"))
- # Look for and record field letter to field name mappings:
- # R: Designated *reviewer*: FullName <address@domain>
- m = re.search(r"\s(\S):\s", line)
- if m:
- field_letter = m.group(1)
- if field_letter and not field_letter in fields:
- m = re.search(r"\*([^\*]+)\*", line)
- if m:
- fields[field_letter] = m.group(1)
- elif subsystems:
- # Skip empty lines: subsystem parser adds them as needed.
- if len(line) == 0:
- continue
- # Subsystem fields are batched into "field_content"
- if line[1] != ':':
- # Render a subsystem entry as:
- # SUBSYSTEM NAME
- # ~~~~~~~~~~~~~~
-
- # Flush pending field content.
- output = field_content + "\n\n"
- field_content = ""
-
- subsystem_name = line.title()
-
- # Collapse whitespace in subsystem name.
- heading = re.sub(r"\s+", " ", line)
- output = output + "%s\n%s" % (heading, "~" * len(heading))
- field_prev = ""
- else:
- # Render a subsystem field as:
- # :Field: entry
- # entry...
- field, details = line.split(':', 1)
- details = details.strip()
-
- # Mark paths (and regexes) as literal text for improved
- # readability and to escape any escapes.
- if field in ['F', 'N', 'X', 'K']:
- # But only if not already marked :)
- if not ':doc:' in details:
- details = '``%s``' % (details)
-
- # Comma separate email field continuations.
- if field == field_prev and field_prev in ['M', 'R', 'L']:
- field_content = field_content + ","
-
- # Do not repeat field names, so that field entries
- # will be collapsed together.
- if field != field_prev:
- output = field_content + "\n"
- field_content = ":%s:" % (fields.get(field, field))
- field_content = field_content + "\n\t%s" % (details)
- field_prev = field
+ self.output += line
+ elif self.subsystems:
+ self.parse_subsystems(line)
else:
- output = line
-
- # Re-split on any added newlines in any above parsing.
- if output != None:
- for separated in output.split('\n'):
- result.append(separated)
+ self.output += line
# Update the state machine when we find heading separators.
if line.startswith('----------'):
if prev.startswith('Descriptions'):
- descriptions = True
+ self.descriptions = True
if prev.startswith('Maintainers'):
- maintainers = True
+ self.maintainers = True
# Retain previous line for state machine transitions.
prev = line
# Flush pending field contents.
- if field_content != "":
- for separated in field_content.split('\n'):
- result.append(separated)
+ if self.field_content:
+ self.output += self.field_content + "\n\n"
- self.output = "\n".join(result)
+ self.output = self.output.rstrip()
+
+ def parse_descriptions(self, line):
+ """Handle contents of the descriptions section."""
+
+ # Have we reached the end of the preformatted Descriptions text?
+ if line.startswith('Maintainers'):
+ self.descriptions = False
+ self.output += "\n" + line
+ return
+
+ # Linkify all non-wildcard refs to ReST files in Documentation/.
+ m = self.re_doc.search(line)
+ if m:
+ # maintainers.rst is in a subdirectory, so include "../".
+ line = self.re_doc.sub(':doc:`%s <../%s>`' % (m.group(2), m.group(2)), line)
+
+ # Escape the escapes in preformatted text.
+ output = "| %s" % (line.replace("\\", "\\\\")
+ .replace("**", "\\**"))
+
+ # Look for and record field letter to field name mappings:
+ # R: Designated *reviewer*: FullName <address@domain>
+ m = re.search(r"\s(\S):\s", line)
+ if m:
+ self.field_letter = m.group(1)
+
+ if self.field_letter and self.field_letter not in self.fields:
+ m = re.search(r"\*([^\*]+)\*", line)
+ if m:
+ self.fields[self.field_letter] = m.group(1)
+
+ # Append parsed content to self.output
+ self.output += output
+
+ def parse_subsystems(self, line):
+ """Handle contents of the per-subsystem sections."""
+
+ # Drop needless input whitespace.
+ line = line.rstrip()
+
+ #
+ # Handle profile entries - either as files or as https refs
+ #
+ match = re.match(rf"P:\s*({self.doc_dir})(/\S+)\.rst", line)
+ if match:
+ name = "".join(match.groups())
+ entry = os.path.relpath(self.base_dir + name, self.app_dir)
+
+ full_name = os.path.join(self.base_dir, name)
+ path = os.path.relpath(full_name, self.app_dir)
+ #
+ # When SPHINXDIRS is used, it will try to reference files
+ # outside srctree, causing warnings. To avoid that, point
+ # to the latest official documentation
+ #
+ if path.startswith("../"):
+ entry = KERNELDOC_URL + match.group(2) + ".html"
+ else:
+ entry = "/" + entry
+
+ if "*" in entry:
+ for e in glob(entry):
+ self.profile_toc.add(e)
+ self.profile_entries[self.subsystem_name] = e
+ else:
+ self.profile_toc.add(entry)
+ self.profile_entries[self.subsystem_name] = entry
+ else:
+ match = re.match(r"P:\s*(https?://.*)", line)
+ if match:
+ entry = match.group(1).strip()
+ self.profile_entries[self.subsystem_name] = entry
+
+ # Linkify all non-wildcard refs to ReST files in Documentation/.
+ m = self.re_doc.search(line)
+ if m:
+ # maintainers.rst is in a subdirectory, so include "../".
+ line = self.re_doc.sub(':doc:`%s <../%s>`' % (m.group(2), m.group(2)), line)
+
+ # Check state machine for output rendering behavior.
+ output = None
+ if self.subsystems:
+ # Skip empty lines: subsystem parser adds them as needed.
+ if len(line) == 0:
+ return
+ # Subsystem fields are batched into "field_content"
+ if line[1] != ':':
+ # Render a subsystem entry as:
+ # SUBSYSTEM NAME
+ # ~~~~~~~~~~~~~~
+ # Flush pending field content.
+ output = self.field_content + "\n\n"
+ self.field_content = ""
+
+ self.subsystem_name = line.title()
+
+ # Collapse whitespace in subsystem name.
+ heading = re.sub(r"\s+", " ", line)
+ output = output + "%s\n%s" % (heading, "~" * len(heading))
+ self.field_prev = ""
+ else:
+ # Render a subsystem field as:
+ # :Field: entry
+ # entry...
+ field, details = line.split(':', 1)
+ details = details.strip()
+
+ # Mark paths (and regexes) as literal text for improved
+ # readability and to escape any escapes.
+ if field in ['F', 'N', 'X', 'K']:
+ # But only if not already marked :)
+ if not ':doc:' in details:
+ details = '``%s``' % (details)
+
+ # Comma separate email field continuations.
+ if field == self.field_prev and self.field_prev in ['M', 'R', 'L']:
+ self.field_content = self.field_content + ","
+
+ # Do not repeat field names, so that field entries
+ # will be collapsed together.
+ if field != self.field_prev:
+ output = self.field_content + "\n"
+ self.field_content = ":%s:" % (self.fields.get(field, field))
+ self.field_content = self.field_content + "\n\t%s" % (details)
+ self.field_prev = field
+ elif not self.descriptions:
+ output = line
+
+ if output is not None:
+ self.output += output + "\n"
- # Create a TOC class
class MaintainersInclude(Include):
"""MaintainersInclude (``maintainers-include``) directive"""
--
2.54.0
next prev parent reply other threads:[~2026-05-04 15:51 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 15:51 [PATCH 0/9] Improve process/maintainers output Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 1/9] docs: maintainers_include: keep hidden TOC sorted Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 2/9] docs: escape ** glob pattern in MAINTAINERS descriptions Mauro Carvalho Chehab
2026-05-04 21:20 ` Randy Dunlap
2026-05-05 3:19 ` Joe Perches
2026-05-05 5:57 ` Mauro Carvalho Chehab
2026-05-05 6:46 ` Mauro Carvalho Chehab
2026-05-04 15:51 ` Mauro Carvalho Chehab [this message]
2026-05-04 15:51 ` [PATCH 4/9] docs: maintainers_include: cleanup the code Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 5/9] docs: maintainers_include.py: clean most SPHINXDIRS=process warnings Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 6/9] docs: maintainers_include: do some coding style cleanups Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 7/9] docs: maintainers_include: store maintainers entries on a dict Mauro Carvalho Chehab
2026-05-04 15:51 ` [PATCH 8/9] docs: maintainers_include: don't ignore invalid profile entries Mauro Carvalho Chehab
2026-05-04 16:08 ` Miguel Ojeda
2026-05-04 20:26 ` Mauro Carvalho Chehab
2026-05-04 22:37 ` Gary Guo
2026-05-04 23:23 ` Mauro Carvalho Chehab
2026-05-05 0:25 ` Gary Guo
2026-05-04 23:34 ` Miguel Ojeda
2026-05-05 0:08 ` Mauro Carvalho Chehab
2026-05-05 0:20 ` Miguel Ojeda
2026-05-05 5:45 ` Mauro Carvalho Chehab
2026-05-05 11:16 ` Gary Guo
2026-05-05 13:09 ` Mauro Carvalho Chehab
2026-05-05 14:37 ` Miguel Ojeda
2026-05-04 15:51 ` [PATCH 9/9] docs: maintainers: add a filtering javascript Mauro Carvalho Chehab
2026-05-04 21:12 ` Randy Dunlap
2026-05-05 13:00 ` Mauro Carvalho Chehab
2026-05-04 21:13 ` [PATCH 0/9] Improve process/maintainers output Randy Dunlap
2026-05-05 12:50 ` Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cdfae61b68c7613663ddd528020f6b4a4ccf8ec.1777908711.git.mchehab+huawei@kernel.org \
--to=mchehab+huawei@kernel.org \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox