From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
Kees Cook <kees@kernel.org>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
intel-wired-lan@lists.osuosl.org, linux-doc@vger.kernel.org,
linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
Aleksandr Loktionov <aleksandr.loktionov@intel.com>,
Randy Dunlap <rdunlap@infradead.org>,
Shuah Khan <skhan@linuxfoundation.org>
Subject: Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc
Date: Fri, 13 Mar 2026 11:48:45 +0100 [thread overview]
Message-ID: <20260313114845.53eb8611@localhost> (raw)
In-Reply-To: <352c3f9f8ffd2d031c86a476e532a8ea6ffcf1ed@intel.com>
On Wed, 04 Mar 2026 12:07:45 +0200
Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Mon, 23 Feb 2026, Jonathan Corbet <corbet@lwn.net> wrote:
> > Jani Nikula <jani.nikula@linux.intel.com> writes:
> >
> >> There's always the question, if you're putting a lot of effort into
> >> making kernel-doc closer to an actual C parser, why not put all that
> >> effort into using and adapting to, you know, an actual C parser?
> >
> > Not speaking to the current effort but ... in the past, when I have
> > contemplated this (using, say, tree-sitter), the real problem is that
> > those parsers simply strip out the comments. Kerneldoc without comments
> > ... doesn't work very well. If there were a parser without those
> > problems, and which could be made to do the right thing with all of our
> > weird macro usage, it would certainly be worth considering.
>
> I think e.g. libclang and its Python bindings can be made to work. The
> main problems with that are passing proper compiler options (because
> it'll need to include stuff to know about types etc. because it is a
> proper parser), preprocessing everything is going to take time, you need
> to invest a bunch into it to know how slow exactly compared to the
> current thing and whether it's prohitive, and it introduces an extra
> dependency.
>
> So yeah, there are definitely tradeoffs there. But it's not like this
> constant patching of kernel-doc is exactly burden free either.
On my tests with a simple C tokenizer:
https://lore.kernel.org/linux-doc/cover.1773326442.git.mchehab+huawei@kernel.org/
The tokenizer is working fine and didn't make it much slow: it
increases the time to pass the entire Kernel tree from 37s to 47s
for man pages generation, but should not change much the time for
htmldocs, as right now only ~4 seconds is needed to read files
pointed by Documentation kernel-doc tags and parse them.
The code can still be cleaned up, as there are still some things
hardcoded on the various dump_* functions that could be better
implemented (*).
The advantage of the approach I'm using is that it allows to
gradually migrate to rely at the tokenized code, as it can be done
incrementally.
(*) for instance, __attribute__ and a couple of other macros are parsed
twice at dump_struct() logic, on different places.
> I don't
> know, is it just me, but I'd like to think as a profession we'd be past
> writing ad hoc C parsers by now.
Probably not, but I don't think we need a C parser, as kernel-doc
just needs to understand data types (enum, struct, typedef, union,
vars) and function/macro prototypes.
For such purpose, a tokenizer sounds enough.
Now, there is the code that it is now inside:
https://github.com/mchehab/linux/blob/tokenizer-v5/tools/lib/python/kdoc/xforms_lists.py
which contains a list of C/gcc/clang keywords that will
be ignored, like:
__attribute__
static
extern
inline
Together with a sanitized version of the kernel macros it needs
to handle or ignore:
DECLARE_BITMAP
DECLARE_HASHTABLE
__acquires
__init
__exit
struct_group
...
Once we finish cleaning up kdoc_parser.py to rely only
on it for prototype transformations, this will be the only file
that will require changes when more macros start affecting
kernel-doc.
As this is complex, and may require manual adjustments, it
is probably better to not try to auto-generate xforms list
in runtime. A better approach is, IMO, to have a C pre-processor
code to help periodically update it, like using a target like:
make kdoc-xforms
that would use either cpp or clang to generate a patch to
update xforms_list content after adding new macros that
affect docs generation.
--
Thanks,
Mauro
next prev parent reply other threads:[~2026-03-13 10:48 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-18 10:12 [PATCH 00/38] docs: several improvements to kernel-doc Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 01/38] docs: kdoc_re: add support for groups() Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 02/38] docs: kdoc_re: don't go past the end of a line Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 03/38] docs: kdoc_parser: move var transformers to the beginning Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 04/38] docs: kdoc_parser: don't mangle with function defines Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 05/38] docs: kdoc_parser: add functions support for NestedMatch Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 06/38] docs: kdoc_parser: use NestedMatch to handle __attribute__ on functions Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 07/38] docs: kdoc_parser: fix variable regexes to work with size_t Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 08/38] docs: kdoc_parser: fix the default_value logic for variables Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 09/38] docs: kdoc_parser: add some debug for variable parsing Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 10/38] docs: kdoc_parser: don't exclude defaults from prototype Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 11/38] docs: kdoc_parser: fix parser to support multi-word types Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 12/38] docs: kdoc_parser: ignore context analysis and lock attributes Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 13/38] docs: kdoc_parser: add support for LIST_HEAD Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 14/38] docs: kdoc_parser: handle struct member macro VIRTIO_DECLARE_FEATURES(name) Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 15/38] docs: kdoc_re: properly handle strings and escape chars on it Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 16/38] docs: kdoc_re: better show KernRe() at documentation Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 17/38] docs: kdoc_re: don't recompile NestedMatch regex every time Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 18/38] docs: kdoc_re: Change NestedMath args replacement to \0 Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 19/38] docs: kdoc_re: make NestedMatch use KernRe Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 20/38] docs: kdoc_re: add support on NestedMatch for argument replacement Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 21/38] docs: kdoc_parser: better handle struct_group macros Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 22/38] docs: kdoc_re: fix a parse bug on struct page_pool_params Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 23/38] docs: kdoc_re: add a helper class to declare C function matches Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 24/38] docs: kdoc_parser: use the new CFunction class Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 25/38] docs: kdoc_parser: minimize differences with struct_group_tagged Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 26/38] docs: kdoc_parser: move transform lists to a separate file Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 27/38] docs: kdoc_re: don't remove the trailing ";" with NestedMatch Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 28/38] docs: kdoc_re: prevent adding whitespaces on sub replacements Mauro Carvalho Chehab
2026-02-18 10:12 ` [PATCH 29/38] docs: xforms_lists.py: use CFuntion to handle all function macros Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 30/38] docs: kdoc_files: allows the caller to use a different xforms class Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 31/38] docs: kdoc_re: Fix NestedMatch.sub() which causes PDF builds to break Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 32/38] docs: kdoc_files: document KernelFiles() ABI Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 33/38] docs: kdoc_output: add optional args to ManOutput class Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 34/38] docs: sphinx-build-wrapper: better handle troff .TH markups Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 35/38] docs: kdoc_output: use a more standard order for .TH on man pages Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 36/38] docs: sphinx-build-wrapper: don't allow "/" on file names Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 37/38] docs: kdoc_output: describe the class init parameters Mauro Carvalho Chehab
2026-02-18 10:13 ` [PATCH 38/38] docs: kdoc_output: pick a better default for modulename Mauro Carvalho Chehab
2026-02-21 1:24 ` [PATCH 00/38] docs: several improvements to kernel-doc Randy Dunlap
2026-02-22 1:24 ` Randy Dunlap
2026-02-23 13:47 ` Jani Nikula
2026-02-23 15:02 ` Jonathan Corbet
2026-02-24 13:25 ` [Intel-wired-lan] " Mauro Carvalho Chehab
2026-03-04 10:07 ` Jani Nikula
2026-03-04 12:20 ` [Intel-wired-lan] " Mauro Carvalho Chehab
2026-03-04 22:34 ` Jonathan Corbet
2026-03-13 10:48 ` Mauro Carvalho Chehab [this message]
2026-03-03 14:53 ` Mauro Carvalho Chehab
2026-03-03 15:12 ` Loktionov, Aleksandr
2026-03-03 16:09 ` [Intel-wired-lan] " Mauro Carvalho Chehab
2026-03-04 9:51 ` Jani Nikula
2026-02-23 21:58 ` Jonathan Corbet
2026-03-02 15:54 ` [Intel-wired-lan] " Mauro Carvalho Chehab
2026-03-02 16:14 ` Jonathan Corbet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260313114845.53eb8611@localhost \
--to=mchehab+huawei@kernel.org \
--cc=aleksander.lobakin@intel.com \
--cc=aleksandr.loktionov@intel.com \
--cc=corbet@lwn.net \
--cc=gustavoars@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jani.nikula@linux.intel.com \
--cc=kees@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rdunlap@infradead.org \
--cc=skhan@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox