linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] docs: Make automarkup ready for Sphinx 3.1+
@ 2020-10-08 13:54 Nícolas F. R. A. Prado
  2020-10-09  5:53 ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 10+ messages in thread
From: Nícolas F. R. A. Prado @ 2020-10-08 13:54 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, linux-doc
  Cc: Jonathan Corbet, linux-kernel, lkcamp, andrealmeid

On Thu Oct 8, 2020 at 2:27 AM -03, Mauro Carvalho Chehab wrote:
>
> Hi Nícolas,
>
> Em Wed, 07 Oct 2020 23:12:25 +0000
> Nícolas F. R. A. Prado <nfraprado@protonmail.com> escreveu:
>
> > While Sphinx 2 used a single c:type role for struct, union, enum and
> > typedef, Sphinx 3 uses a specific role for each one.
> > To keep backward compatibility, detect the Sphinx version and use the
> > correct roles for that version.
> >
> > Also, Sphinx 3 is more strict with its C domain and generated warnings,
> > exposing issues in the parsing.
> > To fix the warnings, make the C regexes use ASCII, ensure the
> > expressions only match the beginning of words and skip trying to
> > cross-reference C reserved words.
> >
> > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@protonmail.com>
> > ---
> >
> > Hi,
> >
> > after Mauro's series making everything ready for Sphinx 3.1, only the automarkup
> > was left to be ported.
> > This patch makes the needed changes to automarkup so that we can soon flip the
> > switch to Sphinx 3.1.
> >
> > This change was tested both with Sphinx 2.4.4 and Sphinx 3.1.
> >
> > This change doesn't add any warnings to the Documentation build.
> > I tested it with Mauro's series but later rebased it to docs-next, and it can be
> > accepted independently of that series.
> >
> > I ended up doing more than one thing in this single patch, but since it was all
> > changing the same lines and for the same purpose, I felt it would be better to
> > keep it as a single commit.
> >
>
> Thanks for doing this! That was the last missing part on fully
> supporting
> Sphinx 3.1+.
>
> > Mauro,
> > if this patch is ok, the 3rd patch in your series, which disables automarkup for
> > sphinx 3, should be dropped.
>
> Yeah, sure.
>
> > Although I'm not sure what the implications of your patches adding namespaces
> > and using the c:macro for functions are.
>
> With regards to namespaces:
>
> Currently, only the media docs use namespaces, and it declares it at the
> beginning of each file that needs it, without overriding it later[1].
>
> [1] btw, the cdomain.py backward compat code doesn't support namespace
> changes - as it parses namespaces before handling the C domain tags.
> I doubt that we'll need to have a single .rst file using more than
> one namespace anyway.
>
> The main usage is to avoid conflicts for uAPI documentation for
> syscalls - actually for libc userspace wrappers to syscalls. It
> documents
> things like: open, close, read, write, ioctl, poll, select.

If it's mainly for that, I think automarkup could skip handling namespaces.
From automarkup.py:

#
# Many places in the docs refer to common system calls.  It is
# pointless to try to cross-reference them and, as has been known
# to happen, somebody defining a function by these names can lead
# to the creation of incorrect and confusing cross references.  So
# just don't even try with these names.
#
Skipfuncs = [ 'open', 'close', 'read', 'write', 'fcntl', 'mmap',
              'select', 'poll', 'fork', 'execve', 'clone', 'ioctl',
	      'socket' ]

So unless I'm confusing things and the namespaces actually sidestep that issue,
the namespace handling could be left out of automarkup.

>
> I'm not sure if the automarkup should be aware of it, or if the c.py
> code
> at Sphinx 3.x will add the namespace automatically, but I suspect that
> automarkup will need to handle it as well.
>
> One file you could use for checking it is this one:
>
> Documentation/userspace-api/media/v4l/hist-v4l2.rst
>
> It contains a namespace directive and documents what changed without
> using any explicit reference (after my patch series + linux-next).
>
> With regards to c:macro vs c:function:
>
> I suspect that automarkup should test both when trying to do
> cross-references for function-like calls. E. g. test first if
> there is a :c:function, falling back to check for :c:macro.
>
> I would add a "sphinx3_c_func_ref" function that would handle
> such special case, e. g. something like:
>
> markup_func_sphinx3 = {RE_doc: markup_doc_ref,
> RE_function: sphinx3_c_func_ref,
> RE_struct: markup_c_ref,
> RE_union: markup_c_ref,
> RE_enum: markup_c_ref,
> RE_typedef: markup_c_ref}

Sounds good.

I'll make this patch into a series and add that function/macro handling as a new
patch, and the namespace handling depending on your answer on the above comment,
for v2.

>
> > All I did here was use the specific roles for sphinx 3 and fix the warnings, but
> > that was enough to get correct cross-references even after your series.
> >
> > Thanks,
> > Nícolas
>
> >
> >
> >  Documentation/sphinx/automarkup.py | 69 ++++++++++++++++++++++++++----
> >  1 file changed, 60 insertions(+), 9 deletions(-)
> >
> > diff --git a/Documentation/sphinx/automarkup.py b/Documentation/sphinx/automarkup.py
> > index a1b0f554cd82..fd1e927408ad 100644
> > --- a/Documentation/sphinx/automarkup.py
> > +++ b/Documentation/sphinx/automarkup.py
> > @@ -22,13 +22,34 @@ from itertools import chain
> >  # :c:func: block (i.e. ":c:func:`mmap()`s" flakes out), so the last
> >  # bit tries to restrict matches to things that won't create trouble.
> >  #
> > -RE_function = re.compile(r'(([\w_][\w\d_]+)\(\))')
> > -RE_type = re.compile(r'(struct|union|enum|typedef)\s+([\w_][\w\d_]+)')
> > +RE_function = re.compile(r'\b(([a-zA-Z_]\w+)\(\))', flags=re.ASCII)
> > +
> > +#
> > +# Sphinx 2 uses the same :c:type role for struct, union, enum and typedef
> > +#
> > +RE_generic_type = re.compile(r'\b(struct|union|enum|typedef)\s+([a-zA-Z_]\w+)',
> > +                             flags=re.ASCII)
> > +
> > +#
> > +# Sphinx 3 uses a different C role for each one of struct, union, enum and
> > +# typedef
> > +#
> > +RE_struct = re.compile(r'\b(struct)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> > +RE_union = re.compile(r'\b(union)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> > +RE_enum = re.compile(r'\b(enum)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> > +RE_typedef = re.compile(r'\b(typedef)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
> > +
> >  #
> >  # Detects a reference to a documentation page of the form Documentation/... with
> >  # an optional extension
> >  #
> > -RE_doc = re.compile(r'Documentation(/[\w\-_/]+)(\.\w+)*')
> > +RE_doc = re.compile(r'\bDocumentation(/[\w\-_/]+)(\.\w+)*')
> > +
> > +#
> > +# Reserved C words that we should skip when cross-referencing
> > +#
> > +Skipnames = [ 'for', 'if', 'register', 'sizeof', 'struct', 'unsigned' ]
> > +
> >
> >  #
> >  # Many places in the docs refer to common system calls.  It is
> > @@ -48,9 +69,22 @@ def markup_refs(docname, app, node):
> >      #
> >      # Associate each regex with the function that will markup its matches
> >      #
> > -    markup_func = {RE_type: markup_c_ref,
> > -                   RE_function: markup_c_ref,
> > -                   RE_doc: markup_doc_ref}
> > +    markup_func_sphinx2 = {RE_doc: markup_doc_ref,
> > +                           RE_function: markup_c_ref,
> > +                           RE_generic_type: markup_c_ref}
> > +
> > +    markup_func_sphinx3 = {RE_doc: markup_doc_ref,
> > +                           RE_function: markup_c_ref,
> > +                           RE_struct: markup_c_ref,
> > +                           RE_union: markup_c_ref,
> > +                           RE_enum: markup_c_ref,
> > +                           RE_typedef: markup_c_ref}
> > +
> > +    if sphinx.__version__[0] == '3':
> > +        markup_func = markup_func_sphinx3
> > +    else:
> > +        markup_func = markup_func_sphinx2
> > +
> >      match_iterators = [regex.finditer(t) for regex in markup_func]
> >      #
> >      # Sort all references by the starting position in text
> > @@ -79,8 +113,24 @@ def markup_refs(docname, app, node):
> >  # type_name) with an appropriate cross reference.
> >  #
> >  def markup_c_ref(docname, app, match):
> > -    class_str = {RE_function: 'c-func', RE_type: 'c-type'}
> > -    reftype_str = {RE_function: 'function', RE_type: 'type'}
> > +    class_str = {RE_function: 'c-func',
> > +                 # Sphinx 2 only
> > +                 RE_generic_type: 'c-type',
> > +                 # Sphinx 3+ only
> > +                 RE_struct: 'c-struct',
> > +                 RE_union: 'c-union',
> > +                 RE_enum: 'c-enum',
> > +                 RE_typedef: 'c-type',
> > +                 }
> > +    reftype_str = {RE_function: 'function',
> > +                   # Sphinx 2 only
> > +                   RE_generic_type: 'type',
> > +                   # Sphinx 3+ only
> > +                   RE_struct: 'struct',
> > +                   RE_union: 'union',
> > +                   RE_enum: 'enum',
> > +                   RE_typedef: 'type',
> > +                   }
> >
> >      cdom = app.env.domains['c']
> >      #
> > @@ -89,7 +139,8 @@ def markup_c_ref(docname, app, match):
> >      target = match.group(2)
> >      target_text = nodes.Text(match.group(0))
> >      xref = None
> > -    if not (match.re == RE_function and target in Skipfuncs):
> > +    if not ((match.re == RE_function and target in Skipfuncs)
> > +            or (target in Skipnames)):
> >          lit_text = nodes.literal(classes=['xref', 'c', class_str[match.re]])
> >          lit_text += target_text
> >          pxref = addnodes.pending_xref('', refdomain = 'c',
>
>
>
> Thanks,
> Mauro

Thanks,
Nícolas


^ permalink raw reply	[flat|nested] 10+ messages in thread
* Re: [PATCH] docs: Make automarkup ready for Sphinx 3.1+
@ 2020-10-08  2:15 Nícolas F. R. A. Prado
  2020-10-08  2:47 ` Matthew Wilcox
  0 siblings, 1 reply; 10+ messages in thread
From: Nícolas F. R. A. Prado @ 2020-10-08  2:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jonathan Corbet, Mauro Carvalho Chehab, linux-doc, linux-kernel,
	lkcamp, andrealmeid

On Wed Oct 7, 2020 at 8:40 PM -03, Matthew Wilcox wrote:
>
> On Wed, Oct 07, 2020 at 11:12:25PM +0000, Nícolas F. R. A. Prado wrote:
> > While Sphinx 2 used a single c:type role for struct, union, enum and
> > typedef, Sphinx 3 uses a specific role for each one.
> > To keep backward compatibility, detect the Sphinx version and use the
> > correct roles for that version.
> >
> > Also, Sphinx 3 is more strict with its C domain and generated warnings,
> > exposing issues in the parsing.
> > To fix the warnings, make the C regexes use ASCII, ensure the
> > expressions only match the beginning of words and skip trying to
> > cross-reference C reserved words.
>
> Thank you for doing this!
>
> I have a feature request ... could you automarkup NULL as being
> :c:macro?
> Or maybe just anything matching \<[[:upper:]_[:digit:]]*\>
> (i may have my regex syntax confused ... a word composed of any
> arrangement of upper-case, digits and underscores.)

I think what you are suggesting are two separate things.

For NULL, what you're interested in is that it appears in a monospaced font, as
if written ``NULL``, right? As I don't think a cross-reference to "the NULL
macro definition" would make much sense.

While "anything containing only upper-case, digits and underscores" would
actually be for cross-referencing to the definition of the macro symbol in
question, right?

At the moment, this automarkup script is being used only for cross-referencing,
but it is indeed a generic automarkup script, and could be used for the
formatting of NULL.  But we also can't just make every upper-case word written
in monospaced font, as that doesn't always makes sense.

So if I understood your two requests correctly, I think we could:
1. Always automatically format NULL using a literal ``.
2. Try to cross-reference every upper-case word with the macro definition using
:c:macro, but if the cross-reference doesn't exist, format it normally, since
it's just normal text (this is what we're doing for C references at the moment).

What do you think?

Thanks,
Nícolas


^ permalink raw reply	[flat|nested] 10+ messages in thread
* [PATCH] docs: Make automarkup ready for Sphinx 3.1+
@ 2020-10-07 23:12 Nícolas F. R. A. Prado
  2020-10-07 23:40 ` Matthew Wilcox
  2020-10-08  5:27 ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 10+ messages in thread
From: Nícolas F. R. A. Prado @ 2020-10-07 23:12 UTC (permalink / raw)
  To: Jonathan Corbet, Mauro Carvalho Chehab
  Cc: linux-doc, linux-kernel, lkcamp, andrealmeid

While Sphinx 2 used a single c:type role for struct, union, enum and
typedef, Sphinx 3 uses a specific role for each one.
To keep backward compatibility, detect the Sphinx version and use the
correct roles for that version.

Also, Sphinx 3 is more strict with its C domain and generated warnings,
exposing issues in the parsing.
To fix the warnings, make the C regexes use ASCII, ensure the
expressions only match the beginning of words and skip trying to
cross-reference C reserved words.

Signed-off-by: Nícolas F. R. A. Prado <nfraprado@protonmail.com>
---

Hi,

after Mauro's series making everything ready for Sphinx 3.1, only the automarkup
was left to be ported.
This patch makes the needed changes to automarkup so that we can soon flip the
switch to Sphinx 3.1.

This change was tested both with Sphinx 2.4.4 and Sphinx 3.1.

This change doesn't add any warnings to the Documentation build.
I tested it with Mauro's series but later rebased it to docs-next, and it can be
accepted independently of that series.

I ended up doing more than one thing in this single patch, but since it was all
changing the same lines and for the same purpose, I felt it would be better to
keep it as a single commit.

Mauro,
if this patch is ok, the 3rd patch in your series, which disables automarkup for
sphinx 3, should be dropped.
Although I'm not sure what the implications of your patches adding namespaces
and using the c:macro for functions are.
All I did here was use the specific roles for sphinx 3 and fix the warnings, but
that was enough to get correct cross-references even after your series.

Thanks,
Nícolas


 Documentation/sphinx/automarkup.py | 69 ++++++++++++++++++++++++++----
 1 file changed, 60 insertions(+), 9 deletions(-)

diff --git a/Documentation/sphinx/automarkup.py b/Documentation/sphinx/automarkup.py
index a1b0f554cd82..fd1e927408ad 100644
--- a/Documentation/sphinx/automarkup.py
+++ b/Documentation/sphinx/automarkup.py
@@ -22,13 +22,34 @@ from itertools import chain
 # :c:func: block (i.e. ":c:func:`mmap()`s" flakes out), so the last
 # bit tries to restrict matches to things that won't create trouble.
 #
-RE_function = re.compile(r'(([\w_][\w\d_]+)\(\))')
-RE_type = re.compile(r'(struct|union|enum|typedef)\s+([\w_][\w\d_]+)')
+RE_function = re.compile(r'\b(([a-zA-Z_]\w+)\(\))', flags=re.ASCII)
+
+#
+# Sphinx 2 uses the same :c:type role for struct, union, enum and typedef
+#
+RE_generic_type = re.compile(r'\b(struct|union|enum|typedef)\s+([a-zA-Z_]\w+)',
+                             flags=re.ASCII)
+
+#
+# Sphinx 3 uses a different C role for each one of struct, union, enum and
+# typedef
+#
+RE_struct = re.compile(r'\b(struct)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
+RE_union = re.compile(r'\b(union)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
+RE_enum = re.compile(r'\b(enum)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
+RE_typedef = re.compile(r'\b(typedef)\s+([a-zA-Z_]\w+)', flags=re.ASCII)
+
 #
 # Detects a reference to a documentation page of the form Documentation/... with
 # an optional extension
 #
-RE_doc = re.compile(r'Documentation(/[\w\-_/]+)(\.\w+)*')
+RE_doc = re.compile(r'\bDocumentation(/[\w\-_/]+)(\.\w+)*')
+
+#
+# Reserved C words that we should skip when cross-referencing
+#
+Skipnames = [ 'for', 'if', 'register', 'sizeof', 'struct', 'unsigned' ]
+
 
 #
 # Many places in the docs refer to common system calls.  It is
@@ -48,9 +69,22 @@ def markup_refs(docname, app, node):
     #
     # Associate each regex with the function that will markup its matches
     #
-    markup_func = {RE_type: markup_c_ref,
-                   RE_function: markup_c_ref,
-                   RE_doc: markup_doc_ref}
+    markup_func_sphinx2 = {RE_doc: markup_doc_ref,
+                           RE_function: markup_c_ref,
+                           RE_generic_type: markup_c_ref}
+
+    markup_func_sphinx3 = {RE_doc: markup_doc_ref,
+                           RE_function: markup_c_ref,
+                           RE_struct: markup_c_ref,
+                           RE_union: markup_c_ref,
+                           RE_enum: markup_c_ref,
+                           RE_typedef: markup_c_ref}
+
+    if sphinx.__version__[0] == '3':
+        markup_func = markup_func_sphinx3
+    else:
+        markup_func = markup_func_sphinx2
+
     match_iterators = [regex.finditer(t) for regex in markup_func]
     #
     # Sort all references by the starting position in text
@@ -79,8 +113,24 @@ def markup_refs(docname, app, node):
 # type_name) with an appropriate cross reference.
 #
 def markup_c_ref(docname, app, match):
-    class_str = {RE_function: 'c-func', RE_type: 'c-type'}
-    reftype_str = {RE_function: 'function', RE_type: 'type'}
+    class_str = {RE_function: 'c-func',
+                 # Sphinx 2 only
+                 RE_generic_type: 'c-type',
+                 # Sphinx 3+ only
+                 RE_struct: 'c-struct',
+                 RE_union: 'c-union',
+                 RE_enum: 'c-enum',
+                 RE_typedef: 'c-type',
+                 }
+    reftype_str = {RE_function: 'function',
+                   # Sphinx 2 only
+                   RE_generic_type: 'type',
+                   # Sphinx 3+ only
+                   RE_struct: 'struct',
+                   RE_union: 'union',
+                   RE_enum: 'enum',
+                   RE_typedef: 'type',
+                   }
 
     cdom = app.env.domains['c']
     #
@@ -89,7 +139,8 @@ def markup_c_ref(docname, app, match):
     target = match.group(2)
     target_text = nodes.Text(match.group(0))
     xref = None
-    if not (match.re == RE_function and target in Skipfuncs):
+    if not ((match.re == RE_function and target in Skipfuncs)
+            or (target in Skipnames)):
         lit_text = nodes.literal(classes=['xref', 'c', class_str[match.re]])
         lit_text += target_text
         pxref = addnodes.pending_xref('', refdomain = 'c',
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-09  5:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-08 13:54 [PATCH] docs: Make automarkup ready for Sphinx 3.1+ Nícolas F. R. A. Prado
2020-10-09  5:53 ` Mauro Carvalho Chehab
  -- strict thread matches above, loose matches on Subject: below --
2020-10-08  2:15 Nícolas F. R. A. Prado
2020-10-08  2:47 ` Matthew Wilcox
2020-10-08  6:03   ` Mauro Carvalho Chehab
2020-10-08 11:31     ` Matthew Wilcox
2020-10-08 12:25       ` Mauro Carvalho Chehab
2020-10-07 23:12 Nícolas F. R. A. Prado
2020-10-07 23:40 ` Matthew Wilcox
2020-10-08  5:27 ` Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).