* [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
@ 2025-06-02 21:07 Lorenzo Stoakes
2025-06-02 21:38 ` Jonathan Corbet
2025-06-02 22:25 ` Jann Horn
0 siblings, 2 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-02 21:07 UTC (permalink / raw)
To: Andrew Morton
Cc: Suren Baghdasaryan, Liam R . Howlett, Vlastimil Babka,
Shakeel Butt, Jonathan Corbet, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
The process addresses documentation already contains a great deal of
information about mmap/VMA locking and page table traversal and
manipulation.
However it waves it hands about non-VMA traversal. Add a section for this
and explain the caveats around this kind of traversal.
Additionally, commit 6375e95f381e ("mm: pgtable: reclaim empty PTE page in
madvise(MADV_DONTNEED)") caused zapping to also free empty PTE page
tables. Highlight this and reference how this impacts ptdump non-VMA
traversal of userland mappings.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
Documentation/mm/process_addrs.rst | 58 ++++++++++++++++++++++++++----
1 file changed, 52 insertions(+), 6 deletions(-)
diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
index e6756e78b476..83166c2b47dc 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
1. **Traversing** page tables - Simply reading page tables in order to traverse
them. This only requires that the VMA is kept stable, so a lock which
establishes this suffices for traversal (there are also lockless variants
- which eliminate even this requirement, such as :c:func:`!gup_fast`).
+ which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
+ also a special case of page table traversal for non-VMA regions which we
+ consider separately below.
2. **Installing** page table mappings - Whether creating a new mapping or
modifying an existing one in such a way as to change its identity. This
requires that the VMA is kept stable via an mmap or VMA lock (explicitly not
@@ -335,15 +337,14 @@ ahead and perform these operations on page tables (though internally, kernel
operations that perform writes also acquire internal page table locks to
serialise - see the page table implementation detail section for more details).
+.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty PTE
+ page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
+ on zap. This does not change zapping locking requirements.
+
When **installing** page table entries, the mmap or VMA lock must be held to
keep the VMA stable. We explore why this is in the page table locking details
section below.
-.. warning:: Page tables are normally only traversed in regions covered by VMAs.
- If you want to traverse page tables in areas that might not be
- covered by VMAs, heavier locking is required.
- See :c:func:`!walk_page_range_novma` for details.
-
**Freeing** page tables is an entirely internal memory management operation and
has special requirements (see the page freeing section below for more details).
@@ -355,6 +356,47 @@ has special requirements (see the page freeing section below for more details).
from the reverse mappings, but no other VMAs can be permitted to be
accessible and span the specified range.
+Traversing non-VMA page tables
+------------------------------
+
+We've focused above on traversal of page tables belonging to VMAs. It is also
+possible to traverse page tables which are not represented by VMAs.
+
+Primarily this is used to traverse kernel page table mappings. In which case one
+must hold an mmap **read** lock on the :c:macro:`!init_mm` kernel instantiation
+of the :c:struct:`!struct mm_struct` metadata object, as performed in
+:c:func:`walk_page_range_novma`.
+
+This is generally sufficient to preclude other page table walkers (excluding
+vmalloc regions and memory hot plug) as the intermediate kernel page tables are
+not usually freed.
+
+For cases where they might be then the caller has to acquire the appropriate
+additional locks.
+
+The truly unusual case is the traversal of non-VMA ranges in **userland**
+ranges.
+
+This has only one user - the general page table dumping logic (implemented in
+:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug purposes
+even if they are highly unusual (possibly architecture-specific) and are not
+backed by a VMA.
+
+We must take great care in this case, as the :c:func:`!munmap` implementation
+detaches VMAs under an mmap write lock before tearing down page tables under a
+downgraded mmap read lock.
+
+This means such an operation could race with this, and thus an mmap **write**
+lock is required.
+
+.. warning:: A racing zap operation is problematic if it is performed without an
+ exclusive lock held - since v6.14 and commit 6375e95f381e PTEs may
+ be freed upon zap, so if this occurs the traversal might encounter
+ the same issue seen due to :c:func:`!munmap`'s use of a downgraded
+ mmap lock.
+
+ In this instance, additional appropriate locking is required.
+
Lock ordering
-------------
@@ -461,6 +503,10 @@ Locking Implementation Details
Page table locking details
--------------------------
+.. note:: This section explores page table locking requirements for page tables
+ encompassed by a VMA. See the above section on non-VMA page table
+ traversal for details on how we handle that case.
+
In addition to the locks described in the terminology section above, we have
additional locks dedicated to page tables:
--
2.49.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-02 21:07 [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal Lorenzo Stoakes
@ 2025-06-02 21:38 ` Jonathan Corbet
2025-06-03 10:56 ` Lorenzo Stoakes
2025-06-02 22:25 ` Jann Horn
1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-02 21:38 UTC (permalink / raw)
To: Lorenzo Stoakes, Andrew Morton
Cc: Suren Baghdasaryan, Liam R . Howlett, Vlastimil Babka,
Shakeel Butt, Jann Horn, Qi Zheng, linux-mm, linux-doc,
linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> --- a/Documentation/mm/process_addrs.rst
> +++ b/Documentation/mm/process_addrs.rst
> @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> 1. **Traversing** page tables - Simply reading page tables in order to traverse
> them. This only requires that the VMA is kept stable, so a lock which
> establishes this suffices for traversal (there are also lockless variants
> - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> + also a special case of page table traversal for non-VMA regions which we
The "!gup_fast" caught my attention - I was unaware that Sphinx had such
a thing. Its purpose would be to appear to suppress the generation of the
link that turns the cross reference into a cross reference.
The MM docs are full of these, do we know why?
I would recommend removing them unless there's some reason I don't see
for doing this. Also get rid of the :c:func: noise entirely - just
saying gup_fast() will do the right thing.
> +.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty
> PTE + page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
> + on zap. This does not change zapping locking requirements.
As a general rule, the docs should represent the current state of
affairs; people wanting documentation for older kernels are best advised
to look at those kernels. Or so it seems to me, anyway. So I'm not
sure we need the "since..." stuff.
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-02 21:07 [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal Lorenzo Stoakes
2025-06-02 21:38 ` Jonathan Corbet
@ 2025-06-02 22:25 ` Jann Horn
2025-06-03 10:45 ` Lorenzo Stoakes
1 sibling, 1 reply; 18+ messages in thread
From: Jann Horn @ 2025-06-02 22:25 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jonathan Corbet, Qi Zheng,
linux-mm, linux-doc, linux-kernel
On Mon, Jun 2, 2025 at 11:07 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> The process addresses documentation already contains a great deal of
> information about mmap/VMA locking and page table traversal and
> manipulation.
>
> However it waves it hands about non-VMA traversal. Add a section for this
> and explain the caveats around this kind of traversal.
>
> Additionally, commit 6375e95f381e ("mm: pgtable: reclaim empty PTE page in
> madvise(MADV_DONTNEED)") caused zapping to also free empty PTE page
> tables. Highlight this and reference how this impacts ptdump non-VMA
> traversal of userland mappings.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> Documentation/mm/process_addrs.rst | 58 ++++++++++++++++++++++++++----
> 1 file changed, 52 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> index e6756e78b476..83166c2b47dc 100644
> --- a/Documentation/mm/process_addrs.rst
> +++ b/Documentation/mm/process_addrs.rst
> @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> 1. **Traversing** page tables - Simply reading page tables in order to traverse
> them. This only requires that the VMA is kept stable, so a lock which
> establishes this suffices for traversal (there are also lockless variants
> - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> + also a special case of page table traversal for non-VMA regions which we
> + consider separately below.
> 2. **Installing** page table mappings - Whether creating a new mapping or
> modifying an existing one in such a way as to change its identity. This
> requires that the VMA is kept stable via an mmap or VMA lock (explicitly not
> @@ -335,15 +337,14 @@ ahead and perform these operations on page tables (though internally, kernel
> operations that perform writes also acquire internal page table locks to
> serialise - see the page table implementation detail section for more details).
>
> +.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty PTE
> + page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
> + on zap. This does not change zapping locking requirements.
> +
> When **installing** page table entries, the mmap or VMA lock must be held to
> keep the VMA stable. We explore why this is in the page table locking details
> section below.
>
> -.. warning:: Page tables are normally only traversed in regions covered by VMAs.
> - If you want to traverse page tables in areas that might not be
> - covered by VMAs, heavier locking is required.
> - See :c:func:`!walk_page_range_novma` for details.
> -
> **Freeing** page tables is an entirely internal memory management operation and
> has special requirements (see the page freeing section below for more details).
>
> @@ -355,6 +356,47 @@ has special requirements (see the page freeing section below for more details).
> from the reverse mappings, but no other VMAs can be permitted to be
> accessible and span the specified range.
>
> +Traversing non-VMA page tables
> +------------------------------
> +
> +We've focused above on traversal of page tables belonging to VMAs. It is also
> +possible to traverse page tables which are not represented by VMAs.
> +
> +Primarily this is used to traverse kernel page table mappings. In which case one
> +must hold an mmap **read** lock on the :c:macro:`!init_mm` kernel instantiation
> +of the :c:struct:`!struct mm_struct` metadata object, as performed in
> +:c:func:`walk_page_range_novma`.
My understanding is that kernel page tables are walked with no MM
locks held all the time. See for example:
- vmalloc_to_page()
- vmap()
- KASAN's shadow_mapped()
- apply_to_page_range() called from kasan_populate_vmalloc() or
arm64's set_direct_map_invalid_noflush()
This is possible because kernel-internal page tables are used for
allocations managed by kernel-internal users, and so things like the
lifetimes of page tables can be guaranteed by higher-level logic.
(Like "I own a vmalloc allocation in this range, so the page tables
can't change until I call vfree().")
The one way in which I think this is currently kinda yolo/broken is
that vmap_try_huge_pud() can end up freeing page tables via
pud_free_pmd_page(), while holding no MM locks AFAICS, so that could
race with the ptdump debug logic such that ptdump walks into freed
page tables?
I think the current rules for kernel page tables can be summarized as
"every kernel subsystem can make up its own rules for its regions of
virtual address space", which makes ptdump buggy because it can't
follow the different rules of all subsystems; and we should probably
change the rules to "every kernel subsystem can make up its own rules
except please take the init_mm's mmap lock when you delete page
tables".
> +This is generally sufficient to preclude other page table walkers (excluding
> +vmalloc regions and memory hot plug) as the intermediate kernel page tables are
> +not usually freed.
> +
> +For cases where they might be then the caller has to acquire the appropriate
> +additional locks.
> +
> +The truly unusual case is the traversal of non-VMA ranges in **userland**
> +ranges.
> +
> +This has only one user - the general page table dumping logic (implemented in
> +:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug purposes
> +even if they are highly unusual (possibly architecture-specific) and are not
> +backed by a VMA.
> +
> +We must take great care in this case, as the :c:func:`!munmap` implementation
> +detaches VMAs under an mmap write lock before tearing down page tables under a
> +downgraded mmap read lock.
> +
> +This means such an operation could race with this, and thus an mmap **write**
> +lock is required.
> +
> +.. warning:: A racing zap operation is problematic if it is performed without an
> + exclusive lock held - since v6.14 and commit 6375e95f381e PTEs may
> + be freed upon zap, so if this occurs the traversal might encounter
> + the same issue seen due to :c:func:`!munmap`'s use of a downgraded
> + mmap lock.
> +
> + In this instance, additional appropriate locking is required.
(I think we should take all the vma locks in that ptdump code and get
rid of this weird exception instead of documenting it.)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-02 22:25 ` Jann Horn
@ 2025-06-03 10:45 ` Lorenzo Stoakes
2025-06-03 18:36 ` Jann Horn
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 10:45 UTC (permalink / raw)
To: Jann Horn
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jonathan Corbet, Qi Zheng,
linux-mm, linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 12:25:36AM +0200, Jann Horn wrote:
> On Mon, Jun 2, 2025 at 11:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > The process addresses documentation already contains a great deal of
> > information about mmap/VMA locking and page table traversal and
> > manipulation.
> >
> > However it waves it hands about non-VMA traversal. Add a section for this
> > and explain the caveats around this kind of traversal.
> >
> > Additionally, commit 6375e95f381e ("mm: pgtable: reclaim empty PTE page in
> > madvise(MADV_DONTNEED)") caused zapping to also free empty PTE page
> > tables. Highlight this and reference how this impacts ptdump non-VMA
> > traversal of userland mappings.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > ---
> > Documentation/mm/process_addrs.rst | 58 ++++++++++++++++++++++++++----
> > 1 file changed, 52 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_addrs.rst
> > index e6756e78b476..83166c2b47dc 100644
> > --- a/Documentation/mm/process_addrs.rst
> > +++ b/Documentation/mm/process_addrs.rst
> > @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> > 1. **Traversing** page tables - Simply reading page tables in order to traverse
> > them. This only requires that the VMA is kept stable, so a lock which
> > establishes this suffices for traversal (there are also lockless variants
> > - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> > + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> > + also a special case of page table traversal for non-VMA regions which we
> > + consider separately below.
> > 2. **Installing** page table mappings - Whether creating a new mapping or
> > modifying an existing one in such a way as to change its identity. This
> > requires that the VMA is kept stable via an mmap or VMA lock (explicitly not
> > @@ -335,15 +337,14 @@ ahead and perform these operations on page tables (though internally, kernel
> > operations that perform writes also acquire internal page table locks to
> > serialise - see the page table implementation detail section for more details).
> >
> > +.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty PTE
> > + page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
> > + on zap. This does not change zapping locking requirements.
> > +
> > When **installing** page table entries, the mmap or VMA lock must be held to
> > keep the VMA stable. We explore why this is in the page table locking details
> > section below.
> >
> > -.. warning:: Page tables are normally only traversed in regions covered by VMAs.
> > - If you want to traverse page tables in areas that might not be
> > - covered by VMAs, heavier locking is required.
> > - See :c:func:`!walk_page_range_novma` for details.
> > -
> > **Freeing** page tables is an entirely internal memory management operation and
> > has special requirements (see the page freeing section below for more details).
> >
> > @@ -355,6 +356,47 @@ has special requirements (see the page freeing section below for more details).
> > from the reverse mappings, but no other VMAs can be permitted to be
> > accessible and span the specified range.
> >
> > +Traversing non-VMA page tables
> > +------------------------------
> > +
> > +We've focused above on traversal of page tables belonging to VMAs. It is also
> > +possible to traverse page tables which are not represented by VMAs.
> > +
> > +Primarily this is used to traverse kernel page table mappings. In which case one
> > +must hold an mmap **read** lock on the :c:macro:`!init_mm` kernel instantiation
> > +of the :c:struct:`!struct mm_struct` metadata object, as performed in
> > +:c:func:`walk_page_range_novma`.
>
> My understanding is that kernel page tables are walked with no MM
> locks held all the time. See for example:
>
> - vmalloc_to_page()
> - vmap()
> - KASAN's shadow_mapped()
> - apply_to_page_range() called from kasan_populate_vmalloc() or
> arm64's set_direct_map_invalid_noflush()
I explicitly mention vmalloc :) and the others are I guess the
'exceptions'.
Perhaps the better way of phrasing this though is to say
'when using this inteface, we stabilise on the mmap read lock, however
there's no guarantees anything else in the kernel will behave like this, we
rely on these page tables not being freed. In cases where they might be -
make sure you have the right locks'.
Which is what it's _kinda_ saying kinda. But maybe not clearly enough...
>
> This is possible because kernel-internal page tables are used for
> allocations managed by kernel-internal users, and so things like the
> lifetimes of page tables can be guaranteed by higher-level logic.
> (Like "I own a vmalloc allocation in this range, so the page tables
> can't change until I call vfree().")
Right.
>
> The one way in which I think this is currently kinda yolo/broken is
> that vmap_try_huge_pud() can end up freeing page tables via
> pud_free_pmd_page(), while holding no MM locks AFAICS, so that could
> race with the ptdump debug logic such that ptdump walks into freed
> page tables?
But those mappings would be kernel mappings? How could ptdump walk into
those?
>
> I think the current rules for kernel page tables can be summarized as
> "every kernel subsystem can make up its own rules for its regions of
> virtual address space", which makes ptdump buggy because it can't
> follow the different rules of all subsystems; and we should probably
> change the rules to "every kernel subsystem can make up its own rules
> except please take the init_mm's mmap lock when you delete page
> tables".
>
> > +This is generally sufficient to preclude other page table walkers (excluding
> > +vmalloc regions and memory hot plug) as the intermediate kernel page tables are
> > +not usually freed.
> > +
> > +For cases where they might be then the caller has to acquire the appropriate
> > +additional locks.
> > +
> > +The truly unusual case is the traversal of non-VMA ranges in **userland**
> > +ranges.
> > +
> > +This has only one user - the general page table dumping logic (implemented in
> > +:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug purposes
> > +even if they are highly unusual (possibly architecture-specific) and are not
> > +backed by a VMA.
> > +
> > +We must take great care in this case, as the :c:func:`!munmap` implementation
> > +detaches VMAs under an mmap write lock before tearing down page tables under a
> > +downgraded mmap read lock.
> > +
> > +This means such an operation could race with this, and thus an mmap **write**
> > +lock is required.
> > +
> > +.. warning:: A racing zap operation is problematic if it is performed without an
> > + exclusive lock held - since v6.14 and commit 6375e95f381e PTEs may
> > + be freed upon zap, so if this occurs the traversal might encounter
> > + the same issue seen due to :c:func:`!munmap`'s use of a downgraded
> > + mmap lock.
> > +
> > + In this instance, additional appropriate locking is required.
>
> (I think we should take all the vma locks in that ptdump code and get
> rid of this weird exception instead of documenting it.)
We really need to be sure that there aren't some weird architectures doing
weird things or circumstances where this is meaningful.
I mean people went to great lengths to make this possible, I find it hard
to believe there aren't some _weird_ real world use cases.
At any rate my soon-to-be-coming patch will separate the kernel use from
this weirdo use so we can fully isolate it.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-02 21:38 ` Jonathan Corbet
@ 2025-06-03 10:56 ` Lorenzo Stoakes
2025-06-03 11:24 ` Lorenzo Stoakes
2025-06-03 14:08 ` Jonathan Corbet
0 siblings, 2 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 10:56 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
On Mon, Jun 02, 2025 at 03:38:55PM -0600, Jonathan Corbet wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>
> > --- a/Documentation/mm/process_addrs.rst
> > +++ b/Documentation/mm/process_addrs.rst
> > @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> > 1. **Traversing** page tables - Simply reading page tables in order to traverse
> > them. This only requires that the VMA is kept stable, so a lock which
> > establishes this suffices for traversal (there are also lockless variants
> > - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> > + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> > + also a special case of page table traversal for non-VMA regions which we
>
> The "!gup_fast" caught my attention - I was unaware that Sphinx had such
> a thing. Its purpose would be to appear to suppress the generation of the
> link that turns the cross reference into a cross reference.
>
> The MM docs are full of these, do we know why?
Removing it from the struct vm_area_struct struct immediately give:
/home/lorenzo/kerndev/kernels/mm/Documentation/mm/process_addrs.rst:11: WARNING: Unparseable C cross-reference: 'struct vm_area_struct'
Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
struct vm_area_struct
And given C's weirdness with typing I really prefer to be explicit in
referencing a struct vs. e.g. a typedef.
At any rate I'm not sure it's all that useful to cross-reference these?
Any such change would need to be a separate patch anyway or otherwise this
becomes a 'add additional documentation and drop cross-refs'.
>
> I would recommend removing them unless there's some reason I don't see
> for doing this. Also get rid of the :c:func: noise entirely - just
> saying gup_fast() will do the right thing.
Re: the c:func: stuff -
Well, the right thing is making function + type names clearly discernable, and
it just putting in the function name like that absolutely does not do the right
thing in that respect.
I feel strongly on this, as I've tried it both ways and it's a _really_ big
difference in how readable the document is.
I spent a lot of time trying to make it as readable as possible (given the
complexity) so would really rather not do anything that would hurt that.
>
> > +.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty
> > PTE + page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
> > + on zap. This does not change zapping locking requirements.
>
> As a general rule, the docs should represent the current state of
> affairs; people wanting documentation for older kernels are best advised
> to look at those kernels. Or so it seems to me, anyway. So I'm not
> sure we need the "since..." stuff.
Sure, I will drop this.
>
> Thanks,
>
> jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 10:56 ` Lorenzo Stoakes
@ 2025-06-03 11:24 ` Lorenzo Stoakes
2025-06-03 14:01 ` Jonathan Corbet
2025-06-03 14:08 ` Jonathan Corbet
1 sibling, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 11:24 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 11:56:37AM +0100, Lorenzo Stoakes wrote:
> On Mon, Jun 02, 2025 at 03:38:55PM -0600, Jonathan Corbet wrote:
> > Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> >
> > > --- a/Documentation/mm/process_addrs.rst
> > > +++ b/Documentation/mm/process_addrs.rst
> > > @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> > > 1. **Traversing** page tables - Simply reading page tables in order to traverse
> > > them. This only requires that the VMA is kept stable, so a lock which
> > > establishes this suffices for traversal (there are also lockless variants
> > > - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> > > + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> > > + also a special case of page table traversal for non-VMA regions which we
> >
> > The "!gup_fast" caught my attention - I was unaware that Sphinx had such
> > a thing. Its purpose would be to appear to suppress the generation of the
> > link that turns the cross reference into a cross reference.
> >
> > The MM docs are full of these, do we know why?
>
> Removing it from the struct vm_area_struct struct immediately give:
>
> /home/lorenzo/kerndev/kernels/mm/Documentation/mm/process_addrs.rst:11: WARNING: Unparseable C cross-reference: 'struct vm_area_struct'
> Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
> struct vm_area_struct
>
> And given C's weirdness with typing I really prefer to be explicit in
> referencing a struct vs. e.g. a typedef.
>
> At any rate I'm not sure it's all that useful to cross-reference these?
>
> Any such change would need to be a separate patch anyway or otherwise this
> becomes a 'add additional documentation and drop cross-refs'.
>
> >
> > I would recommend removing them unless there's some reason I don't see
> > for doing this. Also get rid of the :c:func: noise entirely - just
> > saying gup_fast() will do the right thing.
>
> Re: the c:func: stuff -
>
> Well, the right thing is making function + type names clearly discernable, and
> it just putting in the function name like that absolutely does not do the right
> thing in that respect.
>
> I feel strongly on this, as I've tried it both ways and it's a _really_ big
> difference in how readable the document is.
>
> I spent a lot of time trying to make it as readable as possible (given the
> complexity) so would really rather not do anything that would hurt that.
>
Somebody told me that in _other_ .rst's, seemingly, it does figure out xxx() ->
function and highlights it like this.
But for me, it does not... :)
In case that's something you assumed would happen here.
This is against me building locally with:
make SPHINXDIRS=mm htmldocs
> >
> > > +.. note:: Since v6.14 and commit 6375e95f381e ("mm: pgtable: reclaim empty
> > > PTE + page in madvise (MADV_DONTNEED)"), we now also free empty PTE tables
> > > + on zap. This does not change zapping locking requirements.
> >
> > As a general rule, the docs should represent the current state of
> > affairs; people wanting documentation for older kernels are best advised
> > to look at those kernels. Or so it seems to me, anyway. So I'm not
> > sure we need the "since..." stuff.
>
> Sure, I will drop this.
>
> >
> > Thanks,
> >
> > jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 11:24 ` Lorenzo Stoakes
@ 2025-06-03 14:01 ` Jonathan Corbet
2025-06-03 14:11 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 14:01 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>> Re: the c:func: stuff -
>>
>> Well, the right thing is making function + type names clearly discernable, and
>> it just putting in the function name like that absolutely does not do the right
>> thing in that respect.
>>
>> I feel strongly on this, as I've tried it both ways and it's a _really_ big
>> difference in how readable the document is.
>>
>> I spent a lot of time trying to make it as readable as possible (given the
>> complexity) so would really rather not do anything that would hurt that.
>>
>
> Somebody told me that in _other_ .rst's, seemingly, it does figure out xxx() ->
> function and highlights it like this.
>
> But for me, it does not... :)
OK ... If you look at what's going on, some of the functions will be
marked, others not. The difference is that there is no markup for
functions where a cross-reference cannot be made (because they are
undocumented).
We could easily change the automarkup code to always do the markup; the
problem with that (which is also a problem with the existing markup
under Documentation/mm) is you'll have rendered text that looks like a
cross-reference link, but which is not. We also lose a clue as to which
functions are still in need of documentation.
The right answer might be to mark them up differently, I guess.
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 10:56 ` Lorenzo Stoakes
2025-06-03 11:24 ` Lorenzo Stoakes
@ 2025-06-03 14:08 ` Jonathan Corbet
2025-06-03 14:24 ` Lorenzo Stoakes
1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 14:08 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> On Mon, Jun 02, 2025 at 03:38:55PM -0600, Jonathan Corbet wrote:
>> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>>
>> > --- a/Documentation/mm/process_addrs.rst
>> > +++ b/Documentation/mm/process_addrs.rst
>> > @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
>> > 1. **Traversing** page tables - Simply reading page tables in order to traverse
>> > them. This only requires that the VMA is kept stable, so a lock which
>> > establishes this suffices for traversal (there are also lockless variants
>> > - which eliminate even this requirement, such as :c:func:`!gup_fast`).
>> > + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
>> > + also a special case of page table traversal for non-VMA regions which we
>>
>> The "!gup_fast" caught my attention - I was unaware that Sphinx had such
>> a thing. Its purpose would be to appear to suppress the generation of the
>> link that turns the cross reference into a cross reference.
>>
>> The MM docs are full of these, do we know why?
>
> Removing it from the struct vm_area_struct struct immediately give:
>
> /home/lorenzo/kerndev/kernels/mm/Documentation/mm/process_addrs.rst:11: WARNING: Unparseable C cross-reference: 'struct vm_area_struct'
> Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
> struct vm_area_struct
>
> And given C's weirdness with typing I really prefer to be explicit in
> referencing a struct vs. e.g. a typedef.
That's because the :c:struct: markup doesn't want the word "struct" in
there. In this case, the "!" is being used, essentially, to hide the
fact that the Sphinx markup is being entirely misused here. You would
be far better off just saying:
**struct vm_area_struct**
and avoiding the uglier markup in this case.
Once again, taking out the markup entirely will cause the automarkup
code to do the right thing, with the proviso that undocumented
structures (which, tragically, includes struct vm_area_struct) won't be
marked up in the current implementation. By far the best solution here
is to remove all of the markup and add a kerneldoc comment for this
rather important structure.
> At any rate I'm not sure it's all that useful to cross-reference these?
Why would you *not* want to cross-reference something and make life easier
for your reader?
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:01 ` Jonathan Corbet
@ 2025-06-03 14:11 ` Lorenzo Stoakes
2025-06-03 14:33 ` Jonathan Corbet
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 14:11 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 08:01:02AM -0600, Jonathan Corbet wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>
> >> Re: the c:func: stuff -
> >>
> >> Well, the right thing is making function + type names clearly discernable, and
> >> it just putting in the function name like that absolutely does not do the right
> >> thing in that respect.
> >>
> >> I feel strongly on this, as I've tried it both ways and it's a _really_ big
> >> difference in how readable the document is.
> >>
> >> I spent a lot of time trying to make it as readable as possible (given the
> >> complexity) so would really rather not do anything that would hurt that.
> >>
> >
> > Somebody told me that in _other_ .rst's, seemingly, it does figure out xxx() ->
> > function and highlights it like this.
> >
> > But for me, it does not... :)
>
> OK ... If you look at what's going on, some of the functions will be
> marked, others not. The difference is that there is no markup for
> functions where a cross-reference cannot be made (because they are
> undocumented).
>
> We could easily change the automarkup code to always do the markup; the
> problem with that (which is also a problem with the existing markup
> under Documentation/mm) is you'll have rendered text that looks like a
> cross-reference link, but which is not. We also lose a clue as to which
> functions are still in need of documentation.
Isn't it a pretty egregious requirement to require documentation of every
referenced function?
I mean if that were a known requirement I'd simply not have written this
document at all, frankly.
And it's one I feel is really quite important, since this behaviour is
complicated, confusing and has led to bugs, including security flaws.
I really think we have to be careful about having barriers in the way of
people writing documentation as much as possible.
>
> The right answer might be to mark them up differently, I guess.
But... what I'm doing here, and what mm does elsewhere works perfectly fine? Why
do we need something new?
Surely this cross-referencing stuff is more useful for API documentation
that explicitly intends to describe functions like this?
>
> jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:08 ` Jonathan Corbet
@ 2025-06-03 14:24 ` Lorenzo Stoakes
2025-06-03 14:37 ` Jonathan Corbet
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 14:24 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 08:08:22AM -0600, Jonathan Corbet wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>
> > On Mon, Jun 02, 2025 at 03:38:55PM -0600, Jonathan Corbet wrote:
> >> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> >>
> >> > --- a/Documentation/mm/process_addrs.rst
> >> > +++ b/Documentation/mm/process_addrs.rst
> >> > @@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
> >> > 1. **Traversing** page tables - Simply reading page tables in order to traverse
> >> > them. This only requires that the VMA is kept stable, so a lock which
> >> > establishes this suffices for traversal (there are also lockless variants
> >> > - which eliminate even this requirement, such as :c:func:`!gup_fast`).
> >> > + which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
> >> > + also a special case of page table traversal for non-VMA regions which we
> >>
> >> The "!gup_fast" caught my attention - I was unaware that Sphinx had such
> >> a thing. Its purpose would be to appear to suppress the generation of the
> >> link that turns the cross reference into a cross reference.
> >>
> >> The MM docs are full of these, do we know why?
> >
> > Removing it from the struct vm_area_struct struct immediately give:
> >
> > /home/lorenzo/kerndev/kernels/mm/Documentation/mm/process_addrs.rst:11: WARNING: Unparseable C cross-reference: 'struct vm_area_struct'
> > Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
> > struct vm_area_struct
> >
> > And given C's weirdness with typing I really prefer to be explicit in
> > referencing a struct vs. e.g. a typedef.
>
> That's because the :c:struct: markup doesn't want the word "struct" in
> there. In this case, the "!" is being used, essentially, to hide the
> fact that the Sphinx markup is being entirely misused here. You would
> be far better off just saying:
>
> **struct vm_area_struct**
>
> and avoiding the uglier markup in this case.
I can go change that.
But to repeat - 'given C's weirdness with typing I really prefer to be
explicit in referencing a struct vs. e.g. a typedef.'
I'm not doing this with an intent of Sphix misuse, it's with intent of
being as clear as possible.
>
> Once again, taking out the markup entirely will cause the automarkup
> code to do the right thing, with the proviso that undocumented
> structures (which, tragically, includes struct vm_area_struct) won't be
> marked up in the current implementation. By far the best solution here
> is to remove all of the markup and add a kerneldoc comment for this
> rather important structure.
>
> > At any rate I'm not sure it's all that useful to cross-reference these?
>
> Why would you *not* want to cross-reference something and make life easier
> for your reader?
Because it apparently requires me to document every function I reference?
Unless I'm missing something?
I may be misunderstanding you.
If not then fine, I can delay this patch, go off and do a 'cleanup' patch
first, that will drop the '!'s and come back to this.
But if I need to document every referenced function that just isn't
feasible for me with my current workload.
Please clarify!
>
> Thanks,
>
> jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:11 ` Lorenzo Stoakes
@ 2025-06-03 14:33 ` Jonathan Corbet
0 siblings, 0 replies; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 14:33 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>> OK ... If you look at what's going on, some of the functions will be
>> marked, others not. The difference is that there is no markup for
>> functions where a cross-reference cannot be made (because they are
>> undocumented).
>>
>> We could easily change the automarkup code to always do the markup; the
>> problem with that (which is also a problem with the existing markup
>> under Documentation/mm) is you'll have rendered text that looks like a
>> cross-reference link, but which is not. We also lose a clue as to which
>> functions are still in need of documentation.
>
> Isn't it a pretty egregious requirement to require documentation of every
> referenced function?
>
> I mean if that were a known requirement I'd simply not have written this
> document at all, frankly.
Who said anything about it being a requirement? I think we have gone
way off track here.
Certainly it would be *nice* to have all of that stuff documented, and I
think there is value in giving a visual clue for stuff that lacks
documentation, but I never said anything about requirements.
>> The right answer might be to mark them up differently, I guess.
>
> But... what I'm doing here, and what mm does elsewhere works perfectly fine? Why
> do we need something new?
Because you're thinking in terms of the rendered docs, and not the
people who read the plain text. "function()" is far more readable than
":c:func:`!function()`", don't you agree? And rather easier to write as
well.
> Surely this cross-referencing stuff is more useful for API documentation
> that explicitly intends to describe functions like this?
The cross-referencing is *to* the API documentation. Again, you are
mentioning a function, why would you *not* want to let the system
automatically link to that function's documentation? Especially since
you're using markup that is explicitly provided for exactly that
purpose? Does the link cause some sort of harm?
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:24 ` Lorenzo Stoakes
@ 2025-06-03 14:37 ` Jonathan Corbet
2025-06-03 14:52 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 14:37 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> But to repeat - 'given C's weirdness with typing I really prefer to be
> explicit in referencing a struct vs. e.g. a typedef.'
...and I think that makes perfect sense.
>> Why would you *not* want to cross-reference something and make life easier
>> for your reader?
>
> Because it apparently requires me to document every function I reference?
> Unless I'm missing something?
>
> I may be misunderstanding you.
>
> If not then fine, I can delay this patch, go off and do a 'cleanup' patch
> first, that will drop the '!'s and come back to this.
>
> But if I need to document every referenced function that just isn't
> feasible for me with my current workload.
>
> Please clarify!
Hopefully I already have - I'm in no position to enforce such a
requirement, even if I thought it would be a good thing -- and I don't.
It's hard enough to get documentation written as it is, I certainly
don't want to make it harder.
My suggestion would be: proceed with your changes for now, it was never
my purpose to put obstacles there. I'll look at having automarkup do
something a bit more useful for references that lack documentation, then
maybe I'll do a cleanup pass on some of the mm docs if nobody else gets
there first.
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:37 ` Jonathan Corbet
@ 2025-06-03 14:52 ` Lorenzo Stoakes
2025-06-03 15:05 ` Jonathan Corbet
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 14:52 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Let me reply in one place as we're currently having 2 largely similar
conversations in parallel which is unhelpful... :)
On Tue, Jun 03, 2025 at 08:37:23AM -0600, Jonathan Corbet wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>
> > But to repeat - 'given C's weirdness with typing I really prefer to be
> > explicit in referencing a struct vs. e.g. a typedef.'
>
> ...and I think that makes perfect sense.
>
> >> Why would you *not* want to cross-reference something and make life easier
> >> for your reader?
> >
> > Because it apparently requires me to document every function I reference?
> > Unless I'm missing something?
> >
> > I may be misunderstanding you.
> >
> > If not then fine, I can delay this patch, go off and do a 'cleanup' patch
> > first, that will drop the '!'s and come back to this.
> >
> > But if I need to document every referenced function that just isn't
> > feasible for me with my current workload.
> >
> > Please clarify!
>
> Hopefully I already have - I'm in no position to enforce such a
> requirement, even if I thought it would be a good thing -- and I don't.
I think Andrew would think otherwise :)
Everybody respects you a great deal, myself included, so your opinion
counts for a LOT.
> It's hard enough to get documentation written as it is, I certainly
> don't want to make it harder.
Yeah, I mean I take your points and it's important to me to make sure I've
done this as well as I can, but this is my main concern here. Often times
this stuff feels rather thankless... so we must maintain a balance here I
feel.
Anyway, I think we can figure out a good solution here that should
hopefully be satisfactory for all (see below...)
>
> My suggestion would be: proceed with your changes for now, it was never
> my purpose to put obstacles there. I'll look at having automarkup do
> something a bit more useful for references that lack documentation, then
> maybe I'll do a cleanup pass on some of the mm docs if nobody else gets
> there first.
>
> Thanks,
>
> jon
Thanks, I appreciate that. So I want to address your concerns as well as I
can. I think I have misunderstood you a little bit here too (text is a poor
medium, yada yada) so let me try to nail down what I feel is the sensible
way forward:
1. Once I am confident I have correctly addressed Jann's feedback I'll
respin a v2 with the various 'sins' in place for the time being.
2. I will also drop the 'since v6.14' stuff you rightly raised in this
respin.
3. I will create a follow-up series to address these issues in this file
-in general-:
- Drop '!' from every reference so we get automated cross-referencing - I
think now I understand the point (hopefully!) that Sphinx with
automagically link every unique reference to a function/struct/etc. to
one another.
- Perhaps hack in a **struct ** prefix so we get the 'best of both worlds'
on this for types...?
I think my misapprehension about defining functions was not realising that
by doing :c:func:etc without the ! would automatically provide that
definition upon first reference to that function/struct/etc.?
Is that correct/sensible?
Would you want me to only use the :c:func: stuff in the _first_ mention of
a function and then to not use it from then on?
I wonder if the _appropriate_ use of :c:func:...: is in the actual
definition, but since it's not really practical to do that right now* is
simply doing it upon first mention a sensible 'least worst' approach here?
Let me know what makes sense!
Thanks,
Lorenzo
*I could add to my TODO to ensure we have at least kerneldoc descriptions
for every referenced function, and gradually burn these down as I add
them, I just can't guarantee you this will happen any time soon :)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 14:52 ` Lorenzo Stoakes
@ 2025-06-03 15:05 ` Jonathan Corbet
2025-06-03 15:14 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 15:05 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> Thanks, I appreciate that. So I want to address your concerns as well as I
> can. I think I have misunderstood you a little bit here too (text is a poor
> medium, yada yada) so let me try to nail down what I feel is the sensible
> way forward:
>
> 1. Once I am confident I have correctly addressed Jann's feedback I'll
> respin a v2 with the various 'sins' in place for the time being.
>
> 2. I will also drop the 'since v6.14' stuff you rightly raised in this
> respin.
So far so good
> 3. I will create a follow-up series to address these issues in this file
> -in general-:
>
> - Drop '!' from every reference so we get automated cross-referencing - I
> think now I understand the point (hopefully!) that Sphinx with
> automagically link every unique reference to a function/struct/etc. to
> one another.
If you just drop the "!" you'll run into the "struct" problem you
mentioned before. You'll need to take out "struct" as well if you go
this route...
> - Perhaps hack in a **struct ** prefix so we get the 'best of both worlds'
> on this for types...?
...so yes you'd need to do that.
> I think my misapprehension about defining functions was not realising that
> by doing :c:func:etc without the ! would automatically provide that
> definition upon first reference to that function/struct/etc.?
>
> Is that correct/sensible?
>
> Would you want me to only use the :c:func: stuff in the _first_ mention of
> a function and then to not use it from then on?
>
> I wonder if the _appropriate_ use of :c:func:...: is in the actual
> definition, but since it's not really practical to do that right now* is
> simply doing it upon first mention a sensible 'least worst' approach here?
Here, I think, we've gone a bit off track again. The goal of the
automarkup code was to *never* need to use the :c:func: markup. Let's
just say that ... certain members of our community ... found that markup
entirely intolerable - and, in truth, it is ugly. So I wrote the
initial automarkup extension; now, any time that the docs build sees
function(), it looks for documentation for that function and creates a
cross-reference if that documentation is found.
The goal is that you should never need the :c:gunk: ever.
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 15:05 ` Jonathan Corbet
@ 2025-06-03 15:14 ` Lorenzo Stoakes
2025-06-03 15:28 ` Jonathan Corbet
0 siblings, 1 reply; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 15:14 UTC (permalink / raw)
To: Jonathan Corbet
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 09:05:32AM -0600, Jonathan Corbet wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
>
> > Thanks, I appreciate that. So I want to address your concerns as well as I
> > can. I think I have misunderstood you a little bit here too (text is a poor
> > medium, yada yada) so let me try to nail down what I feel is the sensible
> > way forward:
> >
> > 1. Once I am confident I have correctly addressed Jann's feedback I'll
> > respin a v2 with the various 'sins' in place for the time being.
> >
> > 2. I will also drop the 'since v6.14' stuff you rightly raised in this
> > respin.
>
> So far so good
>
> > 3. I will create a follow-up series to address these issues in this file
> > -in general-:
> >
> > - Drop '!' from every reference so we get automated cross-referencing - I
> > think now I understand the point (hopefully!) that Sphinx with
> > automagically link every unique reference to a function/struct/etc. to
> > one another.
>
> If you just drop the "!" you'll run into the "struct" problem you
> mentioned before. You'll need to take out "struct" as well if you go
> this route...
Yeah I will do so...
>
> > - Perhaps hack in a **struct ** prefix so we get the 'best of both worlds'
> > on this for types...?
>
> ...so yes you'd need to do that.
...with this hack as needed :)
>
> > I think my misapprehension about defining functions was not realising that
> > by doing :c:func:etc without the ! would automatically provide that
> > definition upon first reference to that function/struct/etc.?
> >
> > Is that correct/sensible?
> >
> > Would you want me to only use the :c:func: stuff in the _first_ mention of
> > a function and then to not use it from then on?
> >
> > I wonder if the _appropriate_ use of :c:func:...: is in the actual
> > definition, but since it's not really practical to do that right now* is
> > simply doing it upon first mention a sensible 'least worst' approach here?
>
> Here, I think, we've gone a bit off track again. The goal of the
> automarkup code was to *never* need to use the :c:func: markup. Let's
> just say that ... certain members of our community ... found that markup
> entirely intolerable - and, in truth, it is ugly. So I wrote the
> initial automarkup extension; now, any time that the docs build sees
> function(), it looks for documentation for that function and creates a
> cross-reference if that documentation is found.
>
> The goal is that you should never need the :c:gunk: ever.
>
> Thanks,
>
> jon
OK thanks for clarifying, so let's do a take 2 of the action items:
1. Once I am confident I have correctly addressed Jann's feedback I'll
respin a v2 with the various 'sins' in place for the time being.
2. I will also drop the 'since v6.14' stuff you rightly raised in this
respin.
3. I will create a follow-up series to address these issues in this file
-in general-.
4. Drop '!' from every reference so we get automated cross-referencing (with the
** struct ** hack as needed).
5. Where possible see if we have functions documented, and if so avoid the
:c:... noise. If we can't avoid it for now, note down the functions and add
to todo to get documented. We can remove the gunk as we go...
A couple questions on point 5:
- When you say 'documentation', do you mean the /** kernel-doc stuff?
- Does running `make SPHINXDIRS=mm htmldocs` suffice to have this script run? As
this is how I've been previewing my changes so far!
Thanks,
Lorenzo
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 15:14 ` Lorenzo Stoakes
@ 2025-06-03 15:28 ` Jonathan Corbet
0 siblings, 0 replies; 18+ messages in thread
From: Jonathan Corbet @ 2025-06-03 15:28 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jann Horn, Qi Zheng, linux-mm,
linux-doc, linux-kernel
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> writes:
> OK thanks for clarifying, so let's do a take 2 of the action items:
>
> 1. Once I am confident I have correctly addressed Jann's feedback I'll
> respin a v2 with the various 'sins' in place for the time being.
>
> 2. I will also drop the 'since v6.14' stuff you rightly raised in this
> respin.
>
> 3. I will create a follow-up series to address these issues in this file
> -in general-.
>
> 4. Drop '!' from every reference so we get automated cross-referencing (with the
> ** struct ** hack as needed).
>
> 5. Where possible see if we have functions documented, and if so avoid the
> :c:... noise. If we can't avoid it for now, note down the functions and add
> to todo to get documented. We can remove the gunk as we go...
This one I don't get - what is the noise you are talking about? Again,
you shouldn't need :c:func: at all...?
There will surely be functions (and structs) that are not documented;
trying to do them all probably leads to a point of diminishing returns.
But forward progress on the more important ones is always good.
> A couple questions on point 5:
>
> - When you say 'documentation', do you mean the /** kernel-doc stuff?
Yes, that is the documentation that will be cross-referenced.
> - Does running `make SPHINXDIRS=mm htmldocs` suffice to have this script run? As
> this is how I've been previewing my changes so far!
Yes, the automarkup extension runs with any Sphinx build.
Thanks,
jon
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 10:45 ` Lorenzo Stoakes
@ 2025-06-03 18:36 ` Jann Horn
2025-06-03 18:52 ` Lorenzo Stoakes
0 siblings, 1 reply; 18+ messages in thread
From: Jann Horn @ 2025-06-03 18:36 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jonathan Corbet, Qi Zheng,
linux-mm, linux-doc, linux-kernel
On Tue, Jun 3, 2025 at 12:45 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> On Tue, Jun 03, 2025 at 12:25:36AM +0200, Jann Horn wrote:
> > The one way in which I think this is currently kinda yolo/broken is
> > that vmap_try_huge_pud() can end up freeing page tables via
> > pud_free_pmd_page(), while holding no MM locks AFAICS, so that could
> > race with the ptdump debug logic such that ptdump walks into freed
> > page tables?
>
> But those mappings would be kernel mappings? How could ptdump walk into
> those?
/sys/kernel/debug/page_tables/kernel dumps kernel page tables. And I
think /sys/kernel/debug/page_tables/current_kernel dumps page tables
for the entire address space including both userspace and kernel.
> > (I think we should take all the vma locks in that ptdump code and get
> > rid of this weird exception instead of documenting it.)
>
> We really need to be sure that there aren't some weird architectures doing
> weird things or circumstances where this is meaningful.
>
> I mean people went to great lengths to make this possible, I find it hard
> to believe there aren't some _weird_ real world use cases.
FWIW, looking through the git logs for the x86 version of it, it seems
to mainly be used by developers of arch-specific code trying to
debug/validate kernel behavior.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal
2025-06-03 18:36 ` Jann Horn
@ 2025-06-03 18:52 ` Lorenzo Stoakes
0 siblings, 0 replies; 18+ messages in thread
From: Lorenzo Stoakes @ 2025-06-03 18:52 UTC (permalink / raw)
To: Jann Horn
Cc: Andrew Morton, Suren Baghdasaryan, Liam R . Howlett,
Vlastimil Babka, Shakeel Butt, Jonathan Corbet, Qi Zheng,
linux-mm, linux-doc, linux-kernel
On Tue, Jun 03, 2025 at 08:36:13PM +0200, Jann Horn wrote:
> On Tue, Jun 3, 2025 at 12:45 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > On Tue, Jun 03, 2025 at 12:25:36AM +0200, Jann Horn wrote:
> > > The one way in which I think this is currently kinda yolo/broken is
> > > that vmap_try_huge_pud() can end up freeing page tables via
> > > pud_free_pmd_page(), while holding no MM locks AFAICS, so that could
> > > race with the ptdump debug logic such that ptdump walks into freed
> > > page tables?
> >
> > But those mappings would be kernel mappings? How could ptdump walk into
> > those?
>
> /sys/kernel/debug/page_tables/kernel dumps kernel page tables. And I
> think /sys/kernel/debug/page_tables/current_kernel dumps page tables
> for the entire address space including both userspace and kernel.
Ugh god, so it's checking kernel stuff while passing an mm_struct for the
process?? Or does it pass init_mm in that case? I mean that breaks my proposed
patch right away if so...
And... yes it does. Brilliant.
ptdump_walk_pgd_level_core(NULL, &init_mm, pgd, true, false);
God almighty.
>
> > > (I think we should take all the vma locks in that ptdump code and get
> > > rid of this weird exception instead of documenting it.)
> >
> > We really need to be sure that there aren't some weird architectures doing
> > weird things or circumstances where this is meaningful.
> >
> > I mean people went to great lengths to make this possible, I find it hard
> > to believe there aren't some _weird_ real world use cases.
>
> FWIW, looking through the git logs for the x86 version of it, it seems
> to mainly be used by developers of arch-specific code trying to
> debug/validate kernel behavior.
So yeah they just want something that can traverse both I guess? But the
comments in the code seem to strongly suggest that we must lock carefully...
Maybe this whole thing was literally to make life easier to have the same
function check both kernel and non-kernel mappings. I mean...
OK let me reply to my own series to say 'nope'.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-06-03 18:52 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 21:07 [PATCH] docs/mm: expand vma doc to highlight pte freeing, non-vma traversal Lorenzo Stoakes
2025-06-02 21:38 ` Jonathan Corbet
2025-06-03 10:56 ` Lorenzo Stoakes
2025-06-03 11:24 ` Lorenzo Stoakes
2025-06-03 14:01 ` Jonathan Corbet
2025-06-03 14:11 ` Lorenzo Stoakes
2025-06-03 14:33 ` Jonathan Corbet
2025-06-03 14:08 ` Jonathan Corbet
2025-06-03 14:24 ` Lorenzo Stoakes
2025-06-03 14:37 ` Jonathan Corbet
2025-06-03 14:52 ` Lorenzo Stoakes
2025-06-03 15:05 ` Jonathan Corbet
2025-06-03 15:14 ` Lorenzo Stoakes
2025-06-03 15:28 ` Jonathan Corbet
2025-06-02 22:25 ` Jann Horn
2025-06-03 10:45 ` Lorenzo Stoakes
2025-06-03 18:36 ` Jann Horn
2025-06-03 18:52 ` Lorenzo Stoakes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).