* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page()
2026-05-01 23:39 [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() Souvik Banerjee
@ 2026-05-03 17:11 ` Jan Kara
2026-05-04 16:08 ` Dave Jiang
2026-05-08 9:15 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2026-05-03 17:11 UTC (permalink / raw)
To: Souvik Banerjee
Cc: dan.j.williams, willy, jack, apopple, linux-fsdevel, nvdimm,
linux-mm, linux-kernel, stable
On Fri 01-05-26 23:39:33, Souvik Banerjee wrote:
> Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> added zero/empty-entry early returns to dax_associate_entry() and
> dax_disassociate_entry(), but placed them *after* the
> `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio()
> expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio()
> performs READ_ONCE(page->compound_head) -- a real dereference of the
> struct page pointer derived from a bogus PFN extracted from the
> empty/zero XA value.
>
> On systems where vmemmap covers all of RAM that dereference reads
> garbage and is harmless: the early return then discards the result.
> On virtio-pmem with altmap (vmemmap stored inside the device), only
> the real device PFN range is mapped, so the dereference triggers a
> kernel paging fault from the truncate / invalidate path and from the
> PMD-downgrade branch of dax_iomap_pte_fault when an entry is being
> freed:
>
> Unable to handle kernel paging request at
> virtual address ffff_fdff_bf00_0008 (vmemmap region)
> Call trace:
> dax_disassociate_entry.isra.0+0x20/0x50
> dax_iomap_pte_fault
> dax_iomap_fault
> erofs_dax_fault
>
> Close the residual gap by moving the dax_to_folio() call after the
> zero/empty guard in dax_disassociate_entry(). Apply the same
> treatment to dax_busy_page(), which has the identical pattern but
> was not touched by the prior fix.
>
> Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
> Cc: stable@vger.kernel.org # v6.15+
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Souvik Banerjee <souvik@amlalabs.com>
Yeah, good catch. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/dax.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 6d175cd47a99..6878473265bb 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping,
> static void dax_disassociate_entry(void *entry, struct address_space *mapping,
> bool trunc)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return;
>
> + folio = dax_to_folio(entry);
> dax_folio_put(folio);
> }
>
> static struct page *dax_busy_page(void *entry)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return NULL;
>
> + folio = dax_to_folio(entry);
> if (folio_ref_count(folio) - folio_mapcount(folio))
> return &folio->page;
> else
> --
> 2.51.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page()
2026-05-01 23:39 [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() Souvik Banerjee
2026-05-03 17:11 ` Jan Kara
@ 2026-05-04 16:08 ` Dave Jiang
2026-05-08 9:15 ` David Hildenbrand (Arm)
2 siblings, 0 replies; 6+ messages in thread
From: Dave Jiang @ 2026-05-04 16:08 UTC (permalink / raw)
To: Souvik Banerjee, dan.j.williams
Cc: willy, jack, apopple, linux-fsdevel, nvdimm, linux-mm,
linux-kernel, stable
On 5/1/26 4:39 PM, Souvik Banerjee wrote:
> Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> added zero/empty-entry early returns to dax_associate_entry() and
> dax_disassociate_entry(), but placed them *after* the
> `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio()
> expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio()
> performs READ_ONCE(page->compound_head) -- a real dereference of the
> struct page pointer derived from a bogus PFN extracted from the
> empty/zero XA value.
>
> On systems where vmemmap covers all of RAM that dereference reads
> garbage and is harmless: the early return then discards the result.
> On virtio-pmem with altmap (vmemmap stored inside the device), only
> the real device PFN range is mapped, so the dereference triggers a
> kernel paging fault from the truncate / invalidate path and from the
> PMD-downgrade branch of dax_iomap_pte_fault when an entry is being
> freed:
>
> Unable to handle kernel paging request at
> virtual address ffff_fdff_bf00_0008 (vmemmap region)
> Call trace:
> dax_disassociate_entry.isra.0+0x20/0x50
> dax_iomap_pte_fault
> dax_iomap_fault
> erofs_dax_fault
>
> Close the residual gap by moving the dax_to_folio() call after the
> zero/empty guard in dax_disassociate_entry(). Apply the same
> treatment to dax_busy_page(), which has the identical pattern but
> was not touched by the prior fix.
>
> Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
> Cc: stable@vger.kernel.org # v6.15+
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Souvik Banerjee <souvik@amlalabs.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> ---
> fs/dax.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 6d175cd47a99..6878473265bb 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping,
> static void dax_disassociate_entry(void *entry, struct address_space *mapping,
> bool trunc)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return;
>
> + folio = dax_to_folio(entry);
> dax_folio_put(folio);
> }
>
> static struct page *dax_busy_page(void *entry)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return NULL;
>
> + folio = dax_to_folio(entry);
> if (folio_ref_count(folio) - folio_mapcount(folio))
> return &folio->page;
> else
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page()
2026-05-01 23:39 [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() Souvik Banerjee
2026-05-03 17:11 ` Jan Kara
2026-05-04 16:08 ` Dave Jiang
@ 2026-05-08 9:15 ` David Hildenbrand (Arm)
2026-05-08 16:44 ` Alistair Popple
2 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-08 9:15 UTC (permalink / raw)
To: Souvik Banerjee, dan.j.williams
Cc: willy, jack, apopple, linux-fsdevel, nvdimm, linux-mm,
linux-kernel, stable
On 5/2/26 01:39, Souvik Banerjee wrote:
> Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> added zero/empty-entry early returns to dax_associate_entry() and
> dax_disassociate_entry(), but placed them *after* the
> `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio()
> expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio()
> performs READ_ONCE(page->compound_head) -- a real dereference of the
> struct page pointer derived from a bogus PFN extracted from the
> empty/zero XA value.
>
> On systems where vmemmap covers all of RAM that dereference reads
> garbage and is harmless: the early return then discards the result.
> On virtio-pmem with altmap (vmemmap stored inside the device), only
> the real device PFN range is mapped, so the dereference triggers a
> kernel paging fault from the truncate / invalidate path and from the
> PMD-downgrade branch of dax_iomap_pte_fault when an entry is being
> freed:
>
> Unable to handle kernel paging request at
> virtual address ffff_fdff_bf00_0008 (vmemmap region)
> Call trace:
> dax_disassociate_entry.isra.0+0x20/0x50
> dax_iomap_pte_fault
> dax_iomap_fault
> erofs_dax_fault
>
> Close the residual gap by moving the dax_to_folio() call after the
> zero/empty guard in dax_disassociate_entry(). Apply the same
> treatment to dax_busy_page(), which has the identical pattern but
> was not touched by the prior fix.
>
> Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
> Cc: stable@vger.kernel.org # v6.15+
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Souvik Banerjee <souvik@amlalabs.com>
> ---
> fs/dax.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 6d175cd47a99..6878473265bb 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping,
> static void dax_disassociate_entry(void *entry, struct address_space *mapping,
> bool trunc)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return;
>
> + folio = dax_to_folio(entry);
> dax_folio_put(folio);
> }
>
> static struct page *dax_busy_page(void *entry)
> {
> - struct folio *folio = dax_to_folio(entry);
> + struct folio *folio;
>
> if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> return NULL;
>
> + folio = dax_to_folio(entry);
> if (folio_ref_count(folio) - folio_mapcount(folio))
> return &folio->page;
> else
Makes perfect sense to me.
What about the usage in dax_associate_entry()?
--
Cheers,
David
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page()
2026-05-08 9:15 ` David Hildenbrand (Arm)
@ 2026-05-08 16:44 ` Alistair Popple
2026-05-11 21:40 ` Souvik Banerjee
0 siblings, 1 reply; 6+ messages in thread
From: Alistair Popple @ 2026-05-08 16:44 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Souvik Banerjee, dan.j.williams, willy, jack, linux-fsdevel,
nvdimm, linux-mm, linux-kernel, stable
On 2026-05-08 at 19:15 +1000, "David Hildenbrand (Arm)" <david@kernel.org> wrote...
> On 5/2/26 01:39, Souvik Banerjee wrote:
> > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> > added zero/empty-entry early returns to dax_associate_entry() and
> > dax_disassociate_entry(), but placed them *after* the
> > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio()
> > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio()
> > performs READ_ONCE(page->compound_head) -- a real dereference of the
> > struct page pointer derived from a bogus PFN extracted from the
> > empty/zero XA value.
> >
> > On systems where vmemmap covers all of RAM that dereference reads
> > garbage and is harmless: the early return then discards the result.
> > On virtio-pmem with altmap (vmemmap stored inside the device), only
> > the real device PFN range is mapped, so the dereference triggers a
> > kernel paging fault from the truncate / invalidate path and from the
> > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being
> > freed:
> >
> > Unable to handle kernel paging request at
> > virtual address ffff_fdff_bf00_0008 (vmemmap region)
> > Call trace:
> > dax_disassociate_entry.isra.0+0x20/0x50
> > dax_iomap_pte_fault
> > dax_iomap_fault
> > erofs_dax_fault
> >
> > Close the residual gap by moving the dax_to_folio() call after the
> > zero/empty guard in dax_disassociate_entry(). Apply the same
> > treatment to dax_busy_page(), which has the identical pattern but
> > was not touched by the prior fix.
> >
> > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
> > Cc: stable@vger.kernel.org # v6.15+
> > Cc: Alistair Popple <apopple@nvidia.com>
Thanks for fixing this.
> > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com>
> > ---
> > fs/dax.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 6d175cd47a99..6878473265bb 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping,
> > static void dax_disassociate_entry(void *entry, struct address_space *mapping,
> > bool trunc)
> > {
> > - struct folio *folio = dax_to_folio(entry);
> > + struct folio *folio;
> >
> > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> > return;
> >
> > + folio = dax_to_folio(entry);
> > dax_folio_put(folio);
> > }
> >
> > static struct page *dax_busy_page(void *entry)
> > {
> > - struct folio *folio = dax_to_folio(entry);
> > + struct folio *folio;
> >
> > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> > return NULL;
> >
> > + folio = dax_to_folio(entry);
> > if (folio_ref_count(folio) - folio_mapcount(folio))
> > return &folio->page;
> > else
>
> Makes perfect sense to me.
>
>
> What about the usage in dax_associate_entry()?
Pretty sure the issue exists there as well given this code path implies we could
pass zero/empty entries there as well:
if (shared || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) {
void *old;
dax_disassociate_entry(entry, mapping, false);
dax_associate_entry(new_entry, mapping, vmf->vma,
vmf->address, shared);
- Alistair
> --
> Cheers,
>
> David
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page()
2026-05-08 16:44 ` Alistair Popple
@ 2026-05-11 21:40 ` Souvik Banerjee
0 siblings, 0 replies; 6+ messages in thread
From: Souvik Banerjee @ 2026-05-11 21:40 UTC (permalink / raw)
To: Alistair Popple
Cc: David Hildenbrand (Arm), dan.j.williams, willy, jack,
linux-fsdevel, nvdimm, linux-mm, linux-kernel, stable
Thanks for the review, will send a v2 with this feedback.
Souvik Banerjee
On Fri, May 8, 2026 at 9:44 AM Alistair Popple <apopple@nvidia.com> wrote:
>
> On 2026-05-08 at 19:15 +1000, "David Hildenbrand (Arm)" <david@kernel.org> wrote...
> > On 5/2/26 01:39, Souvik Banerjee wrote:
> > > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> > > added zero/empty-entry early returns to dax_associate_entry() and
> > > dax_disassociate_entry(), but placed them *after* the
> > > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio()
> > > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio()
> > > performs READ_ONCE(page->compound_head) -- a real dereference of the
> > > struct page pointer derived from a bogus PFN extracted from the
> > > empty/zero XA value.
> > >
> > > On systems where vmemmap covers all of RAM that dereference reads
> > > garbage and is harmless: the early return then discards the result.
> > > On virtio-pmem with altmap (vmemmap stored inside the device), only
> > > the real device PFN range is mapped, so the dereference triggers a
> > > kernel paging fault from the truncate / invalidate path and from the
> > > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being
> > > freed:
> > >
> > > Unable to handle kernel paging request at
> > > virtual address ffff_fdff_bf00_0008 (vmemmap region)
> > > Call trace:
> > > dax_disassociate_entry.isra.0+0x20/0x50
> > > dax_iomap_pte_fault
> > > dax_iomap_fault
> > > erofs_dax_fault
> > >
> > > Close the residual gap by moving the dax_to_folio() call after the
> > > zero/empty guard in dax_disassociate_entry(). Apply the same
> > > treatment to dax_busy_page(), which has the identical pattern but
> > > was not touched by the prior fix.
> > >
> > > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries")
> > > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages")
> > > Cc: stable@vger.kernel.org # v6.15+
> > > Cc: Alistair Popple <apopple@nvidia.com>
>
> Thanks for fixing this.
>
> > > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com>
> > > ---
> > > fs/dax.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/dax.c b/fs/dax.c
> > > index 6d175cd47a99..6878473265bb 100644
> > > --- a/fs/dax.c
> > > +++ b/fs/dax.c
> > > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping,
> > > static void dax_disassociate_entry(void *entry, struct address_space *mapping,
> > > bool trunc)
> > > {
> > > - struct folio *folio = dax_to_folio(entry);
> > > + struct folio *folio;
> > >
> > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> > > return;
> > >
> > > + folio = dax_to_folio(entry);
> > > dax_folio_put(folio);
> > > }
> > >
> > > static struct page *dax_busy_page(void *entry)
> > > {
> > > - struct folio *folio = dax_to_folio(entry);
> > > + struct folio *folio;
> > >
> > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry))
> > > return NULL;
> > >
> > > + folio = dax_to_folio(entry);
> > > if (folio_ref_count(folio) - folio_mapcount(folio))
> > > return &folio->page;
> > > else
> >
> > Makes perfect sense to me.
> >
> >
> > What about the usage in dax_associate_entry()?
>
> Pretty sure the issue exists there as well given this code path implies we could
> pass zero/empty entries there as well:
>
> if (shared || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) {
> void *old;
>
> dax_disassociate_entry(entry, mapping, false);
> dax_associate_entry(new_entry, mapping, vmf->vma,
> vmf->address, shared);
>
> - Alistair
>
> > --
> > Cheers,
> >
> > David
^ permalink raw reply [flat|nested] 6+ messages in thread