* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() [not found] <20260501233933.2614302-1-souvik@amlalabs.com> @ 2026-05-04 16:08 ` Dave Jiang 2026-05-08 9:15 ` David Hildenbrand (Arm) 1 sibling, 0 replies; 4+ messages in thread From: Dave Jiang @ 2026-05-04 16:08 UTC (permalink / raw) To: Souvik Banerjee, dan.j.williams Cc: willy, jack, apopple, linux-fsdevel, nvdimm, linux-mm, linux-kernel, stable On 5/1/26 4:39 PM, Souvik Banerjee wrote: > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > added zero/empty-entry early returns to dax_associate_entry() and > dax_disassociate_entry(), but placed them *after* the > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio() > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio() > performs READ_ONCE(page->compound_head) -- a real dereference of the > struct page pointer derived from a bogus PFN extracted from the > empty/zero XA value. > > On systems where vmemmap covers all of RAM that dereference reads > garbage and is harmless: the early return then discards the result. > On virtio-pmem with altmap (vmemmap stored inside the device), only > the real device PFN range is mapped, so the dereference triggers a > kernel paging fault from the truncate / invalidate path and from the > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being > freed: > > Unable to handle kernel paging request at > virtual address ffff_fdff_bf00_0008 (vmemmap region) > Call trace: > dax_disassociate_entry.isra.0+0x20/0x50 > dax_iomap_pte_fault > dax_iomap_fault > erofs_dax_fault > > Close the residual gap by moving the dax_to_folio() call after the > zero/empty guard in dax_disassociate_entry(). Apply the same > treatment to dax_busy_page(), which has the identical pattern but > was not touched by the prior fix. > > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages") > Cc: stable@vger.kernel.org # v6.15+ > Cc: Alistair Popple <apopple@nvidia.com> > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > fs/dax.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 6d175cd47a99..6878473265bb 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > static void dax_disassociate_entry(void *entry, struct address_space *mapping, > bool trunc) > { > - struct folio *folio = dax_to_folio(entry); > + struct folio *folio; > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > return; > > + folio = dax_to_folio(entry); > dax_folio_put(folio); > } > > static struct page *dax_busy_page(void *entry) > { > - struct folio *folio = dax_to_folio(entry); > + struct folio *folio; > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > return NULL; > > + folio = dax_to_folio(entry); > if (folio_ref_count(folio) - folio_mapcount(folio)) > return &folio->page; > else ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() [not found] <20260501233933.2614302-1-souvik@amlalabs.com> 2026-05-04 16:08 ` [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() Dave Jiang @ 2026-05-08 9:15 ` David Hildenbrand (Arm) 2026-05-08 16:44 ` Alistair Popple 1 sibling, 1 reply; 4+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-08 9:15 UTC (permalink / raw) To: Souvik Banerjee, dan.j.williams Cc: willy, jack, apopple, linux-fsdevel, nvdimm, linux-mm, linux-kernel, stable On 5/2/26 01:39, Souvik Banerjee wrote: > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > added zero/empty-entry early returns to dax_associate_entry() and > dax_disassociate_entry(), but placed them *after* the > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio() > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio() > performs READ_ONCE(page->compound_head) -- a real dereference of the > struct page pointer derived from a bogus PFN extracted from the > empty/zero XA value. > > On systems where vmemmap covers all of RAM that dereference reads > garbage and is harmless: the early return then discards the result. > On virtio-pmem with altmap (vmemmap stored inside the device), only > the real device PFN range is mapped, so the dereference triggers a > kernel paging fault from the truncate / invalidate path and from the > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being > freed: > > Unable to handle kernel paging request at > virtual address ffff_fdff_bf00_0008 (vmemmap region) > Call trace: > dax_disassociate_entry.isra.0+0x20/0x50 > dax_iomap_pte_fault > dax_iomap_fault > erofs_dax_fault > > Close the residual gap by moving the dax_to_folio() call after the > zero/empty guard in dax_disassociate_entry(). Apply the same > treatment to dax_busy_page(), which has the identical pattern but > was not touched by the prior fix. > > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages") > Cc: stable@vger.kernel.org # v6.15+ > Cc: Alistair Popple <apopple@nvidia.com> > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com> > --- > fs/dax.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/dax.c b/fs/dax.c > index 6d175cd47a99..6878473265bb 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > static void dax_disassociate_entry(void *entry, struct address_space *mapping, > bool trunc) > { > - struct folio *folio = dax_to_folio(entry); > + struct folio *folio; > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > return; > > + folio = dax_to_folio(entry); > dax_folio_put(folio); > } > > static struct page *dax_busy_page(void *entry) > { > - struct folio *folio = dax_to_folio(entry); > + struct folio *folio; > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > return NULL; > > + folio = dax_to_folio(entry); > if (folio_ref_count(folio) - folio_mapcount(folio)) > return &folio->page; > else Makes perfect sense to me. What about the usage in dax_associate_entry()? -- Cheers, David ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() 2026-05-08 9:15 ` David Hildenbrand (Arm) @ 2026-05-08 16:44 ` Alistair Popple 2026-05-11 21:40 ` Souvik Banerjee 0 siblings, 1 reply; 4+ messages in thread From: Alistair Popple @ 2026-05-08 16:44 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: Souvik Banerjee, dan.j.williams, willy, jack, linux-fsdevel, nvdimm, linux-mm, linux-kernel, stable On 2026-05-08 at 19:15 +1000, "David Hildenbrand (Arm)" <david@kernel.org> wrote... > On 5/2/26 01:39, Souvik Banerjee wrote: > > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > > added zero/empty-entry early returns to dax_associate_entry() and > > dax_disassociate_entry(), but placed them *after* the > > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio() > > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio() > > performs READ_ONCE(page->compound_head) -- a real dereference of the > > struct page pointer derived from a bogus PFN extracted from the > > empty/zero XA value. > > > > On systems where vmemmap covers all of RAM that dereference reads > > garbage and is harmless: the early return then discards the result. > > On virtio-pmem with altmap (vmemmap stored inside the device), only > > the real device PFN range is mapped, so the dereference triggers a > > kernel paging fault from the truncate / invalidate path and from the > > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being > > freed: > > > > Unable to handle kernel paging request at > > virtual address ffff_fdff_bf00_0008 (vmemmap region) > > Call trace: > > dax_disassociate_entry.isra.0+0x20/0x50 > > dax_iomap_pte_fault > > dax_iomap_fault > > erofs_dax_fault > > > > Close the residual gap by moving the dax_to_folio() call after the > > zero/empty guard in dax_disassociate_entry(). Apply the same > > treatment to dax_busy_page(), which has the identical pattern but > > was not touched by the prior fix. > > > > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages") > > Cc: stable@vger.kernel.org # v6.15+ > > Cc: Alistair Popple <apopple@nvidia.com> Thanks for fixing this. > > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com> > > --- > > fs/dax.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/fs/dax.c b/fs/dax.c > > index 6d175cd47a99..6878473265bb 100644 > > --- a/fs/dax.c > > +++ b/fs/dax.c > > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > > static void dax_disassociate_entry(void *entry, struct address_space *mapping, > > bool trunc) > > { > > - struct folio *folio = dax_to_folio(entry); > > + struct folio *folio; > > > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > > return; > > > > + folio = dax_to_folio(entry); > > dax_folio_put(folio); > > } > > > > static struct page *dax_busy_page(void *entry) > > { > > - struct folio *folio = dax_to_folio(entry); > > + struct folio *folio; > > > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > > return NULL; > > > > + folio = dax_to_folio(entry); > > if (folio_ref_count(folio) - folio_mapcount(folio)) > > return &folio->page; > > else > > Makes perfect sense to me. > > > What about the usage in dax_associate_entry()? Pretty sure the issue exists there as well given this code path implies we could pass zero/empty entries there as well: if (shared || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { void *old; dax_disassociate_entry(entry, mapping, false); dax_associate_entry(new_entry, mapping, vmf->vma, vmf->address, shared); - Alistair > -- > Cheers, > > David ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() 2026-05-08 16:44 ` Alistair Popple @ 2026-05-11 21:40 ` Souvik Banerjee 0 siblings, 0 replies; 4+ messages in thread From: Souvik Banerjee @ 2026-05-11 21:40 UTC (permalink / raw) To: Alistair Popple Cc: David Hildenbrand (Arm), dan.j.williams, willy, jack, linux-fsdevel, nvdimm, linux-mm, linux-kernel, stable Thanks for the review, will send a v2 with this feedback. Souvik Banerjee On Fri, May 8, 2026 at 9:44 AM Alistair Popple <apopple@nvidia.com> wrote: > > On 2026-05-08 at 19:15 +1000, "David Hildenbrand (Arm)" <david@kernel.org> wrote... > > On 5/2/26 01:39, Souvik Banerjee wrote: > > > Commit 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > > > added zero/empty-entry early returns to dax_associate_entry() and > > > dax_disassociate_entry(), but placed them *after* the > > > `struct folio *folio = dax_to_folio(entry);` line. dax_to_folio() > > > expands to page_folio(pfn_to_page(dax_to_pfn(entry))), and page_folio() > > > performs READ_ONCE(page->compound_head) -- a real dereference of the > > > struct page pointer derived from a bogus PFN extracted from the > > > empty/zero XA value. > > > > > > On systems where vmemmap covers all of RAM that dereference reads > > > garbage and is harmless: the early return then discards the result. > > > On virtio-pmem with altmap (vmemmap stored inside the device), only > > > the real device PFN range is mapped, so the dereference triggers a > > > kernel paging fault from the truncate / invalidate path and from the > > > PMD-downgrade branch of dax_iomap_pte_fault when an entry is being > > > freed: > > > > > > Unable to handle kernel paging request at > > > virtual address ffff_fdff_bf00_0008 (vmemmap region) > > > Call trace: > > > dax_disassociate_entry.isra.0+0x20/0x50 > > > dax_iomap_pte_fault > > > dax_iomap_fault > > > erofs_dax_fault > > > > > > Close the residual gap by moving the dax_to_folio() call after the > > > zero/empty guard in dax_disassociate_entry(). Apply the same > > > treatment to dax_busy_page(), which has the identical pattern but > > > was not touched by the prior fix. > > > > > > Fixes: 98c183a4fccf ("fs/dax: don't disassociate zero page entries") > > > Fixes: 38607c62b34b ("fs/dax: properly refcount fs dax pages") > > > Cc: stable@vger.kernel.org # v6.15+ > > > Cc: Alistair Popple <apopple@nvidia.com> > > Thanks for fixing this. > > > > Signed-off-by: Souvik Banerjee <souvik@amlalabs.com> > > > --- > > > fs/dax.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/dax.c b/fs/dax.c > > > index 6d175cd47a99..6878473265bb 100644 > > > --- a/fs/dax.c > > > +++ b/fs/dax.c > > > @@ -505,21 +505,23 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, > > > static void dax_disassociate_entry(void *entry, struct address_space *mapping, > > > bool trunc) > > > { > > > - struct folio *folio = dax_to_folio(entry); > > > + struct folio *folio; > > > > > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > > > return; > > > > > > + folio = dax_to_folio(entry); > > > dax_folio_put(folio); > > > } > > > > > > static struct page *dax_busy_page(void *entry) > > > { > > > - struct folio *folio = dax_to_folio(entry); > > > + struct folio *folio; > > > > > > if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) > > > return NULL; > > > > > > + folio = dax_to_folio(entry); > > > if (folio_ref_count(folio) - folio_mapcount(folio)) > > > return &folio->page; > > > else > > > > Makes perfect sense to me. > > > > > > What about the usage in dax_associate_entry()? > > Pretty sure the issue exists there as well given this code path implies we could > pass zero/empty entries there as well: > > if (shared || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) { > void *old; > > dax_disassociate_entry(entry, mapping, false); > dax_associate_entry(new_entry, mapping, vmf->vma, > vmf->address, shared); > > - Alistair > > > -- > > Cheers, > > > > David ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-11 21:41 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260501233933.2614302-1-souvik@amlalabs.com>
2026-05-04 16:08 ` [PATCH] fs/dax: check for empty/zero entries before calling pfn_to_page() Dave Jiang
2026-05-08 9:15 ` David Hildenbrand (Arm)
2026-05-08 16:44 ` Alistair Popple
2026-05-11 21:40 ` Souvik Banerjee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox