* [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on
@ 2026-01-08 10:10 Boris Brezillon
2026-01-08 10:28 ` Steven Price
0 siblings, 1 reply; 5+ messages in thread
From: Boris Brezillon @ 2026-01-08 10:10 UTC (permalink / raw)
To: Steven Price, Adrián Larumbe, Boris Brezillon, Rob Herring
Cc: dri-devel, kernel, Loïc Molinari
drm_gem_put_pages(), which we rely on for returning BO pages to shmem,
assume per-folio refcounting and not per-page. If we call
shmem_read_mapping_page() per-page, we break this assumption and leak
pages every time we get a huge page allocated.
Cc: Loïc Molinari <loic.molinari@collabora.com>
Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint option")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++----
1 file changed, 31 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 8f3b7a7b6ad0..9b61ad98a77e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
pgoff_t page_offset;
struct sg_table *sgt;
struct page **pages;
+ u32 nr_pages;
bomapping = addr_to_mapping(pfdev, as, addr);
if (!bomapping)
@@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
addr &= ~((u64)SZ_2M - 1);
page_offset = addr >> PAGE_SHIFT;
page_offset -= bomapping->mmnode.start;
+ nr_pages = bo->base.base.size >> PAGE_SHIFT;
obj = &bo->base.base;
@@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
goto err_unlock;
}
- pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
- sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
+ pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
if (!pages) {
kvfree(bo->sgts);
bo->sgts = NULL;
@@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
mapping_set_unevictable(mapping);
for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
+ struct folio *folio;
+
/* Can happen if the last fault only partially filled this
* section of the pages array before failing. In that case
* we skip already filled pages.
@@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
if (pages[i])
continue;
- pages[i] = shmem_read_mapping_page(mapping, i);
- if (IS_ERR(pages[i])) {
- ret = PTR_ERR(pages[i]);
- pages[i] = NULL;
+ folio = shmem_read_folio(mapping, i);
+ if (IS_ERR(folio)) {
+ ret = PTR_ERR(folio);
goto err_unlock;
}
+
+ /* We always fill the page array at a folio granularity so
+ * there's no reason for a missing page to not be the first
+ * in the folio. We want to ensure that's the case to avoid
+ * unbalanced folio_{get,put}() leading to leaks, because
+ * drm_gem_put_pages() assumes per-folio refcounting not
+ * per-page.
+ */
+ if (WARN_ON(folio_file_page(folio, i) != folio_page(folio, 0))) {
+ folio_put(folio);
+ ret = -EINVAL;
+ goto err_unlock;
+ }
+
+ /* Fill a folio worth of pages, even if it goes beyond
+ * NUM_FAULT_PAGES, otherwise we'll end up with unbalanced
+ * refcounting next time we hit an unmapped section crossing
+ * this folio.
+ */
+ for (u32 j = 0; j < folio_nr_pages(folio) && i < nr_pages; j++)
+ pages[i + j] = folio_page(folio, j);
+
+ i += folio_nr_pages(folio) - 1;
}
ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
--
2.52.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on
2026-01-08 10:10 [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on Boris Brezillon
@ 2026-01-08 10:28 ` Steven Price
2026-01-08 10:42 ` Boris Brezillon
2026-01-08 10:43 ` Boris Brezillon
0 siblings, 2 replies; 5+ messages in thread
From: Steven Price @ 2026-01-08 10:28 UTC (permalink / raw)
To: Boris Brezillon, Adrián Larumbe, Rob Herring
Cc: dri-devel, kernel, Loïc Molinari
Hi Boris,
Happy new year!
On 08/01/2026 10:10, Boris Brezillon wrote:
> drm_gem_put_pages(), which we rely on for returning BO pages to shmem,
> assume per-folio refcounting and not per-page. If we call
> shmem_read_mapping_page() per-page, we break this assumption and leak
> pages every time we get a huge page allocated.
>
> Cc: Loïc Molinari <loic.molinari@collabora.com>
> Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint option")
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++----
> 1 file changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 8f3b7a7b6ad0..9b61ad98a77e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> pgoff_t page_offset;
> struct sg_table *sgt;
> struct page **pages;
> + u32 nr_pages;
>
> bomapping = addr_to_mapping(pfdev, as, addr);
> if (!bomapping)
> @@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> addr &= ~((u64)SZ_2M - 1);
> page_offset = addr >> PAGE_SHIFT;
> page_offset -= bomapping->mmnode.start;
> + nr_pages = bo->base.base.size >> PAGE_SHIFT;
>
> obj = &bo->base.base;
>
> @@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> goto err_unlock;
> }
>
> - pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
> - sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> + pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> if (!pages) {
> kvfree(bo->sgts);
> bo->sgts = NULL;
> @@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> mapping_set_unevictable(mapping);
>
> for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> + struct folio *folio;
> +
> /* Can happen if the last fault only partially filled this
> * section of the pages array before failing. In that case
> * we skip already filled pages.
> @@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> if (pages[i])
> continue;
>
> - pages[i] = shmem_read_mapping_page(mapping, i);
> - if (IS_ERR(pages[i])) {
> - ret = PTR_ERR(pages[i]);
> - pages[i] = NULL;
> + folio = shmem_read_folio(mapping, i);
> + if (IS_ERR(folio)) {
> + ret = PTR_ERR(folio);
> goto err_unlock;
> }
> +
> + /* We always fill the page array at a folio granularity so
> + * there's no reason for a missing page to not be the first
> + * in the folio. We want to ensure that's the case to avoid
> + * unbalanced folio_{get,put}() leading to leaks, because
> + * drm_gem_put_pages() assumes per-folio refcounting not
> + * per-page.
> + */
I'm a little uneasy about this reasoning. Above we do mask the address
to be a multiple of 2MB, but the folio could (in theory at least) be
bigger than 2MB. So I don't see what stops a GPU job faulting in the
middle of a buffer and triggering this warning.
Can we not walk backwards to the beginning of the folio if the address
isn't aligned and check that?
> + if (WARN_ON(folio_file_page(folio, i) != folio_page(folio, 0))) {
> + folio_put(folio);
> + ret = -EINVAL;
> + goto err_unlock;
> + }
> +
> + /* Fill a folio worth of pages, even if it goes beyond
> + * NUM_FAULT_PAGES, otherwise we'll end up with unbalanced
> + * refcounting next time we hit an unmapped section crossing
> + * this folio.
> + */
> + for (u32 j = 0; j < folio_nr_pages(folio) && i < nr_pages; j++)
This "i < nr_pages" check is wrong - presumably it should be "i + j <
nr_pages".
> + pages[i + j] = folio_page(folio, j);
> +
> + i += folio_nr_pages(folio) - 1;
I feel like the outer for() loop needs reworking into a different form.
It makes complete sense to walk in terms of folios, but we've got this
weird mix of pages and folios going on, leading to odd things like this
"- 1" fix up here.
Thanks,
Steve
> }
>
> ret = sg_alloc_table_from_pages(sgt, pages + page_offset,
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on
2026-01-08 10:28 ` Steven Price
@ 2026-01-08 10:42 ` Boris Brezillon
2026-01-08 11:48 ` Boris Brezillon
2026-01-08 10:43 ` Boris Brezillon
1 sibling, 1 reply; 5+ messages in thread
From: Boris Brezillon @ 2026-01-08 10:42 UTC (permalink / raw)
To: Steven Price
Cc: Adrián Larumbe, Rob Herring, dri-devel, kernel,
Loïc Molinari
On Thu, 8 Jan 2026 10:28:00 +0000
Steven Price <steven.price@arm.com> wrote:
> Hi Boris,
>
> Happy new year!
>
> On 08/01/2026 10:10, Boris Brezillon wrote:
> > drm_gem_put_pages(), which we rely on for returning BO pages to shmem,
> > assume per-folio refcounting and not per-page. If we call
> > shmem_read_mapping_page() per-page, we break this assumption and leak
> > pages every time we get a huge page allocated.
> >
> > Cc: Loïc Molinari <loic.molinari@collabora.com>
> > Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint option")
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++----
> > 1 file changed, 31 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > index 8f3b7a7b6ad0..9b61ad98a77e 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > @@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > pgoff_t page_offset;
> > struct sg_table *sgt;
> > struct page **pages;
> > + u32 nr_pages;
> >
> > bomapping = addr_to_mapping(pfdev, as, addr);
> > if (!bomapping)
> > @@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > addr &= ~((u64)SZ_2M - 1);
> > page_offset = addr >> PAGE_SHIFT;
> > page_offset -= bomapping->mmnode.start;
> > + nr_pages = bo->base.base.size >> PAGE_SHIFT;
> >
> > obj = &bo->base.base;
> >
> > @@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > goto err_unlock;
> > }
> >
> > - pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
> > - sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> > + pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> > if (!pages) {
> > kvfree(bo->sgts);
> > bo->sgts = NULL;
> > @@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > mapping_set_unevictable(mapping);
> >
> > for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> > + struct folio *folio;
> > +
> > /* Can happen if the last fault only partially filled this
> > * section of the pages array before failing. In that case
> > * we skip already filled pages.
> > @@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > if (pages[i])
> > continue;
> >
> > - pages[i] = shmem_read_mapping_page(mapping, i);
> > - if (IS_ERR(pages[i])) {
> > - ret = PTR_ERR(pages[i]);
> > - pages[i] = NULL;
> > + folio = shmem_read_folio(mapping, i);
> > + if (IS_ERR(folio)) {
> > + ret = PTR_ERR(folio);
> > goto err_unlock;
> > }
> > +
> > + /* We always fill the page array at a folio granularity so
> > + * there's no reason for a missing page to not be the first
> > + * in the folio. We want to ensure that's the case to avoid
> > + * unbalanced folio_{get,put}() leading to leaks, because
> > + * drm_gem_put_pages() assumes per-folio refcounting not
> > + * per-page.
> > + */
>
> I'm a little uneasy about this reasoning. Above we do mask the address
> to be a multiple of 2MB, but the folio could (in theory at least) be
> bigger than 2MB. So I don't see what stops a GPU job faulting in the
> middle of a buffer and triggering this warning.
>
> Can we not walk backwards to the beginning of the folio if the address
> isn't aligned and check that?
Yep, I assumed the heap buffer would be faulted from bottom to top
(which I think is a valid assumption for the tiler, but I might be
wrong). I can certainly revisit the logic to map on both side if we're
in the middle of a folio and drop this WARN_ON().
>
> > + if (WARN_ON(folio_file_page(folio, i) != folio_page(folio, 0))) {
> > + folio_put(folio);
> > + ret = -EINVAL;
> > + goto err_unlock;
> > + }
> > +
> > + /* Fill a folio worth of pages, even if it goes beyond
> > + * NUM_FAULT_PAGES, otherwise we'll end up with unbalanced
> > + * refcounting next time we hit an unmapped section crossing
> > + * this folio.
> > + */
> > + for (u32 j = 0; j < folio_nr_pages(folio) && i < nr_pages; j++)
>
> This "i < nr_pages" check is wrong - presumably it should be "i + j <
> nr_pages".
Oops, yep. It comes from a version where I was doing
pages[i++] = folio_pages(folio, j);
>
> > + pages[i + j] = folio_page(folio, j);
> > +
> > + i += folio_nr_pages(folio) - 1;
>
> I feel like the outer for() loop needs reworking into a different form.
> It makes complete sense to walk in terms of folios, but we've got this
> weird mix of pages and folios going on, leading to odd things like this
> "- 1" fix up here.
Yep, I agree. I was initially trying to keep the diff as small as
possible, but it's a bit messy as it is.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on
2026-01-08 10:28 ` Steven Price
2026-01-08 10:42 ` Boris Brezillon
@ 2026-01-08 10:43 ` Boris Brezillon
1 sibling, 0 replies; 5+ messages in thread
From: Boris Brezillon @ 2026-01-08 10:43 UTC (permalink / raw)
To: Steven Price
Cc: Adrián Larumbe, Rob Herring, dri-devel, kernel,
Loïc Molinari
On Thu, 8 Jan 2026 10:28:00 +0000
Steven Price <steven.price@arm.com> wrote:
> Hi Boris,
>
> Happy new year!
Happy new year too!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on
2026-01-08 10:42 ` Boris Brezillon
@ 2026-01-08 11:48 ` Boris Brezillon
0 siblings, 0 replies; 5+ messages in thread
From: Boris Brezillon @ 2026-01-08 11:48 UTC (permalink / raw)
To: Steven Price
Cc: Adrián Larumbe, Rob Herring, dri-devel, kernel,
Loïc Molinari
On Thu, 8 Jan 2026 11:42:32 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:
> On Thu, 8 Jan 2026 10:28:00 +0000
> Steven Price <steven.price@arm.com> wrote:
>
> > Hi Boris,
> >
> > Happy new year!
> >
> > On 08/01/2026 10:10, Boris Brezillon wrote:
> > > drm_gem_put_pages(), which we rely on for returning BO pages to shmem,
> > > assume per-folio refcounting and not per-page. If we call
> > > shmem_read_mapping_page() per-page, we break this assumption and leak
> > > pages every time we get a huge page allocated.
> > >
> > > Cc: Loïc Molinari <loic.molinari@collabora.com>
> > > Fixes: c12e9fcb5a5a ("drm/panfrost: Introduce huge tmpfs mountpoint option")
> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > ---
> > > drivers/gpu/drm/panfrost/panfrost_mmu.c | 37 +++++++++++++++++++++----
> > > 1 file changed, 31 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > index 8f3b7a7b6ad0..9b61ad98a77e 100644
> > > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> > > @@ -595,6 +595,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > > pgoff_t page_offset;
> > > struct sg_table *sgt;
> > > struct page **pages;
> > > + u32 nr_pages;
> > >
> > > bomapping = addr_to_mapping(pfdev, as, addr);
> > > if (!bomapping)
> > > @@ -613,6 +614,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > > addr &= ~((u64)SZ_2M - 1);
> > > page_offset = addr >> PAGE_SHIFT;
> > > page_offset -= bomapping->mmnode.start;
> > > + nr_pages = bo->base.base.size >> PAGE_SHIFT;
> > >
> > > obj = &bo->base.base;
> > >
> > > @@ -626,8 +628,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > > goto err_unlock;
> > > }
> > >
> > > - pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
> > > - sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> > > + pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
> > > if (!pages) {
> > > kvfree(bo->sgts);
> > > bo->sgts = NULL;
> > > @@ -650,6 +651,8 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > > mapping_set_unevictable(mapping);
> > >
> > > for (i = page_offset; i < page_offset + NUM_FAULT_PAGES; i++) {
> > > + struct folio *folio;
> > > +
> > > /* Can happen if the last fault only partially filled this
> > > * section of the pages array before failing. In that case
> > > * we skip already filled pages.
> > > @@ -657,12 +660,34 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
> > > if (pages[i])
> > > continue;
> > >
> > > - pages[i] = shmem_read_mapping_page(mapping, i);
> > > - if (IS_ERR(pages[i])) {
> > > - ret = PTR_ERR(pages[i]);
> > > - pages[i] = NULL;
> > > + folio = shmem_read_folio(mapping, i);
> > > + if (IS_ERR(folio)) {
> > > + ret = PTR_ERR(folio);
> > > goto err_unlock;
> > > }
> > > +
> > > + /* We always fill the page array at a folio granularity so
> > > + * there's no reason for a missing page to not be the first
> > > + * in the folio. We want to ensure that's the case to avoid
> > > + * unbalanced folio_{get,put}() leading to leaks, because
> > > + * drm_gem_put_pages() assumes per-folio refcounting not
> > > + * per-page.
> > > + */
> >
> > I'm a little uneasy about this reasoning. Above we do mask the address
> > to be a multiple of 2MB, but the folio could (in theory at least) be
> > bigger than 2MB. So I don't see what stops a GPU job faulting in the
> > middle of a buffer and triggering this warning.
> >
> > Can we not walk backwards to the beginning of the folio if the address
> > isn't aligned and check that?
>
> Yep, I assumed the heap buffer would be faulted from bottom to top
> (which I think is a valid assumption for the tiler, but I might be
> wrong). I can certainly revisit the logic to map on both side if we're
> in the middle of a folio and drop this WARN_ON().
Actually no, it can and will allocate from both ends, and if we cross a
2M boundary that's backed by the same folio when doing top->bottom
alloc we'll hit this WARN_ON(). I'll send a v2 with a revisited for()
loop, as suggested.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-01-08 11:48 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-08 10:10 [PATCH] drm/panfrost: Fix a page leak in panfrost_mmu_map_fault_addr() when THP is on Boris Brezillon
2026-01-08 10:28 ` Steven Price
2026-01-08 10:42 ` Boris Brezillon
2026-01-08 11:48 ` Boris Brezillon
2026-01-08 10:43 ` Boris Brezillon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.