* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
@ 2026-03-16 14:52 ` Yosry Ahmed
2026-03-16 15:49 ` Lorenzo Stoakes (Oracle)
2026-03-16 15:17 ` Johannes Weiner
` (4 subsequent siblings)
5 siblings, 1 reply; 12+ messages in thread
From: Yosry Ahmed @ 2026-03-16 14:52 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, Johannes Weiner, Nhat Pham, Chengming Zhou,
linux-mm, linux-kernel
On Mon, Mar 16, 2026 at 02:01:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> data for uncompressed pages.
>
> In doing so, it mapped kernel memory locally for 32-bit kernels using
> kmap_local_folio(), however it never unmapped this memory.
>
> This resulted in the linked syzbot report where a BUG_ON() is triggered due
> to leaking the kmap slot.
>
> This patch fixes the issue by explicitly unmapping the established kmap.
>
> Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Oh thanks for catching that, not sure how I ended up doing that tbh..
Anyway, LGTM:
Acked-by: Yosry Ahmed <yosry@kernel.org>
> ---
> mm/zswap.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index e6ec3295bdb0..499520f65ff0 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -942,9 +942,14 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
>
> /* zswap entries of length PAGE_SIZE are not compressed. */
> if (entry->length == PAGE_SIZE) {
> + void *dst;
> +
> WARN_ON_ONCE(input->length != PAGE_SIZE);
> - memcpy_from_sglist(kmap_local_folio(folio, 0), input, 0, PAGE_SIZE);
> +
> + dst = kmap_local_folio(folio, 0);
> + memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> dlen = PAGE_SIZE;
> + kunmap_local(dst);
> } else {
> sg_init_table(&output, 1);
> sg_set_folio(&output, folio, PAGE_SIZE, 0);
> --
> 2.53.0
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:52 ` Yosry Ahmed
@ 2026-03-16 15:49 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-16 15:49 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Andrew Morton, Johannes Weiner, Nhat Pham, Chengming Zhou,
linux-mm, linux-kernel
On Mon, Mar 16, 2026 at 02:52:24PM +0000, Yosry Ahmed wrote:
> On Mon, Mar 16, 2026 at 02:01:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> > Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> > zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> > data for uncompressed pages.
> >
> > In doing so, it mapped kernel memory locally for 32-bit kernels using
> > kmap_local_folio(), however it never unmapped this memory.
> >
> > This resulted in the linked syzbot report where a BUG_ON() is triggered due
> > to leaking the kmap slot.
> >
> > This patch fixes the issue by explicitly unmapping the established kmap.
> >
> > Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> > Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> Oh thanks for catching that, not sure how I ended up doing that tbh..
Don't worry I've made FAR worse mistakes in some of my patches, believe me :)
easily done.
>
> Anyway, LGTM:
>
> Acked-by: Yosry Ahmed <yosry@kernel.org>
Thanks!
Cheers, Lorenzo
>
> > ---
> > mm/zswap.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index e6ec3295bdb0..499520f65ff0 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -942,9 +942,14 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> >
> > /* zswap entries of length PAGE_SIZE are not compressed. */
> > if (entry->length == PAGE_SIZE) {
> > + void *dst;
> > +
> > WARN_ON_ONCE(input->length != PAGE_SIZE);
> > - memcpy_from_sglist(kmap_local_folio(folio, 0), input, 0, PAGE_SIZE);
> > +
> > + dst = kmap_local_folio(folio, 0);
> > + memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> > dlen = PAGE_SIZE;
> > + kunmap_local(dst);
> > } else {
> > sg_init_table(&output, 1);
> > sg_set_folio(&output, folio, PAGE_SIZE, 0);
> > --
> > 2.53.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
2026-03-16 14:52 ` Yosry Ahmed
@ 2026-03-16 15:17 ` Johannes Weiner
2026-03-17 0:38 ` SeongJae Park
` (3 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: Johannes Weiner @ 2026-03-16 15:17 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, Yosry Ahmed, Nhat Pham, Chengming Zhou, linux-mm,
linux-kernel
On Mon, Mar 16, 2026 at 02:01:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> data for uncompressed pages.
>
> In doing so, it mapped kernel memory locally for 32-bit kernels using
> kmap_local_folio(), however it never unmapped this memory.
>
> This resulted in the linked syzbot report where a BUG_ON() is triggered due
> to leaking the kmap slot.
>
> This patch fixes the issue by explicitly unmapping the established kmap.
>
> Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Oops!
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
2026-03-16 14:52 ` Yosry Ahmed
2026-03-16 15:17 ` Johannes Weiner
@ 2026-03-17 0:38 ` SeongJae Park
2026-03-17 10:01 ` Lorenzo Stoakes (Oracle)
` (2 subsequent siblings)
5 siblings, 0 replies; 12+ messages in thread
From: SeongJae Park @ 2026-03-17 0:38 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: SeongJae Park, Andrew Morton, Johannes Weiner, Yosry Ahmed,
Nhat Pham, Chengming Zhou, linux-mm, linux-kernel
On Mon, 16 Mar 2026 14:01:22 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> data for uncompressed pages.
>
> In doing so, it mapped kernel memory locally for 32-bit kernels using
> kmap_local_folio(), however it never unmapped this memory.
>
> This resulted in the linked syzbot report where a BUG_ON() is triggered due
> to leaking the kmap slot.
>
> This patch fixes the issue by explicitly unmapping the established kmap.
Nice catch, thank you for this fix!
>
> Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
` (2 preceding siblings ...)
2026-03-17 0:38 ` SeongJae Park
@ 2026-03-17 10:01 ` Lorenzo Stoakes (Oracle)
2026-03-17 12:13 ` Lorenzo Stoakes (Oracle)
2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
2026-03-18 23:15 ` Nhat Pham
5 siblings, 1 reply; 12+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-17 10:01 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Chengming Zhou, linux-mm,
linux-kernel, Mateusz Guzik, Zi Yan
On Mon, Mar 16, 2026 at 02:01:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> data for uncompressed pages.
>
> In doing so, it mapped kernel memory locally for 32-bit kernels using
> kmap_local_folio(), however it never unmapped this memory.
>
> This resulted in the linked syzbot report where a BUG_ON() is triggered due
> to leaking the kmap slot.
>
> This patch fixes the issue by explicitly unmapping the established kmap.
>
> Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
> mm/zswap.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index e6ec3295bdb0..499520f65ff0 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -942,9 +942,14 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
>
> /* zswap entries of length PAGE_SIZE are not compressed. */
> if (entry->length == PAGE_SIZE) {
> + void *dst;
> +
> WARN_ON_ONCE(input->length != PAGE_SIZE);
> - memcpy_from_sglist(kmap_local_folio(folio, 0), input, 0, PAGE_SIZE);
> +
> + dst = kmap_local_folio(folio, 0);
> + memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> dlen = PAGE_SIZE;
> + kunmap_local(dst);
FYI to address (in advance) the AI review from [0] which a couple people made me
aware of - we don't need a flush_dcache_folio() here, because the folio is not
yet accessible by userspace, so we can't have virtual aliasing of the folio's
physical address on VIVT architectures.
Examining call paths:
zswap_writeback_entry() -> only calls zswap_decompress() if allocated
-> zswap_decompress()
swap_vma_readahead() -> only calls swap_read_folio() if allocated
swap_cluster_readahead() -> only calls swap_read_folio() if allocated
read_swap_cache_async() -> only calls swap_read_folio() if allocated
do_swap_page() -> called in path where folio allocated
shmem_swap_alloc_folio() -> as name implies, allocated folio
-> swap_read_folio()
-> zswap_load()
-> zswap_decompress()
So actually no longer doing this is a de-pessimisation ;)
[0]:https://sashiko.dev/#/patchset/20260316140122.339697-1-ljs%40kernel.org
> } else {
> sg_init_table(&output, 1);
> sg_set_folio(&output, folio, PAGE_SIZE, 0);
> --
> 2.53.0
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-17 10:01 ` Lorenzo Stoakes (Oracle)
@ 2026-03-17 12:13 ` Lorenzo Stoakes (Oracle)
0 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-17 12:13 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Chengming Zhou, linux-mm,
linux-kernel, Mateusz Guzik, Zi Yan
On Tue, Mar 17, 2026 at 10:01:19AM +0000, Lorenzo Stoakes (Oracle) wrote:
> On Mon, Mar 16, 2026 at 02:01:22PM +0000, Lorenzo Stoakes (Oracle) wrote:
> > Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> > zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> > data for uncompressed pages.
> >
> > In doing so, it mapped kernel memory locally for 32-bit kernels using
> > kmap_local_folio(), however it never unmapped this memory.
> >
> > This resulted in the linked syzbot report where a BUG_ON() is triggered due
> > to leaking the kmap slot.
> >
> > This patch fixes the issue by explicitly unmapping the established kmap.
> >
> > Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> > Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> > ---
> > mm/zswap.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/zswap.c b/mm/zswap.c
> > index e6ec3295bdb0..499520f65ff0 100644
> > --- a/mm/zswap.c
> > +++ b/mm/zswap.c
> > @@ -942,9 +942,14 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> >
> > /* zswap entries of length PAGE_SIZE are not compressed. */
> > if (entry->length == PAGE_SIZE) {
> > + void *dst;
> > +
> > WARN_ON_ONCE(input->length != PAGE_SIZE);
> > - memcpy_from_sglist(kmap_local_folio(folio, 0), input, 0, PAGE_SIZE);
> > +
> > + dst = kmap_local_folio(folio, 0);
> > + memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> > dlen = PAGE_SIZE;
> > + kunmap_local(dst);
>
> FYI to address (in advance) the AI review from [0] which a couple people made me
> aware of - we don't need a flush_dcache_folio() here, because the folio is not
> yet accessible by userspace, so we can't have virtual aliasing of the folio's
> physical address on VIVT architectures.
>
> Examining call paths:
>
> zswap_writeback_entry() -> only calls zswap_decompress() if allocated
> -> zswap_decompress()
>
> swap_vma_readahead() -> only calls swap_read_folio() if allocated
> swap_cluster_readahead() -> only calls swap_read_folio() if allocated
> read_swap_cache_async() -> only calls swap_read_folio() if allocated
> do_swap_page() -> called in path where folio allocated
> shmem_swap_alloc_folio() -> as name implies, allocated folio
> -> swap_read_folio()
> -> zswap_load()
> -> zswap_decompress()
>
> So actually no longer doing this is a de-pessimisation ;)
On second thoughts, I'm wrong, this is required. The documentation is really not
clear on this, which is a problem given how horribly fiddly this is.
https://docs.kernel.org/core-api/cachetlb.html it super vague on
update_mmu_cache_range() -
"At the end of every page fault, this routine is invoked to tell the
architecture specific code that translations now exists in the software page
tables for address space “vma->vm_mm” at virtual address “address” for “nr”
consecutive pages."
It doesn't mention that you really need to have invoked dcache flushes in
advance for anything you changed, and that the architecture may then choose to
not do anything at this point otherwise.
Anyway, I'll send a fix-patch.
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
` (3 preceding siblings ...)
2026-03-17 10:01 ` Lorenzo Stoakes (Oracle)
@ 2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
2026-03-17 18:01 ` Andrew Morton
` (2 more replies)
2026-03-18 23:15 ` Nhat Pham
5 siblings, 3 replies; 12+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-17 12:59 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Chengming Zhou, linux-mm,
linux-kernel
Hi Andrew,
Please apply the fix-patch enclosed to add a dcache flush which was also
missing here.
I had assumed that a new folio here combined with the flush that is done at
the point of setting the PTE would suffice, but it doesn't seem that's
actually the case, as update_mmu_cache() will in many archtectures only
actually flush entries where a dcache flush was done on a range previously.
I had also wondered whether kunmap_local() might suffice, but it doesn't
seem to be the case.
Some arches do seem to actually dcache flush on unmap, parisc does it if
CONFIG_HIGHMEM is not set by setting ARCH_HAS_FLUSH_ON_KUNMAP and calling
kunmap_flush_on_unmap() from __kunmap_local(), otherwise non-CONFIG_HIGHMEM
callers do nothing here.
Otherwise arch_kmap_local_pre_unmap() is called which does:
* sparc - flush_cache_all()
* arm - if VIVT, __cpuc_flush_dcache_area()
* otherwise - nothing
Also arch_kmap_local_post_unmap() is called which does:
* arm - local_flush_tlb_kernel_page()
* csky - kmap_flush_tlb()
* microblaze, ppc - local_flush_tlb_page()
* mips - local_flush_tlb_one()
* sparc - flush_tlb_all() (again)
* x86 - arch_flush_lazy_mmu_mode()
* otherwise - nothing
But this is only if it's high memory, and doesn't cover all architectures,
so is presumably intended to handle other cache consistency concerns.
In any case, VIPT is problematic here whether low or high memory (in spite
of what the documentation claims, see [0] - 'the kernel did write to a page
that is in the page cache page and / or in high memory'), because dirty
cache lines may exist at the set indexed by the kernel direct mapping,
which won't exist in the set indexed by any subsequent userland mapping,
meaning userland might read stale data from L2 cache.
Even if the documentation is correct and low memory is fine not to be
flushed here, we can't be sure as to whether the memory is low or high
(kmap_local_folio() will be a no-op if low), and this call should be
harmless if it is low.
VIVT would require more work if the memory were shared and already mapped,
but this isn't the case here, and would anyway be handled by the dcache
flush call.
In any case, we definitely need this flush as far as I can tell.
And we should probably consider updating the documentation unless it turns
out there's somehow dcache synchronisation that happens for low
memory/64-bit kernels elsewhere?
Thanks, Lorenzo
[0]:https://docs.kernel.org/core-api/cachetlb.html
----8<----
From 94989ab3d97f0cde0dcf59eba6466fee932ec4ba Mon Sep 17 00:00:00 2001
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Date: Tue, 17 Mar 2026 12:29:55 +0000
Subject: [PATCH] fix
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
mm/zswap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/zswap.c b/mm/zswap.c
index 499520f65ff0..16b2ef7223e1 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -950,6 +950,7 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
dlen = PAGE_SIZE;
kunmap_local(dst);
+ flush_dcache_folio(folio);
} else {
sg_init_table(&output, 1);
sg_set_folio(&output, folio, PAGE_SIZE, 0);
--
2.53.0
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
@ 2026-03-17 18:01 ` Andrew Morton
2026-03-17 19:01 ` Yosry Ahmed
2026-03-18 0:34 ` SeongJae Park
2 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2026-03-17 18:01 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Johannes Weiner, Yosry Ahmed, Nhat Pham, Chengming Zhou, linux-mm,
linux-kernel
On Tue, 17 Mar 2026 12:59:35 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> Hi Andrew,
>
> Please apply the fix-patch enclosed to add a dcache flush which was also
> missing here.
>
> ...
>
That was a lot of sleuthing.
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -950,6 +950,7 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> dlen = PAGE_SIZE;
> kunmap_local(dst);
> + flush_dcache_folio(folio);
> } else {
> sg_init_table(&output, 1);
> sg_set_folio(&output, folio, PAGE_SIZE, 0);
Which will all be lost when I squash the patch, unless someone follows
the Link: which I add to the footer when squashing.
How about I copy your sleuthing into the base patch changelog to
capture the reasoning for the flush_dcache_folio()'s presence?
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
2026-03-17 18:01 ` Andrew Morton
@ 2026-03-17 19:01 ` Yosry Ahmed
2026-03-18 0:34 ` SeongJae Park
2 siblings, 0 replies; 12+ messages in thread
From: Yosry Ahmed @ 2026-03-17 19:01 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, Johannes Weiner, Nhat Pham, Chengming Zhou,
linux-mm, linux-kernel
On Tue, Mar 17, 2026 at 5:59 AM Lorenzo Stoakes (Oracle) <ljs@kernel.org> wrote:
>
> Hi Andrew,
>
> Please apply the fix-patch enclosed to add a dcache flush which was also
> missing here.
>
> I had assumed that a new folio here combined with the flush that is done at
> the point of setting the PTE would suffice, but it doesn't seem that's
> actually the case, as update_mmu_cache() will in many archtectures only
> actually flush entries where a dcache flush was done on a range previously.
>
> I had also wondered whether kunmap_local() might suffice, but it doesn't
> seem to be the case.
>
> Some arches do seem to actually dcache flush on unmap, parisc does it if
> CONFIG_HIGHMEM is not set by setting ARCH_HAS_FLUSH_ON_KUNMAP and calling
> kunmap_flush_on_unmap() from __kunmap_local(), otherwise non-CONFIG_HIGHMEM
> callers do nothing here.
>
> Otherwise arch_kmap_local_pre_unmap() is called which does:
>
> * sparc - flush_cache_all()
> * arm - if VIVT, __cpuc_flush_dcache_area()
> * otherwise - nothing
>
> Also arch_kmap_local_post_unmap() is called which does:
>
> * arm - local_flush_tlb_kernel_page()
> * csky - kmap_flush_tlb()
> * microblaze, ppc - local_flush_tlb_page()
> * mips - local_flush_tlb_one()
> * sparc - flush_tlb_all() (again)
> * x86 - arch_flush_lazy_mmu_mode()
> * otherwise - nothing
>
> But this is only if it's high memory, and doesn't cover all architectures,
> so is presumably intended to handle other cache consistency concerns.
>
> In any case, VIPT is problematic here whether low or high memory (in spite
> of what the documentation claims, see [0] - 'the kernel did write to a page
> that is in the page cache page and / or in high memory'), because dirty
> cache lines may exist at the set indexed by the kernel direct mapping,
> which won't exist in the set indexed by any subsequent userland mapping,
> meaning userland might read stale data from L2 cache.
>
> Even if the documentation is correct and low memory is fine not to be
> flushed here, we can't be sure as to whether the memory is low or high
> (kmap_local_folio() will be a no-op if low), and this call should be
> harmless if it is low.
>
> VIVT would require more work if the memory were shared and already mapped,
> but this isn't the case here, and would anyway be handled by the dcache
> flush call.
>
> In any case, we definitely need this flush as far as I can tell.
>
> And we should probably consider updating the documentation unless it turns
> out there's somehow dcache synchronisation that happens for low
> memory/64-bit kernels elsewhere?
Yeah the documentation is unclear to me too. Given that the flush
existed before, I agree that we should re-add it here even if it's
just cautionary tbh. This feels like something that should be handled
by the kmap API or memcpy_from_sglist() though, sounds like it's easy
to get it wrong.
>
> Thanks, Lorenzo
>
> [0]:https://docs.kernel.org/core-api/cachetlb.html
>
> ----8<----
> From 94989ab3d97f0cde0dcf59eba6466fee932ec4ba Mon Sep 17 00:00:00 2001
> From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
> Date: Tue, 17 Mar 2026 12:29:55 +0000
> Subject: [PATCH] fix
>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
> mm/zswap.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 499520f65ff0..16b2ef7223e1 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -950,6 +950,7 @@ static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> memcpy_from_sglist(dst, input, 0, PAGE_SIZE);
> dlen = PAGE_SIZE;
> kunmap_local(dst);
> + flush_dcache_folio(folio);
> } else {
> sg_init_table(&output, 1);
> sg_set_folio(&output, folio, PAGE_SIZE, 0);
> --
> 2.53.0
Acked-by: Yosry Ahmed <yosry@kernel.org>
Do you mind sending Andrew an updated changelog capturing the
reasoning here, I'd hate for that to be lost. Or maybe a v2 including
the full patch and updated changelog. Up to you/Andrew.
Thank you!
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
2026-03-17 18:01 ` Andrew Morton
2026-03-17 19:01 ` Yosry Ahmed
@ 2026-03-18 0:34 ` SeongJae Park
2 siblings, 0 replies; 12+ messages in thread
From: SeongJae Park @ 2026-03-18 0:34 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: SeongJae Park, Andrew Morton, Johannes Weiner, Yosry Ahmed,
Nhat Pham, Chengming Zhou, linux-mm, linux-kernel
On Tue, 17 Mar 2026 12:59:35 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> Hi Andrew,
>
> Please apply the fix-patch enclosed to add a dcache flush which was also
> missing here.
>
> I had assumed that a new folio here combined with the flush that is done at
> the point of setting the PTE would suffice, but it doesn't seem that's
> actually the case, as update_mmu_cache() will in many archtectures only
> actually flush entries where a dcache flush was done on a range previously.
>
> I had also wondered whether kunmap_local() might suffice, but it doesn't
> seem to be the case.
>
> Some arches do seem to actually dcache flush on unmap, parisc does it if
> CONFIG_HIGHMEM is not set by setting ARCH_HAS_FLUSH_ON_KUNMAP and calling
> kunmap_flush_on_unmap() from __kunmap_local(), otherwise non-CONFIG_HIGHMEM
> callers do nothing here.
>
> Otherwise arch_kmap_local_pre_unmap() is called which does:
>
> * sparc - flush_cache_all()
> * arm - if VIVT, __cpuc_flush_dcache_area()
> * otherwise - nothing
>
> Also arch_kmap_local_post_unmap() is called which does:
>
> * arm - local_flush_tlb_kernel_page()
> * csky - kmap_flush_tlb()
> * microblaze, ppc - local_flush_tlb_page()
> * mips - local_flush_tlb_one()
> * sparc - flush_tlb_all() (again)
> * x86 - arch_flush_lazy_mmu_mode()
> * otherwise - nothing
>
> But this is only if it's high memory, and doesn't cover all architectures,
> so is presumably intended to handle other cache consistency concerns.
>
> In any case, VIPT is problematic here whether low or high memory (in spite
> of what the documentation claims, see [0] - 'the kernel did write to a page
> that is in the page cache page and / or in high memory'), because dirty
> cache lines may exist at the set indexed by the kernel direct mapping,
> which won't exist in the set indexed by any subsequent userland mapping,
> meaning userland might read stale data from L2 cache.
>
> Even if the documentation is correct and low memory is fine not to be
> flushed here, we can't be sure as to whether the memory is low or high
> (kmap_local_folio() will be a no-op if low), and this call should be
> harmless if it is low.
>
> VIVT would require more work if the memory were shared and already mapped,
> but this isn't the case here, and would anyway be handled by the dcache
> flush call.
>
> In any case, we definitely need this flush as far as I can tell.
Makes sense, thank you for detailed description!
>
> And we should probably consider updating the documentation unless it turns
> out there's somehow dcache synchronisation that happens for low
> memory/64-bit kernels elsewhere?
>
> Thanks, Lorenzo
>
> [0]:https://docs.kernel.org/core-api/cachetlb.html
>
> ----8<----
> From 94989ab3d97f0cde0dcf59eba6466fee932ec4ba Mon Sep 17 00:00:00 2001
> From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
> Date: Tue, 17 Mar 2026 12:29:55 +0000
> Subject: [PATCH] fix
>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local()
2026-03-16 14:01 [PATCH mm-hotfixes] mm/zswap: add missing kunmap_local() Lorenzo Stoakes (Oracle)
` (4 preceding siblings ...)
2026-03-17 12:59 ` Lorenzo Stoakes (Oracle)
@ 2026-03-18 23:15 ` Nhat Pham
5 siblings, 0 replies; 12+ messages in thread
From: Nhat Pham @ 2026-03-18 23:15 UTC (permalink / raw)
To: Lorenzo Stoakes (Oracle)
Cc: Andrew Morton, Johannes Weiner, Yosry Ahmed, Chengming Zhou,
linux-mm, linux-kernel
On Mon, Mar 16, 2026 at 7:01 AM Lorenzo Stoakes (Oracle) <ljs@kernel.org> wrote:
>
> Commit e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from
> zsmalloc") updated zswap_decompress() to use the scatterwalk API to copy
> data for uncompressed pages.
>
> In doing so, it mapped kernel memory locally for 32-bit kernels using
> kmap_local_folio(), however it never unmapped this memory.
>
> This resulted in the linked syzbot report where a BUG_ON() is triggered due
> to leaking the kmap slot.
>
> This patch fixes the issue by explicitly unmapping the established kmap.
>
> Reported-by: syzbot+fe426bef95363177631d@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/69b75e2c.050a0220.12d28.015a.GAE@google.com
> Fixes: e2c3b6b21c77 ("mm: zswap: use SG list decompression APIs from zsmalloc")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Thanks for the catch, Lorenzo :)
^ permalink raw reply [flat|nested] 12+ messages in thread