* [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
@ 2023-07-21 6:37 Andrew Yang
2023-07-26 2:31 ` Sergey Senozhatsky
2023-07-26 3:18 ` Sergey Senozhatsky
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Yang @ 2023-07-21 6:37 UTC (permalink / raw)
To: Minchan Kim, Sergey Senozhatsky, Andrew Morton, Matthias Brugger,
AngeloGioacchino Del Regno, Sebastian Andrzej Siewior
Cc: wsd_upstream, casper.li, Andrew Yang, linux-mm, linux-kernel,
linux-arm-kernel, linux-mediatek
Since fullness and isolated share the same unsigned int,
modifications of them should be protected by the same lock.
Signed-off-by: Andrew Yang <andrew.yang@mediatek.com>
Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration")
---
mm/zsmalloc.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 32f5bc4074df..b96230402a8d 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1777,6 +1777,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage,
static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
{
+ struct zs_pool *pool;
struct zspage *zspage;
/*
@@ -1786,9 +1787,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
VM_BUG_ON_PAGE(PageIsolated(page), page);
zspage = get_zspage(page);
- migrate_write_lock(zspage);
+ pool = zspage->pool;
+ spin_lock(&pool->lock);
inc_zspage_isolation(zspage);
- migrate_write_unlock(zspage);
+ spin_unlock(&pool->lock);
return true;
}
@@ -1858,8 +1860,8 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
* Since we complete the data copy and set up new zspage structure,
* it's okay to release the pool's lock.
*/
- spin_unlock(&pool->lock);
dec_zspage_isolation(zspage);
+ spin_unlock(&pool->lock);
migrate_write_unlock(zspage);
get_page(newpage);
@@ -1876,14 +1878,16 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
static void zs_page_putback(struct page *page)
{
+ struct zs_pool *pool;
struct zspage *zspage;
VM_BUG_ON_PAGE(!PageIsolated(page), page);
zspage = get_zspage(page);
- migrate_write_lock(zspage);
+ pool = zspage->pool;
+ spin_lock(&pool->lock);
dec_zspage_isolation(zspage);
- migrate_write_unlock(zspage);
+ spin_unlock(&pool->lock);
}
static const struct movable_operations zsmalloc_mops = {
--
2.18.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-21 6:37 [PATCH] zsmalloc: Fix races between modifications of fullness and isolated Andrew Yang
@ 2023-07-26 2:31 ` Sergey Senozhatsky
2023-07-26 2:57 ` Sergey Senozhatsky
2023-07-26 3:18 ` Sergey Senozhatsky
1 sibling, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2023-07-26 2:31 UTC (permalink / raw)
To: Andrew Yang
Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, Matthias Brugger,
AngeloGioacchino Del Regno, Sebastian Andrzej Siewior,
wsd_upstream, casper.li, linux-mm, linux-kernel, linux-arm-kernel,
linux-mediatek
On (23/07/21 14:37), Andrew Yang wrote:
>
> Since fullness and isolated share the same unsigned int,
> modifications of them should be protected by the same lock.
Sorry, I don't think I follow. Can you please elaborate?
What is fullness in this context? What is the race condition
exactly? Can I please have something like
CPU0 CPU1
foo bar
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-26 2:31 ` Sergey Senozhatsky
@ 2023-07-26 2:57 ` Sergey Senozhatsky
0 siblings, 0 replies; 7+ messages in thread
From: Sergey Senozhatsky @ 2023-07-26 2:57 UTC (permalink / raw)
To: Andrew Yang
Cc: Minchan Kim, Andrew Morton, Matthias Brugger,
AngeloGioacchino Del Regno, Sebastian Andrzej Siewior,
wsd_upstream, casper.li, linux-mm, linux-kernel, linux-arm-kernel,
linux-mediatek, Sergey Senozhatsky
On (23/07/26 11:31), Sergey Senozhatsky wrote:
> On (23/07/21 14:37), Andrew Yang wrote:
> >
> > Since fullness and isolated share the same unsigned int,
> > modifications of them should be protected by the same lock.
>
> Sorry, I don't think I follow. Can you please elaborate?
> What is fullness in this context?
Oh, my bad, so that's zspage's fullness:FULLNESS_BITS and
isolated:ISOLATED_BITS. I somehow thought about something
very different (page isolated, not zspage isolated).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-21 6:37 [PATCH] zsmalloc: Fix races between modifications of fullness and isolated Andrew Yang
2023-07-26 2:31 ` Sergey Senozhatsky
@ 2023-07-26 3:18 ` Sergey Senozhatsky
2023-07-26 6:59 ` Andrew Yang (楊智強)
1 sibling, 1 reply; 7+ messages in thread
From: Sergey Senozhatsky @ 2023-07-26 3:18 UTC (permalink / raw)
To: Andrew Yang
Cc: Minchan Kim, Sergey Senozhatsky, Andrew Morton, Matthias Brugger,
AngeloGioacchino Del Regno, Sebastian Andrzej Siewior,
wsd_upstream, casper.li, linux-mm, linux-kernel, linux-arm-kernel,
linux-mediatek
On (23/07/21 14:37), Andrew Yang wrote:
>
> Since fullness and isolated share the same unsigned int,
> modifications of them should be protected by the same lock.
>
> Signed-off-by: Andrew Yang <andrew.yang@mediatek.com>
> Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for migration")
Have you observed issues in real life? That commit is more than a year
and a half old, so I wonder.
> @@ -1858,8 +1860,8 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
> * Since we complete the data copy and set up new zspage structure,
> * it's okay to release the pool's lock.
> */
This comment should be moved too, because this is not where we unlock the
pool anymore.
> - spin_unlock(&pool->lock);
> dec_zspage_isolation(zspage);
> + spin_unlock(&pool->lock);
> migrate_write_unlock(zspage);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-26 3:18 ` Sergey Senozhatsky
@ 2023-07-26 6:59 ` Andrew Yang (楊智強)
2023-07-26 11:31 ` Sergey Senozhatsky
2023-07-26 20:18 ` Andrew Morton
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Yang (楊智強) @ 2023-07-26 6:59 UTC (permalink / raw)
To: senozhatsky@chromium.org
Cc: bigeasy@linutronix.de, linux-kernel@vger.kernel.org,
linux-mediatek@lists.infradead.org, linux-mm@kvack.org,
wsd_upstream, Casper Li (李中榮),
akpm@linux-foundation.org, minchan@kernel.org,
linux-arm-kernel@lists.infradead.org, matthias.bgg@gmail.com,
angelogioacchino.delregno@collabora.com
On Wed, 2023-07-26 at 12:18 +0900, Sergey Senozhatsky wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On (23/07/21 14:37), Andrew Yang wrote:
> >
> > Since fullness and isolated share the same unsigned int,
> > modifications of them should be protected by the same lock.
> >
> > Signed-off-by: Andrew Yang <andrew.yang@mediatek.com>
> > Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for
> migration")
>
> Have you observed issues in real life? That commit is more than a
> year
> and a half old, so I wonder.
>
Yes, we encountered many kernel exceptions of
VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and
BUG_ON(!pages[1]) in zs_unmap_object() lately.
This issue only occurs when migration and reclamation occur at the
same time. With our memory stress test, we can reproduce this issue
several times a day. We have no idea why no one else encountered
this issue. BTW, we switched to the new kernel version with this
defect a few months ago.
> > @@ -1858,8 +1860,8 @@ static int zs_page_migrate(struct page
> *newpage, struct page *page,
> > * Since we complete the data copy and set up new zspage
> structure,
> > * it's okay to release the pool's lock.
> > */
>
> This comment should be moved too, because this is not where we unlock
> the
> pool anymore.
>
Okay, I will submit a new patch later.
> > -spin_unlock(&pool->lock);
> > dec_zspage_isolation(zspage);
> > +spin_unlock(&pool->lock);
> > migrate_write_unlock(zspage);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-26 6:59 ` Andrew Yang (楊智強)
@ 2023-07-26 11:31 ` Sergey Senozhatsky
2023-07-26 20:18 ` Andrew Morton
1 sibling, 0 replies; 7+ messages in thread
From: Sergey Senozhatsky @ 2023-07-26 11:31 UTC (permalink / raw)
To: Andrew Yang (楊智強)
Cc: senozhatsky@chromium.org, bigeasy@linutronix.de,
linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org,
linux-mm@kvack.org, wsd_upstream,
Casper Li (李中榮), akpm@linux-foundation.org,
minchan@kernel.org, linux-arm-kernel@lists.infradead.org,
matthias.bgg@gmail.com, angelogioacchino.delregno@collabora.com
On (23/07/26 06:59), Andrew Yang (楊智強) wrote:
> On Wed, 2023-07-26 at 12:18 +0900, Sergey Senozhatsky wrote:
> >
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> > On (23/07/21 14:37), Andrew Yang wrote:
> > >
> > > Since fullness and isolated share the same unsigned int,
> > > modifications of them should be protected by the same lock.
> > >
> > > Signed-off-by: Andrew Yang <andrew.yang@mediatek.com>
> > > Fixes: c4549b871102 ("zsmalloc: remove zspage isolation for
> > migration")
> >
> > Have you observed issues in real life? That commit is more than a
> > year
> > and a half old, so I wonder.
> >
> Yes, we encountered many kernel exceptions of
> VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and
> BUG_ON(!pages[1]) in zs_unmap_object() lately.
Got it.
> This issue only occurs when migration and reclamation occur at the
> same time. With our memory stress test, we can reproduce this issue
> several times a day. We have no idea why no one else encountered
> this issue. BTW, we switched to the new kernel version with this
> defect a few months ago.
Yeah, pretty curious myself.
> > > @@ -1858,8 +1860,8 @@ static int zs_page_migrate(struct page
> > *newpage, struct page *page,
> > > * Since we complete the data copy and set up new zspage
> > structure,
> > > * it's okay to release the pool's lock.
> > > */
> >
> > This comment should be moved too, because this is not where we unlock
> > the
> > pool anymore.
> >
> Okay, I will submit a new patch later.
Thank you!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] zsmalloc: Fix races between modifications of fullness and isolated
2023-07-26 6:59 ` Andrew Yang (楊智強)
2023-07-26 11:31 ` Sergey Senozhatsky
@ 2023-07-26 20:18 ` Andrew Morton
1 sibling, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2023-07-26 20:18 UTC (permalink / raw)
To: Andrew Yang
Cc: senozhatsky@chromium.org, bigeasy@linutronix.de,
linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org,
linux-mm@kvack.org, wsd_upstream, Casper Li, minchan@kernel.org,
linux-arm-kernel@lists.infradead.org, matthias.bgg@gmail.com,
angelogioacchino.delregno@collabora.com
On Wed, 26 Jul 2023 06:59:20 +0000 Andrew Yang (楊智強) <Andrew.Yang@mediatek.com> wrote:
> > Have you observed issues in real life? That commit is more than a
> > year
> > and a half old, so I wonder.
> >
> Yes, we encountered many kernel exceptions of
> VM_BUG_ON(zspage->isolated == 0) in dec_zspage_isolation() and
> BUG_ON(!pages[1]) in zs_unmap_object() lately.
> This issue only occurs when migration and reclamation occur at the
> same time. With our memory stress test, we can reproduce this issue
> several times a day. We have no idea why no one else encountered
> this issue. BTW, we switched to the new kernel version with this
> defect a few months ago.
Ah. It's important that such information be in the changelog!
I have put this info into my copy of the v1 patch's changelog.
I have moved the v1 patch from the mm-unstable branch into
mm-hotfixes-unstable, so it is staged for merging in this -rc cycle.
I have also added a cc:stable so that the fix gets backported into
kernels which contain c4549b871102.
I have added a note-to-self that a v2 patch is expected.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-07-26 20:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-21 6:37 [PATCH] zsmalloc: Fix races between modifications of fullness and isolated Andrew Yang
2023-07-26 2:31 ` Sergey Senozhatsky
2023-07-26 2:57 ` Sergey Senozhatsky
2023-07-26 3:18 ` Sergey Senozhatsky
2023-07-26 6:59 ` Andrew Yang (楊智強)
2023-07-26 11:31 ` Sergey Senozhatsky
2023-07-26 20:18 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).