* [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 [not found] <7829b070df1b405dbc97dd6a028d8c8a@honor.com> @ 2026-04-07 13:37 ` wangzhen 2026-04-07 14:25 ` Kairui Song 0 siblings, 1 reply; 3+ messages in thread From: wangzhen @ 2026-04-07 13:37 UTC (permalink / raw) To: Andrew Morton Cc: Johannes Weiner, David Hildenbrand, Michal Hocko, Qi Zheng, Shakeel Butt, Lorenzo Stoakes, Axel Rasmussen, Yuanchu Xie, Wei Xu, kasong@tencent.com, baolin.wang@linux.alibaba.com, baohua@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org From ac731b061f152cba05b9aa351652a04f933986e0 Mon Sep 17 00:00:00 2001 From: w00021541 <wangzhen5@hihonor.com> Date: Tue, 7 Apr 2026 16:17:53 +0800 Subject: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 In some cases, when swappiness is set to 0 or 201, the oldest generation pages will be changed to the newest generation incorrectly. Consider the following aging scenario: MAX_NR_GENS=4, MIN_NR_GENS=2, swappiness=201, 3 anon gens, 4 file gens. 1. When swappiness = 201, should_run_aging will only check anon type. should_run_aging return true. 2. In inc_max_seq, if the anon and file type have MAX_NR_GENS, inc_min_seq will move the oldest generation pages to the second oldest to prepare for increasing max_seq. Here, the file type will enter inc_min_seq. 3. In inc_min_seq, first goto is true, the pages migration was skipped, resulting in the inversion of cold/hot pages. In fact, when MAX_NR_GENS=4 and MIN_NR_GENS=2, the for loop after the goto is unreachable. Consider the code in inc_max_seq: if (get_nr_gens(lruvec, type) ! = MAX_NR_GENS) continue; This means that only get_nr_gens==4 can enter the inc_min_seq. Discuss the swappiness in three different scenarios: 1<=swappiness<=200: If should_run_aging returns true, both anon and file types must satisfy get_nr_gens<=3, indicating that no type satisfies get_nr_gens==MAX_NR_GENS. Therefore, both cannot enter inc_min_seq. swappiness=201: If should_run_aging returns true, the anon type must satisfy get_nr_gens<=3. Only file type can satisfy get_nr_gens==MAX_NR_GENS. After entering inc_min_seq, type && (swappiness == SWAPPINESS_ANON_ONLY) is true, the for loop will be skipped. swappiness=0: Same as swappiness=201 so the two goto statements should be removed. This ensures that when swappiness=0 or 201, the oldest generation pages are correctly promoted to the second oldest generation. (When 1<= swappiness<=200, only both anon and file types get_nr_gens<=3 will age, preventing the inversion of hot/cold pages). Signed-off-by: w00021541 <wangzhen5@hihonor.com> --- mm/vmscan.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 0fc9373e8251..54c835b07d3e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3843,7 +3843,7 @@ static void clear_mm_walk(void) kfree(walk); } -static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) +static bool inc_min_seq(struct lruvec *lruvec, int type) { int zone; int remaining = MAX_LRU_BATCH; @@ -3851,14 +3851,6 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) int hist = lru_hist_from_seq(lrugen->min_seq[type]); int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); - /* For file type, skip the check if swappiness is anon only */ - if (type && (swappiness == SWAPPINESS_ANON_ONLY)) - goto done; - - /* For anon type, skip the check if swappiness is zero (file only) */ - if (!type && !swappiness) - goto done; - /* prevent cold/hot inversion if the type is evictable */ for (zone = 0; zone < MAX_NR_ZONES; zone++) { struct list_head *head = &lrugen->folios[old_gen][type][zone]; @@ -3889,7 +3881,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) return false; } } -done: + reset_ctrl_pos(lruvec, type, true); WRITE_ONCE(lrugen->min_seq[type], lrugen->min_seq[type] + 1); @@ -3975,7 +3967,7 @@ static bool inc_max_seq(struct lruvec *lruvec, unsigned long seq, int swappiness if (get_nr_gens(lruvec, type) != MAX_NR_GENS) continue; - if (inc_min_seq(lruvec, type, swappiness)) + if (inc_min_seq(lruvec, type)) continue; spin_unlock_irq(&lruvec->lru_lock); -- 2.17.1 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 2026-04-07 13:37 ` [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 wangzhen @ 2026-04-07 14:25 ` Kairui Song 2026-04-07 23:00 ` Barry Song 0 siblings, 1 reply; 3+ messages in thread From: Kairui Song @ 2026-04-07 14:25 UTC (permalink / raw) To: wangzhen Cc: Andrew Morton, Johannes Weiner, David Hildenbrand, Michal Hocko, Qi Zheng, Shakeel Butt, Lorenzo Stoakes, Axel Rasmussen, Yuanchu Xie, Wei Xu, kasong@tencent.com, baolin.wang@linux.alibaba.com, baohua@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Tue, Apr 07, 2026 at 01:37:08PM +0800, wangzhen wrote: > >From ac731b061f152cba05b9aa351652a04f933986e0 Mon Sep 17 00:00:00 2001 > From: w00021541 <wangzhen5@hihonor.com> > Date: Tue, 7 Apr 2026 16:17:53 +0800 > Subject: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 > > In some cases, when swappiness is set to 0 or 201, the oldest generation pages will be changed to the newest generation incorrectly. > > Consider the following aging scenario: > MAX_NR_GENS=4, MIN_NR_GENS=2, swappiness=201, 3 anon gens, 4 file gens. > 1. When swappiness = 201, should_run_aging will only check anon type. > should_run_aging return true. > 2. In inc_max_seq, if the anon and file type have MAX_NR_GENS, inc_min_seq will move the oldest generation pages to the second oldest to prepare for increasing max_seq. > Here, the file type will enter inc_min_seq. > 3. In inc_min_seq, first goto is true, the pages migration was skipped, resulting in the inversion of cold/hot pages. > > In fact, when MAX_NR_GENS=4 and MIN_NR_GENS=2, the for loop after the goto is unreachable. > > Consider the code in inc_max_seq: > if (get_nr_gens(lruvec, type) ! = MAX_NR_GENS) > continue; > This means that only get_nr_gens==4 can enter the inc_min_seq. > > Discuss the swappiness in three different scenarios: > 1<=swappiness<=200: > If should_run_aging returns true, both anon and file types must satisfy get_nr_gens<=3, indicating that no type satisfies get_nr_gens==MAX_NR_GENS. > Therefore, both cannot enter inc_min_seq. > > swappiness=201: > If should_run_aging returns true, the anon type must satisfy get_nr_gens<=3. Only file type can satisfy get_nr_gens==MAX_NR_GENS. > After entering inc_min_seq, type && (swappiness == SWAPPINESS_ANON_ONLY) is true, the for loop will be skipped. > > swappiness=0: > Same as swappiness=201 > > so the two goto statements should be removed. This ensures that when swappiness=0 or 201, the oldest generation pages are correctly promoted to the second oldest generation. > (When 1<= swappiness<=200, only both anon and file types get_nr_gens<=3 will age, preventing the inversion of hot/cold pages). > > Signed-off-by: w00021541 <wangzhen5@hihonor.com> > --- > mm/vmscan.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 0fc9373e8251..54c835b07d3e 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3843,7 +3843,7 @@ static void clear_mm_walk(void) > kfree(walk); > } > > -static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) > +static bool inc_min_seq(struct lruvec *lruvec, int type) > { > int zone; > int remaining = MAX_LRU_BATCH; > @@ -3851,14 +3851,6 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) > int hist = lru_hist_from_seq(lrugen->min_seq[type]); > int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); > > - /* For file type, skip the check if swappiness is anon only */ > - if (type && (swappiness == SWAPPINESS_ANON_ONLY)) > - goto done; > - > - /* For anon type, skip the check if swappiness is zero (file only) */ > - if (!type && !swappiness) > - goto done; > - Hi, thanks for the patch. We have a very similar patch internally, and the result is kind of bad. Currently MGLRU forbid the gen distance between file and anon go larger than 2, which mean with this patch, when under great pressure, you may have to keep rotating a long list of the opposite type of folios to reclaim another type. For example, when you have only 2 gens of file folios, swap disabled, and there are 3 gens of anon folios. Anon folios are unevictable because there is no SWAP. And file is also unevcitable due to force protection of gen. Consider anon folios are mostly cold (at least a portion of them are), now the oldest gen of anon folios will be very long (e.g. 12G, 3145728 folios). Now, to reclaim any file folios, you have to age first. Before this patch that is usually fast. But after this, it will have to rotate all 3145728 folios to second oldest anon gen, will could take a very long time. During that period any concurrent reclaimer will get rejected due to force protection, result in very ugly long tailing or unexpected OOM. So I agree this is a good idea in general, I agree we should do this. But better defer this until we patch up MGLRU to remove the force protection first. But I think it might be reasonable to remove the SWAPPINESS_ANON_ONLY limit now, that can only be triggered by proactive reclaim which would tolerate long tailing and won't cause OOM. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 2026-04-07 14:25 ` Kairui Song @ 2026-04-07 23:00 ` Barry Song 0 siblings, 0 replies; 3+ messages in thread From: Barry Song @ 2026-04-07 23:00 UTC (permalink / raw) To: Kairui Song Cc: wangzhen, Andrew Morton, Johannes Weiner, David Hildenbrand, Michal Hocko, Qi Zheng, Shakeel Butt, Lorenzo Stoakes, Axel Rasmussen, Yuanchu Xie, Wei Xu, kasong@tencent.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Tue, Apr 7, 2026 at 10:26 PM Kairui Song <ryncsn@gmail.com> wrote: > > On Tue, Apr 07, 2026 at 01:37:08PM +0800, wangzhen wrote: > > >From ac731b061f152cba05b9aa351652a04f933986e0 Mon Sep 17 00:00:00 2001 > > From: w00021541 <wangzhen5@hihonor.com> > > Date: Tue, 7 Apr 2026 16:17:53 +0800 > > Subject: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 > > > > In some cases, when swappiness is set to 0 or 201, the oldest generation pages will be changed to the newest generation incorrectly. > > > > Consider the following aging scenario: > > MAX_NR_GENS=4, MIN_NR_GENS=2, swappiness=201, 3 anon gens, 4 file gens. > > 1. When swappiness = 201, should_run_aging will only check anon type. > > should_run_aging return true. > > 2. In inc_max_seq, if the anon and file type have MAX_NR_GENS, inc_min_seq will move the oldest generation pages to the second oldest to prepare for increasing max_seq. > > Here, the file type will enter inc_min_seq. > > 3. In inc_min_seq, first goto is true, the pages migration was skipped, resulting in the inversion of cold/hot pages. > > > > In fact, when MAX_NR_GENS=4 and MIN_NR_GENS=2, the for loop after the goto is unreachable. > > > > Consider the code in inc_max_seq: > > if (get_nr_gens(lruvec, type) ! = MAX_NR_GENS) > > continue; > > This means that only get_nr_gens==4 can enter the inc_min_seq. > > > > Discuss the swappiness in three different scenarios: > > 1<=swappiness<=200: > > If should_run_aging returns true, both anon and file types must satisfy get_nr_gens<=3, indicating that no type satisfies get_nr_gens==MAX_NR_GENS. > > Therefore, both cannot enter inc_min_seq. > > > > swappiness=201: > > If should_run_aging returns true, the anon type must satisfy get_nr_gens<=3. Only file type can satisfy get_nr_gens==MAX_NR_GENS. > > After entering inc_min_seq, type && (swappiness == SWAPPINESS_ANON_ONLY) is true, the for loop will be skipped. > > > > swappiness=0: > > Same as swappiness=201 > > > > so the two goto statements should be removed. This ensures that when swappiness=0 or 201, the oldest generation pages are correctly promoted to the second oldest generation. > > (When 1<= swappiness<=200, only both anon and file types get_nr_gens<=3 will age, preventing the inversion of hot/cold pages). > > > > Signed-off-by: w00021541 <wangzhen5@hihonor.com> > > --- > > mm/vmscan.c | 14 +++----------- > > 1 file changed, 3 insertions(+), 11 deletions(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 0fc9373e8251..54c835b07d3e 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3843,7 +3843,7 @@ static void clear_mm_walk(void) > > kfree(walk); > > } > > > > -static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) > > +static bool inc_min_seq(struct lruvec *lruvec, int type) > > { > > int zone; > > int remaining = MAX_LRU_BATCH; > > @@ -3851,14 +3851,6 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness) > > int hist = lru_hist_from_seq(lrugen->min_seq[type]); > > int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); > > > > - /* For file type, skip the check if swappiness is anon only */ > > - if (type && (swappiness == SWAPPINESS_ANON_ONLY)) > > - goto done; > > - > > - /* For anon type, skip the check if swappiness is zero (file only) */ > > - if (!type && !swappiness) > > - goto done; > > - > > Hi, thanks for the patch. > > We have a very similar patch internally, and the result is kind of bad. > > Currently MGLRU forbid the gen distance between file and anon go larger > than 2, which mean with this patch, when under great pressure, you may > have to keep rotating a long list of the opposite type of folios to > reclaim another type. > > For example, when you have only 2 gens of file folios, swap disabled, > and there are 3 gens of anon folios. Anon folios are unevictable because > there is no SWAP. And file is also unevcitable due to force protection > of gen. Consider anon folios are mostly cold (at least a portion of them > are), now the oldest gen of anon folios will be very long (e.g. 12G, > 3145728 folios). > > Now, to reclaim any file folios, you have to age first. Before this > patch that is usually fast. But after this, it will have to rotate > all 3145728 folios to second oldest anon gen, will could take a > very long time. > > During that period any concurrent reclaimer will get rejected > due to force protection, result in very ugly long tailing or > unexpected OOM. > > So I agree this is a good idea in general, I agree we should do > this. But better defer this until we patch up MGLRU to remove > the force protection first. I suspect that once we can age file and anonymous pages separately, this issue will resolve itself. David already has some code for this [1]. Not sure when he will have time to push it upstream, but I may carve out some time to take care of it this month. [1] https://lore.kernel.org/linux-mm/aam5nOyXs1sNdjTe@google.com/ > > But I think it might be reasonable to remove the SWAPPINESS_ANON_ONLY > limit now, that can only be triggered by proactive reclaim > which would tolerate long tailing and won't cause OOM. It may be better to defer both cases until file and anonymous pages can be aged separately. Thanks Barry ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-07 23:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <7829b070df1b405dbc97dd6a028d8c8a@honor.com>
2026-04-07 13:37 ` [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201 wangzhen
2026-04-07 14:25 ` Kairui Song
2026-04-07 23:00 ` Barry Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox