* [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
@ 2026-04-24 12:37 kerayhuang
2026-04-24 15:39 ` Baoquan He
0 siblings, 1 reply; 10+ messages in thread
From: kerayhuang @ 2026-04-24 12:37 UTC (permalink / raw)
To: kasong, bhe; +Cc: linux-mm, kerayhuang, Hao Peng
Add periodic cond_resched() calls during large full_clusters
reclaim operations to prevent softlockup issues.
Signed-off-by: kerayhuang <kerayhuang@tencent.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
---
mm/swapfile.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9174f1eeffb0..74a1e324449d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
swap_cluster_unlock(ci);
if (to_scan <= 0)
break;
+ cond_resched();
}
}
--
2.43.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-04-24 12:37 [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup kerayhuang
@ 2026-04-24 15:39 ` Baoquan He
2026-04-29 12:49 ` kerayhuang
0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2026-04-24 15:39 UTC (permalink / raw)
To: kerayhuang; +Cc: kasong, bhe, linux-mm, kerayhuang, Hao Peng
Hi Keray,
On 04/24/26 at 08:37pm, kerayhuang wrote:
> Add periodic cond_resched() calls during large full_clusters
> reclaim operations to prevent softlockup issues.
>
> Signed-off-by: kerayhuang <kerayhuang@tencent.com>
> Reviewed-by: Kairui Song <kasong@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> ---
> mm/swapfile.c | 1 +
> 1 file changed, 1 insertion(+)
Thanks for the patch. The change looks good to me, however there are
still small concerns.
For patch log, it might be better to provide more details, e.g did you
observe this issue in a product environment, or just a code exploring?
If observed in a product environment, what does the backtrace look like
when softlockup happened?
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9174f1eeffb0..74a1e324449d 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
> swap_cluster_unlock(ci);
> if (to_scan <= 0)
> break;
> + cond_resched();
Besides, is it a little bit too aggressive to call cond_resched() for
each cluster reclaiming compared with the old code? Do you consider to
make it gentle, e.g calling cond_resched() every several clusters, 8, 16
or other number decided based on your testing performance statistics.
Thanks
Baoquan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-04-24 15:39 ` Baoquan He
@ 2026-04-29 12:49 ` kerayhuang
2026-05-01 2:43 ` Baoquan He
0 siblings, 1 reply; 10+ messages in thread
From: kerayhuang @ 2026-04-29 12:49 UTC (permalink / raw)
To: baoquan.he
Cc: bhe, flyingpeng, huangzjsmile, kasong, kerayhuang, albinwyang,
linux-mm
Hi Baoquan,
Thanks for the review!
> Hi Keray,
>
> On 04/24/26 at 08:37pm, kerayhuang wrote:
> > Add periodic cond_resched() calls during large full_clusters
> > reclaim operations to prevent softlockup issues.
> >
> > Signed-off-by: kerayhuang <kerayhuang@tencent.com>
> > Reviewed-by: Kairui Song <kasong@tencent.com>
> > Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> > ---
> > mm/swapfile.c | 1 +
> > 1 file changed, 1 insertion(+)
>
> Thanks for the patch. The change looks good to me, however there are
> still small concerns.
>
> For patch log, it might be better to provide more details, e.g did you
> observe this issue in a product environment, or just a code exploring?
> If observed in a product environment, what does the backtrace look like
> when softlockup happened?
We hit a real softlockup in an internal stress test environment.
The workload was LTP memory/swap stress on a large arm64 machine,
with 320 CPUs, about 1TB memory and an 8.6GB swap device.
The system was under heavy load and the swap device had a large
number of full clusters. The softlockup was triggered during
a stress test after about 3 days.
The backtrace looks like:
PID: 3817773 TASK: ffff0883bb28b780 CPU: 48 COMMAND: "kworker/48:7"
#0 [ffff800080183d10] __crash_kexec at ffffa4c1361e5de4
#1 [ffff800080183d90] panic at ffffa4c1360d5e9c
#2 [ffff800080183e20] watchdog_timer_fn at ffffa4c136231fa8
...
#16 [ffff8000c4ad3cb0] swap_cache_del_folio at ffffa4c1363e1614
#17 [ffff8000c4ad3ce0] __try_to_reclaim_swap at ffffa4c1363e4bfc
#18 [ffff8000c4ad3d40] swap_reclaim_full_clusters at ffffa4c1363e5474
#19 [ffff8000c4ad3da0] swap_reclaim_work at ffffa4c1363e550c
#20 [ffff8000c4ad3dc0] process_one_work at ffffa4c136102edc
#21 [ffff8000c4ad3e10] worker_thread at ffffa4c136103398
#22 [ffff8000c4ad3e70] kthread at ffffa4c13610d95c
From the vmcore analysis, swap_reclaim_work() called
swap_reclaim_full_clusters() with force=true, which sets to_scan to 1551 clusters.
At the time of the softlockup, there were still 1427 full clusters remaining in the
full_clusters list.
I will add these details to the commit log in v2.
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 9174f1eeffb0..74a1e324449d 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
> > swap_cluster_unlock(ci);
> > if (to_scan <= 0)
> > break;
> > + cond_resched();
>
> Besides, is it a little bit too aggressive to call cond_resched() for
> each cluster reclaiming compared with the old code? Do you consider to
> make it gentle, e.g calling cond_resched() every several clusters, 8, 16
> or other number decided based on your testing performance statistics.
I think calling cond_resched() per cluster is reasonable
here because:
1) Each cluster iteration already involves scanning up to 512 slots,
and each slot reclaim may call __try_to_reclaim_swap() which does
non-trivial work (lock/unlock, folio lookup, swap cache deletion,
and potentially slab freeing). So the work per cluster is already
substantial.
2) cond_resched() is a lightweight check - it only actually reschedules
when need_resched() is set, so in the common case it's just a flag
check with negligible overhead. Therefore calling it once per cluster
gives a bounded latency without forcing an actual context switch every
time. If we call it only every 8 or 16 clusters, the worst-case
non-preemptible window can still become quite large on machines with
many full clusters.
2) This is a workqueue context (swap_reclaim_work), not a hot fast
path, so the slight overhead is acceptable.
Thanks,
Keray
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-04-29 12:49 ` kerayhuang
@ 2026-05-01 2:43 ` Baoquan He
2026-05-06 13:09 ` [PATCH v2] " Zijiang Huang
0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2026-05-01 2:43 UTC (permalink / raw)
To: kerayhuang; +Cc: bhe, flyingpeng, kasong, kerayhuang, albinwyang, linux-mm
On 04/29/26 at 08:49pm, kerayhuang wrote:
> Hi Baoquan,
> Thanks for the review!
> > Hi Keray,
> >
> > On 04/24/26 at 08:37pm, kerayhuang wrote:
> > > Add periodic cond_resched() calls during large full_clusters
> > > reclaim operations to prevent softlockup issues.
> > >
> > > Signed-off-by: kerayhuang <kerayhuang@tencent.com>
> > > Reviewed-by: Kairui Song <kasong@tencent.com>
> > > Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> > > ---
> > > mm/swapfile.c | 1 +
> > > 1 file changed, 1 insertion(+)
> >
> > Thanks for the patch. The change looks good to me, however there are
> > still small concerns.
> >
> > For patch log, it might be better to provide more details, e.g did you
> > observe this issue in a product environment, or just a code exploring?
> > If observed in a product environment, what does the backtrace look like
> > when softlockup happened?
>
> We hit a real softlockup in an internal stress test environment.
> The workload was LTP memory/swap stress on a large arm64 machine,
> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
> The system was under heavy load and the swap device had a large
> number of full clusters. The softlockup was triggered during
> a stress test after about 3 days.
>
> The backtrace looks like:
>
> PID: 3817773 TASK: ffff0883bb28b780 CPU: 48 COMMAND: "kworker/48:7"
> #0 [ffff800080183d10] __crash_kexec at ffffa4c1361e5de4
> #1 [ffff800080183d90] panic at ffffa4c1360d5e9c
> #2 [ffff800080183e20] watchdog_timer_fn at ffffa4c136231fa8
> ...
> #16 [ffff8000c4ad3cb0] swap_cache_del_folio at ffffa4c1363e1614
> #17 [ffff8000c4ad3ce0] __try_to_reclaim_swap at ffffa4c1363e4bfc
> #18 [ffff8000c4ad3d40] swap_reclaim_full_clusters at ffffa4c1363e5474
> #19 [ffff8000c4ad3da0] swap_reclaim_work at ffffa4c1363e550c
> #20 [ffff8000c4ad3dc0] process_one_work at ffffa4c136102edc
> #21 [ffff8000c4ad3e10] worker_thread at ffffa4c136103398
> #22 [ffff8000c4ad3e70] kthread at ffffa4c13610d95c
>
> From the vmcore analysis, swap_reclaim_work() called
> swap_reclaim_full_clusters() with force=true, which sets to_scan to 1551 clusters.
> At the time of the softlockup, there were still 1427 full clusters remaining in the
> full_clusters list.
>
> I will add these details to the commit log in v2.
Sounds like a very great root cause digging. Adding these into patch log
will be very helpful.
By the way, is it worth a Fixes tag?
>
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 9174f1eeffb0..74a1e324449d 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
> > > swap_cluster_unlock(ci);
> > > if (to_scan <= 0)
> > > break;
> > > + cond_resched();
> >
> > Besides, is it a little bit too aggressive to call cond_resched() for
> > each cluster reclaiming compared with the old code? Do you consider to
> > make it gentle, e.g calling cond_resched() every several clusters, 8, 16
> > or other number decided based on your testing performance statistics.
>
> I think calling cond_resched() per cluster is reasonable
> here because:
>
> 1) Each cluster iteration already involves scanning up to 512 slots,
> and each slot reclaim may call __try_to_reclaim_swap() which does
> non-trivial work (lock/unlock, folio lookup, swap cache deletion,
> and potentially slab freeing). So the work per cluster is already
> substantial.
>
> 2) cond_resched() is a lightweight check - it only actually reschedules
> when need_resched() is set, so in the common case it's just a flag
> check with negligible overhead. Therefore calling it once per cluster
> gives a bounded latency without forcing an actual context switch every
> time. If we call it only every 8 or 16 clusters, the worst-case
> non-preemptible window can still become quite large on machines with
> many full clusters.
>
> 2) This is a workqueue context (swap_reclaim_work), not a hot fast
> path, so the slight overhead is acceptable.
OK, that sounds good. Just when system is under heavy stress, it could
yield after each cluster reclaiming. Imagine a system with a bigger swap
disk, it will alwasy need to check if swap is 50% full and if it's in
workqueue. Anyway, maybe I am overthinking. The overrall looks very
great to me, good catch, good root cause digging and good fix. Let's see
if other people have any concern.
Thanks
Baoquan
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-01 2:43 ` Baoquan He
@ 2026-05-06 13:09 ` Zijiang Huang
2026-05-07 1:35 ` Baoquan He
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Zijiang Huang @ 2026-05-06 13:09 UTC (permalink / raw)
To: baoquan.he
Cc: albinwyang, bhe, flyingpeng, huangzjsmile, kasong, kerayhuang,
linux-mm
We hit a real softlockup in an internal stress test environment.
The workload was LTP memory/swap stress on a large arm64 machine,
with 320 CPUs, about 1TB memory and an 8.6GB swap device.
The system was under heavy load and the swap device had a large
number of full clusters. The softlockup was triggered during
a stress test after about 3 days.
So, add periodic cond_resched() calls during large full_clusters
reclaim operations to prevent softlockup issues.
Detailed call trace as follow:
PID: 3817773 TASK: ffff0883bb28b780 CPU: 48 COMMAND: "kworker/48:7"
#0 [ffff800080183d10] __crash_kexec at ffffa4c1361e5de4
#1 [ffff800080183d90] panic at ffffa4c1360d5e9c
#2 [ffff800080183e20] watchdog_timer_fn at ffffa4c136231fa8
...
#16 [ffff8000c4ad3cb0] swap_cache_del_folio at ffffa4c1363e1614
#17 [ffff8000c4ad3ce0] __try_to_reclaim_swap at ffffa4c1363e4bfc
#18 [ffff8000c4ad3d40] swap_reclaim_full_clusters at ffffa4c1363e5474
#19 [ffff8000c4ad3da0] swap_reclaim_work at ffffa4c1363e550c
#20 [ffff8000c4ad3dc0] process_one_work at ffffa4c136102edc
#21 [ffff8000c4ad3e10] worker_thread at ffffa4c136103398
#22 [ffff8000c4ad3e70] kthread at ffffa4c13610d95c
Fixes: 5168a68eb78f ("mm, swap: avoid over reclaim of full clusters")
Signed-off-by: Zijiang Huang <kerayhuang@tencent.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: albinwyang <albinwyang@tencent.com>
---
mm/swapfile.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9174f1eeffb0..74a1e324449d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
swap_cluster_unlock(ci);
if (to_scan <= 0)
break;
+ cond_resched();
}
}
--
2.43.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-06 13:09 ` [PATCH v2] " Zijiang Huang
@ 2026-05-07 1:35 ` Baoquan He
2026-05-07 4:02 ` Chris Li
2026-05-08 21:29 ` Andrew Morton
2 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2026-05-07 1:35 UTC (permalink / raw)
To: Zijiang Huang, akpm, chrisl
Cc: albinwyang, bhe, flyingpeng, kasong, kerayhuang, linux-mm
On 05/06/26 at 09:09pm, Zijiang Huang wrote:
> We hit a real softlockup in an internal stress test environment.
> The workload was LTP memory/swap stress on a large arm64 machine,
> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
> The system was under heavy load and the swap device had a large
> number of full clusters. The softlockup was triggered during
> a stress test after about 3 days.
>
> So, add periodic cond_resched() calls during large full_clusters
> reclaim operations to prevent softlockup issues.
>
> Detailed call trace as follow:
>
> PID: 3817773 TASK: ffff0883bb28b780 CPU: 48 COMMAND: "kworker/48:7"
> #0 [ffff800080183d10] __crash_kexec at ffffa4c1361e5de4
> #1 [ffff800080183d90] panic at ffffa4c1360d5e9c
> #2 [ffff800080183e20] watchdog_timer_fn at ffffa4c136231fa8
> ...
> #16 [ffff8000c4ad3cb0] swap_cache_del_folio at ffffa4c1363e1614
> #17 [ffff8000c4ad3ce0] __try_to_reclaim_swap at ffffa4c1363e4bfc
> #18 [ffff8000c4ad3d40] swap_reclaim_full_clusters at ffffa4c1363e5474
> #19 [ffff8000c4ad3da0] swap_reclaim_work at ffffa4c1363e550c
> #20 [ffff8000c4ad3dc0] process_one_work at ffffa4c136102edc
> #21 [ffff8000c4ad3e10] worker_thread at ffffa4c136103398
> #22 [ffff8000c4ad3e70] kthread at ffffa4c13610d95c
>
> Fixes: 5168a68eb78f ("mm, swap: avoid over reclaim of full clusters")
> Signed-off-by: Zijiang Huang <kerayhuang@tencent.com>
> Reviewed-by: Kairui Song <kasong@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> Reviewed-by: albinwyang <albinwyang@tencent.com>
> ---
> mm/swapfile.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 9174f1eeffb0..74a1e324449d 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1054,6 +1054,7 @@ static void swap_reclaim_full_clusters(struct swap_info_struct *si, bool force)
> swap_cluster_unlock(ci);
> if (to_scan <= 0)
> break;
> + cond_resched();
> }
> }
LGTM,
Reviewed-by: Baoquan He <baoquan.he@linux.dev>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-06 13:09 ` [PATCH v2] " Zijiang Huang
2026-05-07 1:35 ` Baoquan He
@ 2026-05-07 4:02 ` Chris Li
2026-05-08 11:33 ` Zijiang Huang
2026-05-08 21:29 ` Andrew Morton
2 siblings, 1 reply; 10+ messages in thread
From: Chris Li @ 2026-05-07 4:02 UTC (permalink / raw)
To: Zijiang Huang
Cc: baoquan.he, albinwyang, bhe, flyingpeng, kasong, kerayhuang,
linux-mm
On Wed, May 6, 2026 at 4:56 PM Zijiang Huang <huangzjsmile@gmail.com> wrote:
>
> We hit a real softlockup in an internal stress test environment.
> The workload was LTP memory/swap stress on a large arm64 machine,
> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
> The system was under heavy load and the swap device had a large
> number of full clusters. The softlockup was triggered during
> a stress test after about 3 days.
>
> So, add periodic cond_resched() calls during large full_clusters
> reclaim operations to prevent softlockup issues.
Thank you for reporting and fixing this issue.
Can you add that to the patch commit log? This background information
is very important for the reader why this change is needed.
Assume you will update the commit log.
Acked-by: Chris Li <chrisl@kernel.org>
Chris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-07 4:02 ` Chris Li
@ 2026-05-08 11:33 ` Zijiang Huang
2026-05-08 13:49 ` Baoquan He
0 siblings, 1 reply; 10+ messages in thread
From: Zijiang Huang @ 2026-05-08 11:33 UTC (permalink / raw)
To: chrisl
Cc: albinwyang, baoquan.he, bhe, flyingpeng, huangzjsmile, kasong,
kerayhuang, linux-mm
>On Wed, May 6, 2026 at 4:56=E2=80=AFPM Zijiang Huang <huangzjsmile@gmail.co=
>m> wrote:
>>
>> We hit a real softlockup in an internal stress test environment.
>> The workload was LTP memory/swap stress on a large arm64 machine,
>> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
>> The system was under heavy load and the swap device had a large
>> number of full clusters. The softlockup was triggered during
>> a stress test after about 3 days.
>>
>> So, add periodic cond_resched() calls during large full_clusters
>> reclaim operations to prevent softlockup issues.
>Thank you for reporting and fixing this issue.
>
>Can you add that to the patch commit log? This background information
>is very important for the reader why this change is needed.
>
>Assume you will update the commit log.
>
>Acked-by: Chris Li <chrisl@kernel.org>
>
>Chris
Hi Chris, I've updated the commit log with the background information in v2. Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-08 11:33 ` Zijiang Huang
@ 2026-05-08 13:49 ` Baoquan He
0 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2026-05-08 13:49 UTC (permalink / raw)
To: Zijiang Huang
Cc: chrisl, albinwyang, bhe, flyingpeng, kasong, kerayhuang, linux-mm
On 05/08/26 at 07:33pm, Zijiang Huang wrote:
> >On Wed, May 6, 2026 at 4:56=E2=80=AFPM Zijiang Huang <huangzjsmile@gmail.co=
> >m> wrote:
> >>
> >> We hit a real softlockup in an internal stress test environment.
> >> The workload was LTP memory/swap stress on a large arm64 machine,
> >> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
> >> The system was under heavy load and the swap device had a large
> >> number of full clusters. The softlockup was triggered during
> >> a stress test after about 3 days.
> >>
> >> So, add periodic cond_resched() calls during large full_clusters
> >> reclaim operations to prevent softlockup issues.
> >Thank you for reporting and fixing this issue.
> >
> >Can you add that to the patch commit log? This background information
> >is very important for the reader why this change is needed.
> >
> >Assume you will update the commit log.
> >
> >Acked-by: Chris Li <chrisl@kernel.org>
> >
> >Chris
>
> Hi Chris, I've updated the commit log with the background information in v2. Thanks!
From my mail client, I see your v2 is in the same thread as v1. People
may mistaken it as a discussion reply. It's better to post v2
separately. You can resend v2 with reviewers' ack tag collected.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup
2026-05-06 13:09 ` [PATCH v2] " Zijiang Huang
2026-05-07 1:35 ` Baoquan He
2026-05-07 4:02 ` Chris Li
@ 2026-05-08 21:29 ` Andrew Morton
2 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2026-05-08 21:29 UTC (permalink / raw)
To: Zijiang Huang
Cc: baoquan.he, albinwyang, bhe, flyingpeng, kasong, kerayhuang,
linux-mm
On Wed, 6 May 2026 21:09:19 +0800 Zijiang Huang <huangzjsmile@gmail.com> wrote:
> We hit a real softlockup in an internal stress test environment.
> The workload was LTP memory/swap stress on a large arm64 machine,
> with 320 CPUs, about 1TB memory and an 8.6GB swap device.
> The system was under heavy load and the swap device had a large
> number of full clusters. The softlockup was triggered during
> a stress test after about 3 days.
>
> So, add periodic cond_resched() calls during large full_clusters
> reclaim operations to prevent softlockup issues.
>
> Detailed call trace as follow:
>
> PID: 3817773 TASK: ffff0883bb28b780 CPU: 48 COMMAND: "kworker/48:7"
> #0 [ffff800080183d10] __crash_kexec at ffffa4c1361e5de4
> #1 [ffff800080183d90] panic at ffffa4c1360d5e9c
> #2 [ffff800080183e20] watchdog_timer_fn at ffffa4c136231fa8
> ...
> #16 [ffff8000c4ad3cb0] swap_cache_del_folio at ffffa4c1363e1614
> #17 [ffff8000c4ad3ce0] __try_to_reclaim_swap at ffffa4c1363e4bfc
> #18 [ffff8000c4ad3d40] swap_reclaim_full_clusters at ffffa4c1363e5474
> #19 [ffff8000c4ad3da0] swap_reclaim_work at ffffa4c1363e550c
> #20 [ffff8000c4ad3dc0] process_one_work at ffffa4c136102edc
> #21 [ffff8000c4ad3e10] worker_thread at ffffa4c136103398
> #22 [ffff8000c4ad3e70] kthread at ffffa4c13610d95c
Thanks.
> Fixes: 5168a68eb78f ("mm, swap: avoid over reclaim of full clusters")
I'll add a cc:stable to this to help ensure that earlier kernels don't
hit this.
> Signed-off-by: Zijiang Huang <kerayhuang@tencent.com>
> Reviewed-by: Kairui Song <kasong@tencent.com>
> Reviewed-by: Hao Peng <flyingpeng@tencent.com>
> Reviewed-by: albinwyang <albinwyang@tencent.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-08 21:29 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 12:37 [PATCH] mm/swap: Add cond_resched() in swap_reclaim_full_clusters to prevent softlockup kerayhuang
2026-04-24 15:39 ` Baoquan He
2026-04-29 12:49 ` kerayhuang
2026-05-01 2:43 ` Baoquan He
2026-05-06 13:09 ` [PATCH v2] " Zijiang Huang
2026-05-07 1:35 ` Baoquan He
2026-05-07 4:02 ` Chris Li
2026-05-08 11:33 ` Zijiang Huang
2026-05-08 13:49 ` Baoquan He
2026-05-08 21:29 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox