* [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
@ 2026-03-19 7:43 lirongqing
2026-03-19 9:39 ` Uladzislau Rezki
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: lirongqing @ 2026-03-19 7:43 UTC (permalink / raw)
To: Andrew Morton, Uladzislau Rezki, linux-mm, linux-kernel; +Cc: Li RongQing
From: Li RongQing <lirongqing@baidu.com>
The drain_vmap_area_work() function can take >10ms to complete when
there are many accumulated vmap areas in a system with a high CPU
count, causing workqueue watchdog warnings when run via
schedule_work():
[ 2069.796205] workqueue: drain_vmap_area_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
[ 2192.823225] workqueue: drain_vmap_area_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
Switch to a dedicated WQ_UNBOUND workqueue to allow the scheduler to
run this background task on any available CPU, improving responsiveness.
Use WQ_MEM_RECLAIM to ensure forward progress under memory pressure.
Create vmap_drain_wq in vmalloc_init_late() which is called after
workqueue_init_early() in start_kernel() to avoid boot-time crashes.
Suggested-by: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
Diff with v1: create dedicated unbound workqueue
include/linux/vmalloc.h | 2 ++
init/main.c | 1 +
mm/vmalloc.c | 14 +++++++++++++-
3 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index e8e94f9..c028603 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -301,11 +301,13 @@ static inline void set_vm_flush_reset_perms(void *addr)
if (vm)
vm->flags |= VM_FLUSH_RESET_PERMS;
}
+void __init vmalloc_init_late(void);
#else /* !CONFIG_MMU */
#define VMALLOC_TOTAL 0UL
static inline unsigned long vmalloc_nr_pages(void) { return 0; }
static inline void set_vm_flush_reset_perms(void *addr) {}
+static inline void __init vmalloc_init_late(void) {}
#endif /* CONFIG_MMU */
#if defined(CONFIG_MMU) && defined(CONFIG_SMP)
diff --git a/init/main.c b/init/main.c
index 1cb395d..50b497f 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1099,6 +1099,7 @@ void start_kernel(void)
* workqueue_init().
*/
workqueue_init_early();
+ vmalloc_init_late();
rcu_init();
kvfree_rcu_init();
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 61caa55..a52ccd4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1067,6 +1067,7 @@ static void reclaim_and_purge_vmap_areas(void);
static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
static void drain_vmap_area_work(struct work_struct *work);
static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
+static struct workqueue_struct *vmap_drain_wq;
static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr;
@@ -2471,7 +2472,7 @@ static void free_vmap_area_noflush(struct vmap_area *va)
/* After this point, we may free va at any time */
if (unlikely(nr_lazy > nr_lazy_max))
- schedule_work(&drain_vmap_work);
+ queue_work(vmap_drain_wq, &drain_vmap_work);
}
/*
@@ -5422,6 +5423,17 @@ vmap_node_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
return SHRINK_STOP;
}
+void __init vmalloc_init_late(void)
+{
+ vmap_drain_wq = alloc_workqueue("vmap_drain",
+ WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
+ if (!vmap_drain_wq) {
+ pr_warn("vmap_drain_wq creation failed, using system_unbound_wq\n");
+ vmap_drain_wq = system_unbound_wq;
+ }
+
+}
+
void __init vmalloc_init(void)
{
struct shrinker *vmap_node_shrinker;
--
2.9.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 7:43 [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining lirongqing
@ 2026-03-19 9:39 ` Uladzislau Rezki
2026-03-19 10:05 ` 答复: [????] " Li,Rongqing(ACG CCN)
2026-03-20 3:16 ` Andrew Morton
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Uladzislau Rezki @ 2026-03-19 9:39 UTC (permalink / raw)
To: lirongqing; +Cc: Andrew Morton, Uladzislau Rezki, linux-mm, linux-kernel
On Thu, Mar 19, 2026 at 03:43:07AM -0400, lirongqing wrote:
> From: Li RongQing <lirongqing@baidu.com>
>
> The drain_vmap_area_work() function can take >10ms to complete when
> there are many accumulated vmap areas in a system with a high CPU
> count, causing workqueue watchdog warnings when run via
> schedule_work():
>
> [ 2069.796205] workqueue: drain_vmap_area_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> [ 2192.823225] workqueue: drain_vmap_area_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>
> Switch to a dedicated WQ_UNBOUND workqueue to allow the scheduler to
> run this background task on any available CPU, improving responsiveness.
> Use WQ_MEM_RECLAIM to ensure forward progress under memory pressure.
>
> Create vmap_drain_wq in vmalloc_init_late() which is called after
> workqueue_init_early() in start_kernel() to avoid boot-time crashes.
>
> Suggested-by: Uladzislau Rezki <urezki@gmail.com>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> Diff with v1: create dedicated unbound workqueue
>
> include/linux/vmalloc.h | 2 ++
> init/main.c | 1 +
> mm/vmalloc.c | 14 +++++++++++++-
> 3 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index e8e94f9..c028603 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -301,11 +301,13 @@ static inline void set_vm_flush_reset_perms(void *addr)
> if (vm)
> vm->flags |= VM_FLUSH_RESET_PERMS;
> }
> +void __init vmalloc_init_late(void);
> #else /* !CONFIG_MMU */
> #define VMALLOC_TOTAL 0UL
>
> static inline unsigned long vmalloc_nr_pages(void) { return 0; }
> static inline void set_vm_flush_reset_perms(void *addr) {}
> +static inline void __init vmalloc_init_late(void) {}
> #endif /* CONFIG_MMU */
>
> #if defined(CONFIG_MMU) && defined(CONFIG_SMP)
> diff --git a/init/main.c b/init/main.c
> index 1cb395d..50b497f 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1099,6 +1099,7 @@ void start_kernel(void)
> * workqueue_init().
> */
> workqueue_init_early();
> + vmalloc_init_late();
>
No, no. We should not patch main.c for such purpose :)
> rcu_init();
> kvfree_rcu_init();
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 61caa55..a52ccd4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1067,6 +1067,7 @@ static void reclaim_and_purge_vmap_areas(void);
> static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
> static void drain_vmap_area_work(struct work_struct *work);
> static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
> +static struct workqueue_struct *vmap_drain_wq;
>
> static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
> static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr;
> @@ -2471,7 +2472,7 @@ static void free_vmap_area_noflush(struct vmap_area *va)
>
> /* After this point, we may free va at any time */
> if (unlikely(nr_lazy > nr_lazy_max))
> - schedule_work(&drain_vmap_work);
> + queue_work(vmap_drain_wq, &drain_vmap_work);
> }
>
> /*
> @@ -5422,6 +5423,17 @@ vmap_node_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> return SHRINK_STOP;
> }
>
> +void __init vmalloc_init_late(void)
> +{
> + vmap_drain_wq = alloc_workqueue("vmap_drain",
> + WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
> + if (!vmap_drain_wq) {
> + pr_warn("vmap_drain_wq creation failed, using system_unbound_wq\n");
> + vmap_drain_wq = system_unbound_wq;
> + }
> +
> +}
> +
> void __init vmalloc_init(void)
> {
> struct shrinker *vmap_node_shrinker;
> --
> 2.9.4
>
Why can't you add this into the vmalloc_ini()?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 8+ messages in thread
* 答复: [????] Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 9:39 ` Uladzislau Rezki
@ 2026-03-19 10:05 ` Li,Rongqing(ACG CCN)
2026-03-19 13:23 ` Uladzislau Rezki
0 siblings, 1 reply; 8+ messages in thread
From: Li,Rongqing(ACG CCN) @ 2026-03-19 10:05 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Andrew Morton, linux-mm@kvack.org, linux-kernel@vger.kernel.org
> On Thu, Mar 19, 2026 at 03:43:07AM -0400, lirongqing wrote:
> > From: Li RongQing <lirongqing@baidu.com>
> >
> > The drain_vmap_area_work() function can take >10ms to complete when
> > there are many accumulated vmap areas in a system with a high CPU
> > count, causing workqueue watchdog warnings when run via
> > schedule_work():
> >
> > [ 2069.796205] workqueue: drain_vmap_area_work hogged CPU
> for >10000us
> > 4 times, consider switching to WQ_UNBOUND [ 2192.823225] workqueue:
> > drain_vmap_area_work hogged CPU for >10000us 5 times, consider
> > switching to WQ_UNBOUND
> >
> > Switch to a dedicated WQ_UNBOUND workqueue to allow the scheduler to
> > run this background task on any available CPU, improving responsiveness.
> > Use WQ_MEM_RECLAIM to ensure forward progress under memory
> pressure.
> >
> > Create vmap_drain_wq in vmalloc_init_late() which is called after
> > workqueue_init_early() in start_kernel() to avoid boot-time crashes.
> >
> > Suggested-by: Uladzislau Rezki <urezki@gmail.com>
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> > Diff with v1: create dedicated unbound workqueue
> >
> > include/linux/vmalloc.h | 2 ++
> > init/main.c | 1 +
> > mm/vmalloc.c | 14 +++++++++++++-
> > 3 files changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index
> > e8e94f9..c028603 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -301,11 +301,13 @@ static inline void set_vm_flush_reset_perms(void
> *addr)
> > if (vm)
> > vm->flags |= VM_FLUSH_RESET_PERMS;
> > }
> > +void __init vmalloc_init_late(void);
> > #else /* !CONFIG_MMU */
> > #define VMALLOC_TOTAL 0UL
> >
> > static inline unsigned long vmalloc_nr_pages(void) { return 0; }
> > static inline void set_vm_flush_reset_perms(void *addr) {}
> > +static inline void __init vmalloc_init_late(void) {}
> > #endif /* CONFIG_MMU */
> >
> > #if defined(CONFIG_MMU) && defined(CONFIG_SMP) diff --git
> > a/init/main.c b/init/main.c index 1cb395d..50b497f 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -1099,6 +1099,7 @@ void start_kernel(void)
> > * workqueue_init().
> > */
> > workqueue_init_early();
> > + vmalloc_init_late();
> >
> No, no. We should not patch main.c for such purpose :)
>
> > rcu_init();
> > kvfree_rcu_init();
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 61caa55..a52ccd4 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1067,6 +1067,7 @@ static void
> reclaim_and_purge_vmap_areas(void);
> > static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
> > static void drain_vmap_area_work(struct work_struct *work); static
> > DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
> > +static struct workqueue_struct *vmap_drain_wq;
> >
> > static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
> > static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr; @@
> > -2471,7 +2472,7 @@ static void free_vmap_area_noflush(struct vmap_area
> > *va)
> >
> > /* After this point, we may free va at any time */
> > if (unlikely(nr_lazy > nr_lazy_max))
> > - schedule_work(&drain_vmap_work);
> > + queue_work(vmap_drain_wq, &drain_vmap_work);
> > }
> >
> > /*
> > @@ -5422,6 +5423,17 @@ vmap_node_shrink_scan(struct shrinker
> *shrink, struct shrink_control *sc)
> > return SHRINK_STOP;
> > }
> >
> > +void __init vmalloc_init_late(void)
> > +{
> > + vmap_drain_wq = alloc_workqueue("vmap_drain",
> > + WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
> > + if (!vmap_drain_wq) {
> > + pr_warn("vmap_drain_wq creation failed, using
> system_unbound_wq\n");
> > + vmap_drain_wq = system_unbound_wq;
> > + }
> > +
> > +}
> > +
> > void __init vmalloc_init(void)
> > {
> > struct shrinker *vmap_node_shrinker;
> > --
> > 2.9.4
> >
> Why can't you add this into the vmalloc_ini()?
>
If alloc_workqueue() is added into vmalloc_ini(), system will crash and fail to boot, sine allocate workqueue depends on workqueue_init_early()
Maybe this commit 3347fa092821("workqueue: make workqueue available early during boot") shows the reason
[Li,Rongqing]
> --
> Uladzislau Rezki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 答复: [????] Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 10:05 ` 答复: [????] " Li,Rongqing(ACG CCN)
@ 2026-03-19 13:23 ` Uladzislau Rezki
2026-03-20 5:48 ` 答复: [????] Re: ??: " Li,Rongqing(ACG CCN)
0 siblings, 1 reply; 8+ messages in thread
From: Uladzislau Rezki @ 2026-03-19 13:23 UTC (permalink / raw)
To: Li,Rongqing(ACG CCN)
Cc: Uladzislau Rezki, Andrew Morton, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
On Thu, Mar 19, 2026 at 10:05:42AM +0000, Li,Rongqing(ACG CCN) wrote:
>
>
> > On Thu, Mar 19, 2026 at 03:43:07AM -0400, lirongqing wrote:
> > > From: Li RongQing <lirongqing@baidu.com>
> > >
> > > The drain_vmap_area_work() function can take >10ms to complete when
> > > there are many accumulated vmap areas in a system with a high CPU
> > > count, causing workqueue watchdog warnings when run via
> > > schedule_work():
> > >
> > > [ 2069.796205] workqueue: drain_vmap_area_work hogged CPU
> > for >10000us
> > > 4 times, consider switching to WQ_UNBOUND [ 2192.823225] workqueue:
> > > drain_vmap_area_work hogged CPU for >10000us 5 times, consider
> > > switching to WQ_UNBOUND
> > >
> > > Switch to a dedicated WQ_UNBOUND workqueue to allow the scheduler to
> > > run this background task on any available CPU, improving responsiveness.
> > > Use WQ_MEM_RECLAIM to ensure forward progress under memory
> > pressure.
> > >
> > > Create vmap_drain_wq in vmalloc_init_late() which is called after
> > > workqueue_init_early() in start_kernel() to avoid boot-time crashes.
> > >
> > > Suggested-by: Uladzislau Rezki <urezki@gmail.com>
> > > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > > ---
> > > Diff with v1: create dedicated unbound workqueue
> > >
> > > include/linux/vmalloc.h | 2 ++
> > > init/main.c | 1 +
> > > mm/vmalloc.c | 14 +++++++++++++-
> > > 3 files changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index
> > > e8e94f9..c028603 100644
> > > --- a/include/linux/vmalloc.h
> > > +++ b/include/linux/vmalloc.h
> > > @@ -301,11 +301,13 @@ static inline void set_vm_flush_reset_perms(void
> > *addr)
> > > if (vm)
> > > vm->flags |= VM_FLUSH_RESET_PERMS;
> > > }
> > > +void __init vmalloc_init_late(void);
> > > #else /* !CONFIG_MMU */
> > > #define VMALLOC_TOTAL 0UL
> > >
> > > static inline unsigned long vmalloc_nr_pages(void) { return 0; }
> > > static inline void set_vm_flush_reset_perms(void *addr) {}
> > > +static inline void __init vmalloc_init_late(void) {}
> > > #endif /* CONFIG_MMU */
> > >
> > > #if defined(CONFIG_MMU) && defined(CONFIG_SMP) diff --git
> > > a/init/main.c b/init/main.c index 1cb395d..50b497f 100644
> > > --- a/init/main.c
> > > +++ b/init/main.c
> > > @@ -1099,6 +1099,7 @@ void start_kernel(void)
> > > * workqueue_init().
> > > */
> > > workqueue_init_early();
> > > + vmalloc_init_late();
> > >
> > No, no. We should not patch main.c for such purpose :)
> >
> > > rcu_init();
> > > kvfree_rcu_init();
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 61caa55..a52ccd4 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -1067,6 +1067,7 @@ static void
> > reclaim_and_purge_vmap_areas(void);
> > > static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
> > > static void drain_vmap_area_work(struct work_struct *work); static
> > > DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
> > > +static struct workqueue_struct *vmap_drain_wq;
> > >
> > > static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
> > > static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr; @@
> > > -2471,7 +2472,7 @@ static void free_vmap_area_noflush(struct vmap_area
> > > *va)
> > >
> > > /* After this point, we may free va at any time */
> > > if (unlikely(nr_lazy > nr_lazy_max))
> > > - schedule_work(&drain_vmap_work);
> > > + queue_work(vmap_drain_wq, &drain_vmap_work);
> > > }
> > >
> > > /*
> > > @@ -5422,6 +5423,17 @@ vmap_node_shrink_scan(struct shrinker
> > *shrink, struct shrink_control *sc)
> > > return SHRINK_STOP;
> > > }
> > >
> > > +void __init vmalloc_init_late(void)
> > > +{
> > > + vmap_drain_wq = alloc_workqueue("vmap_drain",
> > > + WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
> > > + if (!vmap_drain_wq) {
> > > + pr_warn("vmap_drain_wq creation failed, using
> > system_unbound_wq\n");
> > > + vmap_drain_wq = system_unbound_wq;
> > > + }
> > > +
> > > +}
> > > +
> > > void __init vmalloc_init(void)
> > > {
> > > struct shrinker *vmap_node_shrinker;
> > > --
> > > 2.9.4
> > >
> > Why can't you add this into the vmalloc_ini()?
> >
>
> If alloc_workqueue() is added into vmalloc_ini(), system will crash and fail to boot, sine allocate workqueue depends on workqueue_init_early()
>
> Maybe this commit 3347fa092821("workqueue: make workqueue available early during boot") shows the reason
>
That is true.
<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 61caa55a4402..81e1e74346d5 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1067,6 +1067,7 @@ static void reclaim_and_purge_vmap_areas(void);
static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
static void drain_vmap_area_work(struct work_struct *work);
static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
+static struct workqueue_struct *drain_vmap_wq;
static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr;
@@ -2437,6 +2438,17 @@ static void drain_vmap_area_work(struct work_struct *work)
mutex_unlock(&vmap_purge_lock);
}
+static void
+schedule_drain_vmap_work(unsigned long nr_lazy, unsigned long nr_lazy_max)
+{
+ if (unlikely(nr_lazy > nr_lazy_max)) {
+ struct workqueue_struct *wq = READ_ONCE(drain_vmap_wq);
+
+ if (wq)
+ queue_work(wq, &drain_vmap_work);
+ }
+}
+
/*
* Free a vmap area, caller ensuring that the area has been unmapped,
* unlinked and flush_cache_vunmap had been called for the correct
@@ -2470,8 +2482,7 @@ static void free_vmap_area_noflush(struct vmap_area *va)
trace_free_vmap_area_noflush(va_start, nr_lazy, nr_lazy_max);
/* After this point, we may free va at any time */
- if (unlikely(nr_lazy > nr_lazy_max))
- schedule_work(&drain_vmap_work);
+ schedule_drain_vmap_work(nr_lazy, nr_lazy_max);
}
/*
@@ -5483,3 +5494,15 @@ void __init vmalloc_init(void)
vmap_node_shrinker->scan_objects = vmap_node_shrink_scan;
shrinker_register(vmap_node_shrinker);
}
+
+static int __init vmalloc_init_workqueue(void)
+{
+ struct workqueue_struct *wq;
+
+ wq = alloc_workqueue("vmap_drain", WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
+ WARN_ON(wq == NULL);
+ WRITE_ONCE(drain_vmap_wq, wq);
+
+ return 0;
+}
+early_initcall(vmalloc_init_workqueue);
<snip>
--
Uladzislau Rezki
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 7:43 [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining lirongqing
2026-03-19 9:39 ` Uladzislau Rezki
@ 2026-03-20 3:16 ` Andrew Morton
2026-03-20 9:51 ` [syzbot ci] " syzbot ci
2026-03-24 13:32 ` [PATCH v2] " kernel test robot
3 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2026-03-20 3:16 UTC (permalink / raw)
To: lirongqing; +Cc: Uladzislau Rezki, linux-mm, linux-kernel
On Thu, 19 Mar 2026 03:43:07 -0400 lirongqing <lirongqing@baidu.com> wrote:
> The drain_vmap_area_work() function can take >10ms to complete when
> there are many accumulated vmap areas in a system with a high CPU
> count, causing workqueue watchdog warnings when run via
> schedule_work():
>
> [ 2069.796205] workqueue: drain_vmap_area_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
> [ 2192.823225] workqueue: drain_vmap_area_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
>
> Switch to a dedicated WQ_UNBOUND workqueue to allow the scheduler to
> run this background task on any available CPU, improving responsiveness.
> Use WQ_MEM_RECLAIM to ensure forward progress under memory pressure.
>
> Create vmap_drain_wq in vmalloc_init_late() which is called after
> workqueue_init_early() in start_kernel() to avoid boot-time crashes.
AI review flags some potential issues:
https://sashiko.dev/#/patchset/20260319074307.2325-1-lirongqing%40baidu.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* 答复: [????] Re: ??: [????] Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 13:23 ` Uladzislau Rezki
@ 2026-03-20 5:48 ` Li,Rongqing(ACG CCN)
0 siblings, 0 replies; 8+ messages in thread
From: Li,Rongqing(ACG CCN) @ 2026-03-20 5:48 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Andrew Morton, linux-mm@kvack.org, linux-kernel@vger.kernel.org
> That is true.
>
> <snip>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index
> 61caa55a4402..81e1e74346d5 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1067,6 +1067,7 @@ static void reclaim_and_purge_vmap_areas(void);
> static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
> static void drain_vmap_area_work(struct work_struct *work); static
> DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
> +static struct workqueue_struct *drain_vmap_wq;
>
> static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages; static
> __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr; @@ -2437,6
> +2438,17 @@ static void drain_vmap_area_work(struct work_struct *work)
> mutex_unlock(&vmap_purge_lock);
> }
>
> +static void
> +schedule_drain_vmap_work(unsigned long nr_lazy, unsigned long
> +nr_lazy_max) {
> + if (unlikely(nr_lazy > nr_lazy_max)) {
> + struct workqueue_struct *wq = READ_ONCE(drain_vmap_wq);
> +
> + if (wq)
> + queue_work(wq, &drain_vmap_work);
> + }
> +}
> +
> /*
> * Free a vmap area, caller ensuring that the area has been unmapped,
> * unlinked and flush_cache_vunmap had been called for the correct @@
> -2470,8 +2482,7 @@ static void free_vmap_area_noflush(struct vmap_area
> *va)
> trace_free_vmap_area_noflush(va_start, nr_lazy, nr_lazy_max);
>
> /* After this point, we may free va at any time */
> - if (unlikely(nr_lazy > nr_lazy_max))
> - schedule_work(&drain_vmap_work);
> + schedule_drain_vmap_work(nr_lazy, nr_lazy_max);
> }
>
> /*
> @@ -5483,3 +5494,15 @@ void __init vmalloc_init(void)
> vmap_node_shrinker->scan_objects = vmap_node_shrink_scan;
> shrinker_register(vmap_node_shrinker);
> }
> +
> +static int __init vmalloc_init_workqueue(void) {
> + struct workqueue_struct *wq;
> +
> + wq = alloc_workqueue("vmap_drain", WQ_UNBOUND |
> WQ_MEM_RECLAIM, 0);
> + WARN_ON(wq == NULL);
> + WRITE_ONCE(drain_vmap_wq, wq);
> +
> + return 0;
> +}
> +early_initcall(vmalloc_init_workqueue);
> <snip>
>
I test the upper codes, it works for me, Do you like to send a formal patch?
Reported-and-tested-by: Li RongQing <lirongqing@baidu.com>
thanks
[Li,Rongqing]
> --
> Uladzislau Rezki
^ permalink raw reply [flat|nested] 8+ messages in thread
* [syzbot ci] Re: mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 7:43 [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining lirongqing
2026-03-19 9:39 ` Uladzislau Rezki
2026-03-20 3:16 ` Andrew Morton
@ 2026-03-20 9:51 ` syzbot ci
2026-03-24 13:32 ` [PATCH v2] " kernel test robot
3 siblings, 0 replies; 8+ messages in thread
From: syzbot ci @ 2026-03-20 9:51 UTC (permalink / raw)
To: akpm, linux-kernel, linux-mm, lirongqing, urezki; +Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
https://lore.kernel.org/all/20260319074307.2325-1-lirongqing@baidu.com
* [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
and found the following issue:
possible deadlock in console_flush_all
Full report is available here:
https://ci.syzbot.org/series/1703e204-a8b3-43ef-8979-a596c0ada77b
***
possible deadlock in console_flush_all
tree: mm-new
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base: 8616acb9dc887e0e271229bf520b5279fbd22f94
arch: amd64
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config: https://ci.syzbot.org/builds/b3a95cb5-d858-4555-a40b-1b611b74214b/config
syz repro: https://ci.syzbot.org/findings/d4780575-25c4-4403-a24b-e1c9a6237f30/syz_repro
------------[ cut here ]------------
======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
kworker/u9:4/94 is trying to acquire lock:
ffffffff8e750900 (console_owner){....}-{0:0}, at: rcu_try_lock_acquire include/linux/rcupdate.h:317 [inline]
ffffffff8e750900 (console_owner){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:428 [inline]
ffffffff8e750900 (console_owner){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:291 [inline]
ffffffff8e750900 (console_owner){....}-{0:0}, at: console_flush_one_record kernel/printk/printk.c:3246 [inline]
ffffffff8e750900 (console_owner){....}-{0:0}, at: console_flush_all+0x123/0xb20 kernel/printk/printk.c:3343
but task is already holding lock:
ffff88812103a498 (&pool->lock){-.-.}-{2:2}, at: start_flush_work kernel/workqueue.c:4241 [inline]
ffff88812103a498 (&pool->lock){-.-.}-{2:2}, at: __flush_work+0x1ef/0xc50 kernel/workqueue.c:4292
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&pool->lock){-.-.}-{2:2}:
__raw_spin_lock include/linux/spinlock_api_smp.h:158 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
__queue_work+0x80b/0x1020 kernel/workqueue.c:-1
queue_work_on+0x106/0x1d0 kernel/workqueue.c:2405
queue_work include/linux/workqueue.h:669 [inline]
rpm_suspend+0xe85/0x1750 drivers/base/power/runtime.c:688
__pm_runtime_idle+0x12f/0x1a0 drivers/base/power/runtime.c:1129
pm_runtime_put include/linux/pm_runtime.h:551 [inline]
__device_attach+0x34f/0x450 drivers/base/dd.c:1051
device_initial_probe+0xa1/0xd0 drivers/base/dd.c:1088
bus_probe_device+0x12a/0x220 drivers/base/bus.c:574
device_add+0x7b6/0xb70 drivers/base/core.c:3689
serial_base_port_add+0x18f/0x260 drivers/tty/serial/serial_base_bus.c:186
serial_core_port_device_add drivers/tty/serial/serial_core.c:3257 [inline]
serial_core_register_port+0x375/0x28a0 drivers/tty/serial/serial_core.c:3296
serial8250_register_8250_port+0x1658/0x1fd0 drivers/tty/serial/8250/8250_core.c:822
serial_pnp_probe+0x568/0x7f0 drivers/tty/serial/8250/8250_pnp.c:480
pnp_device_probe+0x30b/0x4c0 drivers/pnp/driver.c:111
call_driver_probe drivers/base/dd.c:-1 [inline]
really_probe+0x267/0xaf0 drivers/base/dd.c:661
__driver_probe_device+0x18c/0x320 drivers/base/dd.c:803
driver_probe_device+0x4f/0x240 drivers/base/dd.c:833
__driver_attach+0x349/0x640 drivers/base/dd.c:1227
bus_for_each_dev+0x23b/0x2c0 drivers/base/bus.c:383
bus_add_driver+0x345/0x670 drivers/base/bus.c:715
driver_register+0x23a/0x320 drivers/base/driver.c:249
serial8250_init+0x8f/0x160 drivers/tty/serial/8250/8250_platform.c:317
do_one_initcall+0x250/0x8d0 init/main.c:1383
do_initcall_level+0x104/0x190 init/main.c:1445
do_initcalls+0x59/0xa0 init/main.c:1461
kernel_init_freeable+0x2a6/0x3e0 init/main.c:1693
kernel_init+0x1d/0x1d0 init/main.c:1583
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #2 (&dev->power.lock){-...}-{3:3}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline]
_raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162
__pm_runtime_resume+0x10f/0x180 drivers/base/power/runtime.c:1196
pm_runtime_get include/linux/pm_runtime.h:494 [inline]
__uart_start+0x171/0x460 drivers/tty/serial/serial_core.c:149
uart_write+0x265/0xa10 drivers/tty/serial/serial_core.c:633
process_output_block drivers/tty/n_tty.c:557 [inline]
n_tty_write+0xd84/0x12a0 drivers/tty/n_tty.c:2366
iterate_tty_write drivers/tty/tty_io.c:1006 [inline]
file_tty_write+0x559/0xa20 drivers/tty/tty_io.c:1081
new_sync_write fs/read_write.c:595 [inline]
vfs_write+0x61d/0xb90 fs/read_write.c:688
ksys_write+0x150/0x270 fs/read_write.c:740
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #1 (&port_lock_key){-...}-{3:3}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:132 [inline]
_raw_spin_lock_irqsave+0x40/0x60 kernel/locking/spinlock.c:162
uart_port_lock_irqsave include/linux/serial_core.h:717 [inline]
serial8250_console_write+0x150/0x1ba0 drivers/tty/serial/8250/8250_port.c:3301
console_emit_next_record kernel/printk/printk.c:3183 [inline]
console_flush_one_record kernel/printk/printk.c:3269 [inline]
console_flush_all+0x718/0xb20 kernel/printk/printk.c:3343
__console_flush_and_unlock kernel/printk/printk.c:3373 [inline]
console_unlock+0xd1/0x1c0 kernel/printk/printk.c:3413
vprintk_emit+0x485/0x560 kernel/printk/printk.c:2479
_printk+0xdd/0x130 kernel/printk/printk.c:2504
register_console+0xbc2/0xfa0 kernel/printk/printk.c:4208
univ8250_console_init+0x3a/0x70 drivers/tty/serial/8250/8250_core.c:515
console_init+0x10b/0x4d0 kernel/printk/printk.c:4407
start_kernel+0x230/0x3e0 init/main.c:1148
x86_64_start_reservations+0x24/0x30 arch/x86/kernel/head64.c:310
x86_64_start_kernel+0x143/0x1c0 arch/x86/kernel/head64.c:291
common_startup_64+0x13e/0x147
-> #0 (console_owner){....}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain kernel/locking/lockdep.c:3908 [inline]
__lock_acquire+0x15a5/0x2cf0 kernel/locking/lockdep.c:5237
lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868
console_lock_spinning_enable kernel/printk/printk.c:1902 [inline]
console_emit_next_record kernel/printk/printk.c:3177 [inline]
console_flush_one_record kernel/printk/printk.c:3269 [inline]
console_flush_all+0x6c1/0xb20 kernel/printk/printk.c:3343
__console_flush_and_unlock kernel/printk/printk.c:3373 [inline]
console_unlock+0xd1/0x1c0 kernel/printk/printk.c:3413
vprintk_emit+0x485/0x560 kernel/printk/printk.c:2479
_printk+0xdd/0x130 kernel/printk/printk.c:2504
__report_bug+0x317/0x540 lib/bug.c:243
report_bug_entry+0x19a/0x290 lib/bug.c:269
handle_bug+0xce/0x200 arch/x86/kernel/traps.c:430
exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:489
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:616
check_flush_dependency+0x312/0x3c0 kernel/workqueue.c:3801
start_flush_work kernel/workqueue.c:4255 [inline]
__flush_work+0x411/0xc50 kernel/workqueue.c:4292
__purge_vmap_area_lazy+0x876/0xb70 mm/vmalloc.c:2412
drain_vmap_area_work+0x27/0x40 mm/vmalloc.c:2437
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
other info that might help us debug this:
Chain exists of:
console_owner --> &dev->power.lock --> &pool->lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&pool->lock);
lock(&dev->power.lock);
lock(&pool->lock);
lock(console_owner);
*** DEADLOCK ***
7 locks held by kworker/u9:4/94:
#0: ffff8881000ab948 ((wq_completion)vmap_drain){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3251 [inline]
#0: ffff8881000ab948 ((wq_completion)vmap_drain){+.+.}-{0:0}, at: process_scheduled_works+0xa52/0x18c0 kernel/workqueue.c:3359
#1: ffffc9000289fc40 (drain_vmap_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3252 [inline]
#1: ffffc9000289fc40 (drain_vmap_work){+.+.}-{0:0}, at: process_scheduled_works+0xa8d/0x18c0 kernel/workqueue.c:3359
#2: ffffffff8e87ec08 (vmap_purge_lock){+.+.}-{4:4}, at: drain_vmap_area_work+0x17/0x40 mm/vmalloc.c:2436
#3: ffffffff8e75e520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
#3: ffffffff8e75e520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:850 [inline]
#3: ffffffff8e75e520 (rcu_read_lock){....}-{1:3}, at: start_flush_work kernel/workqueue.c:4234 [inline]
#3: ffffffff8e75e520 (rcu_read_lock){....}-{1:3}, at: __flush_work+0x100/0xc50 kernel/workqueue.c:4292
#4: ffff88812103a498 (&pool->lock){-.-.}-{2:2}, at: start_flush_work kernel/workqueue.c:4241 [inline]
#4: ffff88812103a498 (&pool->lock){-.-.}-{2:2}, at: __flush_work+0x1ef/0xc50 kernel/workqueue.c:4292
#5: ffffffff8e750960 (console_lock){+.+.}-{0:0}, at: _printk+0xdd/0x130 kernel/printk/printk.c:2504
#6: ffffffff8e638218 (console_srcu){....}-{0:0}, at: rcu_try_lock_acquire include/linux/rcupdate.h:317 [inline]
#6: ffffffff8e638218 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:428 [inline]
#6: ffffffff8e638218 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:291 [inline]
#6: ffffffff8e638218 (console_srcu){....}-{0:0}, at: console_flush_one_record kernel/printk/printk.c:3246 [inline]
#6: ffffffff8e638218 (console_srcu){....}-{0:0}, at: console_flush_all+0x123/0xb20 kernel/printk/printk.c:3343
stack backtrace:
CPU: 0 UID: 0 PID: 94 Comm: kworker/u9:4 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: vmap_drain drain_vmap_area_work
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_circular_bug+0x2e1/0x300 kernel/locking/lockdep.c:2043
check_noncircular+0x12e/0x150 kernel/locking/lockdep.c:2175
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain kernel/locking/lockdep.c:3908 [inline]
__lock_acquire+0x15a5/0x2cf0 kernel/locking/lockdep.c:5237
lock_acquire+0xf0/0x2e0 kernel/locking/lockdep.c:5868
console_lock_spinning_enable kernel/printk/printk.c:1902 [inline]
console_emit_next_record kernel/printk/printk.c:3177 [inline]
console_flush_one_record kernel/printk/printk.c:3269 [inline]
console_flush_all+0x6c1/0xb20 kernel/printk/printk.c:3343
__console_flush_and_unlock kernel/printk/printk.c:3373 [inline]
console_unlock+0xd1/0x1c0 kernel/printk/printk.c:3413
vprintk_emit+0x485/0x560 kernel/printk/printk.c:2479
_printk+0xdd/0x130 kernel/printk/printk.c:2504
__report_bug+0x317/0x540 lib/bug.c:243
report_bug_entry+0x19a/0x290 lib/bug.c:269
handle_bug+0xce/0x200 arch/x86/kernel/traps.c:430
exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:489
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:616
RIP: 0010:check_flush_dependency+0x312/0x3c0 kernel/workqueue.c:3801
Code: 00 00 fc ff df 80 3c 08 00 74 08 4c 89 f7 e8 f5 33 a2 00 49 8b 16 48 81 c3 78 01 00 00 4c 89 ef 4c 89 e6 48 89 d9 4c 8b 04 24 <67> 48 0f b9 3a e9 53 ff ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f
RSP: 0018:ffffc9000289f860 EFLAGS: 00010086
RAX: 1ffff110202e9103 RBX: ffff88810006b178 RCX: ffff88810006b178
RDX: ffffffff821ed1f0 RSI: ffff8881000ab978 RDI: ffffffff9014a330
RBP: ffff888100687008 R08: ffffffff821ee110 R09: 1ffff1102000fb21
R10: dffffc0000000000 R11: ffffed102000fb22 R12: ffff8881000ab978
R13: ffffffff9014a330 R14: ffff888101748818 R15: ffff888101748820
start_flush_work kernel/workqueue.c:4255 [inline]
__flush_work+0x411/0xc50 kernel/workqueue.c:4292
__purge_vmap_area_lazy+0x876/0xb70 mm/vmalloc.c:2412
drain_vmap_area_work+0x27/0x40 mm/vmalloc.c:2437
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
workqueue: WQ_MEM_RECLAIM vmap_drain:drain_vmap_area_work is flushing !WQ_MEM_RECLAIM events:purge_vmap_node
WARNING: kernel/workqueue.c:3805 at check_flush_dependency+0x28f/0x3c0 kernel/workqueue.c:3801, CPU#0: kworker/u9:4/94
Modules linked in:
CPU: 0 UID: 0 PID: 94 Comm: kworker/u9:4 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: vmap_drain drain_vmap_area_work
RIP: 0010:check_flush_dependency+0x312/0x3c0 kernel/workqueue.c:3801
Code: 00 00 fc ff df 80 3c 08 00 74 08 4c 89 f7 e8 f5 33 a2 00 49 8b 16 48 81 c3 78 01 00 00 4c 89 ef 4c 89 e6 48 89 d9 4c 8b 04 24 <67> 48 0f b9 3a e9 53 ff ff ff 44 89 f1 80 e1 07 80 c1 03 38 c1 0f
RSP: 0018:ffffc9000289f860 EFLAGS: 00010086
RAX: 1ffff110202e9103 RBX: ffff88810006b178 RCX: ffff88810006b178
RDX: ffffffff821ed1f0 RSI: ffff8881000ab978 RDI: ffffffff9014a330
RBP: ffff888100687008 R08: ffffffff821ee110 R09: 1ffff1102000fb21
R10: dffffc0000000000 R11: ffffed102000fb22 R12: ffff8881000ab978
R13: ffffffff9014a330 R14: ffff888101748818 R15: ffff888101748820
FS: 0000000000000000(0000) GS:ffff88818de5e000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000386000 CR3: 0000000114a6a000 CR4: 00000000000006f0
Call Trace:
<TASK>
start_flush_work kernel/workqueue.c:4255 [inline]
__flush_work+0x411/0xc50 kernel/workqueue.c:4292
__purge_vmap_area_lazy+0x876/0xb70 mm/vmalloc.c:2412
drain_vmap_area_work+0x27/0x40 mm/vmalloc.c:2437
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
----------------
Code disassembly (best guess), 4 bytes skipped:
0: df 80 3c 08 00 74 filds 0x7400083c(%rax)
6: 08 4c 89 f7 or %cl,-0x9(%rcx,%rcx,4)
a: e8 f5 33 a2 00 call 0xa23404
f: 49 8b 16 mov (%r14),%rdx
12: 48 81 c3 78 01 00 00 add $0x178,%rbx
19: 4c 89 ef mov %r13,%rdi
1c: 4c 89 e6 mov %r12,%rsi
1f: 48 89 d9 mov %rbx,%rcx
22: 4c 8b 04 24 mov (%rsp),%r8
* 26: 67 48 0f b9 3a ud1 (%edx),%rdi <-- trapping instruction
2b: e9 53 ff ff ff jmp 0xffffff83
30: 44 89 f1 mov %r14d,%ecx
33: 80 e1 07 and $0x7,%cl
36: 80 c1 03 add $0x3,%cl
39: 38 c1 cmp %al,%cl
3b: 0f .byte 0xf
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
2026-03-19 7:43 [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining lirongqing
` (2 preceding siblings ...)
2026-03-20 9:51 ` [syzbot ci] " syzbot ci
@ 2026-03-24 13:32 ` kernel test robot
3 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2026-03-24 13:32 UTC (permalink / raw)
To: lirongqing
Cc: oe-lkp, lkp, Uladzislau Rezki, linux-mm, linux-kernel,
Andrew Morton, Li RongQing, oliver.sang
Hello,
kernel test robot noticed "RIP:check_flush_dependency" on:
commit: fadd60891595f20eece57c413cc7654e82bf3ea2 ("[PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining")
url: https://github.com/intel-lab-lkp/linux/commits/lirongqing/mm-vmalloc-use-dedicated-unbound-workqueue-for-vmap-area-draining/20260319-194106
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 8a30aeb0d1b4e4aaf7f7bae72f20f2ae75385ccb
patch link: https://lore.kernel.org/all/20260319074307.2325-1-lirongqing@baidu.com/
patch subject: [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining
in testcase: blktests
version: blktests-x86_64-7baa454-1_20260320
with following parameters:
test: zbd-009
config: x86_64-rhel-9.4-func
compiler: gcc-14
test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202603242123.12c0dd46-lkp@intel.com
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260324/202603242123.12c0dd46-lkp@intel.com
[ 68.342260][ T44] ------------[ cut here ]------------
[ 68.347550][ T44] workqueue: WQ_MEM_RECLAIM vmap_drain:drain_vmap_area_work is flushing !WQ_MEM_RECLAIM events:purge_vmap_node
[ 68.347558][ T44] WARNING: kernel/workqueue.c:3801 at check_flush_dependency+0x1bb/0x330, CPU#1: kworker/u16:3/44
[ 68.370043][ T44] Modules linked in: scsi_debug(-) null_blk loop binfmt_misc snd_hda_codec_intelhdmi snd_hda_codec_hdmi snd_ctl_led snd_hda_codec_alc269 snd_hda_codec_realtek_lib snd_hda_scodec_component snd_hda_codec_generic btrfs libblake2b intel_rapl_msr xor intel_rapl_common zstd_compress platform_profile x86_pkg_temp_thermal intel_powerclamp raid6_pq coretemp i915 dell_wmi sd_mod sg snd_hda_intel kvm_intel intel_gtt snd_soc_avs rfkill dell_smbios dcdbas snd_soc_hda_codec drm_buddy snd_hda_ext_core mei_wdt snd_hda_codec wmi_bmof dell_smm_hwmon dell_wmi_descriptor ttm sparse_keymap kvm snd_hda_core snd_intel_dspcfg drm_display_helper snd_intel_sdw_acpi snd_hwdep irqbypass ghash_clmulni_intel rapl intel_cstate ahci snd_soc_core intel_pmc_core cec snd_compress drm_client_lib pmt_telemetry snd_pcm drm_kms_helper libahci snd_timer pmt_discovery intel_uncore pmt_class pcspkr video i2c_i801 intel_pmc_ssram_telemetry libata snd mei_me intel_vsec acpi_pad i2c_smbus intel_pch_thermal mei wmi soundcore drm fuse nfnetlink
[ 68.370231][ T44] [last unloaded: scsi_debug]
[ 68.464263][ T44] CPU: 1 UID: 0 PID: 44 Comm: kworker/u16:3 Tainted: G S 7.0.0-rc4-00075-gfadd60891595 #1 PREEMPT(lazy)
[ 68.476704][ T44] Tainted: [S]=CPU_OUT_OF_SPEC
[ 68.481292][ T44] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[ 68.489337][ T44] Workqueue: vmap_drain drain_vmap_area_work
[ 68.495137][ T44] RIP: 0010:check_flush_dependency+0x1c9/0x330
[ 68.501713][ T44] Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 4f 01 00 00 48 8d 3d 65 c6 0d 05 48 8b 55 18 48 81 c6 c0 00 00 00 4d 89 f8 <67> 48 0f b9 3a 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc
[ 68.521058][ T44] RSP: 0018:ffffc90000357b50 EFLAGS: 00010086
[ 68.526946][ T44] RAX: dffffc0000000000 RBX: ffff88810c9a9500 RCX: ffff88810c8820c0
[ 68.534730][ T44] RDX: ffffffff81e33510 RSI: ffff88810c9accc0 RDI: ffffffff86670070
[ 68.542511][ T44] RBP: ffff88810d566000 R08: ffffffff81e31890 R09: ffffed102191130b
[ 68.550295][ T44] R10: ffff88810c88985f R11: 0000000000000040 R12: ffff88810d8a8000
[ 68.558082][ T44] R13: ffff88810c882000 R14: ffff88810d8a802c R15: ffffffff81e31890
[ 68.565865][ T44] FS: 0000000000000000(0000) GS:ffff8887fa56f000(0000) knlGS:0000000000000000
[ 68.574596][ T44] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 68.580998][ T44] CR2: 000056337e5a7918 CR3: 000000081a872003 CR4: 00000000003726f0
[ 68.588779][ T44] Call Trace:
[ 68.591900][ T44] <TASK>
[ 68.594681][ T44] __flush_work+0x50c/0x8f0
[ 68.599013][ T44] ? purge_vmap_node+0x6f5/0xab0
[ 68.603780][ T44] ? __pfx___flush_work+0x10/0x10
[ 68.608638][ T44] ? __pfx_purge_vmap_node+0x10/0x10
[ 68.613753][ T44] ? _raw_spin_lock+0x80/0xf0
[ 68.618268][ T44] __purge_vmap_area_lazy+0x723/0xaf0
[ 68.623486][ T44] drain_vmap_area_work+0x21/0x30
[ 68.628347][ T44] process_one_work+0x6b4/0xff0
[ 68.633030][ T44] ? assign_work+0x131/0x3f0
[ 68.637461][ T44] worker_thread+0x51d/0xdb0
[ 68.641894][ T44] ? __kthread_parkme+0xb1/0x1f0
[ 68.646660][ T44] ? __pfx_worker_thread+0x10/0x10
[ 68.651597][ T44] ? __pfx_worker_thread+0x10/0x10
[ 68.656535][ T44] kthread+0x353/0x470
[ 68.660436][ T44] ? recalc_sigpending+0x159/0x1f0
[ 68.665377][ T44] ? __pfx_kthread+0x10/0x10
[ 68.669795][ T44] ret_from_fork+0x32f/0x670
[ 68.674215][ T44] ? __pfx_ret_from_fork+0x10/0x10
[ 68.679164][ T44] ? switch_fpu+0x13/0x1f0
[ 68.683412][ T44] ? __switch_to+0x4c9/0xe70
[ 68.687831][ T44] ? __switch_to_asm+0x33/0x70
[ 68.692426][ T44] ? __pfx_kthread+0x10/0x10
[ 68.696845][ T44] ret_from_fork_asm+0x1a/0x30
[ 68.701454][ T44] </TASK>
[ 68.704333][ T44] ---[ end trace 0000000000000000 ]---
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-24 13:32 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-19 7:43 [PATCH v2] mm/vmalloc: use dedicated unbound workqueue for vmap area draining lirongqing
2026-03-19 9:39 ` Uladzislau Rezki
2026-03-19 10:05 ` 答复: [????] " Li,Rongqing(ACG CCN)
2026-03-19 13:23 ` Uladzislau Rezki
2026-03-20 5:48 ` 答复: [????] Re: ??: " Li,Rongqing(ACG CCN)
2026-03-20 3:16 ` Andrew Morton
2026-03-20 9:51 ` [syzbot ci] " syzbot ci
2026-03-24 13:32 ` [PATCH v2] " kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox