[PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic
@ 2024-07-04  3:49 Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 1/5] workqueue: Register sysfs after the whole creation of the new wq Lai Jiangshan
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Juri Lelli, Waiman Long

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

The PWQ allocation and WQ enlistment are not within the same lock-held
critical section; therefore, their states can become out of sync when
the user modifies the unbound mask or if CPU hotplug events occur in
the interim since those operations only update the WQs that are already
in the list.

Change from v1:
	Init rescuer's affinities as the wq's effective cpumask

Cc: Tejun Heo <tj@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Waiman Long <longman@redhat.com>

V1: https://lore.kernel.org/lkml/20240703033855.3373-1-jiangshanlai@gmail.com/

Lai Jiangshan (5):
  workqueue: Register sysfs after the whole creation of the new wq
  workqueue: Make rescuer initialization as the last step of the
    creation of a new wq
  workqueue: Move kthread_flush_worker() out of alloc_and_link_pwqs()
  workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.
  workqueue: Init rescuer's affinities as the wq's effective cpumask

 kernel/workqueue.c | 87 +++++++++++++++++++++++++---------------------
 1 file changed, 47 insertions(+), 40 deletions(-)

-- 
2.19.1.6.gb485710b


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 1/5] workqueue: Register sysfs after the whole creation of the new wq
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
@ 2024-07-04  3:49 ` Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 2/5] workqueue: Make rescuer initialization as the last step of the creation of a " Lai Jiangshan
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

workqueue creation includes adding it to the workqueue list.

Prepare for moving the whole workqueue initializing procedure into
wq_pool_mutex and cpu hotplug locks.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a38a67ac4e80..904a1a6808b7 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5693,9 +5693,6 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	if (wq_online && init_rescuer(wq) < 0)
 		goto err_destroy;
 
-	if ((wq->flags & WQ_SYSFS) && workqueue_sysfs_register(wq))
-		goto err_destroy;
-
 	/*
 	 * wq_pool_mutex protects global freeze state and workqueues list.
 	 * Grab it, adjust max_active and add the new @wq to workqueues
@@ -5711,6 +5708,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 
 	mutex_unlock(&wq_pool_mutex);
 
+	if ((wq->flags & WQ_SYSFS) && workqueue_sysfs_register(wq))
+		goto err_destroy;
+
 	return wq;
 
 err_free_node_nr_active:
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V2 2/5] workqueue: Make rescuer initialization as the last step of the creation of a new wq
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 1/5] workqueue: Register sysfs after the whole creation of the new wq Lai Jiangshan
@ 2024-07-04  3:49 ` Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 3/5] workqueue: Move kthread_flush_worker() out of alloc_and_link_pwqs() Lai Jiangshan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Juri Lelli, Waiman Long, Tejun Heo, Lai Jiangshan

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

For early wq allocation, rescuer initialization is the last step of the
creation of a new wq.  Make the behavior the same for all allocations.

Prepare for initializing rescuer's affinities with the default pwq's
affinities.

Prepare for moving the whole workqueue initializing procedure into
wq_pool_mutex and cpu hotplug locks.

Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 904a1a6808b7..0c5dc7c06b81 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5690,9 +5690,6 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	if (alloc_and_link_pwqs(wq) < 0)
 		goto err_free_node_nr_active;
 
-	if (wq_online && init_rescuer(wq) < 0)
-		goto err_destroy;
-
 	/*
 	 * wq_pool_mutex protects global freeze state and workqueues list.
 	 * Grab it, adjust max_active and add the new @wq to workqueues
@@ -5708,6 +5705,9 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 
 	mutex_unlock(&wq_pool_mutex);
 
+	if (wq_online && init_rescuer(wq) < 0)
+		goto err_destroy;
+
 	if ((wq->flags & WQ_SYSFS) && workqueue_sysfs_register(wq))
 		goto err_destroy;
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V2 3/5] workqueue: Move kthread_flush_worker() out of alloc_and_link_pwqs()
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 1/5] workqueue: Register sysfs after the whole creation of the new wq Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 2/5] workqueue: Make rescuer initialization as the last step of the creation of a " Lai Jiangshan
@ 2024-07-04  3:49 ` Lai Jiangshan
  2024-07-04  3:49 ` [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S Lai Jiangshan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Zqiang, Tejun Heo, Lai Jiangshan

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

kthread_flush_worker() can't be called with wq_pool_mutex held.

Prepare for moving wq_pool_mutex and cpu hotplug lock out of
alloc_and_link_pwqs().

Cc: Zqiang <qiang.zhang1211@gmail.com>
Link: https://lore.kernel.org/lkml/20230920060704.24981-1-qiang.zhang1211@gmail.com/
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0c5dc7c06b81..cb496facf654 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5476,12 +5476,6 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 	}
 	cpus_read_unlock();
 
-	/* for unbound pwq, flush the pwq_release_worker ensures that the
-	 * pwq_release_workfn() completes before calling kfree(wq).
-	 */
-	if (ret)
-		kthread_flush_worker(pwq_release_worker);
-
 	return ret;
 
 enomem:
@@ -5714,8 +5708,15 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	return wq;
 
 err_free_node_nr_active:
-	if (wq->flags & WQ_UNBOUND)
+	/*
+	 * Failed alloc_and_link_pwqs() may leave pending pwq->release_work,
+	 * flushing the pwq_release_worker ensures that the pwq_release_workfn()
+	 * completes before calling kfree(wq).
+	 */
+	if (wq->flags & WQ_UNBOUND) {
+		kthread_flush_worker(pwq_release_worker);
 		free_node_nr_active(wq->node_nr_active);
+	}
 err_unreg_lockdep:
 	wq_unregister_lockdep(wq);
 	wq_free_lockdep(wq);
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
                   ` (2 preceding siblings ...)
  2024-07-04  3:49 ` [PATCH V2 3/5] workqueue: Move kthread_flush_worker() out of alloc_and_link_pwqs() Lai Jiangshan
@ 2024-07-04  3:49 ` Lai Jiangshan
  2024-07-08  7:54   ` kernel test robot
  2024-07-04  3:49 ` [PATCH V2 5/5] workqueue: Init rescuer's affinities as the wq's effective cpumask Lai Jiangshan
  2024-07-05 19:15 ` [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Tejun Heo
  5 siblings, 1 reply; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

The PWQ allocation and WQ enlistment are not within the same lock-held
critical section; therefore, their states can become out of sync when
the user modifies the unbound mask or if CPU hotplug events occur in
the interim since those operations only update the WQs that are already
in the list.

Make the PWQ allocation and WQ enlistment atomic.

Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 54 ++++++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index cb496facf654..5129934f274f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5118,6 +5118,19 @@ static struct pool_workqueue *alloc_unbound_pwq(struct workqueue_struct *wq,
 	return pwq;
 }
 
+static void apply_wqattrs_lock(void)
+{
+	/* CPUs should stay stable across pwq creations and installations */
+	cpus_read_lock();
+	mutex_lock(&wq_pool_mutex);
+}
+
+static void apply_wqattrs_unlock(void)
+{
+	mutex_unlock(&wq_pool_mutex);
+	cpus_read_unlock();
+}
+
 /**
  * wq_calc_pod_cpumask - calculate a wq_attrs' cpumask for a pod
  * @attrs: the wq_attrs of the default pwq of the target workqueue
@@ -5429,6 +5442,9 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 	bool highpri = wq->flags & WQ_HIGHPRI;
 	int cpu, ret;
 
+	lockdep_assert_cpus_held();
+	lockdep_assert_held(&wq_pool_mutex);
+
 	wq->cpu_pwq = alloc_percpu(struct pool_workqueue *);
 	if (!wq->cpu_pwq)
 		goto enomem;
@@ -5461,20 +5477,18 @@ static int alloc_and_link_pwqs(struct workqueue_struct *wq)
 		return 0;
 	}
 
-	cpus_read_lock();
 	if (wq->flags & __WQ_ORDERED) {
 		struct pool_workqueue *dfl_pwq;
 
-		ret = apply_workqueue_attrs(wq, ordered_wq_attrs[highpri]);
+		ret = apply_workqueue_attrs_locked(wq, ordered_wq_attrs[highpri]);
 		/* there should only be single pwq for ordering guarantee */
 		dfl_pwq = rcu_access_pointer(wq->dfl_pwq);
 		WARN(!ret && (wq->pwqs.next != &dfl_pwq->pwqs_node ||
 			      wq->pwqs.prev != &dfl_pwq->pwqs_node),
 		     "ordering guarantee broken for workqueue %s\n", wq->name);
 	} else {
-		ret = apply_workqueue_attrs(wq, unbound_std_wq_attrs[highpri]);
+		ret = apply_workqueue_attrs_locked(wq, unbound_std_wq_attrs[highpri]);
 	}
-	cpus_read_unlock();
 
 	return ret;
 
@@ -5681,15 +5695,15 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 			goto err_unreg_lockdep;
 	}
 
-	if (alloc_and_link_pwqs(wq) < 0)
-		goto err_free_node_nr_active;
-
 	/*
-	 * wq_pool_mutex protects global freeze state and workqueues list.
-	 * Grab it, adjust max_active and add the new @wq to workqueues
-	 * list.
+	 * wq_pool_mutex protects the workqueues list, allocations of PWQs,
+	 * and the global freeze state.  alloc_and_link_pwqs() also requires
+	 * cpus_read_lock() for PWQs' affinities.
 	 */
-	mutex_lock(&wq_pool_mutex);
+	apply_wqattrs_lock();
+
+	if (alloc_and_link_pwqs(wq) < 0)
+		goto err_unlock_free_node_nr_active;
 
 	mutex_lock(&wq->mutex);
 	wq_adjust_max_active(wq);
@@ -5697,7 +5711,7 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 
 	list_add_tail_rcu(&wq->list, &workqueues);
 
-	mutex_unlock(&wq_pool_mutex);
+	apply_wqattrs_unlock();
 
 	if (wq_online && init_rescuer(wq) < 0)
 		goto err_destroy;
@@ -5707,7 +5721,8 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 
 	return wq;
 
-err_free_node_nr_active:
+err_unlock_free_node_nr_active:
+	apply_wqattrs_unlock();
 	/*
 	 * Failed alloc_and_link_pwqs() may leave pending pwq->release_work,
 	 * flushing the pwq_release_worker ensures that the pwq_release_workfn()
@@ -6996,19 +7011,6 @@ static struct attribute *wq_sysfs_attrs[] = {
 };
 ATTRIBUTE_GROUPS(wq_sysfs);
 
-static void apply_wqattrs_lock(void)
-{
-	/* CPUs should stay stable across pwq creations and installations */
-	cpus_read_lock();
-	mutex_lock(&wq_pool_mutex);
-}
-
-static void apply_wqattrs_unlock(void)
-{
-	mutex_unlock(&wq_pool_mutex);
-	cpus_read_unlock();
-}
-
 static ssize_t wq_nice_show(struct device *dev, struct device_attribute *attr,
 			    char *buf)
 {
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.
  2024-07-04  3:49 ` [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S Lai Jiangshan
@ 2024-07-08  7:54   ` kernel test robot
  2024-07-08 18:16     ` Tejun Heo
  0 siblings, 1 reply; 10+ messages in thread
From: kernel test robot @ 2024-07-08  7:54 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: oe-lkp, lkp, linux-kernel, Lai Jiangshan, Tejun Heo,
	Lai Jiangshan, oliver.sang



Hello,

kernel test robot noticed "WARNING:possible_recursive_locking_detected" on:

commit: 1d4c6111406c8306a9b87ba6c496996cd4539cfd ("[PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.")
url: https://github.com/intel-lab-lkp/linux/commits/Lai-Jiangshan/workqueue-Register-sysfs-after-the-whole-creation-of-the-new-wq/20240705-043238
base: v6.10-rc5
patch subject: [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.

in testcase: rcutorture
version: 
with following parameters:

	runtime: 300s
	test: default
	torture_type: srcu



compiler: gcc-13
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202407081521.83b627c1-lkp@intel.com


[    2.233457][    T1] WARNING: possible recursive locking detected
[    2.233457][    T1] 6.10.0-rc5-00004-g1d4c6111406c #1 Not tainted
[    2.233457][    T1] --------------------------------------------
[    2.233457][    T1] swapper/0/1 is trying to acquire lock:
[ 2.233457][ T1] c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
[    2.233457][    T1]
[    2.233457][    T1] but task is already holding lock:
[ 2.233457][ T1] c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: padata_alloc (kernel/padata.c:1007) 
[    2.233457][    T1]
[    2.233457][    T1] other info that might help us debug this:
[    2.245549][    T1]  Possible unsafe locking scenario:
[    2.245549][    T1]
[    2.245549][    T1]        CPU0
[    2.245549][    T1]        ----
[    2.245549][    T1]   lock(cpu_hotplug_lock);
[    2.245549][    T1]
[    2.245549][    T1]  *** DEADLOCK ***
[    2.245549][    T1]
[    2.251678][    T1]  May be due to missing lock nesting notation
[    2.251678][    T1]
[    2.253463][    T1] 1 lock held by swapper/0/1:
[ 2.253463][ T1] #0: c27760f4 (cpu_hotplug_lock){++++}-{0:0}, at: padata_alloc (kernel/padata.c:1007) 
[    2.253463][    T1]
[    2.253463][    T1] stack backtrace:
[    2.257461][    T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc5-00004-g1d4c6111406c #1 c89023213b7b89ade58aa28d4c172b811b00908c
[    2.257461][    T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    2.261462][    T1] Call Trace:
[ 2.261462][ T1] dump_stack_lvl (lib/dump_stack.c:117) 
[ 2.261462][ T1] dump_stack (lib/dump_stack.c:124) 
[ 2.261462][ T1] print_deadlock_bug (kernel/locking/lockdep.c:3013) 
[ 2.265464][ T1] validate_chain (kernel/locking/lockdep.c:3859) 
[ 2.265464][ T1] __lock_acquire (kernel/locking/lockdep.c:5137 (discriminator 1)) 
[ 2.265464][ T1] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67) 
[ 2.265464][ T1] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5756) 
[ 2.265464][ T1] ? alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
[ 2.269463][ T1] ? __might_sleep (kernel/sched/core.c:10126) 
[ 2.269463][ T1] ? is_dynamic_key (include/linux/rcupdate.h:779 kernel/locking/lockdep.c:1257) 
[ 2.269463][ T1] cpus_read_lock (include/linux/percpu-rwsem.h:53 kernel/cpu.c:488) 
[ 2.269463][ T1] ? alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
[ 2.273463][ T1] alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
[ 2.273463][ T1] padata_alloc (kernel/padata.c:1007 (discriminator 1)) 
[ 2.273463][ T1] ? nhpoly1305_mod_init (crypto/pcrypt.c:345) 
[ 2.273463][ T1] pcrypt_init_padata (crypto/pcrypt.c:327 (discriminator 1)) 
[ 2.277463][ T1] ? nhpoly1305_mod_init (crypto/pcrypt.c:345) 
[ 2.277463][ T1] pcrypt_init (crypto/pcrypt.c:353) 
[ 2.277463][ T1] ? nhpoly1305_mod_init (crypto/pcrypt.c:345) 
[ 2.277463][ T1] do_one_initcall (init/main.c:1267) 
[ 2.281464][ T1] ? next_arg (lib/cmdline.c:274) 
[ 2.281464][ T1] do_initcalls (init/main.c:1328 (discriminator 1) init/main.c:1345 (discriminator 1)) 
[ 2.281464][ T1] ? rdinit_setup (init/main.c:1313) 
[ 2.281464][ T1] kernel_init_freeable (init/main.c:1364) 
[ 2.285464][ T1] ? kernel_init (init/main.c:1469) 
[ 2.285464][ T1] ? rest_init (init/main.c:1459) 
[ 2.285464][ T1] kernel_init (init/main.c:1469) 
[ 2.285464][ T1] ? schedule_tail (kernel/sched/core.c:5346) 
[ 2.285464][ T1] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 2.289466][ T1] ? rest_init (init/main.c:1459) 
[ 2.289466][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
[ 2.289466][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:944) 



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240708/202407081521.83b627c1-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.
  2024-07-08  7:54   ` kernel test robot
@ 2024-07-08 18:16     ` Tejun Heo
  2024-07-09 14:55       ` Lai Jiangshan
  0 siblings, 1 reply; 10+ messages in thread
From: Tejun Heo @ 2024-07-08 18:16 UTC (permalink / raw)
  To: kernel test robot; +Cc: Lai Jiangshan, oe-lkp, lkp, linux-kernel, Lai Jiangshan

Hello,

On Mon, Jul 08, 2024 at 03:54:19PM +0800, kernel test robot wrote:
...
> [ 2.269463][ T1] cpus_read_lock (include/linux/percpu-rwsem.h:53 kernel/cpu.c:488) 
> [ 2.273463][ T1] alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730) 
> [ 2.273463][ T1] padata_alloc (kernel/padata.c:1007 (discriminator 1)) 
> [ 2.273463][ T1] pcrypt_init_padata (crypto/pcrypt.c:327 (discriminator 1)) 
> [ 2.277463][ T1] pcrypt_init (crypto/pcrypt.c:353) 
> [ 2.277463][ T1] do_one_initcall (init/main.c:1267) 
> [ 2.281464][ T1] do_initcalls (init/main.c:1328 (discriminator 1) init/main.c:1345 (discriminator 1)) 
> [ 2.281464][ T1] kernel_init_freeable (init/main.c:1364) 
> [ 2.285464][ T1] kernel_init (init/main.c:1469) 
> [ 2.285464][ T1] ret_from_fork (arch/x86/kernel/process.c:153) 
> [ 2.289466][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:737) 
> [ 2.289466][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:944) 

Ah, this is unfortunate, so pcrypt is allocating a workqueue while holding
cpus_read_lock(), so workqueue code can't do it again as that can lead to
deadlocks if down_write starts after the first down_read. Lai, it looks like
we'd need to switch to workqueue specific locking so that we don't depend on
cpus_read_lock from alloc path.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S.
  2024-07-08 18:16     ` Tejun Heo
@ 2024-07-09 14:55       ` Lai Jiangshan
  0 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-09 14:55 UTC (permalink / raw)
  To: Tejun Heo; +Cc: kernel test robot, oe-lkp, lkp, linux-kernel, Lai Jiangshan

Hello,

On Tue, Jul 9, 2024 at 2:16 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Mon, Jul 08, 2024 at 03:54:19PM +0800, kernel test robot wrote:
> ...
> > [ 2.269463][ T1] cpus_read_lock (include/linux/percpu-rwsem.h:53 kernel/cpu.c:488)
> > [ 2.273463][ T1] alloc_workqueue (kernel/workqueue.c:5152 kernel/workqueue.c:5730)
> > [ 2.273463][ T1] padata_alloc (kernel/padata.c:1007 (discriminator 1))
> > [ 2.273463][ T1] pcrypt_init_padata (crypto/pcrypt.c:327 (discriminator 1))
> > [ 2.277463][ T1] pcrypt_init (crypto/pcrypt.c:353)
> > [ 2.277463][ T1] do_one_initcall (init/main.c:1267)
> > [ 2.281464][ T1] do_initcalls (init/main.c:1328 (discriminator 1) init/main.c:1345 (discriminator 1))
> > [ 2.281464][ T1] kernel_init_freeable (init/main.c:1364)
> > [ 2.285464][ T1] kernel_init (init/main.c:1469)
> > [ 2.285464][ T1] ret_from_fork (arch/x86/kernel/process.c:153)
> > [ 2.289466][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
> > [ 2.289466][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:944)

Thanks for the report!

>
> Ah, this is unfortunate, so pcrypt is allocating a workqueue while holding
> cpus_read_lock(), so workqueue code can't do it again as that can lead to
> deadlocks if down_write starts after the first down_read. Lai, it looks like
> we'd need to switch to workqueue specific locking so that we don't depend on
> cpus_read_lock from alloc path.
>

I'm working on it. A new wq_online_mask will be added. The new
wq_online_mask mirrors the cpu_online_mask except during hotplugging;
specifically, it differs between the hotplugging stages of
workqueue_offline_cpu() and workqueue_online_cpu(), during which the
transitioning CPU is not represented in the mask.

Thanks
Lai

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 5/5] workqueue: Init rescuer's affinities as the wq's effective cpumask
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
                   ` (3 preceding siblings ...)
  2024-07-04  3:49 ` [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S Lai Jiangshan
@ 2024-07-04  3:49 ` Lai Jiangshan
  2024-07-05 19:15 ` [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Tejun Heo
  5 siblings, 0 replies; 10+ messages in thread
From: Lai Jiangshan @ 2024-07-04  3:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Juri Lelli, Waiman Long, Tejun Heo, Lai Jiangshan

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

Make it consistent with apply_wqattrs_commit().

Link: https://lore.kernel.org/lkml/20240203154334.791910-5-longman@redhat.com/
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 kernel/workqueue.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 5129934f274f..8b2a0fe4a85e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5525,6 +5525,8 @@ static int init_rescuer(struct workqueue_struct *wq)
 	struct worker *rescuer;
 	int ret;
 
+	lockdep_assert_held(&wq_pool_mutex);
+
 	if (!(wq->flags & WQ_MEM_RECLAIM))
 		return 0;
 
@@ -5547,7 +5549,7 @@ static int init_rescuer(struct workqueue_struct *wq)
 
 	wq->rescuer = rescuer;
 	if (wq->flags & WQ_UNBOUND)
-		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
+		kthread_bind_mask(rescuer->task, unbound_effective_cpumask(wq));
 	else
 		kthread_bind_mask(rescuer->task, cpu_possible_mask);
 	wake_up_process(rescuer->task);
@@ -5711,10 +5713,10 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 
 	list_add_tail_rcu(&wq->list, &workqueues);
 
-	apply_wqattrs_unlock();
-
 	if (wq_online && init_rescuer(wq) < 0)
-		goto err_destroy;
+		goto err_unlock_destroy;
+
+	apply_wqattrs_unlock();
 
 	if ((wq->flags & WQ_SYSFS) && workqueue_sysfs_register(wq))
 		goto err_destroy;
@@ -5739,6 +5741,8 @@ struct workqueue_struct *alloc_workqueue(const char *fmt,
 	free_workqueue_attrs(wq->unbound_attrs);
 	kfree(wq);
 	return NULL;
+err_unlock_destroy:
+	apply_wqattrs_unlock();
 err_destroy:
 	destroy_workqueue(wq);
 	return NULL;
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic
  2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
                   ` (4 preceding siblings ...)
  2024-07-04  3:49 ` [PATCH V2 5/5] workqueue: Init rescuer's affinities as the wq's effective cpumask Lai Jiangshan
@ 2024-07-05 19:15 ` Tejun Heo
  5 siblings, 0 replies; 10+ messages in thread
From: Tejun Heo @ 2024-07-05 19:15 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan, Juri Lelli, Waiman Long

On Thu, Jul 04, 2024 at 11:49:09AM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> The PWQ allocation and WQ enlistment are not within the same lock-held
> critical section; therefore, their states can become out of sync when
> the user modifies the unbound mask or if CPU hotplug events occur in
> the interim since those operations only update the WQs that are already
> in the list.

Applied to wq/for-6.11.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-07-09 14:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-04  3:49 [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Lai Jiangshan
2024-07-04  3:49 ` [PATCH V2 1/5] workqueue: Register sysfs after the whole creation of the new wq Lai Jiangshan
2024-07-04  3:49 ` [PATCH V2 2/5] workqueue: Make rescuer initialization as the last step of the creation of a " Lai Jiangshan
2024-07-04  3:49 ` [PATCH V2 3/5] workqueue: Move kthread_flush_worker() out of alloc_and_link_pwqs() Lai Jiangshan
2024-07-04  3:49 ` [PATCH V2 4/5] workqueue: Put PWQ allocation and WQ enlistment in the same lock C.S Lai Jiangshan
2024-07-08  7:54   ` kernel test robot
2024-07-08 18:16     ` Tejun Heo
2024-07-09 14:55       ` Lai Jiangshan
2024-07-04  3:49 ` [PATCH V2 5/5] workqueue: Init rescuer's affinities as the wq's effective cpumask Lai Jiangshan
2024-07-05 19:15 ` [PATCH V2 0/5] workqueue: Make the PWQ allocation and WQ enlistment atomic Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox