Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Xing Zhengjun <zhengjun.xing@linux.intel.com>
To: lkp@lists.01.org
Subject: Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work
Date: Tue, 26 Jan 2021 10:45:11 +0800	[thread overview]
Message-ID: <7b97a3ff-a1a3-55c4-0643-1346dfb276b2@linux.intel.com> (raw)
In-Reply-To: <20210125092900.1839-1-hdanton@sina.com>

[-- Attachment #1: Type: text/plain, Size: 16451 bytes --]



On 1/25/2021 5:29 PM, Hillf Danton wrote:
> On 25 Jan 2021 16:31:32 +0800 Xing Zhengjun wrote:
>> On 1/22/2021 3:59 PM, Hillf Danton wrote:
>>> On Fri, 22 Jan 2021 09:48:32 +0800 Xing Zhengjun wrote:
>>>> On 1/21/2021 12:00 PM, Hillf Danton wrote:
>>>>> On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote:
>>>>>> On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote:
>>>>>>> Thu, 14 Jan 2021 15:45:11 +0800
>>>>>>>>
>>>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>>>
>>>>>>>> commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b
>>>>>>>>
>>>>>>> [...]
>>>>>>>>
>>>>>>>> [   73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work
>>>>>>>
>>>>>>> Thanks for your report.
>>>>>>>
>>>>>>> We can also break CPU affinity by checking POOL_DISASSOCIATED at attach
>>>>>>> time without extra cost paid; that way we have the same behavior as at
>>>>>>> the unbind time.
>>>>>>>
>>>>>>> What is more the change that makes kworker pcpu is cut because they are
>>>>>>> going to not help either hotplug or the mechanism of stop machine.
>>>>>>
>>>>>> hi, by applying below patch, the issue still happened.
>>>>>
>>>>> Thanks for your report.
>>>>>>
>>>>>> [ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
>>>>>> [ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds
>>>>>> [ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
>>>>>> [ 4.578648] PCI: CLS 0 bytes, default 64
>>>>>> [ 4.579685] Unpacking initramfs...
>>>>>> [ 8.878031] -----------[ cut here ]-----------
>>>>>> [ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0
>>>>>> [ 8.880688] Modules linked in:
>>>>>> [ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2
>>>>>
>>>>> The kworker bond to CPU1 runs on CPU0 and triggers the warning, which
>>>>> shows that scheduler breaks CPU affinity, after 06249738a41a
>>>>> ("workqueue: Manually break affinity on hotplug"), though quite likely
>>>>> by kworker/1:0 for the initial workers.
>>>>>
>>>>>> [ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>> [ 8.887539] Workqueue: 0x0 (events)
>>>>>> [ 8.887838] EIP: process_one_work+0x92/0x9e0
>>>>>> [ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>> [ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>> [ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08
>>>>>> [ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>> [ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>>>>>> [ 8.887838] Call Trace:
>>>>>> [ 8.887838] ? worker_thread+0x98/0x6a0
>>>>>> [ 8.887838] ? worker_thread+0x2dd/0x6a0
>>>>>> [ 8.887838] ? kthread+0x1ba/0x1e0
>>>>>> [ 8.887838] ? create_worker+0x1e0/0x1e0
>>>>>> [ 8.887838] ? kzalloc+0x20/0x20
>>>>>> [ 8.887838] ? ret_from_fork+0x1c/0x28
>>>>>> [ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed
>>>>>> [ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0
>>>>>> [ 8.887838] --[ end trace ac461b4d54c37cfa ]--
>>>>>
>>>>>
>>>>> Instead of creating the initial workers only on the active CPUS, rebind
>>>>> them (labeled pcpu) and jump to the right CPU at bootup time.
>>>>>
>>>>> --- a/kernel/workqueue.c
>>>>> +++ b/kernel/workqueue.c
>>>>> @@ -2385,6 +2385,16 @@ woke_up:
>>>>>     		return 0;
>>>>>     	}
>>>>>     
>>>>> +	if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() !=
>>>>> +								pool->cpu) {
>>>>> +		/* scheduler breaks CPU affinity for us, rebind it */
>>>>> +		raw_spin_unlock_irq(&pool->lock);
>>>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>> +		/* and jump to the right seat */
>>>>> +		schedule_timeout_interruptible(1);
>>>>> +		goto woke_up;
>>>>> +	}
>>>>> +
>>>>>     	worker_leave_idle(worker);
>>>>>     recheck:
>>>>>     	/* no more worker necessary? */
>>>>> --
>>>>>
>>>> I test the patch, the warning still appears in the kernel log.
>>>
>>> Thanks for your report.
>>>>
>>>> [  230.356503] smpboot: CPU 1 is now offline
>>>> [  230.544652] x86: Booting SMP configuration:
>>>> [  230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [  230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>> [  230.545675] masked ExtINT on CPU#1
>>>> [  230.593829] ------------[ cut here ]------------
>>>> [  230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>>>> [  230.594990] Modules linked in: rcutorture torture mousedev input_leds
>>>> led_class pcspkr psmouse evbug tiny_power_button button
>>>> [  230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2
>>>
>>> Like what was reported, kworker bond to CPU1 runs on CPU0 and triggers
>>> warning, due to scheduler breaking CPU affinity for us. What is new, the
>>> affinity was broken at offline time instead of bootup.
>>>
>>>> [  230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>> [  230.597322] Workqueue:  0x0 (rcu_gp)
>>>> [  230.597636] EIP: process_one_work+0x92/0x9e0
>>>> [  230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04
>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>> [  230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>> [  230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08
>>>> [  230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>> [  230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0
>>>> [  230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>>>> [  230.602265] DR6: fffe0ff0 DR7: 00000400
>>>> [  230.602594] Call Trace:
>>>> [  230.602813]  ? process_one_work+0x20e/0x9e0
>>>> [  230.603181]  ? worker_thread+0x32d/0x700
>>>> [  230.603522]  ? kthread+0x1ba/0x1e0
>>>> [  230.603818]  ? create_worker+0x1e0/0x1e0
>>>> [  230.604157]  ? kzalloc+0x20/0x20
>>>> [  230.604524]  ? ret_from_fork+0x1c/0x28
>>>> [  230.604850] ---[ end trace 06b1e66b5e17fa85 ]---
>>>> [  230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>> [  230.766960] smpboot: CPU 1 is now offline
>>>> [  230.814803] x86: Booting SMP configuration:
>>>> [  230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [  230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>
>>>
>>> Unlike the above diff that is at most papering over the problem
>>> sitting somewhere in the scheduler, add change to creating worker
>>> by skipping set_cpus_allowed_ptr() because we will wake it up after
>>> attaching it to worker pool.
>>>
>>> If we can ignore rescuer for now, then the allowed ptr is only
>>> updated at on/offline time; lets see the difference at boot time.
>>>
>>>
>>> --- a/kernel/workqueue.c
>>> +++ b/kernel/workqueue.c
>>> @@ -1844,16 +1844,10 @@ static struct worker *alloc_worker(int n
>>>     * cpu-[un]hotplugs.
>>>     */
>>>    static void worker_attach_to_pool(struct worker *worker,
>>> -				   struct worker_pool *pool)
>>> +				   struct worker_pool *pool,
>>> +				   int update_cpus_allowed)
>>>    {
>>>    	mutex_lock(&wq_pool_attach_mutex);
>>> -
>>> -	/*
>>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>>> -	 */
>>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> -
>>>    	/*
>>>    	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>>    	 * stable across this function.  See the comments above the flag
>>> @@ -1867,6 +1861,9 @@ static void worker_attach_to_pool(struct
>>>    	list_add_tail(&worker->node, &pool->workers);
>>>    	worker->pool = pool;
>>>    
>>> +	if (update_cpus_allowed)
>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> +
>>>    	mutex_unlock(&wq_pool_attach_mutex);
>>>    }
>>>    
>>> @@ -1942,8 +1939,11 @@ static struct worker *create_worker(stru
>>>    	set_user_nice(worker->task, pool->attrs->nice);
>>>    	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>    
>>> -	/* successful, attach the worker to the pool */
>>> -	worker_attach_to_pool(worker, pool);
>>> +	/*
>>> +	 * attach the worker to the pool without asking scheduler to
>>> +	 * update CPUs allowed
>>> +	 */
>>> +	worker_attach_to_pool(worker, pool, 0);
>>>    
>>>    	/* start the newly created worker */
>>>    	raw_spin_lock_irq(&pool->lock);
>>> @@ -2508,7 +2508,7 @@ repeat:
>>>    
>>>    		raw_spin_unlock_irq(&wq_mayday_lock);
>>>    
>>> -		worker_attach_to_pool(rescuer, pool);
>>> +		worker_attach_to_pool(rescuer, pool, 1);
>>>    
>>>    		raw_spin_lock_irq(&pool->lock);
>>>    
>>> --
>>>
>> I test the patch, the warning still appears in the kernel log.
> 
> Thanks.
>>
>> [   55.754187] smpboot: Booting Node 0 Processor 1 APIC 0x1
>> [   55.785594] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>> [   55.785646] masked ExtINT on CPU#1
>> [   55.920602] ------------[ cut here ]------------
>> [   55.921355] WARNING: CPU: 0 PID: 160 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>> [   55.922583] Modules linked in: rcutorture torture mousedev evbug
>> input_leds led_class tiny_power_button psmouse pcspkr button
>> [   55.924294] CPU: 0 PID: 160 Comm: kworker/1:2 Not tainted 5.11.0-rc3-00186-g77bf4e461cfa #2
> 
> Same issue as before.
> 
>> [   55.925552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>> [   55.926763] Workqueue:  0x0 (rcu_gp)
>> [   55.927298] EIP: process_one_work+0x92/0x9e0
>> [   55.927950] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 94 85 13 00 ff 05 b8 30 04
>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>> [   55.930726] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>> [   55.931642] ESI: 43d90540 EDI: df48c0f4 EBP: de7f23c0 ESP: bfb47f08
>> [   55.932590] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>> [   55.933609] CR0: 80050033 CR2: 024e994c CR3: 7fd80000 CR4: 000406d0
>> [   55.934555] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> [   55.935457] DR6: fffe0ff0 DR7: 00000400
>> [   55.936041] Call Trace:
>> [   55.936534]  ? process_one_work+0x20e/0x9e0
>> [   55.937305]  ? worker_thread+0x2dd/0x6a0
>> [   55.938018]  ? kthread+0x1ba/0x1e0
>> [   55.938598]  ? create_worker+0x1e0/0x1e0
>> [   55.939315]  ? kzalloc+0x20/0x20
>> [   55.940000]  ? ret_from_fork+0x1c/0x28
>> [   55.940627] ---[ end trace d155e9e6402de179 ]---
>> [   55.941641] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>> [   56.155271] smpboot: CPU 1 is now offline
>> [   56.193613] x86: Booting SMP configuration:
>> [   56.194400] smpboot: Booting Node 0 Processor 1 APIC 0x1
> 
> The changes in the diff below are
> 
> 1/ at rescue time, change CPU affinity only if POOL_DISASSOCIATED
> is not set, and print warning the same way as offline time.
> 
> 2/ at offine time, dont update allowed CPUs after setting
> POOL_DISASSOCIATED because we no longer have interest in affinity.
> 
> 3/ at online time, mark pcpu before binding affinity.
> 
> Though one change a diff is appreciated, by the WARNs, we can tell
> which is what if any warning goes into dmesg.
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
>    * cpu-[un]hotplugs.
>    */
>   static void worker_attach_to_pool(struct worker *worker,
> -				   struct worker_pool *pool)
> +				   struct worker_pool *pool, int set)
>   {
>   	mutex_lock(&wq_pool_attach_mutex);
>   
>   	/*
> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
> -	 */
> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
> -	/*
>   	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>   	 * stable across this function.  See the comments above the flag
>   	 * definition for details.
>   	 */
>   	if (pool->flags & POOL_DISASSOCIATED)
>   		worker->flags |= WORKER_UNBOUND;
> -	else
> +	else {
>   		kthread_set_per_cpu(worker->task, true);
> +		if (set)
> +			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
> +						pool->attrs->cpumask) < 0);
> +	}
>   
>   	list_add_tail(&worker->node, &pool->workers);
>   	worker->pool = pool;
> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
>   	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>   
>   	/* successful, attach the worker to the pool */
> -	worker_attach_to_pool(worker, pool);
> +	worker_attach_to_pool(worker, pool, 0);
>   
>   	/* start the newly created worker */
>   	raw_spin_lock_irq(&pool->lock);
> @@ -2508,7 +2506,7 @@ repeat:
>   
>   		raw_spin_unlock_irq(&wq_mayday_lock);
>   
> -		worker_attach_to_pool(rescuer, pool);
> +		worker_attach_to_pool(rescuer, pool, 1);
>   
>   		raw_spin_lock_irq(&pool->lock);
>   
> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>   
>   		for_each_pool_worker(worker, pool) {
>   			kthread_set_per_cpu(worker->task, false);
> -			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>   		}
>   
>   		mutex_unlock(&wq_pool_attach_mutex);
> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker
>   	 * from CPU_ONLINE, the following shouldn't fail.
>   	 */
>   	for_each_pool_worker(worker, pool) {
> +		kthread_set_per_cpu(worker->task, true);
>   		WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>   						  pool->attrs->cpumask) < 0);
> -		kthread_set_per_cpu(worker->task, true);
>   	}
>   
>   	raw_spin_lock_irq(&pool->lock);
> --
> 
I test the patch, the warning still appears in the kernel log, but the 
warning is different from before.

[    0.054803] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
[    0.054803] masked ExtINT on CPU#1
[    0.054803] smpboot: CPU 1 Converting physical 0 to logical die 1
[    1.890338] ------------[ cut here ]------------
[    1.890338] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 
kthread_set_per_cpu+0x156/0x180
[    1.890338] Modules linked in:
[    1.890338] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 
5.11.0-rc3-00186-ged03082352b2 #2
[    1.890338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.12.0-1 04/01/2014
[    1.890338] EIP: kthread_set_per_cpu+0x156/0x180
[    1.890338] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3 
8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04 
43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff
[    1.890338] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001
[    1.890338] ESI: 43d76300 EDI: 43c0de00 EBP: de7f2564 ESP: 43d6beb8
[    1.900350] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
[    1.901303] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
[    1.902280] Call Trace:
[    1.902682]  ? workqueue_online_cpu+0x12b/0x640
[    1.903415]  ? workqueue_prepare_cpu+0xa0/0xa0
[    1.904155]  ? cpuhp_invoke_callback+0x1ed/0x1340
[    1.904941]  ? cpuhp_thread_fun+0x28f/0x460
[    1.905630]  ? cpuhp_thread_fun+0x49/0x460
[    1.906298]  ? smpboot_thread_fn+0x446/0x620
[    1.910275]  ? kthread+0x1ba/0x1e0
[    1.910857]  ? __smpboot_create_thread+0x260/0x260
[    1.911659]  ? kzalloc+0x20/0x20
[    1.912368]  ? ret_from_fork+0x1c/0x28
[    1.913016] ---[ end trace 6f6c005278241eba ]---
[    1.913971] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
[    1.920012] smp: Brought up 1 node, 2 CPUs
[    1.920299] smpboot: Max logical packages: 2
[    1.921019] smpboot: Total of 2 processors activated (10774.03 BogoMIPS)



-- 
Zhengjun Xing

[-- Attachment #2: dmesg.xz --]
[-- Type: application/octet-stream, Size: 38592 bytes --]

WARNING: multiple messages have this Message-ID (diff)

From: Xing Zhengjun <zhengjun.xing@linux.intel.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Oliver Sang <oliver.sang@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Lai Jiangshan <laijs@linux.alibaba.com>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@intel.com, lkp <lkp@lists.01.org>
Subject: Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work
Date: Tue, 26 Jan 2021 10:45:11 +0800	[thread overview]
Message-ID: <7b97a3ff-a1a3-55c4-0643-1346dfb276b2@linux.intel.com> (raw)
In-Reply-To: <20210125092900.1839-1-hdanton@sina.com>

[-- Attachment #1: Type: text/plain, Size: 16079 bytes --]



On 1/25/2021 5:29 PM, Hillf Danton wrote:
> On 25 Jan 2021 16:31:32 +0800 Xing Zhengjun wrote:
>> On 1/22/2021 3:59 PM, Hillf Danton wrote:
>>> On Fri, 22 Jan 2021 09:48:32 +0800 Xing Zhengjun wrote:
>>>> On 1/21/2021 12:00 PM, Hillf Danton wrote:
>>>>> On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote:
>>>>>> On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote:
>>>>>>> Thu, 14 Jan 2021 15:45:11 +0800
>>>>>>>>
>>>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>>>
>>>>>>>> commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b
>>>>>>>>
>>>>>>> [...]
>>>>>>>>
>>>>>>>> [   73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work
>>>>>>>
>>>>>>> Thanks for your report.
>>>>>>>
>>>>>>> We can also break CPU affinity by checking POOL_DISASSOCIATED at attach
>>>>>>> time without extra cost paid; that way we have the same behavior as at
>>>>>>> the unbind time.
>>>>>>>
>>>>>>> What is more the change that makes kworker pcpu is cut because they are
>>>>>>> going to not help either hotplug or the mechanism of stop machine.
>>>>>>
>>>>>> hi, by applying below patch, the issue still happened.
>>>>>
>>>>> Thanks for your report.
>>>>>>
>>>>>> [ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
>>>>>> [ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds
>>>>>> [ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
>>>>>> [ 4.578648] PCI: CLS 0 bytes, default 64
>>>>>> [ 4.579685] Unpacking initramfs...
>>>>>> [ 8.878031] -----------[ cut here ]-----------
>>>>>> [ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0
>>>>>> [ 8.880688] Modules linked in:
>>>>>> [ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2
>>>>>
>>>>> The kworker bond to CPU1 runs on CPU0 and triggers the warning, which
>>>>> shows that scheduler breaks CPU affinity, after 06249738a41a
>>>>> ("workqueue: Manually break affinity on hotplug"), though quite likely
>>>>> by kworker/1:0 for the initial workers.
>>>>>
>>>>>> [ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>> [ 8.887539] Workqueue: 0x0 (events)
>>>>>> [ 8.887838] EIP: process_one_work+0x92/0x9e0
>>>>>> [ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>> [ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>> [ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08
>>>>>> [ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>> [ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>>>>>> [ 8.887838] Call Trace:
>>>>>> [ 8.887838] ? worker_thread+0x98/0x6a0
>>>>>> [ 8.887838] ? worker_thread+0x2dd/0x6a0
>>>>>> [ 8.887838] ? kthread+0x1ba/0x1e0
>>>>>> [ 8.887838] ? create_worker+0x1e0/0x1e0
>>>>>> [ 8.887838] ? kzalloc+0x20/0x20
>>>>>> [ 8.887838] ? ret_from_fork+0x1c/0x28
>>>>>> [ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed
>>>>>> [ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0
>>>>>> [ 8.887838] --[ end trace ac461b4d54c37cfa ]--
>>>>>
>>>>>
>>>>> Instead of creating the initial workers only on the active CPUS, rebind
>>>>> them (labeled pcpu) and jump to the right CPU at bootup time.
>>>>>
>>>>> --- a/kernel/workqueue.c
>>>>> +++ b/kernel/workqueue.c
>>>>> @@ -2385,6 +2385,16 @@ woke_up:
>>>>>     		return 0;
>>>>>     	}
>>>>>     
>>>>> +	if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() !=
>>>>> +								pool->cpu) {
>>>>> +		/* scheduler breaks CPU affinity for us, rebind it */
>>>>> +		raw_spin_unlock_irq(&pool->lock);
>>>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>> +		/* and jump to the right seat */
>>>>> +		schedule_timeout_interruptible(1);
>>>>> +		goto woke_up;
>>>>> +	}
>>>>> +
>>>>>     	worker_leave_idle(worker);
>>>>>     recheck:
>>>>>     	/* no more worker necessary? */
>>>>> --
>>>>>
>>>> I test the patch, the warning still appears in the kernel log.
>>>
>>> Thanks for your report.
>>>>
>>>> [  230.356503] smpboot: CPU 1 is now offline
>>>> [  230.544652] x86: Booting SMP configuration:
>>>> [  230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [  230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>> [  230.545675] masked ExtINT on CPU#1
>>>> [  230.593829] ------------[ cut here ]------------
>>>> [  230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>>>> [  230.594990] Modules linked in: rcutorture torture mousedev input_leds
>>>> led_class pcspkr psmouse evbug tiny_power_button button
>>>> [  230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2
>>>
>>> Like what was reported, kworker bond to CPU1 runs on CPU0 and triggers
>>> warning, due to scheduler breaking CPU affinity for us. What is new, the
>>> affinity was broken at offline time instead of bootup.
>>>
>>>> [  230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>> [  230.597322] Workqueue:  0x0 (rcu_gp)
>>>> [  230.597636] EIP: process_one_work+0x92/0x9e0
>>>> [  230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04
>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>> [  230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>> [  230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08
>>>> [  230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>> [  230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0
>>>> [  230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>>>> [  230.602265] DR6: fffe0ff0 DR7: 00000400
>>>> [  230.602594] Call Trace:
>>>> [  230.602813]  ? process_one_work+0x20e/0x9e0
>>>> [  230.603181]  ? worker_thread+0x32d/0x700
>>>> [  230.603522]  ? kthread+0x1ba/0x1e0
>>>> [  230.603818]  ? create_worker+0x1e0/0x1e0
>>>> [  230.604157]  ? kzalloc+0x20/0x20
>>>> [  230.604524]  ? ret_from_fork+0x1c/0x28
>>>> [  230.604850] ---[ end trace 06b1e66b5e17fa85 ]---
>>>> [  230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>> [  230.766960] smpboot: CPU 1 is now offline
>>>> [  230.814803] x86: Booting SMP configuration:
>>>> [  230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [  230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>
>>>
>>> Unlike the above diff that is at most papering over the problem
>>> sitting somewhere in the scheduler, add change to creating worker
>>> by skipping set_cpus_allowed_ptr() because we will wake it up after
>>> attaching it to worker pool.
>>>
>>> If we can ignore rescuer for now, then the allowed ptr is only
>>> updated at on/offline time; lets see the difference at boot time.
>>>
>>>
>>> --- a/kernel/workqueue.c
>>> +++ b/kernel/workqueue.c
>>> @@ -1844,16 +1844,10 @@ static struct worker *alloc_worker(int n
>>>     * cpu-[un]hotplugs.
>>>     */
>>>    static void worker_attach_to_pool(struct worker *worker,
>>> -				   struct worker_pool *pool)
>>> +				   struct worker_pool *pool,
>>> +				   int update_cpus_allowed)
>>>    {
>>>    	mutex_lock(&wq_pool_attach_mutex);
>>> -
>>> -	/*
>>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>>> -	 */
>>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> -
>>>    	/*
>>>    	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>>    	 * stable across this function.  See the comments above the flag
>>> @@ -1867,6 +1861,9 @@ static void worker_attach_to_pool(struct
>>>    	list_add_tail(&worker->node, &pool->workers);
>>>    	worker->pool = pool;
>>>    
>>> +	if (update_cpus_allowed)
>>> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> +
>>>    	mutex_unlock(&wq_pool_attach_mutex);
>>>    }
>>>    
>>> @@ -1942,8 +1939,11 @@ static struct worker *create_worker(stru
>>>    	set_user_nice(worker->task, pool->attrs->nice);
>>>    	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>    
>>> -	/* successful, attach the worker to the pool */
>>> -	worker_attach_to_pool(worker, pool);
>>> +	/*
>>> +	 * attach the worker to the pool without asking scheduler to
>>> +	 * update CPUs allowed
>>> +	 */
>>> +	worker_attach_to_pool(worker, pool, 0);
>>>    
>>>    	/* start the newly created worker */
>>>    	raw_spin_lock_irq(&pool->lock);
>>> @@ -2508,7 +2508,7 @@ repeat:
>>>    
>>>    		raw_spin_unlock_irq(&wq_mayday_lock);
>>>    
>>> -		worker_attach_to_pool(rescuer, pool);
>>> +		worker_attach_to_pool(rescuer, pool, 1);
>>>    
>>>    		raw_spin_lock_irq(&pool->lock);
>>>    
>>> --
>>>
>> I test the patch, the warning still appears in the kernel log.
> 
> Thanks.
>>
>> [   55.754187] smpboot: Booting Node 0 Processor 1 APIC 0x1
>> [   55.785594] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>> [   55.785646] masked ExtINT on CPU#1
>> [   55.920602] ------------[ cut here ]------------
>> [   55.921355] WARNING: CPU: 0 PID: 160 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>> [   55.922583] Modules linked in: rcutorture torture mousedev evbug
>> input_leds led_class tiny_power_button psmouse pcspkr button
>> [   55.924294] CPU: 0 PID: 160 Comm: kworker/1:2 Not tainted 5.11.0-rc3-00186-g77bf4e461cfa #2
> 
> Same issue as before.
> 
>> [   55.925552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>> [   55.926763] Workqueue:  0x0 (rcu_gp)
>> [   55.927298] EIP: process_one_work+0x92/0x9e0
>> [   55.927950] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 94 85 13 00 ff 05 b8 30 04
>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>> [   55.930726] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>> [   55.931642] ESI: 43d90540 EDI: df48c0f4 EBP: de7f23c0 ESP: bfb47f08
>> [   55.932590] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>> [   55.933609] CR0: 80050033 CR2: 024e994c CR3: 7fd80000 CR4: 000406d0
>> [   55.934555] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> [   55.935457] DR6: fffe0ff0 DR7: 00000400
>> [   55.936041] Call Trace:
>> [   55.936534]  ? process_one_work+0x20e/0x9e0
>> [   55.937305]  ? worker_thread+0x2dd/0x6a0
>> [   55.938018]  ? kthread+0x1ba/0x1e0
>> [   55.938598]  ? create_worker+0x1e0/0x1e0
>> [   55.939315]  ? kzalloc+0x20/0x20
>> [   55.940000]  ? ret_from_fork+0x1c/0x28
>> [   55.940627] ---[ end trace d155e9e6402de179 ]---
>> [   55.941641] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>> [   56.155271] smpboot: CPU 1 is now offline
>> [   56.193613] x86: Booting SMP configuration:
>> [   56.194400] smpboot: Booting Node 0 Processor 1 APIC 0x1
> 
> The changes in the diff below are
> 
> 1/ at rescue time, change CPU affinity only if POOL_DISASSOCIATED
> is not set, and print warning the same way as offline time.
> 
> 2/ at offine time, dont update allowed CPUs after setting
> POOL_DISASSOCIATED because we no longer have interest in affinity.
> 
> 3/ at online time, mark pcpu before binding affinity.
> 
> Though one change a diff is appreciated, by the WARNs, we can tell
> which is what if any warning goes into dmesg.
> 
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
>    * cpu-[un]hotplugs.
>    */
>   static void worker_attach_to_pool(struct worker *worker,
> -				   struct worker_pool *pool)
> +				   struct worker_pool *pool, int set)
>   {
>   	mutex_lock(&wq_pool_attach_mutex);
>   
>   	/*
> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
> -	 */
> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
> -	/*
>   	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>   	 * stable across this function.  See the comments above the flag
>   	 * definition for details.
>   	 */
>   	if (pool->flags & POOL_DISASSOCIATED)
>   		worker->flags |= WORKER_UNBOUND;
> -	else
> +	else {
>   		kthread_set_per_cpu(worker->task, true);
> +		if (set)
> +			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
> +						pool->attrs->cpumask) < 0);
> +	}
>   
>   	list_add_tail(&worker->node, &pool->workers);
>   	worker->pool = pool;
> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
>   	kthread_bind_mask(worker->task, pool->attrs->cpumask);
>   
>   	/* successful, attach the worker to the pool */
> -	worker_attach_to_pool(worker, pool);
> +	worker_attach_to_pool(worker, pool, 0);
>   
>   	/* start the newly created worker */
>   	raw_spin_lock_irq(&pool->lock);
> @@ -2508,7 +2506,7 @@ repeat:
>   
>   		raw_spin_unlock_irq(&wq_mayday_lock);
>   
> -		worker_attach_to_pool(rescuer, pool);
> +		worker_attach_to_pool(rescuer, pool, 1);
>   
>   		raw_spin_lock_irq(&pool->lock);
>   
> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>   
>   		for_each_pool_worker(worker, pool) {
>   			kthread_set_per_cpu(worker->task, false);
> -			WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
>   		}
>   
>   		mutex_unlock(&wq_pool_attach_mutex);
> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker
>   	 * from CPU_ONLINE, the following shouldn't fail.
>   	 */
>   	for_each_pool_worker(worker, pool) {
> +		kthread_set_per_cpu(worker->task, true);
>   		WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
>   						  pool->attrs->cpumask) < 0);
> -		kthread_set_per_cpu(worker->task, true);
>   	}
>   
>   	raw_spin_lock_irq(&pool->lock);
> --
> 
I test the patch, the warning still appears in the kernel log, but the 
warning is different from before.

[    0.054803] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
[    0.054803] masked ExtINT on CPU#1
[    0.054803] smpboot: CPU 1 Converting physical 0 to logical die 1
[    1.890338] ------------[ cut here ]------------
[    1.890338] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508 
kthread_set_per_cpu+0x156/0x180
[    1.890338] Modules linked in:
[    1.890338] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted 
5.11.0-rc3-00186-ged03082352b2 #2
[    1.890338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS 1.12.0-1 04/01/2014
[    1.890338] EIP: kthread_set_per_cpu+0x156/0x180
[    1.890338] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3 
8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04 
43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff
[    1.890338] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001
[    1.890338] ESI: 43d76300 EDI: 43c0de00 EBP: de7f2564 ESP: 43d6beb8
[    1.900350] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
[    1.901303] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
[    1.902280] Call Trace:
[    1.902682]  ? workqueue_online_cpu+0x12b/0x640
[    1.903415]  ? workqueue_prepare_cpu+0xa0/0xa0
[    1.904155]  ? cpuhp_invoke_callback+0x1ed/0x1340
[    1.904941]  ? cpuhp_thread_fun+0x28f/0x460
[    1.905630]  ? cpuhp_thread_fun+0x49/0x460
[    1.906298]  ? smpboot_thread_fn+0x446/0x620
[    1.910275]  ? kthread+0x1ba/0x1e0
[    1.910857]  ? __smpboot_create_thread+0x260/0x260
[    1.911659]  ? kzalloc+0x20/0x20
[    1.912368]  ? ret_from_fork+0x1c/0x28
[    1.913016] ---[ end trace 6f6c005278241eba ]---
[    1.913971] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
[    1.920012] smp: Brought up 1 node, 2 CPUs
[    1.920299] smpboot: Max logical packages: 2
[    1.921019] smpboot: Total of 2 processors activated (10774.03 BogoMIPS)



-- 
Zhengjun Xing

[-- Attachment #2: dmesg.xz --]
[-- Type: application/octet-stream, Size: 38592 bytes --]

next      parent reply	other threads:[~2021-01-26  2:45 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210125092900.1839-1-hdanton@sina.com>
2021-01-26  2:45 ` Xing Zhengjun [this message]
2021-01-26  2:45   ` [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work Xing Zhengjun
     [not found] <20210126073925.1962-1-hdanton@sina.com>
2021-01-27  8:04 ` Xing Zhengjun
2021-01-27  8:04   ` Xing Zhengjun
     [not found] ` <20210127092128.2299-1-hdanton@sina.com>
2021-01-28  7:52   ` Xing Zhengjun
2021-01-28  7:52     ` Xing Zhengjun
     [not found]   ` <20210128090905.1596-1-hdanton@sina.com>
2021-01-28 18:08     ` Paul E. McKenney
2021-01-29  6:20       ` Xing Zhengjun
2021-01-29  6:20         ` Xing Zhengjun
2021-01-29 15:11         ` Paul E. McKenney
     [not found] <20210122075903.1722-1-hdanton@sina.com>
2021-01-25  8:37 ` Xing Zhengjun
2021-01-25  8:37   ` Xing Zhengjun
     [not found] <20210114084248.1819-1-hdanton@sina.com>
2021-01-20 13:41 ` Oliver Sang
2021-01-20 13:41   ` Oliver Sang
2021-01-14  7:45 kernel test robot
2021-01-14  7:45 ` kernel test robot
     [not found] ` <20210115072432.150-1-hdanton@sina.com>
2021-01-20 13:46   ` Oliver Sang
2021-01-20 13:46     ` Oliver Sang
     [not found]   ` <20210121040037.1555-1-hdanton@sina.com>
2021-01-22  1:48     ` Xing Zhengjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b97a3ff-a1a3-55c4-0643-1346dfb276b2@linux.intel.com \
    --to=zhengjun.xing@linux.intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.