From: Thomas Gleixner <tglx@linutronix.de>
To: Xiongfeng Wang <wangxiongfeng2@huawei.com>,
vschneid@redhat.com, Phil Auld <pauld@redhat.com>,
vdonnefort@google.com
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
wangxiongfeng2@huawei.com, Wei Li <liwei391@huawei.com>,
"liaoyu (E)" <liaoyu15@huawei.com>,
zhangqiao22@huawei.com, Peter Zijlstra <peterz@infradead.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [Question] report a race condition between CPU hotplug state machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling
Date: Fri, 09 Jun 2023 16:55:37 +0200 [thread overview]
Message-ID: <87mt18it1y.ffs@tglx> (raw)
In-Reply-To: <8e785777-03aa-99e1-d20e-e956f5685be6@huawei.com>
On Fri, Jun 09 2023 at 19:24, Xiongfeng Wang wrote:
Cc+ scheduler people, leave context intact
> Hello,
> When I do some low power tests, the following hung task is printed.
>
> Call trace:
> __switch_to+0xd4/0x160
> __schedule+0x38c/0x8c4
> __cond_resched+0x24/0x50
> unmap_kernel_range_noflush+0x210/0x240
> kretprobe_trampoline+0x0/0xc8
> __vunmap+0x70/0x31c
> __vfree+0x34/0x8c
> vfree+0x40/0x58
> free_vm_stack_cache+0x44/0x74
> cpuhp_invoke_callback+0xc4/0x71c
> _cpu_down+0x108/0x284
> kretprobe_trampoline+0x0/0xc8
> suspend_enter+0xd8/0x8ec
> suspend_devices_and_enter+0x1f0/0x360
> pm_suspend.part.1+0x428/0x53c
> pm_suspend+0x3c/0xa0
> devdrv_suspend_proc+0x148/0x248 [drv_devmng]
> devdrv_manager_set_power_state+0x140/0x680 [drv_devmng]
> devdrv_manager_ioctl+0xcc/0x210 [drv_devmng]
> drv_ascend_intf_ioctl+0x84/0x248 [drv_davinci_intf]
> __arm64_sys_ioctl+0xb4/0xf0
> el0_svc_common.constprop.0+0x140/0x374
> do_el0_svc+0x80/0xa0
> el0_svc+0x1c/0x28
> el0_sync_handler+0x90/0xf0
> el0_sync+0x168/0x180
>
> After some analysis, I found it is caused by the following race condition.
>
> 1. A task running on CPU1 is throttled for cfs bandwidth. CPU1 starts the
> hrtimer cfs_bandwidth 'period_timer' and enqueue the hrtimer on CPU1's rbtree.
> 2. Then the task is migrated to CPU2 and starts to offline CPU1. CPU1 starts
> CPUHP AP steps, and then the hrtimer 'period_timer' expires and re-enqueued on CPU1.
> 3. CPU1 runs to take_cpu_down() and disable irq. After CPU1 finished CPUHP AP
> steps, CPU2 starts the rest CPUHP step.
> 4. When CPU2 runs to free_vm_stack_cache(), it is sched out in __vunmap()
> because it run out of CPU quota. start_cfs_bandwidth() does not restart the
> hrtimer because 'cfs_b->period_active' is set.
> 5. The task waits the hrtimer 'period_timer' to expire to wake itself up, but
> CPU1 has disabled irq and the hrtimer won't expire until it is migrated to CPU2
> in hrtimers_dead_cpu(). But the task is blocked and cannot proceed to
> hrtimers_dead_cpu() step. So the task hungs.
>
> CPU1 CPU2
> Task set cfs_quota
> start hrtimer cfs_bandwidth 'period_timer'
> start to offline CPU1
> CPU1 start CPUHP AP step
> ...
> 'period_timer' expired and re-enqueued on CPU1
> ...
> disable irq in take_cpu_down()
> ...
> CPU2 start the rest CPUHP steps
> ...
> sched out in free_vm_stack_cache()
> wait for 'period_timer' expires
>
>
> Appreciate it a lot if anyone can give some suggestion on how fix this problem !
>
> Thanks,
> Xiongfeng
next prev parent reply other threads:[~2023-06-09 14:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-09 11:24 [Question] report a race condition between CPU hotplug state machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling Xiongfeng Wang
2023-06-09 14:55 ` Thomas Gleixner [this message]
2023-06-12 12:49 ` Xiongfeng Wang
2023-06-26 8:23 ` Xiongfeng Wang
2023-06-27 16:46 ` Vincent Guittot
2023-06-28 12:03 ` Thomas Gleixner
2023-06-28 12:35 ` Vincent Guittot
2023-06-28 22:01 ` Thomas Gleixner
2023-06-29 1:41 ` Xiongfeng Wang
2023-06-29 8:30 ` Vincent Guittot
2023-08-22 8:58 ` Xiongfeng Wang
2023-08-23 10:14 ` Thomas Gleixner
2023-08-24 7:25 ` Yu Liao
2023-08-29 7:18 ` Vincent Guittot
2023-06-28 13:30 ` Vincent Guittot
2023-06-28 21:09 ` Thomas Gleixner
2023-06-29 1:26 ` Xiongfeng Wang
2023-06-29 8:33 ` Vincent Guittot
2023-08-30 10:29 ` [tip: smp/urgent] cpu/hotplug: Prevent self deadlock on CPU hot-unplug tip-bot2 for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mt18it1y.ffs@tglx \
--to=tglx@linutronix.de \
--cc=dietmar.eggemann@arm.com \
--cc=liaoyu15@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liwei391@huawei.com \
--cc=mingo@kernel.org \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=vdonnefort@google.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=wangxiongfeng2@huawei.com \
--cc=zhangqiao22@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox