public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Vincent Guittot <vincent.guittot@linaro.org>,
	Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: vschneid@redhat.com, Phil Auld <pauld@redhat.com>,
	vdonnefort@google.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Wei Li <liwei391@huawei.com>, "liaoyu (E)" <liaoyu15@huawei.com>,
	zhangqiao22@huawei.com, Peter Zijlstra <peterz@infradead.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [Question] report a race condition between CPU hotplug state machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling
Date: Wed, 28 Jun 2023 14:03:22 +0200	[thread overview]
Message-ID: <875y774wvp.ffs@tglx> (raw)
In-Reply-To: <CAKfTPtBoe_jRn-EMsQxssQ4BcveT+Qcd+GmsRbQEXQDGfzFOMg@mail.gmail.com>

On Tue, Jun 27 2023 at 18:46, Vincent Guittot wrote:
> On Mon, 26 Jun 2023 at 10:23, Xiongfeng Wang <wangxiongfeng2@huawei.com> wrote:
>> > diff --cc kernel/sched/fair.c
>> > index d9d6519fae01,bd6624353608..000000000000
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@@ -5411,10 -5411,16 +5411,15 @@@ void start_cfs_bandwidth(struct cfs_ban
>> >   {
>> >         lockdep_assert_held(&cfs_b->lock);
>> >
>> > -       if (cfs_b->period_active)
>> > +       if (cfs_b->period_active) {
>> > +               struct hrtimer_clock_base *clock_base = cfs_b->period_timer.base;
>> > +               int cpu = clock_base->cpu_base->cpu;
>> > +               if (!cpu_active(cpu) && cpu != smp_processor_id())
>> > +                       hrtimer_start_expires(&cfs_b->period_timer,
>> > HRTIMER_MODE_ABS_PINNED);
>> >                 return;
>> > +       }
>
> I have been able to reproduce your problem and run your fix on top. I
> still wonder if there is a
> Could we have a helper from hrtimer to get the cpu of the clock_base ?

No, because this is fundamentally wrong.

If the CPU is on the way out, then the scheduler hotplug machinery
has to handle the period timer so that the problem Xiongfeng analyzed
does not happen in the first place.

sched_cpu_wait_empty() would be the obvious place to cleanup armed CFS
timers, but let me look into whether we can migrate hrtimers early in
general.

Aside of that the above is wrong by itself.

	if (cfs_b->period_active)
        	hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED);

This only ends up on the outgoing CPU if either:

   1) The code runs on the outgoing CPU

or

   2) The hrtimer is concurrently executing the hrtimer callback on the
      outgoing CPU.

So this:

	if (cfs_b->period_active) {
		struct hrtimer_clock_base *clock_base = cfs_b->period_timer.base;
		int cpu = clock_base->cpu_base->cpu;

		if (!cpu_active(cpu) && cpu != smp_processor_id())
			hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED);
              return;
      }

only works, if

  1) The code runs _not_ on the outgoing CPU

and

  2) The hrtimer is _not_ concurrently executing the hrtimer callback on
     the outgoing CPU.

     If the callback is executing (it spins on cfs_b->lock), then the
     timer is requeued on the outgoing CPU. Not what you want, right?

Plus accessing hrtimer->clock_base->cpu_base->cpu lockless is fragile at
best.

Thanks,

        tglx

  reply	other threads:[~2023-06-28 12:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-09 11:24 [Question] report a race condition between CPU hotplug state machine and hrtimer 'sched_cfs_period_timer' for cfs bandwidth throttling Xiongfeng Wang
2023-06-09 14:55 ` Thomas Gleixner
2023-06-12 12:49   ` Xiongfeng Wang
2023-06-26  8:23     ` Xiongfeng Wang
2023-06-27 16:46       ` Vincent Guittot
2023-06-28 12:03         ` Thomas Gleixner [this message]
2023-06-28 12:35           ` Vincent Guittot
2023-06-28 22:01             ` Thomas Gleixner
2023-06-29  1:41               ` Xiongfeng Wang
2023-06-29  8:30               ` Vincent Guittot
2023-08-22  8:58                 ` Xiongfeng Wang
2023-08-23 10:14                 ` Thomas Gleixner
2023-08-24  7:25                   ` Yu Liao
2023-08-29  7:18                   ` Vincent Guittot
2023-06-28 13:30         ` Vincent Guittot
2023-06-28 21:09           ` Thomas Gleixner
2023-06-29  1:26         ` Xiongfeng Wang
2023-06-29  8:33           ` Vincent Guittot
2023-08-30 10:29 ` [tip: smp/urgent] cpu/hotplug: Prevent self deadlock on CPU hot-unplug tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875y774wvp.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=dietmar.eggemann@arm.com \
    --cc=liaoyu15@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=vdonnefort@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wangxiongfeng2@huawei.com \
    --cc=zhangqiao22@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox