All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Auld <pauld@redhat.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] cpuhp: make target_store() a nop when target == state
Date: Tue, 24 May 2022 12:39:33 -0400	[thread overview]
Message-ID: <Yo0KRVpfhUb8Ta4N@lorien.usersys.redhat.com> (raw)
In-Reply-To: <xhsmh35gzj77s.mognet@vschneid.remote.csb>

On Tue, May 24, 2022 at 04:11:51PM +0100 Valentin Schneider wrote:
> On 23/05/22 10:47, Phil Auld wrote:
> > writing the current state back into hotplug/target calls cpu_down()
> > which will set cpu dying even when it isn't and then nothing will
> > ever clear it. A stress test that reads values and writes them back
> > for all cpu device files in sysfs will trigger the BUG() in
> > select_fallback_rq once all cpus are marked as dying.
> >
> > kernel/cpu.c::target_store()
> > 	...
> >         if (st->state < target)
> >                 ret = cpu_up(dev->id, target);
> >         else
> >                 ret = cpu_down(dev->id, target);
> >
> > cpu_down() -> cpu_set_state()
> > 	 bool bringup = st->state < target;
> > 	 ...
> > 	 if (cpu_dying(cpu) != !bringup)
> > 		set_cpu_dying(cpu, !bringup);
> >
> > Make this safe by catching the case where target == state
> > and bailing early.
> >
> > Signed-off-by: Phil Auld <pauld@redhat.com>
> > ---
> >
> > Yeah, I know... don't do that. But it's still messy.
> >
> > !< != > 
> >
> >  kernel/cpu.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index d0a9aa0b42e8..8a71b1149c60 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -2302,6 +2302,9 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,
> >  		return -EINVAL;
> >  #endif
> >  
> > +	if (target == st->state)
> > +		return count;
> > +
> 
> The current checks are against static boundaries, this has to compare
> against st->state - AFAICT this could race with another hotplug operation
> to the same CPU, e.g.
> 
>   CPU42.cpuhp_state
>     ->state  == CPUHP_AP_SCHED_STARTING
>     ->target == CPUHP_ONLINE
> 
>   <write CPUHP_ONLINE via sysfs, OK because current state != CPUHP_ONLINE>
> 
>   CPU42.cpuhp_state == CPUHP_ONLINE
>
>   <issues ensue>
>

What I'm trying to fix is not a race.  It's just bogus logic. 
There is an assumption here that !< means > which is just not
true. 

This potential race seems orthogonal and not even effected
one way or the other by this code change, right?

I could not convince myself that the check I added needed to
be under the locks because returning success when the state
is already reporting what you asked for seems harmless.


> 
> _cpu_up() has:
> 
> 	/*
> 	 * The caller of cpu_up() might have raced with another
> 	 * caller. Nothing to do.
> 	 */
> 	if (st->state >= target)
> 		goto out;
>
> Looks like we want an equivalent in _cpu_down(), what do you think?

Maybe. I still think that

> >         if (st->state < target)
> >                 ret = cpu_up(dev->id, target);
> >         else
> >                 ret = cpu_down(dev->id, target);

is not correct. If we catch the == case earlier then this makes
sense as is.

I suppose "if (st->state <= target)" would work too since __cpu_up()
already checks. Catching this sooner seems better to me though.

> 
> >  	ret = lock_device_hotplug_sysfs();
> >  	if (ret)
> >  		return ret;
> > -- 
> > 2.18.0
> 


Cheers,
Phil

-- 


  reply	other threads:[~2022-05-24 16:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-23 14:47 [PATCH] cpuhp: make target_store() a nop when target == state Phil Auld
2022-05-24 15:11 ` Valentin Schneider
2022-05-24 16:39   ` Phil Auld [this message]
2022-05-25  9:48     ` Valentin Schneider
2022-05-25 13:31       ` Phil Auld
2022-05-25 15:09         ` Valentin Schneider
2022-05-25 15:11           ` Phil Auld
2022-05-24 19:37   ` Phil Auld
2022-05-25  9:48     ` Valentin Schneider

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yo0KRVpfhUb8Ta4N@lorien.usersys.redhat.com \
    --to=pauld@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.