From: Nathan Lynch <nathanl@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: brking@linux.ibm.com, linuxppc-dev@lists.ozlabs.org,
linux-kernel@vger.kernel.org, npiggin@gmail.com,
srikar@linux.vnet.ibm.com
Subject: Re: [PATCH] powerpc/smp: poll cpu_callin_map more aggressively in __cpu_up()
Date: Wed, 29 Jun 2022 12:51:13 -0500 [thread overview]
Message-ID: <87letfmk8e.fsf@linux.ibm.com> (raw)
In-Reply-To: <87wncz3jzu.fsf@mpe.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> Replace the outdated iteration and timeout calculations here with
>> indefinite spin_until_cond()-wrapped poll of cpu_callin_map. __cpu_up()
>> already does this when waiting for the cpu to set its online bit before
>> returning, so this change is not really making the function more brittle.
>
> I'm not sure I agree that this doesn't make the code more brittle.
>
> The existing indefinite wait you mention is later in the function, and
> happens after the CPU has successfully come into the kernel.
>
> I think it's more common that a stuck/borked CPU doesn't come into the
> kernel at all, rather than comes in and then fails to online.
>
> So I think the bail out when the CPU fails to call in is useful, I would
> guess I see that "Processor x is stuck" message multiple times a year
> while debugging various things.
Yeah I can see how my claim is too strong here.
>> Removing the msleep(1) in the hotplug path here reduces the time it takes
>> to online a CPU on a P9 PowerVM LPAR from about 30ms to 1ms when exercised
>> via thaw_secondary_cpus().
>
> That is a nice improvement.
>
> Can we do something that returns quickly in the happy case and still has
> a timeout when things go wrong? Seems like a busy loop with a
> time_after() check would do the trick.
Yes, I'll rework it like that. Thanks.
WARNING: multiple messages have this Message-ID (diff)
From: Nathan Lynch <nathanl@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-kernel@vger.kernel.org, npiggin@gmail.com,
brking@linux.ibm.com, srikar@linux.vnet.ibm.com,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc/smp: poll cpu_callin_map more aggressively in __cpu_up()
Date: Wed, 29 Jun 2022 12:51:13 -0500 [thread overview]
Message-ID: <87letfmk8e.fsf@linux.ibm.com> (raw)
In-Reply-To: <87wncz3jzu.fsf@mpe.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> Replace the outdated iteration and timeout calculations here with
>> indefinite spin_until_cond()-wrapped poll of cpu_callin_map. __cpu_up()
>> already does this when waiting for the cpu to set its online bit before
>> returning, so this change is not really making the function more brittle.
>
> I'm not sure I agree that this doesn't make the code more brittle.
>
> The existing indefinite wait you mention is later in the function, and
> happens after the CPU has successfully come into the kernel.
>
> I think it's more common that a stuck/borked CPU doesn't come into the
> kernel at all, rather than comes in and then fails to online.
>
> So I think the bail out when the CPU fails to call in is useful, I would
> guess I see that "Processor x is stuck" message multiple times a year
> while debugging various things.
Yeah I can see how my claim is too strong here.
>> Removing the msleep(1) in the hotplug path here reduces the time it takes
>> to online a CPU on a P9 PowerVM LPAR from about 30ms to 1ms when exercised
>> via thaw_secondary_cpus().
>
> That is a nice improvement.
>
> Can we do something that returns quickly in the happy case and still has
> a timeout when things go wrong? Seems like a busy loop with a
> time_after() check would do the trick.
Yes, I'll rework it like that. Thanks.
next prev parent reply other threads:[~2022-06-29 17:52 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-25 7:21 [PATCH] powerpc/smp: poll cpu_callin_map more aggressively in __cpu_up() Nathan Lynch
2022-01-25 7:21 ` Nathan Lynch
2022-06-29 9:19 ` Michael Ellerman
2022-06-29 9:19 ` Michael Ellerman
2022-06-29 17:51 ` Nathan Lynch [this message]
2022-06-29 17:51 ` Nathan Lynch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87letfmk8e.fsf@linux.ibm.com \
--to=nathanl@linux.ibm.com \
--cc=brking@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=srikar@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.