From: Nathan Lynch <nathanl@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, Greg Kurz <groug@kaod.org>,
Thiago Jung Bauermann <bauerman@linux.ibm.com>,
Cedric Le Goater <clg@kaod.org>
Subject: Re: [PATCH] powerpc/pseries/hotplug-cpu: increase wait time for vCPU death
Date: Tue, 11 Aug 2020 09:46:08 -0500 [thread overview]
Message-ID: <87ft8tz1e7.fsf@linux.ibm.com> (raw)
In-Reply-To: <87r1sd9z0d.fsf@mpe.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Michael Roth <mdroth@linux.vnet.ibm.com> writes:
>> Quoting Nathan Lynch (2020-08-07 02:05:09)
> ...
>>> wait_for_cpu_stopped() should be able to accommodate a time-based
>>> warning if necessary, but speaking as a likely recipient of any bug
>>> reports that would arise here, I'm not convinced of the need and I
>>> don't know what a good value would be. It's relatively easy to sample
>>> the stack of a task that's apparently failing to make progress, plus I
>>> probably would use 'perf probe' or similar to report the inputs and
>>> outputs for the RTAS call.
>>
>> I think if we make the timeout sufficiently high like 2 minutes or so
>> it wouldn't hurt and if we did seem them it would probably point to an
>> actual bug. But I don't have a strong feeling either way.
>
> I think we should print a warning after 2 minutes.
>
> It's true that there are fairly easy mechanisms to work out where the
> thread is stuck, but customers are unlikely to use them. They're just
> going to report that it's stuck with no further info, and probably
> reboot the machine before we get a chance to get any further info.
>
> Whereas if the kernel prints a warning with a stack trace we at least
> have that to go on in an initial bug report.
>
>>> I'm happy to make this a proper submission after I can clean it up and
>>> retest it, or Michael R. is welcome to appropriate it, assuming it's
>>> acceptable.
>>>
>>
>> I've given it a shot with this patch and it seems to be holding up in
>> testing. If we don't think the ~2 minutes warning message is needed I
>> can clean it up to post:
>>
>> https://github.com/mdroth/linux/commit/354b8c97bf0dc1146e36aa72273f5b33fe90d09e
>>
>> I'd likely break the refactoring patches out to a separate patch under
>> Nathan's name since it fixes a separate bug potentially.
>
> While I like Nathan's refactoring, we probably want to do the minimal
> fix first to ease backporting.
>
> Then do the refactoring on top of that.
Fair enough, thanks.
prev parent reply other threads:[~2020-08-11 14:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-04 3:29 [PATCH] powerpc/pseries/hotplug-cpu: increase wait time for vCPU death Michael Roth
2020-08-04 13:35 ` Michael Ellerman
2020-08-04 14:16 ` Greg Kurz
2020-08-05 3:07 ` Michael Ellerman
2020-08-05 4:01 ` Thiago Jung Bauermann
2020-08-05 4:37 ` Michael Roth
2020-08-05 22:29 ` Michael Roth
2020-08-05 22:31 ` Michael Roth
2020-08-06 12:51 ` Michael Ellerman
2020-08-07 7:05 ` Nathan Lynch
2020-08-11 5:39 ` Michael Roth
2020-08-11 11:56 ` Michael Ellerman
2020-08-11 14:46 ` Nathan Lynch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ft8tz1e7.fsf@linux.ibm.com \
--to=nathanl@linux.ibm.com \
--cc=bauerman@linux.ibm.com \
--cc=clg@kaod.org \
--cc=groug@kaod.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mdroth@linux.vnet.ibm.com \
--cc=mpe@ellerman.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.