linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: "Nathan Lynch" <nathanl@linux.ibm.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Leonardo Brás" <leobras.c@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] Revert "powerpc/rtas: Implement reentrant rtas call"
Date: Tue, 20 Sep 2022 09:45:20 +1000	[thread overview]
Message-ID: <87bkrbkjxr.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <87illjqxpn.fsf@linux.ibm.com>

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> "Nicholas Piggin" <npiggin@gmail.com> writes:
>>> On Wed Sep 14, 2022 at 3:39 AM AEST, Leonardo Brás wrote:
>>>> On Mon, 2022-09-12 at 14:58 -0500, Nathan Lynch wrote:
>>>> > Leonardo Brás <leobras.c@gmail.com> writes:
>>>> > > On Fri, 2022-09-09 at 09:04 -0500, Nathan Lynch wrote:
>>>> > > > Leonardo Brás <leobras.c@gmail.com> writes:
>>>> > > > > On Wed, 2022-09-07 at 17:01 -0500, Nathan Lynch wrote:
>>>> > > > > > At the time this was submitted by Leonardo, I confirmed -- or thought
>>>> > > > > > I had confirmed -- with PowerVM partition firmware development that
>>>> > > > > > the following RTAS functions:
>>>> > > > > > 
>>>> > > > > > - ibm,get-xive
>>>> > > > > > - ibm,int-off
>>>> > > > > > - ibm,int-on
>>>> > > > > > - ibm,set-xive
>>>> > > > > > 
>>>> > > > > > were safe to call on multiple CPUs simultaneously, not only with
>>>> > > > > > respect to themselves as indicated by PAPR, but with arbitrary other
>>>> > > > > > RTAS calls:
>>>> > > > > > 
>>>> > > > > > https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/
>>>> > > > > > 
>>>> > > > > > Recent discussion with firmware development makes it clear that this
>>>> > > > > > is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas:
>>>> > > > > > Implement reentrant rtas call") is unsafe, likely explaining several
>>>> > > > > > strange bugs we've seen in internal testing involving DLPAR and
>>>> > > > > > LPM. These scenarios use ibm,configure-connector, whose internal state
>>>> > > > > > can be corrupted by the concurrent use of the "reentrant" functions,
>>>> > > > > > leading to symptoms like endless busy statuses from RTAS.
>>>> > > > > 
>>>> > > > > Oh, does not it means PowerVM is not compliant to the PAPR specs?
>>>> > > > 
>>>> > > > No, it means the premise of commit b664db8e3f97 ("powerpc/rtas:
>>>> > > > Implement reentrant rtas call") change is incorrect. The "reentrant"
>>>> > > > property described in the spec applies only to the individual RTAS
>>>> > > > functions. The OS can invoke (for example) ibm,set-xive on multiple CPUs
>>>> > > > simultaneously, but it must adhere to the more general requirement to
>>>> > > > serialize with other RTAS functions.
>>>> > > > 
>>>> > > 
>>>> > > I see. Thanks for explaining that part!
>>>> > > I agree: reentrant calls that way don't look as useful on Linux than I
>>>> > > previously thought.
>>>> > > 
>>>> > > OTOH, I think that instead of reverting the change, we could make use of the
>>>> > > correct information and fix the current implementation. (This could help when we
>>>> > > do the same rtas call in multiple cpus)
>>>> > 
>>>> > Hmm I'm happy to be mistaken here, but I doubt we ever really need to do
>>>> > that. I'm not seeing the need.
>>>> > 
>>>> > > I have an idea of a patch to fix this. 
>>>> > > Do you think it would be ok if I sent that, to prospect being an alternative to
>>>> > > this reversion?
>>>> > 
>>>> > It is my preference, and I believe it is more common, to revert to the
>>>> > well-understood prior state, imperfect as it may be. The revert can be
>>>> > backported to -stable and distros while development and review of
>>>> > another approach proceeds.
>>>>
>>>> Ok then, as long as you are aware of the kdump bug, I'm good.
>>>>
>>>> FWIW:
>>>> Reviewed-by: Leonardo Bras <leobras.c@gmail.com>
>>>
>>> A shame. I guess a reader/writer lock would not be much help because
>>> the crash is probably more likely to hit longer running rtas calls?
>>>
>>> Alternative is just cheat and do this...?
>
> [...]
>
>>
>> I wonder - would it be worth making the panic path use a separate
>> "emergency" rtas_args buffer as well? If a CPU is actually "stuck" in
>> RTAS at panic time, then leaving rtas.args untouched might make the
>> ibm,int-off, ibm,set-xive, ibm,os-term, and any other RTAS calls we
>> incur on the panic path more likely to succeed.
>
> Regardless, I request that we proceed with the revert while the crash
> path hardening gets sorted out.

It's in next, I just haven't sent out the mails yet because I'm
disorganised :)

> If I understand the motivation behind
> commit b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call"),
> then it looks like it was incomplete anyway? rtas_os_term() still takes
> the lock when calling ibm,os-term.

The original report was for kdump, which doesn't use rtas_os_term() AFAIK.

cheers

  reply	other threads:[~2022-09-19 23:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-07 22:01 [PATCH] Revert "powerpc/rtas: Implement reentrant rtas call" Nathan Lynch
2022-09-08  7:56 ` Laurent Dufour
     [not found] ` <1d76891ee052112ee1547a4027e358d5cbcac23d.camel@gmail.com>
2022-09-09 14:04   ` Nathan Lynch
2022-09-12 15:22     ` Leonardo Brás
2022-09-12 19:58       ` Nathan Lynch
2022-09-13 17:39         ` Leonardo Brás
2022-09-16  1:31           ` Nicholas Piggin
2022-09-16 21:56             ` Nathan Lynch
2022-09-19 13:51               ` Nathan Lynch
2022-09-19 23:45                 ` Michael Ellerman [this message]
2022-09-20  3:54                 ` Nicholas Piggin
2022-09-21 15:54                   ` Nathan Lynch
2023-04-14 14:20               ` Michal Suchánek
2023-04-17 13:55                 ` Nathan Lynch
2022-09-23 10:57 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bkrbkjxr.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=leobras.c@gmail.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathanl@linux.ibm.com \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).