linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Sam Bobroff <sbobroff@linux.ibm.com>
To: Nathan Lynch <nathanl@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org, Oliver O'Halloran <oohall@gmail.com>
Subject: Re: [PATCH v2 1/2] powerpc/eeh: fix pseries_eeh_configure_bridge()
Date: Wed, 22 Apr 2020 13:30:24 +1000	[thread overview]
Message-ID: <20200422033023.GA19544@osmium> (raw)
In-Reply-To: <874ktc2z3j.fsf@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3225 bytes --]

On Tue, Apr 21, 2020 at 06:33:36PM -0500, Nathan Lynch wrote:
> Sam Bobroff <sbobroff@linux.ibm.com> writes:
> > If a device is hot unplgged during EEH recovery, it's possible for the
> > RTAS call to ibm,configure-pe in pseries_eeh_configure() to return
> > parameter error (-3), however negative return values are not checked
> > for and this leads to an infinite loop.
> >
> > Fix this by correctly bailing out on negative values.
> >
> > Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
> > ---
> >  arch/powerpc/platforms/pseries/eeh_pseries.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
> > index 893ba3f562c4..c4ef03bec0de 100644
> > --- a/arch/powerpc/platforms/pseries/eeh_pseries.c
> > +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
> > @@ -605,7 +605,7 @@ static int pseries_eeh_configure_bridge(struct eeh_pe *pe)
> >  				config_addr, BUID_HI(pe->phb->buid),
> >  				BUID_LO(pe->phb->buid));
> >  
> > -		if (!ret)
> > +		if (ret <= 0)
> >  			return ret;
> 
> Note that this returns the firmware error value (e.g. -3 parameter
> error) without converting it to a Linux errno. Nothing checks the error
> value of this function as best I can tell, but -EINVAL would be better
> than an implicit -ESRCH here.

Right, it's never used but I agree. I'll change it for v3.

> And while this will behave correctly, the pr_warn() at the end of
> pseries_eeh_configure_bridge() hints that someone had the intention
> that this code should log a message on such an error:
> 
> static int pseries_eeh_configure_bridge(struct eeh_pe *pe)
> {
> 	int config_addr;
> 	int ret;
> 	/* Waiting 0.2s maximum before skipping configuration */
> 	int max_wait = 200;
> 
> 	/* Figure out the PE address */
> 	config_addr = pe->config_addr;
> 	if (pe->addr)
> 		config_addr = pe->addr;
> 
> 	while (max_wait > 0) {
> 		ret = rtas_call(ibm_configure_pe, 3, 1, NULL,
> 				config_addr, BUID_HI(pe->phb->buid),
> 				BUID_LO(pe->phb->buid));
> 
> 		if (!ret)
> 			return ret;
> 
> 		/*
> 		 * If RTAS returns a delay value that's above 100ms, cut it
> 		 * down to 100ms in case firmware made a mistake.  For more
> 		 * on how these delay values work see rtas_busy_delay_time
> 		 */
> 		if (ret > RTAS_EXTENDED_DELAY_MIN+2 &&
> 		    ret <= RTAS_EXTENDED_DELAY_MAX)
> 			ret = RTAS_EXTENDED_DELAY_MIN+2;
> 
> 		max_wait -= rtas_busy_delay_time(ret);
> 
> 		if (max_wait < 0)
> 			break;
> 
> 		rtas_busy_delay(ret);
> 	}
> 
> 	pr_warn("%s: Unable to configure bridge PHB#%x-PE#%x (%d)\n",
> 		__func__, pe->phb->global_number, pe->addr, ret);
> 	return ret;
> }
> 
> So perhaps the error path should be made to break out of the loop
> instead of returning. Or is the parameter error result simply
> uninteresting in this scenario?

Sounds reasonable to me, and given that the only way I know to trigger
the error path (see the commit message) is not going to be common, I
think a message is a good idea. (And, as one of the people likely to
debug a future issue here, I'll probably appreciate it.)

Cheers,
Sam.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-04-22  3:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-20  5:47 [PATCH v2 0/2] powerpc/eeh: Release EEH device state synchronously Sam Bobroff
2020-04-20  5:47 ` [PATCH v2 1/2] powerpc/eeh: fix pseries_eeh_configure_bridge() Sam Bobroff
2020-04-21 23:33   ` Nathan Lynch
2020-04-22  3:30     ` Sam Bobroff [this message]
2020-04-20  5:47 ` [PATCH v2 2/2] powerpc/eeh: Release EEH device state synchronously Sam Bobroff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200422033023.GA19544@osmium \
    --to=sbobroff@linux.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=nathanl@linux.ibm.com \
    --cc=oohall@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).