public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Dennis Dalessandro
	<dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	jgg-uk2M96/98Pc@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Mike Marciniszyn
	<mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Alex Estrin <alex.estrin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH for-next 16/16] IB/ipoib: Fix for potential no-carrier state
Date: Fri, 26 Jan 2018 12:00:11 -0500	[thread overview]
Message-ID: <1516986011.27592.207.camel@redhat.com> (raw)
In-Reply-To: <20180126143323.6868.89162.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2950 bytes --]

On Fri, 2018-01-26 at 06:33 -0800, Dennis Dalessandro wrote:
> From: Alex Estrin <alex.estrin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> On reboot SM can program port pkey table before ipoib registered its
> event handler, which could result in missing pkey event and leave root
> interface with initial pkey value from index 0.
> 
> Since OPA port starts with invalid pkey in index 0, root interface will
> fail to initialize and stay down with no-carrier flag.
> 
> For IB ipoib interface may end up with pkey different from value
> opensm put in pkey table idx 0, resulting in connectivity issues
> (different mcast groups, for example).
> 
> Close the window by calling event handler after registration
> to make sure ipoib pkey is in sync with port pkey table.
> 
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Alex Estrin <alex.estrin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_main.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index 5930c7d..161ba8c 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -2306,6 +2306,9 @@ void ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca)
>  			      priv->ca, ipoib_event);
>  	ib_register_event_handler(&priv->event_handler);
>  
> +	/* call event handler to ensure pkey in sync */
> +	queue_work(ipoib_workqueue, &priv->flush_heavy);
> +

This seems like a bit of a sledgehammer to the issue.  Looking through
ipoib_add_port(), the real race is that we have to call ib_query_pkey()
early in the init sequence as some of the later steps need it to be set
(ipoib_dev_init() must have it already set for one), but since we don't
setup our event handler until after we've finished setting up the
device, there is that window from our first ib_query_pkey call until we
complete the ib_register_event_handler() call for the pkey to change. 
Instead of throwing the flush regardless, it might be nicer to do:

	{
		u16 new_pkey;

		ib_query_pkey(hca, port, 0, &new_pkey);
		if (priv->pkey != (new_pkey | 0x8000))
			/* The pkey changed between when we
			 * read it and now, flush the device
			 */
			queue_work(ipoib_workqueue, &priv->flush_heavy);
	}

>  	result = register_netdev(priv->dev);
>  	if (result) {
>  		pr_warn("%s: couldn't register ipoib port %d; error %d\n",
> 

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2018-01-26 17:00 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-26 14:31 [PATCH for-next 00/16] IB/hfi1,core: Driver updates for 1/26/2018 Dennis Dalessandro
     [not found] ` <20180126142640.6868.12402.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 14:31   ` [PATCH for-next 01/16] IB/hfi1: Do not override given pcie_pset value Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 02/16] IB/hfi1: Fix for early release of sdma context Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 03/16] IB/hfi1: Remove dependence on qp->s_hdrwords Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 04/16] IB/hfi1: Remove blind constants from 16B update Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 05/16] IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 06/16] IB/hfi1: hfi1_open_file() missing kobject_put in err path Dennis Dalessandro
     [not found]     ` <20180126143203.6868.85360.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 15:51       ` Jason Gunthorpe
2018-01-26 14:32   ` [PATCH for-next 07/16] IB/hfi1: Show fault stats in both TX and RX directions Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 08/16] IB/hfi1: Prevent LNI hang when LCB can't obtain lanes Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 09/16] IB/hfi1: Compute BTH only for RDMA_WRITE_LAST/SEND_LAST packet Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 10/16] IB/hfi1: Optimize packet type comparison using 9B and bypass code paths Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 11/16] IB/hfi1: Look up ibport using a pointer in receive path Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 12/16] IB/hfi1: Remove unnecessary fecn and becn fields Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 13/16] IB/hfi1: Optimize process_receive_ib() Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 14/16] IB/hfi1: Re-order IRQ cleanup to address driver cleanup race Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 15/16] IB/core: Map iWarp AH type to undefined in rdma_ah_find_type Dennis Dalessandro
     [not found]     ` <20180126143315.6868.67354.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 15:45       ` Jason Gunthorpe
     [not found]         ` <20180126154540.GA23869-uk2M96/98Pc@public.gmane.org>
2018-01-26 15:58           ` Dennis Dalessandro
     [not found]             ` <ce815670-cf1b-5199-9a52-2779cb8f8c75-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-01-26 17:47               ` Parav Pandit
2018-01-26 18:55       ` Shiraz Saleem
2018-01-26 14:33   ` [PATCH for-next 16/16] IB/ipoib: Fix for potential no-carrier state Dennis Dalessandro
     [not found]     ` <20180126143323.6868.89162.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 17:00       ` Doug Ledford [this message]
     [not found]         ` <1516986011.27592.207.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-26 17:06           ` Jason Gunthorpe
     [not found]             ` <20180126170610.GD23869-uk2M96/98Pc@public.gmane.org>
2018-01-26 17:09               ` Doug Ledford
     [not found]                 ` <1516986596.27592.209.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-26 17:27                   ` Estrin, Alex
2018-01-26 17:18           ` Estrin, Alex

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1516986011.27592.207.camel@redhat.com \
    --to=dledford-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=alex.estrin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=jgg-uk2M96/98Pc@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox