linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: Doug Ledford <dledford@redhat.com>
Cc: Tadeusz Struk <tadeusz.struk@intel.com>,
	linux-rdma@vger.kernel.org, linux-pci@vger.kernel.org,
	dennis.dalessandro@intel.com, ira.weiny@intel.com
Subject: Re: [PATCH] [RFC] IB/hfi1: Fix port ordering issue in a multiport device
Date: Thu, 19 Jan 2017 10:58:02 -0700	[thread overview]
Message-ID: <20170119175802.GB8109@obsidianresearch.com> (raw)
In-Reply-To: <1484784989.2406.67.camel@redhat.com>

On Wed, Jan 18, 2017 at 07:16:29PM -0500, Doug Ledford wrote:

> > This is a 'stable device naming' problem, which we have never
> > tried to solve in RDMA.
> 
> No, that would imply they must be hfi1_0 and hfi1_1, when this is not
> the case.  If you had two cards in this system, both dual port, then
> you may be wanting to rename hfi_3 to hfi1_2 and vice versa.  It is a
> relative name problem, not a stable name problem.  We need only reverse
> the order that the ports are probed in, their names are whatever they
> end up being once reversed.

Eh? If *users* expect RDMA names to be meaningful/stable then it *IS*
the stable naming problem. We have never guarenteed stable device
names in RDMA, but it does happen to work out by luck in many cases.

Linux does have a guarentee of PCI driver bind order. For instance
the parallel probe patch series randomizes driver bind order, so any
driver relying on this for 'stable names' is broken.

> > udev is the expected kernel way to solve this. Trying to hack stable
> > names by forcing device bind order is horrible.
> 
> This is a manufacturing defect.  Something I'm sure Intel wants to
> resolve without requiring users to go in and manually name their ports.
>  I have no doubt that they would prefer that the user remain blissfully
> unaware of the issue, all except for the ones that probably reported it
> and already have their system cabled up wrong as a result.

Modern udev models do not require manual naming by users, look at what
netdev is doing to solve this problem these days.

hif_slot#_port# can be generated automatically by udev based on
information from the driver and the BIOS. This is what is being done
for netdev.

That is where we really need to go as well.

As you say, this is a oops on Intels part, so that may be too long
term - so they should solve this temporarily and imperfectly *in their
driver* by assinging RDMA device names manually, eg make it so that
hfiX has X be even for port 0 and X be odd for port 1.

Never any need for any kind of defered binding approach.

> No, mlx5 could have easily hit this too as their ports are separate
> PCI functions.

Sound like intel and mellanox should collaborate on getting udev
stable naming working right for RDMA... They are eventually going to
get burned.

Jason

  reply	other threads:[~2017-01-19 18:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-10 23:57 [PATCH] [RFC] IB/hfi1: Fix port ordering issue in a multiport device Tadeusz Struk
2017-01-11  7:12 ` Leon Romanovsky
2017-01-11 17:20   ` Tadeusz Struk
2017-01-11 17:59     ` Leon Romanovsky
2017-01-11 18:10 ` Jason Gunthorpe
2017-01-18 21:01   ` Doug Ledford
2017-01-18 21:08     ` Jason Gunthorpe
2017-01-18 22:03       ` Tadeusz Struk
2017-01-19  0:17         ` Doug Ledford
2017-01-19 16:51           ` Tadeusz Struk
2017-01-19  0:16       ` Doug Ledford
2017-01-19 17:58         ` Jason Gunthorpe [this message]
2017-01-22  8:16           ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170119175802.GB8109@obsidianresearch.com \
    --to=jgunthorpe@obsidianresearch.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dledford@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=tadeusz.struk@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).