All of lore.kernel.org
 help / color / mirror / Atom feed
From: Potnuri Bharat Teja <bharat@chelsio.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"BMT@zurich.ibm.com" <BMT@zurich.ibm.com>,
	"monis@mellanox.com" <monis@mellanox.com>,
	Nirranjan Kirubaharan <nirranjan@chelsio.com>
Subject: Re: User SIW fails matching device
Date: Fri, 12 Jul 2019 20:54:20 +0530	[thread overview]
Message-ID: <20190712152418.GA16331@chelsio.com> (raw)
In-Reply-To: <20190712143546.GD27512@ziepe.ca>

On Friday, July 07/12/19, 2019 at 20:05:46 +0530, Jason Gunthorpe wrote:
> On Fri, Jul 12, 2019 at 07:57:19PM +0530, Potnuri Bharat Teja wrote:
> > Hi all,
> > I observe the following behavior on one of my machines configured for siw.
> > 
> > Issue:
> > SIW device gets wrong device ops (HW/real rdma driver device ops) instead of
> > siw device ops due to improper device matching.
> > 
> > Root-cause:
> > In libibverbs, during user cma initialisation, for each entry from the driver 
> > list, sysfs device is checked for matching name or device.
> > If the siw/rxe driver is at the head of the list, then sysfs device matches 
> > properly with the corresponding siw driver and gets the corresponding siw/rxe 
> > device ops. Now, If the siw/rxe driver is after the real HW driver cxgb4/mlx5 
> > respectively in the driver list, then siw sysfs device matches pci device and 
> > wrongly gets the device ops of HW driver (cxgb4/mlx5).
> > 
> > Below debug prints from verbs_register_driver() and driver_list entries, where 
> > siw is after cxgb4. I see verbs alloc context landing in cxgb4_alloc_context 
> > instead of siw_alloc_context, thus breaking user siw.
> > 
> > <debug> verbs_register_driver_22: 184: driver 0x176e370
> > <debug> verbs_register_driver_22: 185: name ipathverbs
> > <debug> verbs_register_driver_22: 184: driver 0x176f6a0
> > <debug> verbs_register_driver_22: 185: name cxgb4
> > <debug> verbs_register_driver_22: 184: driver 0x176fd50
> > <debug> verbs_register_driver_22: 185: name cxgb3
> > <debug> verbs_register_driver_22: 184: driver 0x1777020
> > <debug> verbs_register_driver_22: 185: name rxe
> > <debug> verbs_register_driver_22: 184: driver 0x1770a30
> > <debug> verbs_register_driver_22: 185: name siw
> > <debug> verbs_register_driver_22: 184: driver 0x1771120
> > <debug> verbs_register_driver_22: 185: name mlx4
> > <debug> verbs_register_driver_22: 184: driver 0x1771990
> > <debug> verbs_register_driver_22: 185: name mlx5
> > <debug> verbs_register_driver_22: 184: driver 0x1771ff0
> > <debug> verbs_register_driver_22: 185: name efa
> > 
> > <debug> try_drivers: 372: driver 0x176e370, sysfs_dev 0x1776b20, name: ipathverbs
> > <debug> try_drivers: 372: driver 0x176f6a0, sysfs_dev 0x1776b20, name: cxgb4
> > <debug> try_drivers: 372: driver 0x176fd50, sysfs_dev 0x1776b20, name: cxgb3
> > <debug> try_drivers: 372: driver 0x1777020, sysfs_dev 0x1776b20, name: rxe
> > <debug> try_drivers: 372: driver 0x1770a30, sysfs_dev 0x1776b20, name: siw
> > <debug> try_drivers: 372: driver 0x1771120, sysfs_dev 0x1776b20, name: mlx4
> > <debug> try_drivers: 372: driver 0x1771990, sysfs_dev 0x1776b20, name: mlx5
> > <debug> try_drivers: 372: driver 0x1771ff0, sysfs_dev 0x1776b20, name: efa
> > 
> > Proposed fix:
> > I have the below fix that works. It adds siw/rxe driver to the HEAD of the 
> > driver list and the rest to the tail. I am not sure if this fix is the ideal 
> > one, so I am attaching it to this mail.
> 
> Update your rdma-core to latest and this will be fixed fully by using
> netlink to match the siw device..
> 
I pulled the latest rdma-core, still see the issue.

commit 7ef6077ec3201f661458297fea776746ba752843 (HEAD, upstream/master)
Merge: 837954ff677c 95934b61a74e
Author: Jason Gunthorpe <jgg@mellanox.com>
Date:   Thu Jul 11 16:18:06 2019 -0300

    Merge pull request #539 from jgunthorpe/netlink

        Use netlink to learn about ibdevs and their related chardevs

-----------

Is there any corresponding kernel change or package dependency? I am currently 
on Doug's wip/dl-for-next branch.

  reply	other threads:[~2019-07-12 15:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-12 14:27 User SIW fails matching device Potnuri Bharat Teja
2019-07-12 14:35 ` Jason Gunthorpe
2019-07-12 15:24   ` Potnuri Bharat Teja [this message]
2019-07-12 15:30     ` Jason Gunthorpe
2019-07-12 15:52       ` Potnuri Bharat Teja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190712152418.GA16331@chelsio.com \
    --to=bharat@chelsio.com \
    --cc=BMT@zurich.ibm.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-rdma@vger.kernel.org \
    --cc=monis@mellanox.com \
    --cc=nirranjan@chelsio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.