All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Bernard Metzler <BMT@zurich.ibm.com>
Cc: Yi Zhang <yi.zhang@redhat.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [bug report]concurrent blktests nvme-rdma execution lead kernel null pointer
Date: Mon, 6 Dec 2021 15:13:51 +0200	[thread overview]
Message-ID: <Ya4MjzhZJi//VRo6@unreal> (raw)
In-Reply-To: <BYAPR15MB26317A0F809FDAB6BC739937996D9@BYAPR15MB2631.namprd15.prod.outlook.com>

On Mon, Dec 06, 2021 at 11:10:52AM +0000, Bernard Metzler wrote:
> > -----Original Message-----
> > From: Leon Romanovsky <leon@kernel.org>
> > Sent: Sunday, 5 December 2021 12:47
> > To: Bernard Metzler <BMT@zurich.ibm.com>
> > Cc: Yi Zhang <yi.zhang@redhat.com>; RDMA mailing list <linux-
> > rdma@vger.kernel.org>
> > Subject: [EXTERNAL] Re: [bug report]concurrent blktests nvme-rdma
> > execution lead kernel null pointer
> > 
> > On Fri, Dec 03, 2021 at 11:27:22AM +0000, Bernard Metzler wrote:
> > > -----"Yi Zhang" <yi.zhang@redhat.com> wrote: -----
> > >
> > > >To: "RDMA mailing list" <linux-rdma@vger.kernel.org>
> > > >From: "Yi Zhang" <yi.zhang@redhat.com>
> > > >Date: 12/03/2021 03:20AM
> > > >Subject: [EXTERNAL] [bug report]concurrent blktests nvme-rdma
> > > >execution lead kernel null pointer
> > > >
> > > >Hello
> > > >With the concurrent blktests nvme-rdma execution with both rdma_rxe
> > > >and siw lead kernel BUG on 5.16.0-rc3, pls help check it, thanks.
> > > >
> > >
> > > The RDMA core currently does not prevent us from assigning  both siw
> > > and rxe to the same netdev. I think this is what is happening here.
> > > This setting is of no sense, but obviously not prohibited by the RDMA
> > > infrastructure. Behavior is undefined and a kernel panic not
> > > unexpected. Shall we prevent the privileged user from doing this type
> > > of experiments?
> > >
> > > A related question: should we also explicitly refuse to add software
> > > RDMA drivers to netdevs with RDMA hardware active?
> > > This is, while stupid and resulting behavior undefined, currently
> > > possible as well.
> > 
> > In old soft-RoCE manuals, I saw a request to unload mlx4_ib/mlx5_ib
> > modules before configuring RXE. This effectively "prevented" from running
> > with "RDMA hardware active".
> > 
> Right. Same for 'siw over Chelsio T5/6' etc: first unload the iw_cxgb4
> driver, which implements the iWarp protocol, before attaching siw to
> the network interface. But shouldn't the kernel just refuse that two
> instances of the _same_ ULP (e.g., one hardware iWarp, one software
> iWARP) can be attached to the same netdev, potentially sharing IP
> address and port space?

I think that users will get different rdma-cm ids for real HW and SW devices.
The rdma_getaddrinfo() should help here.

> 
> > So I'm not surprised that it doesn't work, but why do you think that this
> > behavior is stupid? RXE/SIW can be seen as ULP and as such it is ok to run
> > many ULPs on same netdev.
> 
> Hmm, from an rdma_cm perspective, I am not sure it is supported
> that two RDMA providers can share the same device and IP address.
> Without recreating it or looking into the code, I expect Yi's
> null pointer issue is caused by this unsupported setup. If it is
> unsupported, it should be impossible to setup.

I agree with you that it is the best solution here, just because it is
good enough for RXE/SIW.

Thanks

> 
> Thanks,
> Bernard.

      reply	other threads:[~2021-12-06 13:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03  2:20 [bug report]concurrent blktests nvme-rdma execution lead kernel null pointer Yi Zhang
2021-12-03 11:27 ` Bernard Metzler
2021-12-05 11:47   ` Leon Romanovsky
2021-12-06 11:10     ` Bernard Metzler
2021-12-06 13:13       ` Leon Romanovsky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya4MjzhZJi//VRo6@unreal \
    --to=leon@kernel.org \
    --cc=BMT@zurich.ibm.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.