From: Leon Romanovsky <leon@kernel.org>
To: yanjun.zhu@linux.dev
Cc: Leon Romanovsky <leo@kernel.org>,
linux-rdma@vger.kernel.org, jgg@nvidia.com
Subject: Re: [PATCH] rdma: not display the rdma link in other net namespace
Date: Tue, 11 Oct 2022 12:49:37 +0300 [thread overview]
Message-ID: <Y0U8McWLRJRTKqQ/@unreal> (raw)
In-Reply-To: <1c6c286460ac6450d1ae7a93efd4c062@linux.dev>
On Sun, Oct 09, 2022 at 10:20:53AM +0000, yanjun.zhu@linux.dev wrote:
> September 28, 2022 2:04 PM, "Leon Romanovsky" <leon@kernel.org> wrote:
>
> > On Tue, Sep 27, 2022 at 06:58:50PM +0800, Yanjun Zhu wrote:
> >
> >> 在 2022/9/27 18:34, Leon Romanovsky 写道:
> >> On Sun, Sep 25, 2022 at 10:40:33PM -0400, yanjun.zhu@linux.dev wrote:
> >>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> >>>
> >>> When the net devices are moved to another net namespace, the command
> >>> "rdma link" should not dispaly the rdma link about this net device.
> >>>
> >>> For example, when the net device eno12399 is moved to net namespace net0
> >>> from init_net, the rdma link of eno12399 should not display in init_net.
> >>>
> >>> Before this change:
> >>>
> >>> Init_net:
> >>>
> >>> link roceo12399/1 state DOWN physical_state DISABLED <---should not display
> >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1
> >>>
> >>> net0:
> >>>
> >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399
> >>> link roceo12409/1 state DOWN physical_state DISABLED <---should not display
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED <---should not display
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP <---should not display
> >>>
> >>> After this change
> >>>
> >>> Init_net:
> >>>
> >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1
> >>>
> >>> net0:
> >>>
> >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399
> >>>
> >>> Fixes: da990ab40a92 ("rdma: Add link object")
> >>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> >>> ---
> >>> rdma/link.c | 3 +++
> >>> 1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/rdma/link.c b/rdma/link.c
> >>> index bf24b849..449a7636 100644
> >>> --- a/rdma/link.c
> >>> +++ b/rdma/link.c
> >>> @@ -238,6 +238,9 @@ static int link_parse_cb(const struct nlmsghdr *nlh, void *data)
> >>> return MNL_CB_ERROR;
> >>> }
> >>> + if (!tb[RDMA_NLDEV_ATTR_NDEV_NAME] || !tb[RDMA_NLDEV_ATTR_NDEV_INDEX])
> >>> + return MNL_CB_OK;
> >>> +
> >> Regarding your question where it should go in addition to RDMA, the answer
> >> is netdev ML. The rdmatool is part of iproute2 and the relevant maintainers
> >> should be CCed.
> >> Thanks. I will also send it to netdev ML and CC the maintainers.
> >>
> >> Regarding the change, I don't think that it is right. User space tool is
> >> a simple viewer of data returned from the kernel. It is not a mistake to
> >> return device without netdev.
> >>
> >> Normally a rdma link based on RoCEv2 should be with a NIC. This NIC device
> >>
> >> will send/recv udp packets. With mellanox/intel NIC device, this net device
> >> also
> >>
> >> do more work than sending/receiving packets.
> >>
> >> From this perspective, a rdma link is dependent on a net device.
> >>
> >> In this problem, net device is moved to another net namespace. So it can not
> >> be
> >>
> >> obtained. And this rdma link can also not work in this net namespace.
> >>
> >> So this rdma link should not appear in this net namespace. Or else, it would
> >> confuse
> >>
> >> the user.
> >>
> >> In fact, net namespace is a concept in tcp/ip stack. And it does not exist
> >> in rdma stack.
> >
> > RDMA has two different net namespace mode: shared and exclusive.
> >
> > In shared mode, the IB devices are shared across all net namespaces and
> > "moving" net device into different namespace just "hides" it, but don't
> > disconnect.
>
> Hi, Leon
>
> About RDMA shared and exclusive mode, I am confusing about this scenario:
>
> In shared mode, ib device A is in net namespace A1 while netdev device B is in net namespace B1.
> IB device A is dependent on netdev device B. How to make tests in the above scenario?
> Both rping and perftest need a IP address to work. But now ip address is in net namespace B1 while
> ib device A is in net namespace A1.
>
> In the product environment, does the above scenario exist?
Yes and no at the same time.
Yes:
The whole net namespace support is needed for containers. In old
versions of rdma-core, libibverbs relied on /sys/class/infiniband/
structure. This is why we need "shared" mode, where IB exists without
relation to netdev.
No:
Like you said, it won't work for RoCE and iWARP.
Thanks
>
> Thanks and Regards,
> Zhu Yanjun
>
> >
> > See comments around various usages of ib_devices_shared_netns variable.
> >
> > Thanks
prev parent reply other threads:[~2022-10-11 9:49 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-26 2:40 [PATCH] rdma: not display the rdma link in other net namespace yanjun.zhu
2022-09-25 10:22 ` Yanjun Zhu
2022-09-27 10:34 ` Leon Romanovsky
2022-09-27 10:58 ` Yanjun Zhu
2022-09-28 6:04 ` Leon Romanovsky
2022-09-30 7:25 ` Yanjun Zhu
2022-10-06 12:53 ` Leon Romanovsky
2022-10-06 14:26 ` Yanjun Zhu
2022-10-06 16:21 ` Leon Romanovsky
2022-10-06 16:23 ` Jason Gunthorpe
2022-10-07 6:21 ` Leon Romanovsky
2022-10-07 6:56 ` Yanjun Zhu
2022-10-11 0:25 ` [RFC PATCH 1/1] RDMA/core: Fix a problem from rdma link in exclusive mode Zhu Yanjun
2022-10-11 10:12 ` Leon Romanovsky
2022-10-11 15:08 ` Yanjun Zhu
2022-10-13 8:30 ` yanjun.zhu
2022-10-09 10:20 ` [PATCH] rdma: not display the rdma link in other net namespace yanjun.zhu
2022-10-11 9:49 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0U8McWLRJRTKqQ/@unreal \
--to=leon@kernel.org \
--cc=jgg@nvidia.com \
--cc=leo@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=yanjun.zhu@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).