From: Dust Li <dust.li@linux.alibaba.com>
To: Yanjun Zhu <yanjun.zhu@linux.dev>,
Zhu Yanjun <yanjun.zhu@intel.com>,
jgg@ziepe.ca, leon@kernel.org, linux-rdma@vger.kernel.org
Subject: Re: [PATCH 0/3] RDMA net namespace
Date: Mon, 24 Oct 2022 22:35:21 +0800 [thread overview]
Message-ID: <20221024143521.GG63658@linux.alibaba.com> (raw)
In-Reply-To: <662d6804-0e16-117e-d4a4-9abd4a2e8c75@linux.dev>
On Mon, Oct 24, 2022 at 09:12:56PM +0800, Yanjun Zhu wrote:
>
>在 2022/10/24 19:52, Dust Li 写道:
>> On Mon, Oct 24, 2022 at 06:15:01AM +0000, yanjun.zhu@linux.dev wrote:
>> > October 24, 2022 9:10 AM, "Dust Li" <dust.li@linux.alibaba.com> wrote:
>> >
>> > > On Sun, Oct 23, 2022 at 06:04:47PM -0400, Zhu Yanjun wrote:
>> > >
>> > > > From: Zhu Yanjun <yanjun.zhu@linux.dev>
>> > > >
>> > > > There are shared and exclusive modes in RDMA net namespace. After
>> > > > discussion with Leon, the above modes are compatible with legacy IB
>> > > > device.
>> > > >
>> > > > To the RoCE and iWARP devices, the ib devices should be in the same net
>> > > > namespace with the related net devices regardless of in shared or
>> > > > exclusive mode.
>> > > Does this mean that shared mode is no longer supported for RoCE and iWarp
>> > > devices ?
>> >From the discussion, a RoCE and iWarp device should make ib devices and net devices in the same net. So a RoCE and iWarp device has no shared/exclusive modes.
>> > Shared/exclusive modes are for legacy ib devices, such as ipoib.
>> >
>> > In this patch series, shared/exclusive modes are left for legacy ib devices.
>> > To a RoCE and iWarp device, we just keep net devices and ib devices in the same net.
>> I think this may limit the use case of RoCE and iWarp.
>>
>> See the following use case:
>> In the container enviroment, we may have lots of containers on a host,
>> for example, more than 100. And we don't have that much VFs, so we use
>> ipvlan or other virtual network devices for each container, and put
>> those virtual network devices into each container(net namespace).
>> Since we only use 1 physical network device for all those containers,
>> there is only one RoCE device. If we don't support shared mode, we
>> cannot even enable RDMA for those containers with RoCE.
>
>You use the ipvlan or other virtual network devices for each container.
>
>In these containers, you also use RDMA, correct?
>
>Since all the packets for these virtual network devices finally come to
>
>the physical network devices, without shared/exclusive modes, it should work.
>
>So we do not consider the shared/exclusive mode.
For the netdevice, that's true. But for RDMA, we should not even see
the ib device in the containers any more, so I think we cannot create
qp/cq, and RDMA is not available for these containers in this case.
Thanks
>
>Zhu Yanjun
>
>>
>> I don't know any other way to solve this, maybe I missed something ?
>>
>> Thanks
>>
>> >
>> >
>> > > > In the first commit, when the net devices are moved to a new net
>> > > > namespace, the related ib devices are also moved to the same net
>> > > > namespace.
>> > > >
>> > > > In the second commit, the shared/exclusive modes still work with legacy
>> > > > ib devices. To the RoCE and iWARP devices, these modes will not be
>> > > > considered.
>> > > >
>> > > > Because MLX4/5 do not call the function ib_device_set_netdev to map ib
>> > > > devices and the related net devices, the function ib_device_get_by_netdev
>> > > > can not get ib devices from net devices. In the third commit, all the
>> > > > registered ib devices are parsed to get the net devices, then compared
>> > > > with the given net devices.
>> > > >
>> > > > The steps to make tests:
>> > > > 1) Create a new net namespace net0
>> > > >
>> > > > ip netns add net0
>> > > >
>> > > > 2) Show the rdma links in init_net
>> > > >
>> > > > rdma link
>> > > >
>> > > > "
>> > > > link mlx5_0/1 state DOWN physical_state DISABLED netdev enp7s0np1
>> > > > "
>> > > >
>> > > > 3) Move the net device to net namespace net0
>> > > >
>> > > > ip link set enp7s0np1 netns net0
>> > > >
>> > > > 4) Show the rdma links in init_net again
>> > > >
>> > > > rdma link
>> > > >
>> > > > There is no rdma links
>> > > >
>> > > > 5) Show the rdma links in net0
>> > > >
>> > > > ip netns exec net0 rdma link
>> > > >
>> > > > "
>> > > > link mlx5_0/1 state DOWN physical_state DISABLED netdev enp7s0np1
>> > > > "
>> > > >
>> > > > We can confirm that rdma links are moved to the same net namespace with
>> > > > the net devices.
>> > > >
>> > > > Zhu Yanjun (3):
>> > > > RDMA/core: Move ib device to the same net namespace with net device
>> > > > RDMA/core: The legacy IB devices still work with shared/exclusive mode
>> > > > RDMA/core: Get all the ib devices from net devices
>> > > >
>> > > > drivers/infiniband/core/device.c | 107 ++++++++++++++++++++++++++++++-
>> > > > 1 file changed, 105 insertions(+), 2 deletions(-)
>> > > >
>> > > > --
>> > > > 2.27.0
next prev parent reply other threads:[~2022-10-24 16:05 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-23 22:04 [PATCH 0/3] RDMA net namespace Zhu Yanjun
2022-10-23 13:04 ` Leon Romanovsky
2022-10-23 13:42 ` Yanjun Zhu
2022-10-23 16:45 ` Leon Romanovsky
2022-10-24 7:20 ` yanjun.zhu
2022-10-23 22:04 ` [PATCH 1/3] RDMA/core: Move ib device to the same net namespace with net device Zhu Yanjun
2022-10-23 22:04 ` [PATCH 2/3] RDMA/core: The legacy IB devices still work with shared/exclusive mode Zhu Yanjun
2022-10-23 22:04 ` [PATCH 3/3] RDMA/core: Get all the ib devices from net devices Zhu Yanjun
2022-10-24 1:10 ` [PATCH 0/3] RDMA net namespace Dust Li
2022-10-24 6:15 ` yanjun.zhu
2022-10-24 11:52 ` Dust Li
2022-10-24 13:12 ` Yanjun Zhu
2022-10-24 14:35 ` Dust Li [this message]
2022-10-24 16:41 ` Jason Gunthorpe
2022-10-25 2:51 ` Yanjun Zhu
2022-10-26 4:08 ` Dust Li
2022-10-26 15:01 ` Dust Li
2022-10-27 2:30 ` Dust Li
2022-10-27 2:54 ` yanjun.zhu
2022-10-27 3:01 ` Parav Pandit
2022-10-27 3:07 ` yanjun.zhu
2022-10-27 3:10 ` Parav Pandit
2022-10-27 3:17 ` yanjun.zhu
2022-10-27 3:21 ` Parav Pandit
2022-10-27 3:39 ` yanjun.zhu
2022-10-27 3:48 ` Parav Pandit
2022-10-27 6:01 ` yanjun.zhu
2022-10-27 14:06 ` Parav Pandit
2022-10-28 3:21 ` Yanjun Zhu
2022-10-28 3:31 ` Parav Pandit
2022-10-28 3:49 ` Yanjun Zhu
2022-10-28 3:58 ` Parav Pandit
2022-11-11 2:38 ` Yanjun Zhu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221024143521.GG63658@linux.alibaba.com \
--to=dust.li@linux.alibaba.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=yanjun.zhu@intel.com \
--cc=yanjun.zhu@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox