public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Yanjun Zhu <yanjun.zhu@linux.dev>
To: dust.li@linux.alibaba.com, Zhu Yanjun <yanjun.zhu@intel.com>,
	jgg@ziepe.ca, leon@kernel.org, linux-rdma@vger.kernel.org,
	"yanjun.zhu@linux.dev" <yanjun.zhu@linux.dev>
Subject: Re: [PATCH 0/3] RDMA net namespace
Date: Tue, 25 Oct 2022 10:51:38 +0800	[thread overview]
Message-ID: <1bf1e54f-f8d2-6b8c-5ded-ef701719816d@linux.dev> (raw)
In-Reply-To: <20221024143521.GG63658@linux.alibaba.com>


在 2022/10/24 22:35, Dust Li 写道:
> On Mon, Oct 24, 2022 at 09:12:56PM +0800, Yanjun Zhu wrote:
>> 在 2022/10/24 19:52, Dust Li 写道:
>>> On Mon, Oct 24, 2022 at 06:15:01AM +0000, yanjun.zhu@linux.dev wrote:
>>>> October 24, 2022 9:10 AM, "Dust Li" <dust.li@linux.alibaba.com> wrote:
>>>>
>>>>> On Sun, Oct 23, 2022 at 06:04:47PM -0400, Zhu Yanjun wrote:
>>>>>
>>>>>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>>>>
>>>>>> There are shared and exclusive modes in RDMA net namespace. After
>>>>>> discussion with Leon, the above modes are compatible with legacy IB
>>>>>> device.
>>>>>>
>>>>>> To the RoCE and iWARP devices, the ib devices should be in the same net
>>>>>> namespace with the related net devices regardless of in shared or
>>>>>> exclusive mode.
>>>>> Does this mean that shared mode is no longer supported for RoCE and iWarp
>>>>> devices ?
>>> >From the discussion,  a RoCE and iWarp device should make ib devices and net devices in the same net. So a RoCE and iWarp device has no shared/exclusive modes.
>>>> Shared/exclusive modes are for legacy ib devices, such as ipoib.
>>>>
>>>> In this patch series, shared/exclusive modes are left for legacy ib devices.
>>>> To a RoCE and iWarp device, we just keep net devices and ib devices in the same net.
>>> I think this may limit the use case of RoCE and iWarp.
>>>
>>> See the following use case:
>>> In the container enviroment, we may have lots of containers on a host,
>>> for example, more than 100. And we don't have that much VFs, so we use
>>> ipvlan or other virtual network devices for each container, and put
>>> those virtual network devices into each container(net namespace).
>>> Since we only use 1 physical network device for all those containers,
>>> there is only one RoCE device. If we don't support shared mode, we
>>> cannot even enable RDMA for those containers with RoCE.
>> You use the ipvlan or other virtual network devices for each container.
>>
>> In these containers, you also use RDMA, correct?
>>
>> Since all the packets for these virtual network devices finally come to
>>
>> the physical network devices, without shared/exclusive modes, it should work.
>>
>> So we do not consider the shared/exclusive mode.
> For the netdevice, that's true. But for RDMA, we should not even see
> the ib device in the containers any more, so I think we cannot create
> qp/cq, and RDMA is not available for these containers in this case.

I can not get you.

Do you mean that RDMA can not be accessed in the container after these 
patches are applied?

Can you share a test case with me?


Zhu Yanjun

>
> Thanks
>
>
>
>> Zhu Yanjun
>>
>>> I don't know any other way to solve this, maybe I missed something ?
>>>
>>> Thanks
>>>
>>>>
>>>>>> In the first commit, when the net devices are moved to a new net
>>>>>> namespace, the related ib devices are also moved to the same net
>>>>>> namespace.
>>>>>>
>>>>>> In the second commit, the shared/exclusive modes still work with legacy
>>>>>> ib devices. To the RoCE and iWARP devices, these modes will not be
>>>>>> considered.
>>>>>>
>>>>>> Because MLX4/5 do not call the function ib_device_set_netdev to map ib
>>>>>> devices and the related net devices, the function ib_device_get_by_netdev
>>>>>> can not get ib devices from net devices. In the third commit, all the
>>>>>> registered ib devices are parsed to get the net devices, then compared
>>>>>> with the given net devices.
>>>>>>
>>>>>> The steps to make tests:
>>>>>> 1) Create a new net namespace net0
>>>>>>
>>>>>> ip netns add net0
>>>>>>
>>>>>> 2) Show the rdma links in init_net
>>>>>>
>>>>>> rdma link
>>>>>>
>>>>>> "
>>>>>> link mlx5_0/1 state DOWN physical_state DISABLED netdev enp7s0np1
>>>>>> "
>>>>>>
>>>>>> 3) Move the net device to net namespace net0
>>>>>>
>>>>>> ip link set enp7s0np1 netns net0
>>>>>>
>>>>>> 4) Show the rdma links in init_net again
>>>>>>
>>>>>> rdma link
>>>>>>
>>>>>> There is no rdma links
>>>>>>
>>>>>> 5) Show the rdma links in net0
>>>>>>
>>>>>> ip netns exec net0 rdma link
>>>>>>
>>>>>> "
>>>>>> link mlx5_0/1 state DOWN physical_state DISABLED netdev enp7s0np1
>>>>>> "
>>>>>>
>>>>>> We can confirm that rdma links are moved to the same net namespace with
>>>>>> the net devices.
>>>>>>
>>>>>> Zhu Yanjun (3):
>>>>>> RDMA/core: Move ib device to the same net namespace with net device
>>>>>> RDMA/core: The legacy IB devices still work with shared/exclusive mode
>>>>>> RDMA/core: Get all the ib devices from net devices
>>>>>>
>>>>>> drivers/infiniband/core/device.c | 107 ++++++++++++++++++++++++++++++-
>>>>>> 1 file changed, 105 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.27.0

  parent reply	other threads:[~2022-10-25  2:52 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-23 22:04 [PATCH 0/3] RDMA net namespace Zhu Yanjun
2022-10-23 13:04 ` Leon Romanovsky
2022-10-23 13:42   ` Yanjun Zhu
2022-10-23 16:45     ` Leon Romanovsky
2022-10-24  7:20       ` yanjun.zhu
2022-10-23 22:04 ` [PATCH 1/3] RDMA/core: Move ib device to the same net namespace with net device Zhu Yanjun
2022-10-23 22:04 ` [PATCH 2/3] RDMA/core: The legacy IB devices still work with shared/exclusive mode Zhu Yanjun
2022-10-23 22:04 ` [PATCH 3/3] RDMA/core: Get all the ib devices from net devices Zhu Yanjun
2022-10-24  1:10 ` [PATCH 0/3] RDMA net namespace Dust Li
2022-10-24  6:15   ` yanjun.zhu
2022-10-24 11:52     ` Dust Li
2022-10-24 13:12       ` Yanjun Zhu
2022-10-24 14:35         ` Dust Li
2022-10-24 16:41           ` Jason Gunthorpe
2022-10-25  2:51           ` Yanjun Zhu [this message]
2022-10-26  4:08             ` Dust Li
2022-10-26 15:01 ` Dust Li
2022-10-27  2:30   ` Dust Li
2022-10-27  2:54     ` yanjun.zhu
2022-10-27  3:01     ` Parav Pandit
2022-10-27  3:07       ` yanjun.zhu
2022-10-27  3:10         ` Parav Pandit
2022-10-27  3:17           ` yanjun.zhu
2022-10-27  3:21             ` Parav Pandit
2022-10-27  3:39               ` yanjun.zhu
2022-10-27  3:48                 ` Parav Pandit
2022-10-27  6:01                   ` yanjun.zhu
2022-10-27 14:06                     ` Parav Pandit
2022-10-28  3:21                       ` Yanjun Zhu
2022-10-28  3:31                         ` Parav Pandit
2022-10-28  3:49                           ` Yanjun Zhu
2022-10-28  3:58                             ` Parav Pandit
2022-11-11  2:38                               ` Yanjun Zhu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1bf1e54f-f8d2-6b8c-5ded-ef701719816d@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=dust.li@linux.alibaba.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=yanjun.zhu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox