From: Edward Srouji <edwards@nvidia.com>
To: Zhu Yanjun <yanjun.zhu@linux.dev>,
Leon Romanovsky <leon@kernel.org>,
Jason Gunthorpe <jgg@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>,
linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
netdev@vger.kernel.org, Saeed Mahameed <saeedm@nvidia.com>,
Tariq Toukan <tariqt@nvidia.com>,
Yishai Hadas <yishaih@nvidia.com>
Subject: Re: [PATCH rdma-next 0/2] Introduce mlx5 data direct placement (DDP)
Date: Thu, 5 Sep 2024 15:23:29 +0300 [thread overview]
Message-ID: <09db1552-db97-4e82-9517-3b67c4b33feb@nvidia.com> (raw)
In-Reply-To: <aaf9263b-931e-4b1d-8aea-1218faec2802@linux.dev>
On 9/4/2024 2:53 PM, Zhu Yanjun wrote:
> External email: Use caution opening links or attachments
>
>
> 在 2024/9/4 16:27, Edward Srouji 写道:
>>
>> On 9/4/2024 9:02 AM, Zhu Yanjun wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> 在 2024/9/3 19:37, Leon Romanovsky 写道:
>>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>>
>>>> Hi,
>>>>
>>>> This series from Edward introduces mlx5 data direct placement (DDP)
>>>> feature.
>>>>
>>>> This feature allows WRs on the receiver side of the QP to be consumed
>>>> out of order, permitting the sender side to transmit messages without
>>>> guaranteeing arrival order on the receiver side.
>>>>
>>>> When enabled, the completion ordering of WRs remains in-order,
>>>> regardless of the Receive WRs consumption order.
>>>>
>>>> RDMA Read and RDMA Atomic operations on the responder side continue to
>>>> be executed in-order, while the ordering of data placement for RDMA
>>>> Write and Send operations is not guaranteed.
>>>
>>> It is an interesting feature. If I got this feature correctly, this
>>> feature permits the user consumes the data out of order when RDMA Write
>>> and Send operations. But its completiong ordering is still in order.
>>>
>> Correct.
>>> Any scenario that this feature can be applied and what benefits will be
>>> got from this feature?
>>>
>>> I am just curious about this. Normally the users will consume the data
>>> in order. In what scenario, the user will consume the data out of
>>> order?
>>>
>> One of the main benefits of this feature is achieving higher bandwidth
>> (BW) by allowing
>> responders to receive packets out of order (OOO).
>>
>> For example, this can be utilized in devices that support multi-plane
>> functionality,
>> as introduced in the "Multi-plane support for mlx5" series [1]. When
>> mlx5 multi-plane
>> is supported, a single logical mlx5 port aggregates multiple physical
>> plane ports.
>> In this scenario, the requester can "spray" packets across the
>> multiple physical
>> plane ports without guaranteeing packet order, either on the wire or
>> on the receiver
>> (responder) side.
>>
>> With this approach, no barriers or fences are required to ensure
>> in-order packet
>> reception, which optimizes the data path for performance. This can
>> result in better
>> BW, theoretically achieving line-rate performance equivalent to the
>> sum of
>> the maximum BW of all physical plane ports, with only one QP.
>
> Thanks a lot for your quick reply. Without ensuring in-order packet
> reception, this does optimize the data path for performance.
>
> I agree with you.
>
> But how does the receiver get the correct packets from the out-of-order
> packets efficiently?
>
> The method is implemented in Software or Hardware?
The packets have new field that is used by the HW to understand the
correct message order (similar to PSN).
Once the packets arrive OOO to the receiver side, the data is scattered
directly (hence the DDP - "Direct Data Placement" name) by the HW.
So the efficiency is achieved by the HW, as it also saves the required
context and metadata so it can deliver the correct completion to the
user (in-order) once we have some WQEs that can be considered an
"in-order window" and be delivered to the user.
The SW/Applications may receive OOO WR_IDs though (because the first CQE
may have consumed Recv WQE of any index on the receiver side), and it's
their responsibility to handle it from this point, if it's required.
>
> I am just interested in this feature and want to know more about this.
>
> Thanks,
>
> Zhu Yanjun
>
>>
>> [1] https://lore.kernel.org/lkml/cover.1718553901.git.leon@kernel.org/
>>> Thanks,
>>> Zhu Yanjun
>>>
>>>>
>>>> Thanks
>>>>
>>>> Edward Srouji (2):
>>>> net/mlx5: Introduce data placement ordering bits
>>>> RDMA/mlx5: Support OOO RX WQE consumption
>>>>
>>>> drivers/infiniband/hw/mlx5/main.c | 8 +++++
>>>> drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 +
>>>> drivers/infiniband/hw/mlx5/qp.c | 51
>>>> +++++++++++++++++++++++++---
>>>> include/linux/mlx5/mlx5_ifc.h | 24 +++++++++----
>>>> include/uapi/rdma/mlx5-abi.h | 5 +++
>>>> 5 files changed, 78 insertions(+), 11 deletions(-)
>>>>
>>>
> --
> Best Regards,
> Yanjun.Zhu
>
next prev parent reply other threads:[~2024-09-05 12:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-03 11:37 [PATCH rdma-next 0/2] Introduce mlx5 data direct placement (DDP) Leon Romanovsky
2024-09-04 6:02 ` Zhu Yanjun
2024-09-04 8:27 ` Edward Srouji
2024-09-04 11:53 ` Zhu Yanjun
2024-09-05 12:23 ` Edward Srouji [this message]
2024-09-06 5:02 ` Zhu Yanjun
2024-09-06 12:17 ` Edward Srouji
2024-09-06 15:17 ` Zhu Yanjun
2024-09-08 8:47 ` Edward Srouji
2024-09-06 13:02 ` Bernard Metzler
2024-11-04 8:20 ` (subset) " Leon Romanovsky
2024-11-04 8:27 ` Leon Romanovsky
2024-11-05 2:53 ` Jakub Kicinski
2024-11-05 6:26 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=09db1552-db97-4e82-9517-3b67c4b33feb@nvidia.com \
--to=edwards@nvidia.com \
--cc=jgg@nvidia.com \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
--cc=yanjun.zhu@linux.dev \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox