From: Sagi Grimberg <sagi@grimberg.me>
To: Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Aurelien Aptel <aaptel@nvidia.com>,
Shai Malin <smalin@nvidia.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
Boris Pismenny <borisp@nvidia.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"kuba@kernel.org" <kuba@kernel.org>,
"aurelien.aptel@gmail.com" <aurelien.aptel@gmail.com>,
"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
"malin1024@gmail.com" <malin1024@gmail.com>,
Or Gerlitz <ogerlitz@nvidia.com>, Yoray Zack <yorayz@nvidia.com>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Gal Shalom <galshalom@nvidia.com>,
Max Gurtovoy <mgurtovoy@nvidia.com>,
"kbusch@kernel.org" <kbusch@kernel.org>
Subject: Re: [PATCH v12 07/26] nvme-tcp: Add DDP offload control path
Date: Wed, 9 Aug 2023 10:39:42 +0300 [thread overview]
Message-ID: <2ae6c96b-2b05-583e-55bd-2d20133b9b37@grimberg.me> (raw)
In-Reply-To: <8a4ccb05-b9c5-fd45-69cb-c531fd017941@nvidia.com>
On 8/1/23 05:25, Chaitanya Kulkarni wrote:
> On 7/12/23 09:14, Aurelien Aptel wrote:
>> From: Boris Pismenny <borisp@nvidia.com>
>>
>> This commit introduces direct data placement offload to NVME
>> TCP. There is a context per queue, which is established after the
>> handshake using the sk_add/del NDOs.
>>
>> Additionally, a resynchronization routine is used to assist
>> hardware recovery from TCP OOO, and continue the offload.
>> Resynchronization operates as follows:
>>
>> 1. TCP OOO causes the NIC HW to stop the offload
>>
>> 2. NIC HW identifies a PDU header at some TCP sequence number,
>> and asks NVMe-TCP to confirm it.
>> This request is delivered from the NIC driver to NVMe-TCP by first
>> finding the socket for the packet that triggered the request, and
>> then finding the nvme_tcp_queue that is used by this routine.
>> Finally, the request is recorded in the nvme_tcp_queue.
>>
>> 3. When NVMe-TCP observes the requested TCP sequence, it will compare
>> it with the PDU header TCP sequence, and report the result to the
>> NIC driver (resync), which will update the HW, and resume offload
>> when all is successful.
>>
>> Some HW implementation such as ConnectX-7 assume linear CCID (0...N-1
>> for queue of size N) where the linux nvme driver uses part of the 16
>> bit CCID for generation counter. To address that, we use the existing
>> quirk in the nvme layer when the HW driver advertises if the device is
>> not supports the full 16 bit CCID range.
>>
>> Furthermore, we let the offloading driver advertise what is the max hw
>> sectors/segments via ulp_ddp_limits.
>>
>> A follow-up patch introduces the data-path changes required for this
>> offload.
>>
>> Socket operations need a netdev reference. This reference is
>> dropped on NETDEV_GOING_DOWN events to allow the device to go down in
>> a follow-up patch.
>>
>> Signed-off-by: Boris Pismenny <borisp@nvidia.com>
>> Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
>> Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
>> Signed-off-by: Yoray Zack <yorayz@nvidia.com>
>> Signed-off-by: Shai Malin <smalin@nvidia.com>
>> Signed-off-by: Aurelien Aptel <aaptel@nvidia.com>
>> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
>> ---
>
> For NVMe related code :-
>
> Offload feature is configurable and maybe not be turned on in the absence
> of the H/W. In order to keep the nvme/host/tcp.c file small to only handle
> core related functionality, I wonder if we should to move tcp-offload code
> into it's own file say nvme/host/tcp-offload.c ?
Maybe. it wouldn't be tcp_offload.c but rather tcp_ddp.c because its not
offloading the tcp stack but rather doing direct data placement.
If we are going to do that it will pollute nvme.h or add a common
header file, which is something I'd like to avoid if possible.
next prev parent reply other threads:[~2023-08-09 7:39 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-12 16:14 [PATCH v12 00/26] nvme-tcp receive offloads Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 01/26] net: Introduce direct data placement tcp offload Aurelien Aptel
2023-08-09 7:15 ` Sagi Grimberg
2023-08-10 14:46 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 02/26] net/ethtool: add new stringset ETH_SS_ULP_DDP_{CAPS,STATS} Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 03/26] net/ethtool: add ULP_DDP_{GET,SET} operations for caps and stats Aurelien Aptel
2023-07-15 10:14 ` Simon Horman
2023-07-17 9:45 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 04/26] Documentation: document netlink ULP_DDP_GET/SET messages Aurelien Aptel
2023-07-15 10:17 ` Simon Horman
2023-07-17 9:47 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 05/26] iov_iter: skip copy if src == dst for direct data placement Aurelien Aptel
2023-08-16 0:24 ` Max Gurtovoy
2023-07-12 16:14 ` [PATCH v12 06/26] net/tls,core: export get_netdev_for_sock Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 07/26] nvme-tcp: Add DDP offload control path Aurelien Aptel
2023-08-01 2:25 ` Chaitanya Kulkarni
2023-08-09 7:39 ` Sagi Grimberg [this message]
2023-08-11 5:28 ` Chaitanya Kulkarni
2023-08-16 0:50 ` Max Gurtovoy
2023-08-09 7:13 ` Sagi Grimberg
2023-08-14 16:11 ` Aurelien Aptel
2023-08-14 18:54 ` Sagi Grimberg
2023-08-16 12:30 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 08/26] nvme-tcp: Add DDP data-path Aurelien Aptel
2023-08-09 7:35 ` Sagi Grimberg
2023-08-14 16:12 ` Aurelien Aptel
2023-08-14 19:01 ` Sagi Grimberg
2023-08-17 13:28 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 09/26] nvme-tcp: RX DDGST offload Aurelien Aptel
2023-08-09 7:59 ` Sagi Grimberg
2023-08-10 14:48 ` Aurelien Aptel
2023-08-13 13:49 ` Sagi Grimberg
2023-07-12 16:14 ` [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events Aurelien Aptel
2023-08-09 8:00 ` Sagi Grimberg
2023-08-16 13:03 ` Aurelien Aptel
2023-08-16 14:10 ` Sagi Grimberg
2023-08-17 14:09 ` Aurelien Aptel
2023-08-20 10:50 ` Sagi Grimberg
2023-08-21 12:33 ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 11/26] nvme-tcp: Add modparam to control the ULP offload enablement Aurelien Aptel
2023-08-09 8:03 ` Sagi Grimberg
2023-08-10 14:50 ` Aurelien Aptel
2023-08-16 1:05 ` Max Gurtovoy
2023-07-12 16:14 ` [PATCH v12 12/26] nvme-tcp: Only enable offload with TLS if the driver supports it Aurelien Aptel
2023-08-09 8:05 ` Sagi Grimberg
2023-08-10 14:52 ` Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 13/26] Documentation: add ULP DDP offload documentation Aurelien Aptel
2023-07-15 10:32 ` Simon Horman
2023-07-17 9:48 ` Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 14/26] net/mlx5e: Rename from tls to transport static params Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 15/26] net/mlx5e: Refactor ico sq polling to get budget Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 16/26] net/mlx5e: Have mdev pointer directly on the icosq structure Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 17/26] net/mlx5e: Refactor doorbell function to allow avoiding a completion Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 18/26] net/mlx5: Add NVMEoTCP caps, HW bits, 128B CQE and enumerations Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 19/26] net/mlx5e: NVMEoTCP, offload initialization Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 20/26] net/mlx5e: TCP flow steering for nvme-tcp acceleration Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 21/26] net/mlx5e: NVMEoTCP, use KLM UMRs for buffer registration Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 22/26] net/mlx5e: NVMEoTCP, queue init/teardown Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 23/26] net/mlx5e: NVMEoTCP, ddp setup and resync Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 24/26] net/mlx5e: NVMEoTCP, async ddp invalidation Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 25/26] net/mlx5e: NVMEoTCP, data-path for DDP+DDGST offload Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 26/26] net/mlx5e: NVMEoTCP, statistics Aurelien Aptel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2ae6c96b-2b05-583e-55bd-2d20133b9b37@grimberg.me \
--to=sagi@grimberg.me \
--cc=aaptel@nvidia.com \
--cc=aurelien.aptel@gmail.com \
--cc=axboe@fb.com \
--cc=borisp@nvidia.com \
--cc=chaitanyak@nvidia.com \
--cc=davem@davemloft.net \
--cc=galshalom@nvidia.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=malin1024@gmail.com \
--cc=mgurtovoy@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=ogerlitz@nvidia.com \
--cc=smalin@nvidia.com \
--cc=yorayz@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).