netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Aurelien Aptel <aaptel@nvidia.com>,
	Shai Malin <smalin@nvidia.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
	Boris Pismenny <borisp@nvidia.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"aurelien.aptel@gmail.com" <aurelien.aptel@gmail.com>,
	"hch@lst.de" <hch@lst.de>, "axboe@fb.com" <axboe@fb.com>,
	"malin1024@gmail.com" <malin1024@gmail.com>,
	Or Gerlitz <ogerlitz@nvidia.com>, Yoray Zack <yorayz@nvidia.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Gal Shalom <galshalom@nvidia.com>,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	"kbusch@kernel.org" <kbusch@kernel.org>
Subject: Re: [PATCH v12 07/26] nvme-tcp: Add DDP offload control path
Date: Wed, 9 Aug 2023 10:39:42 +0300	[thread overview]
Message-ID: <2ae6c96b-2b05-583e-55bd-2d20133b9b37@grimberg.me> (raw)
In-Reply-To: <8a4ccb05-b9c5-fd45-69cb-c531fd017941@nvidia.com>



On 8/1/23 05:25, Chaitanya Kulkarni wrote:
> On 7/12/23 09:14, Aurelien Aptel wrote:
>> From: Boris Pismenny <borisp@nvidia.com>
>>
>> This commit introduces direct data placement offload to NVME
>> TCP. There is a context per queue, which is established after the
>> handshake using the sk_add/del NDOs.
>>
>> Additionally, a resynchronization routine is used to assist
>> hardware recovery from TCP OOO, and continue the offload.
>> Resynchronization operates as follows:
>>
>> 1. TCP OOO causes the NIC HW to stop the offload
>>
>> 2. NIC HW identifies a PDU header at some TCP sequence number,
>> and asks NVMe-TCP to confirm it.
>> This request is delivered from the NIC driver to NVMe-TCP by first
>> finding the socket for the packet that triggered the request, and
>> then finding the nvme_tcp_queue that is used by this routine.
>> Finally, the request is recorded in the nvme_tcp_queue.
>>
>> 3. When NVMe-TCP observes the requested TCP sequence, it will compare
>> it with the PDU header TCP sequence, and report the result to the
>> NIC driver (resync), which will update the HW, and resume offload
>> when all is successful.
>>
>> Some HW implementation such as ConnectX-7 assume linear CCID (0...N-1
>> for queue of size N) where the linux nvme driver uses part of the 16
>> bit CCID for generation counter. To address that, we use the existing
>> quirk in the nvme layer when the HW driver advertises if the device is
>> not supports the full 16 bit CCID range.
>>
>> Furthermore, we let the offloading driver advertise what is the max hw
>> sectors/segments via ulp_ddp_limits.
>>
>> A follow-up patch introduces the data-path changes required for this
>> offload.
>>
>> Socket operations need a netdev reference. This reference is
>> dropped on NETDEV_GOING_DOWN events to allow the device to go down in
>> a follow-up patch.
>>
>> Signed-off-by: Boris Pismenny <borisp@nvidia.com>
>> Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
>> Signed-off-by: Or Gerlitz <ogerlitz@nvidia.com>
>> Signed-off-by: Yoray Zack <yorayz@nvidia.com>
>> Signed-off-by: Shai Malin <smalin@nvidia.com>
>> Signed-off-by: Aurelien Aptel <aaptel@nvidia.com>
>> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
>> ---
> 
> For NVMe related code :-
> 
> Offload feature is configurable and maybe not be turned on in the absence
> of the H/W. In order to keep the nvme/host/tcp.c file small to only handle
> core related functionality, I wonder if we should to move tcp-offload code
> into it's own file say nvme/host/tcp-offload.c ?

Maybe. it wouldn't be tcp_offload.c but rather tcp_ddp.c because its not
offloading the tcp stack but rather doing direct data placement.

If we are going to do that it will pollute nvme.h or add a common
header file, which is something I'd like to avoid if possible.

  reply	other threads:[~2023-08-09  7:39 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12 16:14 [PATCH v12 00/26] nvme-tcp receive offloads Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 01/26] net: Introduce direct data placement tcp offload Aurelien Aptel
2023-08-09  7:15   ` Sagi Grimberg
2023-08-10 14:46     ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 02/26] net/ethtool: add new stringset ETH_SS_ULP_DDP_{CAPS,STATS} Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 03/26] net/ethtool: add ULP_DDP_{GET,SET} operations for caps and stats Aurelien Aptel
2023-07-15 10:14   ` Simon Horman
2023-07-17  9:45     ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 04/26] Documentation: document netlink ULP_DDP_GET/SET messages Aurelien Aptel
2023-07-15 10:17   ` Simon Horman
2023-07-17  9:47     ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 05/26] iov_iter: skip copy if src == dst for direct data placement Aurelien Aptel
2023-08-16  0:24   ` Max Gurtovoy
2023-07-12 16:14 ` [PATCH v12 06/26] net/tls,core: export get_netdev_for_sock Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 07/26] nvme-tcp: Add DDP offload control path Aurelien Aptel
2023-08-01  2:25   ` Chaitanya Kulkarni
2023-08-09  7:39     ` Sagi Grimberg [this message]
2023-08-11  5:28       ` Chaitanya Kulkarni
2023-08-16  0:50         ` Max Gurtovoy
2023-08-09  7:13   ` Sagi Grimberg
2023-08-14 16:11     ` Aurelien Aptel
2023-08-14 18:54       ` Sagi Grimberg
2023-08-16 12:30         ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 08/26] nvme-tcp: Add DDP data-path Aurelien Aptel
2023-08-09  7:35   ` Sagi Grimberg
2023-08-14 16:12     ` Aurelien Aptel
2023-08-14 19:01       ` Sagi Grimberg
2023-08-17 13:28         ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 09/26] nvme-tcp: RX DDGST offload Aurelien Aptel
2023-08-09  7:59   ` Sagi Grimberg
2023-08-10 14:48     ` Aurelien Aptel
2023-08-13 13:49       ` Sagi Grimberg
2023-07-12 16:14 ` [PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events Aurelien Aptel
2023-08-09  8:00   ` Sagi Grimberg
2023-08-16 13:03     ` Aurelien Aptel
2023-08-16 14:10       ` Sagi Grimberg
2023-08-17 14:09         ` Aurelien Aptel
2023-08-20 10:50           ` Sagi Grimberg
2023-08-21 12:33             ` Aurelien Aptel
2023-07-12 16:14 ` [PATCH v12 11/26] nvme-tcp: Add modparam to control the ULP offload enablement Aurelien Aptel
2023-08-09  8:03   ` Sagi Grimberg
2023-08-10 14:50     ` Aurelien Aptel
2023-08-16  1:05     ` Max Gurtovoy
2023-07-12 16:14 ` [PATCH v12 12/26] nvme-tcp: Only enable offload with TLS if the driver supports it Aurelien Aptel
2023-08-09  8:05   ` Sagi Grimberg
2023-08-10 14:52     ` Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 13/26] Documentation: add ULP DDP offload documentation Aurelien Aptel
2023-07-15 10:32   ` Simon Horman
2023-07-17  9:48     ` Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 14/26] net/mlx5e: Rename from tls to transport static params Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 15/26] net/mlx5e: Refactor ico sq polling to get budget Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 16/26] net/mlx5e: Have mdev pointer directly on the icosq structure Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 17/26] net/mlx5e: Refactor doorbell function to allow avoiding a completion Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 18/26] net/mlx5: Add NVMEoTCP caps, HW bits, 128B CQE and enumerations Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 19/26] net/mlx5e: NVMEoTCP, offload initialization Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 20/26] net/mlx5e: TCP flow steering for nvme-tcp acceleration Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 21/26] net/mlx5e: NVMEoTCP, use KLM UMRs for buffer registration Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 22/26] net/mlx5e: NVMEoTCP, queue init/teardown Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 23/26] net/mlx5e: NVMEoTCP, ddp setup and resync Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 24/26] net/mlx5e: NVMEoTCP, async ddp invalidation Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 25/26] net/mlx5e: NVMEoTCP, data-path for DDP+DDGST offload Aurelien Aptel
2023-07-12 16:15 ` [PATCH v12 26/26] net/mlx5e: NVMEoTCP, statistics Aurelien Aptel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2ae6c96b-2b05-583e-55bd-2d20133b9b37@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=aaptel@nvidia.com \
    --cc=aurelien.aptel@gmail.com \
    --cc=axboe@fb.com \
    --cc=borisp@nvidia.com \
    --cc=chaitanyak@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=galshalom@nvidia.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=malin1024@gmail.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@nvidia.com \
    --cc=smalin@nvidia.com \
    --cc=yorayz@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).