From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,DKIM_VALID,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86258C4361B for ; Sun, 13 Dec 2020 18:21:57 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CDACF23123 for ; Sun, 13 Dec 2020 18:21:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CDACF23123 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=MupPILb2VeAQQPlvwlmLEQo8g9PMry6kvObRSstr+0s=; b=g2WnaCcM9v/FA1+JPEr2IHf+1 59FRW61OJv3yEY0DD4nulAijKAQQNQsgdzI34cHZMKnacYYoxxUL04eZvzVWxqU0HnanevloWYoEV 40o/ktTfo5a/YXzaz2E6SkKRGRTNHx035quNIvowQF6xazqDaH729LM6QDvUpA4rFlV6RBoiwWexi 4NWjTUUn/wFIgDkv5wisQ3MM253NjiyOSwtkQgeuBr10NYgVF32yEDpTONUV42BG5VdoPw9DAqdHN +Ejpy0ne3khwU+cpClICcpmzkCgKsBWwmD8XMc1fm6y3y6lkOgTeIC6x+WcaTh3xNxlsndK9SKKbW o/7AsAtsg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1koW09-0000Od-Bq; Sun, 13 Dec 2020 18:21:45 +0000 Received: from mail-ej1-x644.google.com ([2a00:1450:4864:20::644]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1koW06-0000N6-9M for linux-nvme@lists.infradead.org; Sun, 13 Dec 2020 18:21:43 +0000 Received: by mail-ej1-x644.google.com with SMTP id j22so1473277eja.13 for ; Sun, 13 Dec 2020 10:21:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Ls/TM45a/2TY5gXY5BjrFTwylM7rwS7H7ZnBfdEPrfA=; b=YOBfepvFU3U7nDxKg8TkO6s2thPhYjGsmQM55AMUvhxzW9hUBn6VwZStajlfvGXv+2 s6BImEZLABw+W323lfg2RhCpOEhW7l9moBIRw6/cWOhCszsq3d1ZgzSkJEtHscayo7ma AjzI3uoNAleG0wgP5RGvpMrR6huWhbEaMmX07lM1c2BcyvuR8A2smXhaDvNX+expXUZN r1eKWUsYFSQrHdEiqGFU84ls8TOTolHkp5CcoBpMOkOCPOiKhFBjggMBrK2P3GKM57PR ojXKFcv7QBVg2ykOATU2DHMDwqTzN7XZpsC9lkH25c7bPSFNhhy2j90XU6jiXEscnCFL 31dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Ls/TM45a/2TY5gXY5BjrFTwylM7rwS7H7ZnBfdEPrfA=; b=EkfQYp1r7XjVuci9V6M667qB4VeE1a9wMCYh8pMF2FWzOwLO+MczJbBcXFVZlMfHKc X4/DTpnTziiYl+IJ/c4x9QrBYkZdwgo//qGjyUEkZ+TOtgBezlHMQfsrocX8hzpcm2ac mmdINVZCd09mHr4RVRY7bWVLtfGqzTEizU8Ro2MshUK8VTEB/8trclwmuUl1J145+op9 UBduOSDe9rRvGg9xy9odqu3s/MotiCGDb2ldGFkhx8SCrOZFiyz7d7TVJ45ZhtXPCbIm lNOgh5qgepplNH0yof0+u4lePsM1dK1tIh5CBk5B1dX2dPrHARZW+5aX7VEXMnvn16nN Mkyg== X-Gm-Message-State: AOAM532CCwNse9GDypVGw3ep8wohVQTJWzgFWCDZbdLFiXShfwr0wxI0 xstOJz8JfF943vxgTIC2Eiw= X-Google-Smtp-Source: ABdhPJwMd6oIqdXnkYAqIinjRuXZNq3+uGiZH0+NDDPEi/blLJOIgUFkZ79xJudqlpIBQFzjEyW2xg== X-Received: by 2002:a17:906:a8e:: with SMTP id y14mr19079272ejf.47.1607883698356; Sun, 13 Dec 2020 10:21:38 -0800 (PST) Received: from [192.168.1.11] ([213.57.108.142]) by smtp.gmail.com with ESMTPSA id d1sm11690591eje.82.2020.12.13.10.21.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 13 Dec 2020 10:21:37 -0800 (PST) Subject: Re: [PATCH v1 net-next 02/15] net: Introduce direct data placement tcp offload To: David Ahern , Boris Pismenny , kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com References: <20201207210649.19194-1-borisp@mellanox.com> <20201207210649.19194-3-borisp@mellanox.com> <6f48fa5d-465c-5c38-ea45-704e86ba808b@gmail.com> <65dc5bba-13e6-110a-ddae-3d0c260aa875@gmail.com> <921a110f-60fa-a711-d386-39eeca52199f@gmail.com> From: Boris Pismenny Message-ID: Date: Sun, 13 Dec 2020 20:21:34 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: <921a110f-60fa-a711-d386-39eeca52199f@gmail.com> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201213_132142_420137_6F2663D7 X-CRM114-Status: GOOD ( 54.54 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yoray Zack , yorayz@nvidia.com, Boris Pismenny , boris.pismenny@gmail.com, Ben Ben-Ishay , benishay@nvidia.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, Or Gerlitz , ogerlitz@nvidia.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org T24gMTAvMTIvMjAyMCA2OjI2LCBEYXZpZCBBaGVybiB3cm90ZToKPiBPbiAxMi85LzIwIDE6MTUg QU0sIEJvcmlzIFBpc21lbm55IHdyb3RlOgo+PiBPbiAwOS8xMi8yMDIwIDI6MzgsIERhdmlkIEFo ZXJuIHdyb3RlOgpbLi4uXQo+Pgo+PiBUaGVyZSBpcyBtb3JlIHRvIHRoaXMgdGhhbiBUQ1AgemVy b2NvcHkgdGhhdCBleGlzdHMgaW4gdXNlcnNwYWNlIG9yCj4+IGluc2lkZSB0aGUga2VybmVsLiBG aXJzdCwgcGxlYXNlIG5vdGUgdGhhdCB0aGUgcGF0Y2hlcyBpbmNsdWRlIHN1cHBvcnQgZm9yCj4+ IENSQyBvZmZsb2FkIGFzIHdlbGwgYXMgZGF0YSBwbGFjZW1lbnQuIFNlY29uZCwgZGF0YS1wbGFj ZW1lbnQgaXMgbm90IHRoZSBzYW1lCj4gCj4gWWVzLCB0aGUgQ1JDIG9mZmxvYWQgaXMgZGlmZmVy ZW50LCBidXQgSSB0aGluayBpdCBpcyBvcnRob2dvbmFsIHRvIHRoZQo+ICd3aGVyZSBkb2VzIGgv dyBwdXQgdGhlIGRhdGEnIHByb2JsZW0uCj4gCgpJIGFncmVlCgo+PiBhcyB6ZXJvY29weSBmb3Ig dGhlIGZvbGxvd2luZyByZWFzb25zOgo+PiAoMSkgVGhlIGZvcm1lciBwbGFjZXMgYnVmZmVycyAq ZXhhY3RseSogd2hlcmUgdGhlIHVzZXIgcmVxdWVzdHMKPj4gcmVnYXJkbGVzcyBvZiB0aGUgb3Jk ZXIgb2YgcmVzcG9uc2UgYXJyaXZhbHMsIHdoaWxlIHRoZSBsYXR0ZXIgcGxhY2VzIHBhY2tldHMK Pj4gaW4gYW5vbnltb3VzIGJ1ZmZlcnMgYWNjb3JkaW5nIHRvIHBhY2tldCBhcnJpdmFsIG9yZGVy LiBUaGVyZWZvcmUsIHplcm9jb3B5Cj4+IGNhbiBiZSBpbXBsZW1lbnRlZCB1c2luZyBkYXRhIHBs YWNlbWVudCwgYnV0IG5vdCB2aWNlIHZlcnNhLgo+IAo+IEZ1bmRhbWVudGFsbHksIGl0IGlzIGFu IFNHTCBhbmQgYSBUQ1Agc2VxdWVuY2UgbnVtYmVyLiBUaGVyZSBpcyBhCj4gc3RhcnRpbmcgcG9p bnQgd2hlcmUgc2VxIE4gPT0gc2dsIGVsZW1lbnQgMCwgcG9zaXRpb24gMC4gUHJlc3VtYWJseQo+ IHRoZXJlIGlzIGEgaGFyZHdhcmUgY3Vyc29yIHRvIHRyYWNrIHdoZXJlIHlvdSBhcmUgaW4gZmls bGluZyB0aGUgU0dMIGFzCj4gcGFja2V0cyBhcmUgcHJvY2Vzc2VkLiBZb3UgYWJvcnQgb24gT09P LCBzbyBpdCBzZWVtcyBsaWtlIGEgZmFpcmx5Cj4gc3RyYWlnaHRmb3dhcmQgcHJvYmxlbS4KPiAK CldlIGRvIG5vdCBhYm9ydCBvbiBPT08uIE1vcmVvdmVyLCB3ZSBjYW4ga2VlcCBnb2luZyBhcyBs b25nIGFzClBEVSBoZWFkZXJzIGFyZSBub3QgcmVvcmRlcmVkLgoKPj4gKDIpIERhdGEtcGxhY2Vt ZW50IHN1cHBvcnRzIHN1Yi1wYWdlIHplcm9jb3B5LCB1bmxpa2UgcGFnZS1mbGlwcGluZwo+PiB0 ZWNobmlxdWVzIChpLmUuLCBUQ1BfWkVST0NPUFkpLgo+IAo+IEkgYW0gbm90IHB1c2hpbmcgZm9y IG9yIHN1Z2dlc3RpbmcgYW55IHBhZ2UtZmxpcHBpbmcuIEkgdW5kZXJzdGFuZCB0aGUKPiBsaW1p dGF0aW9ucyBvZiB0aGF0IGFwcHJvYWNoLgo+IAo+PiAoMykgUGFnZS1mbGlwcGluZyBjYW4ndCB3 b3JrIGZvciBhbnkgc3RvcmFnZSBpbml0aWF0b3IgYmVjYXVzZSB0aGUKPj4gZGVzdGluYXRpb24g YnVmZmVyIGlzIG93bmVkIGJ5IHNvbWUgdXNlciBwYWdlY2FjaGUgb3IgcHJvY2VzcyB1c2luZyBP X0RJUkVDVC4KPj4gKDQpIFN0b3JhZ2Ugb3ZlciBUQ1AgUERVcyBhcmUgbm90IG5lY2Vzc2FyaWx5 IGFsaWduZWQgdG8gVENQIHBhY2tldHMsCj4+IGkuZS4sIHRoZSBQRFUgaGVhZGVyIGNhbiBiZSBp biB0aGUgbWlkZGxlIG9mIGEgcGFja2V0LCBzbyBoZWFkZXItZGF0YSBzcGxpdAo+PiBhbG9uZSBp c24ndCBlbm91Z2guCj4gCj4geWVzLCBUQ1AgaXMgYSBieXRlIHN0cmVhbSBhbmQgeW91IGhhdmUg dG8gaGF2ZSBhIGN1cnNvciBtYXJraW5nIGxhc3QKPiB3cml0dGVuIHNwb3QgaW4gdGhlIFNHTC4g TW9yZSBiZWxvdy4KPiAKPj4KPj4gSSB3aXNoIHdlIGNvdWxkIGRvIHRoZSBzYW1lIHVzaW5nIHNv bWUgc2ltcGxlciB6ZXJvY29weSBtZWNoYW5pc20sCj4+IGl0IHdvdWxkIGluZGVlZCBzaW1wbGlm eSB0aGluZ3MuIEJ1dCwgdW5mb3J0dW5hdGVseSB0aGlzIHdvdWxkIHNldmVyZWx5Cj4+IHJlc3Ry aWN0IGdlbmVyYWxpdHksIG5vIHN1Yi1wYWdlIHN1cHBvcnQgYW5kIGFsaWdubWVudCBiZXR3ZWVu IFBEVXMKPj4gYW5kIHBhY2tldHMsIGFuZCBwZXJmb3JtYW5jZSAob3JkZXJpbmcgb2YgUERVcyku Cj4+Cj4gCj4gTXkgYmlnZ2VzdCBjb25jZXJuIGlzIHRoYXQgeW91IGFyZSBhZGRpbmcgY2hlY2tz IGluIHRoZSBmYXN0IHBhdGggZm9yIGEKPiB2ZXJ5IHNwZWNpZmljIHVzZSBjYXNlLiBJZiAvIHdo ZW4gUnggemVyb2NvcHkgaGFwcGVucyAoYW5kIEkgc3VzcGVjdCBpdAo+IGhhcyB0byBoYXBwZW4g c29vbiB0byBoYW5kbGUgdGhlIGV2ZXIgaW5jcmVhc2luZyBzcGVlZHMpLCBub3RoaW5nIGFib3V0 Cj4gdGhpcyBwYXRjaCBzZXQgaXMgcmV1c2FibGUgYW5kIHdvcnNlIG1vcmUgY2hlY2tzIGFyZSBu ZWVkZWQgaW4gdGhlIGZhc3QKPiBwYXRoLiBJIHRoaW5rIGl0IGlzIGJlc3QgaWYgeW91IG1ha2Ug dGhpcyBtb3JlIGdlbmVyaWMg4oCUIGF0IGxlYXN0Cj4gYW55dGhpbmcgdG91Y2hpbmcgY29yZSBj b2RlLgo+IAo+IEZvciBleGFtcGxlLCB5b3UgaGF2ZSBhbiBpb3Ygc3RhdGljIGtleSBob29rIG1h bmFnZWQgYnkgYSBkcml2ZXIgZm9yCj4gZ2VuZXJpYyBjb2RlLiBUaGVyZSBhcmUgYSBmZXcgd2F5 cyBhcm91bmQgdGhhdC4gT25lIGlzIGJ5IGFkZGluZyBza2IKPiBkZXRhaWxzIHRvIHRoZSBudm1l IGNvZGUg4oCUIGllLiwgd2Fsa2luZyB0aGUgc2tiIGZyYWdtZW50cywgc2VlaW5nIHRoYXQgYQo+ IGdpdmVuIGZyYWcgaXMgaW4geW91ciBhbGxvY2F0ZWQgbWVtb3J5IGFuZCBza2lwcGluZyB0aGUg Y29weS4gVGhpcyB3b3VsZAo+IG9mZmVyIGJlc3QgcGVyZm9ybWFuY2Ugc2luY2UgaXQgc2tpcHMg YWxsIHVubmVjZXNzYXJ5IGNoZWNrcy4gQW5vdGhlcgo+IG9wdGlvbiBpcyB0byBleHBvcnQgX19z a2JfZGF0YWdyYW1faXRlciwgdXNlIGl0IGFuZCBkZWZpbmUgeW91ciBvd24gY29weQo+IGhhbmRs ZXIgdGhhdCBkb2VzIHRoZSBhZGRyZXNzIGNvbXBhcmUgYW5kIHNraXBzIHRoZSBjb3B5LiBLZXkg cG9pbnQgLQo+IG9ubHkgeW91ciBjb2RlIHBhdGggaXMgYWZmZWN0ZWQuCgpJJ2xsIHN1Ym1pdCBW MiB0aGF0IGlzIGxlc3MgaW52YXNpdmUgdG8gY29yZSBjb2RlOgpJTU8gZXhwb3J0aW5nIF9fc2ti X2RhdGFncmFtX2l0ZXIgYW5kIGFsbCBjaGlsZCBmdW5jdGlvbnMKaXMgbW9yZSBnZW5lcmljLCBz byB3ZSdsbCB1c2UgdGhhdC4KCj4gCj4gU2ltaWxhcmx5IGZvciB0aGUgTlZNZSBTR0xzIGFuZCBE RFAgb2ZmbG9hZCAtIGEgbW9yZSBnZW5lcmljIHNvbHV0aW9uCj4gYWxsb3dzIG90aGVyIHVzZSBj YXNlcyB0byBidWlsZCBvbiB0aGlzIGFzIG9wcG9zZWQgdG8gdGhlIGNoZWNrcyB5b3UKPiB3YW50 IGZvciBhIHNwZWNpYWwgY2FzZS4gRm9yIGV4YW1wbGUsIGEgc3BsaXQgYXQgdGhlIHByb3RvY29s IGhlYWRlcnMgLwo+IHBheWxvYWQgYm91bmRhcmllcyB3b3VsZCBiZSBhIGdlbmVyaWMgc29sdXRp b24gd2hlcmUga2VybmVsIG1hbmFnZWQKPiBwcm90b2NvbHMgZ2V0IGRhdGEgaW4gb25lIGJ1ZmZl ciBhbmQgc29ja2V0IGRhdGEgaXMgcHV0IGludG8gYSBnaXZlbgo+IFNHTC4gSSBhbSBndWVzc2lu ZyB0aGF0IHlvdSBoYXZlIHRvIGJlIGFscmVhZHkgZG9pbmcgdGhpcyB0byBwdXQgUERVCj4gcGF5 bG9hZHMgaW50byBhbiBTR0wgYW5kIG90aGVyIGhlYWRlcnMgaW50byBvdGhlciBtZW1vcnkgdG8g bWFrZSBhCj4gY29tcGxldGUgcGFja2V0LCBzbyB0aGlzIGlzIG5vdCB0b28gZmFyIG9mZiBmcm9t IHdoYXQgeW91IGFyZSBhbHJlYWR5IGRvaW5nLgo+IAoKU3BsaXR0aW5nIGF0IHByb3RvY29sIGhl YWRlciBib3VuZGFyaWVzIGFuZCBwbGFjaW5nIGRhdGEgYXQgc29ja2V0IGRlZmluZWQKU0dMcyBp cyBub3QgZW5vdWdoIGZvciBudm1lLXRjcCBiZWNhdXNlIHRoZSBudm1lLXRjcCBwcm90b2NvbCBj YW4gcmVvcmRlcgpyZXNwb25zZXMuIEhlcmUgaXMgYW4gZXhhbXBsZToKCnRoZSBob3N0IHN1Ym1p dHMgdGhlIGZvbGxvd2luZyByZXF1ZXN0czoKKy0tLS0tLS0tKy0tLS0tLS0tKy0tLS0tLS0tKwp8 IFJlYWQgMSB8IFJlYWQgMiB8IFJlYWQgMyB8CistLS0tLS0tLSstLS0tLS0tLSstLS0tLS0tLSsK CnRoZSB0YXJnZXQgcmVzcG9uZHMgd2l0aCB0aGUgZm9sbG93aW5nIHJlc3BvbnNlczoKKy0tLS0t LS0tKy0tLS0tLS0tKy0tLS0tLS0tKwp8IFJlc3AgMiB8IFJlc3AgMyB8IFJlc3AgMSB8CistLS0t LS0tLSstLS0tLS0tLSstLS0tLS0tLSsKClRoZXJlZm9yZSwgaGFyZHdhcmUgbXVzdCBob2xkIGEg bWFwcGluZyBiZXR3ZWVuIFBEVSBpZGVudGlmaWVycyAoY29tbWFuZF9pZCkKYW5kIHRoZSBjb3Jy ZXNwb25kaW5nIGJ1ZmZlcnMuIFRoaXMgaW50ZXJmYWNlIGlzIG1pc3NpbmcgaW4gdGhlIHByb3Bv c2FsCmFib3ZlLCB3aGljaCBpcyB3aHkgaXQgd29uJ3Qgd29yayBmb3IgbnZtZS10Y3AuCgo+IExl dCBtZSB3YWxrIHRocm91Z2ggYW4gZXhhbXBsZSB3aXRoIGFzc3VtcHRpb25zIGFib3V0IHlvdXIg aGFyZHdhcmUncwo+IGNhcGFiaWxpdGllcywgYW5kIHlvdSBjb3JyZWN0IG1lIHdoZXJlIEkgYW0g d3JvbmcuIEFzc3VtZSB5b3UgaGF2ZSBhCj4gJ2Z1bGwnIGNvbW1hbmQgcmVzcG9uc2Ugb2YgdGhp cyBmb3JtOgo+IAo+ICArLS0tLS0tLS0tLS0tLSAuLi4gLS0tLS0tLS0tLS0tLS0tLSstLS0tLS0t LS0rLS0tLS0tLS0tKy0tLS0tLS0tKy0tLS0tKwo+ICB8ICAgICAgICAgIGJpZyBkYXRhIHNlZ21l bnQgICAgICAgIHwgUERVIGhkciB8IFRDUCBoZHIgfCBJUCBoZHIgfCBldGggfAo+ICArLS0tLS0t LS0tLS0tLSAuLi4gLS0tLS0tLS0tLS0tLS0tLSstLS0tLS0tLS0rLS0tLS0tLS0tKy0tLS0tLS0t Ky0tLS0tKwo+IAo+IGJ1dCBpdCBzaG93cyB1cCB0byB0aGUgaG9zdCBpbiAzIHBhY2tldHMgbGlr ZSB0aGlzIChpZGVhbCBjYXNlKToKPiAKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0rLS0t LS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgfCAgICAgICBkYXRhIC0gc2VnIDEg ICAgICB8IFBEVSBoZHIgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRoIHwKPiAgKy0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgKy0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0t LSsKPiAgfCAgICAgICBkYXRhIC0gc2VnIDIgICAgICAgICAgICAgICAgfCBUQ1AgaGRyIHwgSVAg aGRyIHwgZXRoIHwKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tKy0tLS0t LS0tLSstLS0tLS0tLSstLS0tLSsKPiAgICAgICAgICAgICAgICAgICAgKy0tLS0tLS0tLS0tLS0t LS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgICAgICAgICAgICAgICAgICAgfCBwYXls b2FkIC0gc2VnIDMgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRoIHwKPiAgICAgICAgICAgICAgICAg ICAgKy0tLS0tLS0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAKPiAKPiBU aGUgaGFyZHdhcmUgc3BsaXRzIHRoZSBldGgvSVAvdGNwIGhlYWRlcnMgZnJvbSBwYXlsb2FkIGxp a2UgdGhpcwo+IChhZ2FpbiwgeW91ciBoYXJkd2FyZSBoYXMgdG8ga25vdyB0aGVzZSBib3VuZGFy aWVzIHRvIGFjY29tcGxpc2ggd2hhdAo+IHlvdSB3YW50KToKPiAKPiAgKy0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0rLS0tLS0tLS0tKyAgICAgKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAg fCAgICAgICBkYXRhIC0gc2VnIDEgICAgICB8IFBEVSBoZHIgfCAgICAgfCBUQ1AgaGRyIHwgSVAg aGRyIHwgZXRoIHwKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tKyAgICAg Ky0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tKyAgICAgKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgfCAgICAgICBk YXRhIC0gc2VnIDIgICAgICAgICAgICAgICAgfCAgICAgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRo IHwKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tKyAgICAgKy0tLS0tLS0t LSstLS0tLS0tLSstLS0tLSsKPiAKPiAgICAgICAgICAgICAgICAgICAgKy0tLS0tLS0tLS0tLS0t LS0tKyAgICAgKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgICAgICAgICAgICAgICAgICAg fCBwYXlsb2FkIC0gc2VnIDMgfCAgICAgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRoIHwKPiAgICAg ICAgICAgICAgICAgICAgKy0tLS0tLS0tLS0tLS0tLS0tKyAgICAgKy0tLS0tLS0tLSstLS0tLS0t LSstLS0tLSsKPiAKPiBMZWZ0IHNpZGUgZ29lcyBpbnRvIHRoZSBTR0xzIHBvc3RlZCBmb3IgdGhp cyBzb2NrZXQgLyBmbG93OyB0aGUgcmlnaHQKPiBzaWRlIGdvZXMgaW50byBzb21lIG90aGVyIG1l bW9yeSByZXNvdXJjZSBtYWRlIGF2YWlsYWJsZSBmb3IgaGVhZGVycy4KPiBUaGlzIGlzIHZlcnkg Y2xvc2UgdG8gd2hhdCB5b3UgYXJlIGRvaW5nIG5vdyAtIHdpdGggdGhlIGV4Y2VwdGlvbiBvZiB0 aGUKPiBQRFUgaGVhZGVyIGJlaW5nIHB1dCB0byB0aGUgcmlnaHQgc2lkZS4gTlZNZSBjb2RlIHRo ZW4ganVzdCBuZWVkcyB0byBzZXQKPiB0aGUgaW92IG9mZnNldCAob3IgYWRqdXN0IHRoZSBiYXNl X2FkZHIpIHRvIHNraXAgb3ZlciB0aGUgUERVIGhlYWRlciAtCj4gc3RhbmRhcmQgb3B0aW9ucyBm b3IgYW4gaW92Lgo+IAo+IFllcywgVENQIGlzIGEgYnl0ZSBzdHJlYW0sIHNvIHRoZSBwYWNrZXRz IGNvdWxkIHZlcnkgd2VsbCBzaG93IHVwIGxpa2UgdGhpczoKPiAKPiAgKy0tLS0tLS0tLS0tLS0t Ky0tLS0tLS0tLSstLS0tLS0tLS0tLSstLS0tLS0tLS0rLS0tLS0tLS0rLS0tLS0rCj4gIHwgZGF0 YSAtIHNlZyAxIHwgUERVIGhkciB8IHByZXYgZGF0YSB8IFRDUCBoZHIgfCBJUCBoZHIgfCBldGgg fAo+ICArLS0tLS0tLS0tLS0tLS0rLS0tLS0tLS0tKy0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0t LS0tLSstLS0tLSsKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tKy0tLS0t LS0tLSstLS0tLS0tLSstLS0tLSsKPiAgfCAgICAgcGF5bG9hZCAtIHNlZyAyICAgICAgICAgICAg ICAgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRoIHwKPiAgKy0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgKy0tLS0tLS0tICstLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0tLSstLS0tLSsKPiAgfCBQRFUg aGRyIHwgICAgcGF5bG9hZCAtIHNlZyAzICAgICAgfCBUQ1AgaGRyIHwgSVAgaGRyIHwgZXRoIHwK PiAgKy0tLS0tLS0tLSstLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tKy0tLS0tLS0tLSstLS0tLS0t LSstLS0tLSsKPiAKPiBJZiB5b3VyIGhhcmR3YXJlIGNhbiBleHRyYWN0IHRoZSBOVk1lIHBheWxv YWQgaW50byBhIHRhcmdldGVkIFNHTCBsaWtlCj4geW91IHdhbnQgaW4gdGhpcyBzZXQsIHRoZW4g aXQgaGFzIHNvbWUgbG9naWMgZm9yIHBhcnNpbmcgaGVhZGVycyBhbmQKPiAic25hcHBpbmciIGFu IFNHTCB0byBhIG5ldyBlbGVtZW50LiBpZS4sIGl0IGFscmVhZHkga25vd3MgJ3ByZXYgZGF0YScK PiBnb2VzIHdpdGggdGhlIGluLXByb2dyZXNzIFBEVSwgc2VlcyBtb3JlIGRhdGEsIHJlY29nbml6 ZXMgYSBuZXcgUERVCj4gaGVhZGVyIGFuZCBhIG5ldyBwYXlsb2FkLiBUaGF0IG1lYW5zIGl0IGFs cmVhZHkgaGFzIHRvIGhhbmRsZSBhCj4gJ3NuYXAtdG8tUERVJyBzdHlsZSBhcmd1bWVudCB3aGVy ZSB0aGUgZW5kIG9mIHRoZSBwYXlsb2FkIGNsb3NlcyBvdXQgYW4KPiBTR0wgZWxlbWVudCBhbmQg dGhlIG5leHQgUERVIGhkciBzdGFydHMgaW4gYSBuZXcgU0dMIGVsZW1lbnQgKGllLiwgJ3ByZXYK PiBkYXRhJyBjbG9zZXMgb3V0IHNnbFtpXSwgYW5kIHRoZSBuZXh0IFBEVSBoZHIgc3RhcnRzIHNn bFtpKzFdKS4gU28gaW4KPiB0aGlzIGNhc2UsIHlvdSB3YW50ICdzbmFwLXRvLVBEVScgYnV0IHRo YXQgY291bGQganVzdCBhcyBlYXNpbHkgYmUgJ25vCj4gc25hcCBhdCBhbGwnLCBqdXN0IGEgYnl0 ZSBzdHJlYW0gYW5kIGZpbGxpbmcgYW4gU0dMIGFmdGVyIHRoZSBwcm90b2NvbAo+IGhlYWRlcnMu Cj4gCj4gS2V5IHBvaW50IGhlcmUgaXMgdGhhdCB0aGlzIGlzIHRoZSBzdGFydCBvZiBhIGdlbmVy aWMgaGVhZGVyIC8gZGF0YQo+IHNwbGl0IHRoYXQgY291bGQgd29yayBmb3Igb3RoZXIgYXBwbGlj YXRpb25zIC0gbm90IGp1c3QgTlZNZS4gZXRoL0lQL1RDUAo+IGhlYWRlcnMgYXJlIGNvbnN1bWVk IGJ5IHRoZSBMaW51eCBuZXR3b3JraW5nIHN0YWNrOyBkYXRhIGlzIGluCj4gYXBwbGljYXRpb24g b3duZWQsIHNvY2tldCBiYXNlZCBTR0xzIHRvIGF2b2lkIGNvcGllcy4KPiAKCkkgdGhpbmsgdGhh dCB0aGUgaW50ZXJmYWNlIHdlIGNyZWF0ZWQgKHRjcF9kZHApIGlzIHN1ZmZpY2llbnRseSBnZW5l cmljCmZvciB0aGUgdGFzayBhdCBoYW5kLCB3aGljaCBpcyBvZmZsb2FkaW5nIHByb3RvY29scyB0 aGF0IGNhbiByZS1vcmRlcgp0aGVpciByZXNwb25zZXMsIGEgbm9uLXRyaXZpYWwgdGFzayB0aGF0 IHdlIGNsYWltIGlzIGltcG9ydGFudC4KCldlIGRlc2lnbmVkIGl0IHRvIHN1cHBvcnQgb3RoZXIg cHJvdG9jb2xzIGFuZCBub3QganVzdCBudm1lLXRjcCwKd2hpY2ggaXMgbWVyZWx5IGFuIGV4YW1w bGUuIEZvciBpbnN0YW5jZSwgSSB0aGluayB0aGF0IHN1cHBvcnRpbmcgaVNDU0kKd291bGQgYmUg bmF0dXJhbCwgYW5kIHRoYXQgb3RoZXIgcHJvdG9jb2xzIHdpbGwgZml0IG5pY2VseS4KCj4gIyMj Cj4gCj4gQSBkdW1wIG9mIG90aGVyIGNvbW1lbnRzIGFib3V0IHRoaXMgcGF0Y2ggc2V0OgoKVGhh bmtzIGZvciByZXZpZXdpbmchIFdlIHdpbGwgZml4IGFuZCByZXN1Ym1pdC4KCj4gLSB0aGVyZSBh cmUgYSBMT1Qgb2YgdW5uZWNlc3NhcnkgdHlwZWNhc3RzIGFyb3VuZCB0Y3BfZGRwX2N0eCB0aGF0 IGNhbgo+IGJlIGF2b2lkZWQgYnkgdXNpbmcgY29udGFpbmVyX29mLgo+IAo+IC0geW91IGhhdmUg YW4gYWNjZXNzb3IgdGNwX2RkcF9nZXRfY3R4IGJ1dCBubyBzZXR0ZXI7IGFsbCB1c2VzIG9mCj4g dGNwX2RkcF9nZXRfY3R4IGFyZSB3aXRoaW4gbWx4NS4gd2h5IG9wZW4gY29kZSB0aGUgc2V0IGJ1 dCB1c2UgdGhlCj4gYWNjZXNzb3IgZm9yIHRoZSBnZXQ/IFdvcnNlLCBtbHg1ZV9udm1lb3RjcF9x dWV1ZV90ZWFyZG93biBhY3R1YWxseSBoYXMKPiBib3RoIOKAlCB1c2VzIHRoZSBhY2Nlc3NvciBh bmQgb3BlbiBjb2RlcyBzZXR0aW5nIGljc2tfdWxwX2RkcF9kYXRhLgo+IAo+IC0gdGhlIGRyaXZl ciBpcyBzdG9yaW5nIHByaXZhdGUgZGF0YSBvbiB0aGUgc29ja2V0LiBOb3RoaW5nIGFib3V0IHRo ZQo+IHNvY2tldCBsYXllciBjYXJlcyBhbmQgdGhlIG1seDUgZHJpdmVyIGlzIGFscmVhZHkgdHJh Y2tpbmcgdGhhdCBkYXRhIGluCj4gcHJpdi0+bnZtZW90Y3AtPnF1ZXVlX2hhc2guIEFzIEkgbWVu dGlvbmVkIGluIGEgcHJldmlvdXMgcmVzcG9uc2UsIEkKPiB1bmRlcnN0YW5kIHRoZSBzb2NrZXQg b3BzIGFyZSBuZWVkZWQgZm9yIHRoZSBkcml2ZXIgbGV2ZWwgdG8gY2FsbCBpbnRvCj4gdGhlIHNv Y2tldCBsYXllciwgYnV0IHRoZSBkYXRhIHBhcnQgZG9lcyBub3Qgc2VlbSB0byBiZSBuZWVkZWQu CgpUaGUgc29ja2V0IGxheWVyIGRvZXMgY2FyZTogdGhlIHNvY2tldCB3b3VsZG4ndCBkaXNhcHBl YXIgdW5kZXIgdGhlCmRyaXZlciBhcyBpdCB3aWxsIGNsZWFuIHRoaW5ncyB1cCBhbmQgY2FsbCB0 aGUgZHJpdmVyIGJlZm9yZSB0aGUgc29ja2V0CmRpc2FwcGVhcnMuIFRoaXMgcGFydCBpcyBzaW1p bGFyIHRvIHdoYXQgd2UgaGF2ZSBpbiB0bHMgd2hlcmUKZHJpdmVycyBzdG9yZSBzb21lIHByaXZh dGUgZGF0YSBwZXItc29ja2V0IHRvIGFzc2lzdCBvZmZsb2FkLgoKPiAKPiAtIG52bWVfdGNwX29m ZmxvYWRfc29ja2V0IGFuZCBudm1lX3RjcF9vZmZsb2FkX2xpbWl0cyBib3RoIHJldHVybiBpbnQK PiB5ZXQgdGhlIHZhbHVlIGlzIGlnbm9yZWQKPiAKClRoaXMgaXMgb24gcHVycG9zZS4gVXNlcnMg Y2FuIGtub3cgd2hldGhlciBpdCBpcyBzdWNjZXNzZnVsIG9yIG5vdCB1c2luZwpldGh0b29sIGNv dW50ZXJzIG9mIHRoZSBOSUMuIE9mZmxvYWQgaXMgb3Bwb3J0dW5pc3RpYyBhbmQgaXRzIGZhaWx1 cmUKaXMgbm9uLWZhdGFsLCBhbmQgYXMgbnZtZS10Y3AgaGFzIG5vIHN0YXRzIHdlIGludGVudGlv bmFsbHkgaWdub3JlIHRoZQpyZXR1cm5lZCB2YWx1ZXMgZm9yIG5vdy4KCj4gLSB0aGUgYnVpbGQg cm9ib3QgZm91bmQgYSBudW1iZXIgb2YgcHJvYmxlbXMgKGl0IHB1bGxzIG15IGdpdGh1YiB0cmVl Cj4gYW5kIEkgcHVzaGVkIHRoaXMgc2V0IHRvIGl0IHRvIG1vdmUgYWNyb3NzIGNvbXB1dGVycyku Cj4gCj4gSSB0aGluayB0aGUgcGF0Y2ggc2V0IHdvdWxkIGJlIGVhc2llciB0byBmb2xsb3cgaWYg eW91IHJlc3RydWN0dXJlZCB0aGUKPiBwYXRjaGVzIHRvIDEgdGhpbmcgb25seSBwZXIgcGF0Y2gg LS0gZS5nLiwgc3BsaXQgcGF0Y2ggMiBpbnRvIG5ldGRldgo+IGJpdHMgYW5kIHNvY2tldCBiaXRz LiBBZGQgdGhlIG5ldGRldiBmZWF0dXJlIGJpdCBhbmQgb3BlcmF0aW9ucyBpbiAxCj4gcGF0Y2gg YW5kIGFkZCB0aGUgc29ja2V0IG9wcyBpbiBhIHNlY29uZCBwYXRjaCB3aXRoIGJldHRlciBjb21t aXQgbG9ncwo+IGFib3V0IHdoeSBlYWNoIGlzIG5lZWRlZCBhbmQgd2hhdCBpcyBkb25lLgo+IAoK X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KTGludXgtbnZt ZSBtYWlsaW5nIGxpc3QKTGludXgtbnZtZUBsaXN0cy5pbmZyYWRlYWQub3JnCmh0dHA6Ly9saXN0 cy5pbmZyYWRlYWQub3JnL21haWxtYW4vbGlzdGluZm8vbGludXgtbnZtZQo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A023C4361B for ; Sun, 13 Dec 2020 18:22:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EF95A23123 for ; Sun, 13 Dec 2020 18:22:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728180AbgLMSWU (ORCPT ); Sun, 13 Dec 2020 13:22:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727413AbgLMSWU (ORCPT ); Sun, 13 Dec 2020 13:22:20 -0500 Received: from mail-ej1-x642.google.com (mail-ej1-x642.google.com [IPv6:2a00:1450:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB277C0613CF for ; Sun, 13 Dec 2020 10:21:39 -0800 (PST) Received: by mail-ej1-x642.google.com with SMTP id g20so19512891ejb.1 for ; Sun, 13 Dec 2020 10:21:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Ls/TM45a/2TY5gXY5BjrFTwylM7rwS7H7ZnBfdEPrfA=; b=YOBfepvFU3U7nDxKg8TkO6s2thPhYjGsmQM55AMUvhxzW9hUBn6VwZStajlfvGXv+2 s6BImEZLABw+W323lfg2RhCpOEhW7l9moBIRw6/cWOhCszsq3d1ZgzSkJEtHscayo7ma AjzI3uoNAleG0wgP5RGvpMrR6huWhbEaMmX07lM1c2BcyvuR8A2smXhaDvNX+expXUZN r1eKWUsYFSQrHdEiqGFU84ls8TOTolHkp5CcoBpMOkOCPOiKhFBjggMBrK2P3GKM57PR ojXKFcv7QBVg2ykOATU2DHMDwqTzN7XZpsC9lkH25c7bPSFNhhy2j90XU6jiXEscnCFL 31dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Ls/TM45a/2TY5gXY5BjrFTwylM7rwS7H7ZnBfdEPrfA=; b=EBIkRQmF7FXkm+58LghOqLEAjJezAceOfJ1VevU9v7a1LWJsPLiFxQcT8kQib262IR dxRZYTStzyo1ONlh3pYIfw+FTR0BIebu4GjFDpKZtMamjAw1kUslVBK3aZvxcJRF5cuG RoNnNmo1X83he/7r6NRs8ilN8siO6PVyZYxY/+aiPsnZv6qjBTNpjBzAtGkWCfN36DFz MG5m/TCos00Ztiss5UiE3zjiVVlDL9BNdRtwW1eX7U6uIUURzFRpWUXfro5EGP1Fn5N5 WnKKZ6OxUFlkPO+IABzS6wMDvpJJOkG1dAtTY3wZbasZA4CW3Iyp5ZZfQCJUieB1Wiiy At2Q== X-Gm-Message-State: AOAM5322o10mGZ+27oS4X9zCXiGyaOQjQsIun6nIhLrUlsOwtuwhrLau F0XyGrqIgHA4tkKMcbcOCR4= X-Google-Smtp-Source: ABdhPJwMd6oIqdXnkYAqIinjRuXZNq3+uGiZH0+NDDPEi/blLJOIgUFkZ79xJudqlpIBQFzjEyW2xg== X-Received: by 2002:a17:906:a8e:: with SMTP id y14mr19079272ejf.47.1607883698356; Sun, 13 Dec 2020 10:21:38 -0800 (PST) Received: from [192.168.1.11] ([213.57.108.142]) by smtp.gmail.com with ESMTPSA id d1sm11690591eje.82.2020.12.13.10.21.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 13 Dec 2020 10:21:37 -0800 (PST) Subject: Re: [PATCH v1 net-next 02/15] net: Introduce direct data placement tcp offload To: David Ahern , Boris Pismenny , kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz , Yoray Zack , Boris Pismenny References: <20201207210649.19194-1-borisp@mellanox.com> <20201207210649.19194-3-borisp@mellanox.com> <6f48fa5d-465c-5c38-ea45-704e86ba808b@gmail.com> <65dc5bba-13e6-110a-ddae-3d0c260aa875@gmail.com> <921a110f-60fa-a711-d386-39eeca52199f@gmail.com> From: Boris Pismenny Message-ID: Date: Sun, 13 Dec 2020 20:21:34 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: <921a110f-60fa-a711-d386-39eeca52199f@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 10/12/2020 6:26, David Ahern wrote: > On 12/9/20 1:15 AM, Boris Pismenny wrote: >> On 09/12/2020 2:38, David Ahern wrote: [...] >> >> There is more to this than TCP zerocopy that exists in userspace or >> inside the kernel. First, please note that the patches include support for >> CRC offload as well as data placement. Second, data-placement is not the same > > Yes, the CRC offload is different, but I think it is orthogonal to the > 'where does h/w put the data' problem. > I agree >> as zerocopy for the following reasons: >> (1) The former places buffers *exactly* where the user requests >> regardless of the order of response arrivals, while the latter places packets >> in anonymous buffers according to packet arrival order. Therefore, zerocopy >> can be implemented using data placement, but not vice versa. > > Fundamentally, it is an SGL and a TCP sequence number. There is a > starting point where seq N == sgl element 0, position 0. Presumably > there is a hardware cursor to track where you are in filling the SGL as > packets are processed. You abort on OOO, so it seems like a fairly > straightfoward problem. > We do not abort on OOO. Moreover, we can keep going as long as PDU headers are not reordered. >> (2) Data-placement supports sub-page zerocopy, unlike page-flipping >> techniques (i.e., TCP_ZEROCOPY). > > I am not pushing for or suggesting any page-flipping. I understand the > limitations of that approach. > >> (3) Page-flipping can't work for any storage initiator because the >> destination buffer is owned by some user pagecache or process using O_DIRECT. >> (4) Storage over TCP PDUs are not necessarily aligned to TCP packets, >> i.e., the PDU header can be in the middle of a packet, so header-data split >> alone isn't enough. > > yes, TCP is a byte stream and you have to have a cursor marking last > written spot in the SGL. More below. > >> >> I wish we could do the same using some simpler zerocopy mechanism, >> it would indeed simplify things. But, unfortunately this would severely >> restrict generality, no sub-page support and alignment between PDUs >> and packets, and performance (ordering of PDUs). >> > > My biggest concern is that you are adding checks in the fast path for a > very specific use case. If / when Rx zerocopy happens (and I suspect it > has to happen soon to handle the ever increasing speeds), nothing about > this patch set is reusable and worse more checks are needed in the fast > path. I think it is best if you make this more generic — at least > anything touching core code. > > For example, you have an iov static key hook managed by a driver for > generic code. There are a few ways around that. One is by adding skb > details to the nvme code — ie., walking the skb fragments, seeing that a > given frag is in your allocated memory and skipping the copy. This would > offer best performance since it skips all unnecessary checks. Another > option is to export __skb_datagram_iter, use it and define your own copy > handler that does the address compare and skips the copy. Key point - > only your code path is affected. I'll submit V2 that is less invasive to core code: IMO exporting __skb_datagram_iter and all child functions is more generic, so we'll use that. > > Similarly for the NVMe SGLs and DDP offload - a more generic solution > allows other use cases to build on this as opposed to the checks you > want for a special case. For example, a split at the protocol headers / > payload boundaries would be a generic solution where kernel managed > protocols get data in one buffer and socket data is put into a given > SGL. I am guessing that you have to be already doing this to put PDU > payloads into an SGL and other headers into other memory to make a > complete packet, so this is not too far off from what you are already doing. > Splitting at protocol header boundaries and placing data at socket defined SGLs is not enough for nvme-tcp because the nvme-tcp protocol can reorder responses. Here is an example: the host submits the following requests: +--------+--------+--------+ | Read 1 | Read 2 | Read 3 | +--------+--------+--------+ the target responds with the following responses: +--------+--------+--------+ | Resp 2 | Resp 3 | Resp 1 | +--------+--------+--------+ Therefore, hardware must hold a mapping between PDU identifiers (command_id) and the corresponding buffers. This interface is missing in the proposal above, which is why it won't work for nvme-tcp. > Let me walk through an example with assumptions about your hardware's > capabilities, and you correct me where I am wrong. Assume you have a > 'full' command response of this form: > > +------------- ... ----------------+---------+---------+--------+-----+ > | big data segment | PDU hdr | TCP hdr | IP hdr | eth | > +------------- ... ----------------+---------+---------+--------+-----+ > > but it shows up to the host in 3 packets like this (ideal case): > > +-------------------------+---------+---------+--------+-----+ > | data - seg 1 | PDU hdr | TCP hdr | IP hdr | eth | > +-------------------------+---------+---------+--------+-----+ > +-----------------------------------+---------+--------+-----+ > | data - seg 2 | TCP hdr | IP hdr | eth | > +-----------------------------------+---------+--------+-----+ > +-----------------+---------+--------+-----+ > | payload - seg 3 | TCP hdr | IP hdr | eth | > +-----------------+---------+--------+-----+ > > > The hardware splits the eth/IP/tcp headers from payload like this > (again, your hardware has to know these boundaries to accomplish what > you want): > > +-------------------------+---------+ +---------+--------+-----+ > | data - seg 1 | PDU hdr | | TCP hdr | IP hdr | eth | > +-------------------------+---------+ +---------+--------+-----+ > > +-----------------------------------+ +---------+--------+-----+ > | data - seg 2 | | TCP hdr | IP hdr | eth | > +-----------------------------------+ +---------+--------+-----+ > > +-----------------+ +---------+--------+-----+ > | payload - seg 3 | | TCP hdr | IP hdr | eth | > +-----------------+ +---------+--------+-----+ > > Left side goes into the SGLs posted for this socket / flow; the right > side goes into some other memory resource made available for headers. > This is very close to what you are doing now - with the exception of the > PDU header being put to the right side. NVMe code then just needs to set > the iov offset (or adjust the base_addr) to skip over the PDU header - > standard options for an iov. > > Yes, TCP is a byte stream, so the packets could very well show up like this: > > +--------------+---------+-----------+---------+--------+-----+ > | data - seg 1 | PDU hdr | prev data | TCP hdr | IP hdr | eth | > +--------------+---------+-----------+---------+--------+-----+ > +-----------------------------------+---------+--------+-----+ > | payload - seg 2 | TCP hdr | IP hdr | eth | > +-----------------------------------+---------+--------+-----+ > +-------- +-------------------------+---------+--------+-----+ > | PDU hdr | payload - seg 3 | TCP hdr | IP hdr | eth | > +---------+-------------------------+---------+--------+-----+ > > If your hardware can extract the NVMe payload into a targeted SGL like > you want in this set, then it has some logic for parsing headers and > "snapping" an SGL to a new element. ie., it already knows 'prev data' > goes with the in-progress PDU, sees more data, recognizes a new PDU > header and a new payload. That means it already has to handle a > 'snap-to-PDU' style argument where the end of the payload closes out an > SGL element and the next PDU hdr starts in a new SGL element (ie., 'prev > data' closes out sgl[i], and the next PDU hdr starts sgl[i+1]). So in > this case, you want 'snap-to-PDU' but that could just as easily be 'no > snap at all', just a byte stream and filling an SGL after the protocol > headers. > > Key point here is that this is the start of a generic header / data > split that could work for other applications - not just NVMe. eth/IP/TCP > headers are consumed by the Linux networking stack; data is in > application owned, socket based SGLs to avoid copies. > I think that the interface we created (tcp_ddp) is sufficiently generic for the task at hand, which is offloading protocols that can re-order their responses, a non-trivial task that we claim is important. We designed it to support other protocols and not just nvme-tcp, which is merely an example. For instance, I think that supporting iSCSI would be natural, and that other protocols will fit nicely. > ### > > A dump of other comments about this patch set: Thanks for reviewing! We will fix and resubmit. > - there are a LOT of unnecessary typecasts around tcp_ddp_ctx that can > be avoided by using container_of. > > - you have an accessor tcp_ddp_get_ctx but no setter; all uses of > tcp_ddp_get_ctx are within mlx5. why open code the set but use the > accessor for the get? Worse, mlx5e_nvmeotcp_queue_teardown actually has > both — uses the accessor and open codes setting icsk_ulp_ddp_data. > > - the driver is storing private data on the socket. Nothing about the > socket layer cares and the mlx5 driver is already tracking that data in > priv->nvmeotcp->queue_hash. As I mentioned in a previous response, I > understand the socket ops are needed for the driver level to call into > the socket layer, but the data part does not seem to be needed. The socket layer does care: the socket wouldn't disappear under the driver as it will clean things up and call the driver before the socket disappears. This part is similar to what we have in tls where drivers store some private data per-socket to assist offload. > > - nvme_tcp_offload_socket and nvme_tcp_offload_limits both return int > yet the value is ignored > This is on purpose. Users can know whether it is successful or not using ethtool counters of the NIC. Offload is opportunistic and its failure is non-fatal, and as nvme-tcp has no stats we intentionally ignore the returned values for now. > - the build robot found a number of problems (it pulls my github tree > and I pushed this set to it to move across computers). > > I think the patch set would be easier to follow if you restructured the > patches to 1 thing only per patch -- e.g., split patch 2 into netdev > bits and socket bits. Add the netdev feature bit and operations in 1 > patch and add the socket ops in a second patch with better commit logs > about why each is needed and what is done. >