From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sowmini Varadhan Subject: [PATCH v3 net-next 0/7] RDS: zerocopy support Date: Thu, 15 Feb 2018 10:49:31 -0800 Message-ID: Cc: davem@davemloft.net, rds-devel@oss.oracle.com, santosh.shilimkar@oracle.com To: sowmini.varadhan@oracle.com, netdev@vger.kernel.org, willemdebruijn.kernel@gmail.com Return-path: Received: from userp2120.oracle.com ([156.151.31.85]:34656 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755737AbeBOUeI (ORCPT ); Thu, 15 Feb 2018 15:34:08 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w1FKW3V9121352 for ; Thu, 15 Feb 2018 20:34:07 GMT Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2g5g6s8ekr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Thu, 15 Feb 2018 20:34:07 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w1FKY6oF024931 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Thu, 15 Feb 2018 20:34:06 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w1FKY6i6027314 for ; Thu, 15 Feb 2018 20:34:06 GMT Sender: netdev-owner@vger.kernel.org List-ID: This is version 3 of the series, following up on review comments for http://patchwork.ozlabs.org/project/netdev/list/?series=28530 Review comments addressed Patch 4 - fix fragile use of skb->cb[], do not set ee_code incorrectly. Patch 5: - remove needless bzero of skb->cb[], consolidate err cleanup A brief overview of this feature follows. This patch series provides support for MSG_ZERCOCOPY on a PF_RDS socket based on the APIs and infrastructure added by Commit f214f915e7db ("tcp: enable MSG_ZEROCOPY") For single threaded rds-stress testing using rds-tcp with the ixgbe driver using 1M message sizes (-a 1M -q 1M) preliminary results show that there is a significant reduction in latency: about 90 usec with zerocopy, compared with 200 usec without zerocopy. This patchset modifies the above for zerocopy in the following manner. - if the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and, - if the SO_ZEROCOPY socket option has been set on the PF_RDS socket, application pages sent down with rds_sendmsg are pinned. The pinning uses the accounting infrastructure added by a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages"). The message is unpinned when all references to the message go down to 0, and the message is freed by rds_message_purge. A multithreaded application using this infrastructure must send down a unique 32 bit cookie as ancillary data with each sendmsg invocation. The format of this ancillary data is described in Patch 5 of the series. The cookie is passed up to the application on the sk_error_queue when the message is unpinned, indicating to the application that it is now safe to free/reuse the message buffer. The details of the completion notification are provided in Patch 4 of this series. Sowmini Varadhan (7): skbuff: export mm_[un]account_pinned_pages for other modules rds: hold a sock ref from rds_message to the rds_sock sock: permit SO_ZEROCOPY on PF_RDS socket rds: support for zcopy completion notification rds: zerocopy Tx support. selftests/net: add support for PF_RDS sockets selftests/net: add zerocopy support for PF_RDS test case include/linux/skbuff.h | 3 + include/uapi/linux/errqueue.h | 2 + include/uapi/linux/rds.h | 1 + net/core/skbuff.c | 6 +- net/core/sock.c | 25 +++--- net/rds/af_rds.c | 2 + net/rds/message.c | 132 ++++++++++++++++++++++++++- net/rds/rds.h | 17 ++++- net/rds/recv.c | 2 + net/rds/send.c | 51 ++++++++--- tools/testing/selftests/net/msg_zerocopy.c | 133 ++++++++++++++++++++++++++- 11 files changed, 339 insertions(+), 35 deletions(-)