All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arjun Roy <arjunroy.kdev@gmail.com>
To: davem@davemloft.net, netdev@vger.kernel.org
Cc: arjunroy@google.com, edumazet@google.com, soheil@google.com
Subject: [net-next v2 6/8] net-zerocopy: Introduce short-circuit small reads.
Date: Wed,  2 Dec 2020 14:09:43 -0800	[thread overview]
Message-ID: <20201202220945.911116-7-arjunroy.kdev@gmail.com> (raw)
In-Reply-To: <20201202220945.911116-1-arjunroy.kdev@gmail.com>

From: Arjun Roy <arjunroy@google.com>

Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.

In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to simply copy
the received data.

This allows us to save both system call overhead and
the latency of acquiring mmap_sem in read mode for cases where
it would be useless to do so.

This patchset enables this behaviour by:
1. Returning quickly if inq is 0.
2. Attempting to perform a regular copy if a hybrid copybuffer is
   provided and it is large enough to absorb all available bytes.
3. Return quickly if no such buffer was provided and there are less
   than PAGE_SIZE bytes available.

For small RPC ping-pong workloads, normally we would have
1 getsockopt(), 1 recvmsg() and 1 sendmsg() call per RPC. With this
change, we remove the recvmsg() call entirely, reducing the syscall
overhead by about 33%. In testing with small (hundreds of bytes)
RPC traffic, this yields a syscall reduction of about 33% and
an efficiency gain of about 3-5% when defined as QPS/CPU Util.
---
 net/ipv4/tcp.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b2f24a5ec230..f67dd732a47b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1785,6 +1785,39 @@ static int find_next_mappable_frag(const skb_frag_t *frag,
 	return offset;
 }
 
+static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
+			      int nonblock, int flags,
+			      struct scm_timestamping_internal *tss,
+			      int *cmsg_flags);
+static int receive_fallback_to_copy(struct sock *sk,
+				    struct tcp_zerocopy_receive *zc, int inq)
+{
+	unsigned long copy_address = (unsigned long)zc->copybuf_address;
+	struct scm_timestamping_internal tss_unused;
+	int err, cmsg_flags_unused;
+	struct msghdr msg = {};
+	struct iovec iov;
+
+	zc->length = 0;
+	zc->recv_skip_hint = 0;
+
+	if (copy_address != zc->copybuf_address)
+		return -EINVAL;
+
+	err = import_single_range(READ, (void __user *)copy_address,
+				  inq, &iov, &msg.msg_iter);
+	if (err)
+		return err;
+
+	err = tcp_recvmsg_locked(sk, &msg, inq, /*nonblock=*/1, /*flags=*/0,
+				 &tss_unused, &cmsg_flags_unused);
+	if (err < 0)
+		return err;
+
+	zc->copybuf_len = err;
+	return 0;
+}
+
 static int tcp_copy_straggler_data(struct tcp_zerocopy_receive *zc,
 				   struct sk_buff *skb, u32 copylen,
 				   u32 *offset, u32 *seq)
@@ -1889,6 +1922,9 @@ static int tcp_zerocopy_receive(struct sock *sk,
 
 	sock_rps_record_flow(sk);
 
+	if (inq && inq <= copybuf_len)
+		return receive_fallback_to_copy(sk, zc, inq);
+
 	if (inq < PAGE_SIZE) {
 		zc->length = 0;
 		zc->recv_skip_hint = inq;
-- 
2.29.2.576.ga3fc446d84-goog


  parent reply	other threads:[~2020-12-02 22:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02 22:09 [net-next v2 0/8] Perf. optimizations for TCP Recv. Zerocopy Arjun Roy
2020-12-02 22:09 ` [net-next v2 1/8] net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy Arjun Roy
2020-12-03  0:15   ` Stephen Hemminger
2020-12-03  0:24     ` Arjun Roy
2020-12-03 23:01     ` David Laight
2020-12-03 23:14       ` Eric Dumazet
2020-12-04  9:02         ` David Laight
2020-12-03 23:19       ` Arjun Roy
2020-12-03 23:24         ` Arjun Roy
2020-12-04  9:03           ` David Laight
2020-12-04 22:37             ` Arjun Roy
2020-12-02 22:09 ` [net-next v2 2/8] net-tcp: Introduce tcp_recvmsg_locked() Arjun Roy
2020-12-02 22:09 ` [net-next v2 3/8] net-zerocopy: Refactor skb frag fast-forward op Arjun Roy
2020-12-02 22:09 ` [net-next v2 4/8] net-zerocopy: Refactor frag-is-remappable test Arjun Roy
2020-12-02 22:09 ` [net-next v2 5/8] net-zerocopy: Fast return if inq < PAGE_SIZE Arjun Roy
2020-12-02 22:09 ` Arjun Roy [this message]
2020-12-02 22:09 ` [net-next v2 7/8] net-zerocopy: Set zerocopy hint when data is copied Arjun Roy
2020-12-02 22:09 ` [net-next v2 8/8] net-zerocopy: Defer vm zap unless actually needed Arjun Roy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202220945.911116-7-arjunroy.kdev@gmail.com \
    --to=arjunroy.kdev@gmail.com \
    --cc=arjunroy@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=soheil@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.