From: Arjun Roy <arjunroy.kdev@gmail.com>
To: davem@davemloft.net, netdev@vger.kernel.org
Cc: arjunroy@google.com, edumazet@google.com, soheil@google.com
Subject: [net-next v3 6/8] net-zerocopy: Introduce short-circuit small reads.
Date: Wed, 2 Dec 2020 14:53:47 -0800 [thread overview]
Message-ID: <20201202225349.935284-7-arjunroy.kdev@gmail.com> (raw)
In-Reply-To: <20201202225349.935284-1-arjunroy.kdev@gmail.com>
From: Arjun Roy <arjunroy@google.com>
Sometimes, we may call tcp receive zerocopy when inq is 0,
or inq < PAGE_SIZE, or inq is generally small enough that
it is cheaper to copy rather than remap pages.
In these cases, we may want to either return early (inq=0) or
attempt to use the provided copy buffer to simply copy
the received data.
This allows us to save both system call overhead and
the latency of acquiring mmap_sem in read mode for cases where
it would be useless to do so.
This patchset enables this behaviour by:
1. Returning quickly if inq is 0.
2. Attempting to perform a regular copy if a hybrid copybuffer is
provided and it is large enough to absorb all available bytes.
3. Return quickly if no such buffer was provided and there are less
than PAGE_SIZE bytes available.
For small RPC ping-pong workloads, normally we would have
1 getsockopt(), 1 recvmsg() and 1 sendmsg() call per RPC. With this
change, we remove the recvmsg() call entirely, reducing the syscall
overhead by about 33%. In testing with small (hundreds of bytes)
RPC traffic, this yields a syscall reduction of about 33% and
an efficiency gain of about 3-5% when defined as QPS/CPU Util.
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
---
net/ipv4/tcp.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b2f24a5ec230..f67dd732a47b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1785,6 +1785,39 @@ static int find_next_mappable_frag(const skb_frag_t *frag,
return offset;
}
+static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
+ int nonblock, int flags,
+ struct scm_timestamping_internal *tss,
+ int *cmsg_flags);
+static int receive_fallback_to_copy(struct sock *sk,
+ struct tcp_zerocopy_receive *zc, int inq)
+{
+ unsigned long copy_address = (unsigned long)zc->copybuf_address;
+ struct scm_timestamping_internal tss_unused;
+ int err, cmsg_flags_unused;
+ struct msghdr msg = {};
+ struct iovec iov;
+
+ zc->length = 0;
+ zc->recv_skip_hint = 0;
+
+ if (copy_address != zc->copybuf_address)
+ return -EINVAL;
+
+ err = import_single_range(READ, (void __user *)copy_address,
+ inq, &iov, &msg.msg_iter);
+ if (err)
+ return err;
+
+ err = tcp_recvmsg_locked(sk, &msg, inq, /*nonblock=*/1, /*flags=*/0,
+ &tss_unused, &cmsg_flags_unused);
+ if (err < 0)
+ return err;
+
+ zc->copybuf_len = err;
+ return 0;
+}
+
static int tcp_copy_straggler_data(struct tcp_zerocopy_receive *zc,
struct sk_buff *skb, u32 copylen,
u32 *offset, u32 *seq)
@@ -1889,6 +1922,9 @@ static int tcp_zerocopy_receive(struct sock *sk,
sock_rps_record_flow(sk);
+ if (inq && inq <= copybuf_len)
+ return receive_fallback_to_copy(sk, zc, inq);
+
if (inq < PAGE_SIZE) {
zc->length = 0;
zc->recv_skip_hint = inq;
--
2.29.2.576.ga3fc446d84-goog
next prev parent reply other threads:[~2020-12-02 22:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-02 22:53 [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy Arjun Roy
2020-12-02 22:53 ` [net-next v3 1/8] net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy Arjun Roy
2020-12-02 22:53 ` [net-next v3 2/8] net-tcp: Introduce tcp_recvmsg_locked() Arjun Roy
2020-12-02 22:53 ` [net-next v3 3/8] net-zerocopy: Refactor skb frag fast-forward op Arjun Roy
2020-12-02 22:53 ` [net-next v3 4/8] net-zerocopy: Refactor frag-is-remappable test Arjun Roy
2020-12-02 22:53 ` [net-next v3 5/8] net-zerocopy: Fast return if inq < PAGE_SIZE Arjun Roy
2020-12-02 22:53 ` Arjun Roy [this message]
2020-12-02 22:53 ` [net-next v3 7/8] net-zerocopy: Set zerocopy hint when data is copied Arjun Roy
2020-12-02 22:53 ` [net-next v3 8/8] net-zerocopy: Defer vm zap unless actually needed Arjun Roy
2020-12-04 22:38 ` [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201202225349.935284-7-arjunroy.kdev@gmail.com \
--to=arjunroy.kdev@gmail.com \
--cc=arjunroy@google.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=netdev@vger.kernel.org \
--cc=soheil@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.