netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: netdev@vger.kernel.org
Cc: Willem de Bruijn <willemb@google.com>
Subject: [PATCH RFC v2 11/12] packet: enable sendmsg zerocopy
Date: Wed, 22 Feb 2017 11:39:00 -0500	[thread overview]
Message-ID: <20170222163901.90834-12-willemdebruijn.kernel@gmail.com> (raw)
In-Reply-To: <20170222163901.90834-1-willemdebruijn.kernel@gmail.com>

From: Willem de Bruijn <willemb@google.com>

Support MSG_ZEROCOPY on PF_PACKET transmission.

Tested:
  pf_packet loopback test snd_zerocopy_lo -p -z produces:

  without zerocopy (-p):
    rx=0 (0 MB) tx=221696 txc=0
    rx=0 (0 MB) tx=443880 txc=0
    rx=0 (0 MB) tx=661056 txc=0
    rx=0 (0 MB) tx=877152 txc=0

  with zerocopy (-p -z):
    rx=0 (0 MB) tx=528548 txc=528544
    rx=0 (0 MB) tx=1052364 txc=1052360
    rx=0 (0 MB) tx=1571956 txc=1571952
    rx=0 (0 MB) tx=2094144 txc=2094140

  Packets do not arrive at the Rx socket due to a martian test:

    IPv4: martian destination 127.0.0.1 from 127.0.0.1, dev lo

  I'll need to revise snd_zerocopy_lo to bypass that.

Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 net/packet/af_packet.c | 52 ++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 42 insertions(+), 10 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 2bd0d1949312..af9ecc1edf72 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2754,28 +2754,55 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 
 static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
 				        size_t reserve, size_t len,
-				        size_t linear, int noblock,
+					size_t linear, int flags,
 				        int *err)
 {
 	struct sk_buff *skb;
+	size_t data_len;
 
-	/* Under a page?  Don't bother with paged skb. */
-	if (prepad + len < PAGE_SIZE || !linear)
-		linear = len;
+	if (flags & MSG_ZEROCOPY) {
+		/* Minimize linear, but respect header lower bound */
+		linear = reserve + min(len, max_t(size_t, linear, MAX_HEADER));
+		data_len = 0;
+	} else {
+		/* Under a page? Don't bother with paged skb. */
+		if (prepad + len < PAGE_SIZE || !linear)
+			linear = len;
+		data_len = len - linear;
+	}
 
-	skb = sock_alloc_send_pskb(sk, prepad + linear, len - linear, noblock,
-				   err, 0);
+	skb = sock_alloc_send_pskb(sk, prepad + linear, data_len,
+				   flags & MSG_DONTWAIT, err, 0);
 	if (!skb)
 		return NULL;
 
 	skb_reserve(skb, reserve);
 	skb_put(skb, linear);
-	skb->data_len = len - linear;
-	skb->len += len - linear;
+	skb->data_len = data_len;
+	skb->len += data_len;
 
 	return skb;
 }
 
+static int packet_zerocopy_sg_from_iovec(struct sk_buff *skb,
+					 struct msghdr *msg,
+					 int offset, size_t size)
+{
+	int ret;
+
+	/* if SOCK_DGRAM, head room was alloc'ed and holds ll-headers */
+	__skb_pull(skb, offset);
+	ret = zerocopy_sg_from_iter(skb, &msg->msg_iter);
+	__skb_push(skb, offset);
+	if (unlikely(ret))
+		return ret == -EMSGSIZE ? ret : -EIO;
+
+	if (!skb_zerocopy_alloc(skb, size))
+		return -ENOMEM;
+
+	return 0;
+}
+
 static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 {
 	struct sock *sk = sock->sk;
@@ -2853,7 +2880,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	linear = __virtio16_to_cpu(vio_le(), vnet_hdr.hdr_len);
 	linear = max(linear, min_t(int, len, dev->hard_header_len));
 	skb = packet_alloc_skb(sk, hlen + tlen, hlen, len, linear,
-			       msg->msg_flags & MSG_DONTWAIT, &err);
+			       msg->msg_flags, &err);
 	if (skb == NULL)
 		goto out_unlock;
 
@@ -2867,7 +2894,11 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	/* Returns -EFAULT on error */
-	err = skb_copy_datagram_from_iter(skb, offset, &msg->msg_iter, len);
+	if (msg->msg_flags & MSG_ZEROCOPY)
+		err = packet_zerocopy_sg_from_iovec(skb, msg, offset, len);
+	else
+		err = skb_copy_datagram_from_iter(skb, offset, &msg->msg_iter,
+						  len);
 	if (err)
 		goto out_free;
 
@@ -2913,6 +2944,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	return len;
 
 out_free:
+	skb_zcopy_abort(skb);
 	kfree_skb(skb);
 out_unlock:
 	if (dev)
-- 
2.11.0.483.g087da7b7c-goog

  parent reply	other threads:[~2017-02-22 16:39 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-22 16:38 [PATCH RFC v2 00/12] socket sendmsg MSG_ZEROCOPY Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 01/12] sock: allocate skbs from optmem Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 02/12] sock: skb_copy_ubufs support for compound pages Willem de Bruijn
2017-02-22 20:33   ` Eric Dumazet
2017-02-23  1:51     ` Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 03/12] sock: add generic socket zerocopy Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 04/12] sock: enable sendmsg zerocopy Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 05/12] sock: sendmsg zerocopy notification coalescing Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 06/12] sock: sendmsg zerocopy ulimit Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 07/12] sock: sendmsg zerocopy limit bytes per notification Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 08/12] tcp: enable sendmsg zerocopy Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 09/12] udp: " Willem de Bruijn
2017-02-22 16:38 ` [PATCH RFC v2 10/12] raw: enable sendmsg zerocopy with IP_HDRINCL Willem de Bruijn
2017-02-22 16:39 ` Willem de Bruijn [this message]
2017-02-22 16:39 ` [PATCH RFC v2 12/12] test: add sendmsg zerocopy tests Willem de Bruijn
2017-02-23 15:45 ` [PATCH RFC v2 00/12] socket sendmsg MSG_ZEROCOPY David Miller
2017-02-24 23:03 ` Alexei Starovoitov
2017-02-25  0:25   ` Willem de Bruijn
2017-02-27 18:57 ` Michael Kerrisk
2017-02-28 19:46   ` Andy Lutomirski
2017-02-28 20:43     ` Willem de Bruijn
     [not found]       ` <CAF=yD-K_0zO3pMeXf-UKGTsD4sNOdyN9KJkUb5MnCO_J5pisrA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-02-28 21:06         ` Andy Lutomirski
2017-03-01  3:28           ` David Miller
2017-03-01  3:43             ` Eric Dumazet
2017-03-02 19:26             ` Andy Lutomirski
2017-02-28 21:09         ` Andy Lutomirski
2017-02-28 21:28           ` Willem de Bruijn
2017-02-28 21:47           ` Eric Dumazet
     [not found]             ` <1488318476.9415.270.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2017-02-28 22:25               ` Andy Lutomirski
     [not found]                 ` <CALCETrVQj1AEsLEGGkWW1zApGz6_x2rDmE0wz4ft+O5h07f_Ug-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-02-28 22:40                   ` Eric Dumazet
2017-02-28 22:52                     ` Andy Lutomirski
2017-02-28 23:22                       ` Eric Dumazet
     [not found]                         ` <1488324131.9415.278.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2017-03-01  0:28                           ` Tom Herbert
     [not found]                             ` <CALx6S357ssnbEu7CMrczEjiX25QYBJh3WG=w8KuAoxGQS4aKLA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-03-01  0:37                               ` Eric Dumazet
2017-03-01  0:58                               ` Willem de Bruijn
2017-03-01  1:50                                 ` Tom Herbert
2017-03-01  3:25     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170222163901.90834-12-willemdebruijn.kernel@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).