netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, kubakici@wp.pl
Subject: Re: [PATCH net-next V2 1/3] tap: use build_skb() for small packet
Date: Wed, 16 Aug 2017 06:55:15 +0300	[thread overview]
Message-ID: <20170816064951-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <1502855120.4936.89.camel@edumazet-glaptop3.roam.corp.google.com>

On Tue, Aug 15, 2017 at 08:45:20PM -0700, Eric Dumazet wrote:
> On Fri, 2017-08-11 at 19:41 +0800, Jason Wang wrote:
> > We use tun_alloc_skb() which calls sock_alloc_send_pskb() to allocate
> > skb in the past. This socket based method is not suitable for high
> > speed userspace like virtualization which usually:
> > 
> > - ignore sk_sndbuf (INT_MAX) and expect to receive the packet as fast as
> >   possible
> > - don't want to be block at sendmsg()
> > 
> > To eliminate the above overheads, this patch tries to use build_skb()
> > for small packet. We will do this only when the following conditions
> > are all met:
> > 
> > - TAP instead of TUN
> > - sk_sndbuf is INT_MAX
> > - caller don't want to be blocked
> > - zerocopy is not used
> > - packet size is smaller enough to use build_skb()
> > 
> > Pktgen from guest to host shows ~11% improvement for rx pps of tap:
> > 
> > Before: ~1.70Mpps
> > After : ~1.88Mpps
> > 
> > What's more important, this makes it possible to implement XDP for tap
> > before creating skbs.
> 
> 
> Well well well.
> 
> You do realize that tun_build_skb() is not thread safe ?

The issue is alloc frag, isn't it?
I guess for now we can limit this to XDP mode only, and
just allocate full pages in that mode.


> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 3982 Comm: syz-executor0 Not tainted 4.13.0-rc5-next-20170815+ #3
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff880069f265c0 task.stack: ffff880067688000
> RIP: 0010:__read_once_size include/linux/compiler.h:276 [inline]
> RIP: 0010:compound_head include/linux/page-flags.h:146 [inline]
> RIP: 0010:put_page include/linux/mm.h:811 [inline]
> RIP: 0010:__skb_frag_unref include/linux/skbuff.h:2743 [inline]
> RIP: 0010:skb_release_data+0x26c/0x790 net/core/skbuff.c:568
> RSP: 0018:ffff88006768ef20 EFLAGS: 00010206
> RAX: 00d70cb5b39acdeb RBX: dffffc0000000000 RCX: 1ffff1000ced1e13
> RDX: 0000000000000000 RSI: ffff88003ec28c38 RDI: 06b865ad9cd66f59
> RBP: ffff88006768f040 R08: ffffea0000ee74a0 R09: ffffed0007ab4200
> R10: 0000000000028c28 R11: 0000000000000010 R12: ffff88003c5581b0
> R13: ffffed000ced1dfb R14: 1ffff1000ced1df3 R15: 06b865ad9cd66f39
> FS:  00007ffbc9ef7700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002001aff0 CR3: 000000003d623000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  skb_release_all+0x4a/0x60 net/core/skbuff.c:631
>  __kfree_skb net/core/skbuff.c:645 [inline]
>  kfree_skb+0x15d/0x4c0 net/core/skbuff.c:663
>  __netif_receive_skb_core+0x10f8/0x33d0 net/core/dev.c:4425
>  __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4456
>  netif_receive_skb_internal+0x10b/0x5e0 net/core/dev.c:4527
>  netif_receive_skb+0xae/0x390 net/core/dev.c:4551
>  tun_rx_batched.isra.43+0x5e7/0x860 drivers/net/tun.c:1221
>  tun_get_user+0x11dd/0x2150 drivers/net/tun.c:1542
>  tun_chr_write_iter+0xd8/0x190 drivers/net/tun.c:1568
>  call_write_iter include/linux/fs.h:1742 [inline]
>  new_sync_write fs/read_write.c:457 [inline]
>  __vfs_write+0x684/0x970 fs/read_write.c:470
>  vfs_write+0x189/0x510 fs/read_write.c:518
>  SYSC_write fs/read_write.c:565 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:557
>  entry_SYSCALL_64_fastpath+0x1f/0xbe
> RIP: 0033:0x40bab1
> RSP: 002b:00007ffbc9ef6c00 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000000036 RCX: 000000000040bab1
> RDX: 0000000000000036 RSI: 0000000020002000 RDI: 0000000000000003
> RBP: 0000000000a5f870 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
> R13: 0000000000000000 R14: 00007ffbc9ef79c0 R15: 00007ffbc9ef7700
> Code: c6 e8 c9 78 8d fd 4c 89 e0 48 c1 e8 03 80 3c 18 00 0f 85 93 04 00 00 4d 8b 3c 24 41 c6 45 00 00 49 8d 7f 20 48 89 f8 48 c1 e8 03 <80> 3c 18 00 0f 85 6b 04 00 00 41 80 7d 00 00 49 8b 47 20 0f 85 
> RIP: __read_once_size include/linux/compiler.h:276 [inline] RSP: ffff88006768ef20
> RIP: compound_head include/linux/page-flags.h:146 [inline] RSP: ffff88006768ef20
> RIP: put_page include/linux/mm.h:811 [inline] RSP: ffff88006768ef20
> RIP: __skb_frag_unref include/linux/skbuff.h:2743 [inline] RSP: ffff88006768ef20
> RIP: skb_release_data+0x26c/0x790 net/core/skbuff.c:568 RSP: ffff88006768ef20
> ---[ end trace 54050eb1ec52ff83 ]---

  reply	other threads:[~2017-08-16  3:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11 11:41 [PATCH net-next V2 0/3] XDP support for tap Jason Wang
2017-08-11 11:41 ` [PATCH net-next V2 1/3] tap: use build_skb() for small packet Jason Wang
2017-08-16  3:45   ` Eric Dumazet
2017-08-16  3:55     ` Michael S. Tsirkin [this message]
2017-08-16  3:57       ` Jason Wang
2017-08-16  3:59         ` Michael S. Tsirkin
2017-08-16  4:07           ` Jason Wang
2017-08-16  9:17             ` Jason Wang
2017-08-16 16:30               ` David Miller
2017-08-16  3:55     ` Jason Wang
2017-08-16 10:24       ` Eric Dumazet
2017-08-16 13:16         ` Jason Wang
2017-08-11 11:41 ` [PATCH net-next V2 2/3] net: export some generic xdp helpers Jason Wang
2017-08-11 11:41 ` [PATCH net-next V2 3/3] tap: XDP support Jason Wang
2017-08-11 23:12   ` Jakub Kicinski
2017-08-12  2:48     ` Jason Wang
2017-08-14 16:01       ` Michael S. Tsirkin
2017-08-15  5:02         ` Jason Wang
2017-08-16  3:45           ` Michael S. Tsirkin
2017-08-14  8:43   ` Daniel Borkmann
2017-08-15  4:55     ` Jason Wang
2017-08-14  2:56 ` [PATCH net-next V2 0/3] XDP support for tap David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170816064951-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=kubakici@wp.pl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).