Re: [PATCH 1/3] tuntap: rx batching

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jason Wang <jasowang@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] tuntap: rx batching
Date: Tue, 15 Nov 2016 16:08:52 +0800	[thread overview]
Message-ID: <4eb99b34-29a5-fb67-a484-c345e9a2513d@redhat.com> (raw)
In-Reply-To: <20161115053435-mutt-send-email-mst@kernel.org>



On 2016年11月15日 11:41, Michael S. Tsirkin wrote:
> On Tue, Nov 15, 2016 at 11:14:48AM +0800, Jason Wang wrote:
>> >
>> >
>> >On 2016年11月12日 00:20, Michael S. Tsirkin wrote:
>>> > >On Fri, Nov 11, 2016 at 12:28:38PM +0800, Jason Wang wrote:
>>>> > > >
>>>> > > >On 2016年11月11日 12:17, John Fastabend wrote:
>>>>> > > > >On 16-11-10 07:31 PM, Michael S. Tsirkin wrote:
>>>>>>> > > > > > >On Fri, Nov 11, 2016 at 10:07:44AM +0800, Jason Wang wrote:
>>>>>>>>> > > > > > > > >
>>>>>>>>> > > > > > > > >On 2016年11月10日 00:38, Michael S. Tsirkin wrote:
>>>>>>>>>>> > > > > > > > > > >On Wed, Nov 09, 2016 at 03:38:31PM +0800, Jason Wang wrote:
>>>>>>>>>>>>> > > > > > > > > > > > >Backlog were used for tuntap rx, but it can only process 1 packet at
>>>>>>>>>>>>> > > > > > > > > > > > >one time since it was scheduled during sendmsg() synchronously in
>>>>>>>>>>>>> > > > > > > > > > > > >process context. This lead bad cache utilization so this patch tries
>>>>>>>>>>>>> > > > > > > > > > > > >to do some batching before call rx NAPI. This is done through:
>>>>>>>>>>>>> > > > > > > > > > > > >
>>>>>>>>>>>>> > > > > > > > > > > > >- accept MSG_MORE as a hint from sendmsg() caller, if it was set,
>>>>>>>>>>>>> > > > > > > > > > > > >     batch the packet temporarily in a linked list and submit them all
>>>>>>>>>>>>> > > > > > > > > > > > >     once MSG_MORE were cleared.
>>>>>>>>>>>>> > > > > > > > > > > > >- implement a tuntap specific NAPI handler for processing this kind of
>>>>>>>>>>>>> > > > > > > > > > > > >     possible batching. (This could be done by extending backlog to
>>>>>>>>>>>>> > > > > > > > > > > > >     support skb like, but using a tun specific one looks cleaner and
>>>>>>>>>>>>> > > > > > > > > > > > >     easier for future extension).
>>>>>>>>>>>>> > > > > > > > > > > > >
>>>>>>>>>>>>> > > > > > > > > > > > >Signed-off-by: Jason Wang<jasowang@redhat.com>
>>>>>>>>>>> > > > > > > > > > >So why do we need an extra queue?
>>>>>>>>> > > > > > > > >The idea was borrowed from backlog to allow some kind of bulking and avoid
>>>>>>>>> > > > > > > > >spinlock on each dequeuing.
>>>>>>>>> > > > > > > > >
>>>>>>>>>>> > > > > > > > > > >    This is not what hardware devices do.
>>>>>>>>>>> > > > > > > > > > >How about adding the packet to queue unconditionally, deferring
>>>>>>>>>>> > > > > > > > > > >signalling until we get sendmsg without MSG_MORE?
>>>>>>>>> > > > > > > > >Then you need touch spinlock when dequeuing each packet.
>>>>> > > > >Random thought, I have a cmpxchg ring I am using for the qdisc work that
>>>>> > > > >could possibly replace the spinlock implementation. I haven't figured
>>>>> > > > >out the resizing API yet because I did not need it but I assume it could
>>>>> > > > >help here and let you dequeue multiple skbs in one operation.
>>>>> > > > >
>>>>> > > > >I can post the latest version if useful or an older version is
>>>>> > > > >somewhere on patchworks as well.
>>>>> > > > >
>>>>> > > > >.John
>>>>> > > > >
>>>>> > > > >
>>>> > > >Look useful here, and I can compare the performance if you post.
>>>> > > >
>>>> > > >A question is can we extend the skb_array to support that?
>>>> > > >
>>>> > > >Thanks
>>> > >I'd like to start with simple patch adding napi with one queue, then add
>>> > >optimization patches on top.
>> >
>> >The point is tun is using backlog who uses two queues (process_queue and
>> >input_pkt_queue).
>> >
>> >How about something like:
>> >
>> >1) NAPI support with skb_array
> I would start with just write queue linked list. It all runs on a single
> CPU normally,

True for virt but I'm not sure the others. If we have multiple senders 
at the same time, current code scales very well.

>   so the nice reductions of cache line bounces due to skb
> array should never materialize.
>
> While we are at it, limiting the size of the queue might
> be a good idea. Kind of like TUNSETSNDBUF but 1. actually
> working where instead of tracking packets within net stack
> we make sndbuf track the internal buffer

Get your point, will start from simple skb list.

Thanks

     prev parent reply	other threads:[~2016-11-15  8:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09  7:38 [PATCH 1/3] tuntap: rx batching Jason Wang
2016-11-09  7:38 ` [PATCH 2/3] vhost: better detection of available buffers Jason Wang
2016-11-09 19:57   ` Michael S. Tsirkin
2016-11-11  2:18     ` Jason Wang
2016-11-11  3:41       ` Michael S. Tsirkin
2016-11-11  4:18         ` Jason Wang
2016-11-11 16:20           ` Michael S. Tsirkin
2016-11-15  3:16             ` Jason Wang
2016-11-15  3:28               ` Michael S. Tsirkin
2016-11-15  8:00                 ` Jason Wang
2016-11-15 14:46                   ` Michael S. Tsirkin
2016-11-09  7:38 ` [PATCH 3/3] vhost_net: tx support batching Jason Wang
2016-11-09 20:05   ` Michael S. Tsirkin
2016-11-11  2:27     ` Jason Wang
2016-11-09 16:38 ` [PATCH 1/3] tuntap: rx batching Michael S. Tsirkin
2016-11-11  2:07   ` Jason Wang
2016-11-11  3:31     ` Michael S. Tsirkin
2016-11-11  4:10       ` Jason Wang
2016-11-11  4:17       ` John Fastabend
2016-11-11  4:28         ` Jason Wang
2016-11-11  4:45           ` John Fastabend
2016-11-11 16:20           ` Michael S. Tsirkin
2016-11-15  3:14             ` Jason Wang
2016-11-15  3:41               ` Michael S. Tsirkin
2016-11-15  8:08                 ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4eb99b34-29a5-fb67-a484-c345e9a2513d@redhat.com \
    --to=jasowang@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).