From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754599AbcE3J7k (ORCPT ); Mon, 30 May 2016 05:59:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47683 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751137AbcE3J7j (ORCPT ); Mon, 30 May 2016 05:59:39 -0400 Subject: Re: [PATCH v5 0/2] skb_array: array based FIFO for skbs To: "Michael S. Tsirkin" , linux-kernel@vger.kernel.org References: <1464000201-15560-1-git-send-email-mst@redhat.com> Cc: Eric Dumazet , davem@davemloft.net, netdev@vger.kernel.org, Steven Rostedt , brouer@redhat.com From: Jason Wang Message-ID: <574C0F05.1040500@redhat.com> Date: Mon, 30 May 2016 17:59:33 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1464000201-15560-1-git-send-email-mst@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 30 May 2016 09:59:38 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2016年05月23日 18:43, Michael S. Tsirkin wrote: > This is in response to the proposal by Jason to make tun > rx packet queue lockless using a circular buffer. > My testing seems to show that at least for the common usecase > in networking, which isn't lockless, circular buffer > with indices does not perform that well, because > each index access causes a cache line to bounce between > CPUs, and index access causes stalls due to the dependency. > > By comparison, an array of pointers where NULL means invalid > and !NULL means valid, can be updated without messing up barriers > at all and does not have this issue. > > On the flip side, cache pressure may be caused by using large queues. > tun has a queue of 1000 entries by default and that's 8K. > At this point I'm not sure this can be solved efficiently. > The correct solution might be sizing the queues appropriately. > > Here's an implementation of this idea: it can be used more > or less whenever sk_buff_head can be used, except you need > to know the queue size in advance. > > It's using skb pointers but we switching to void * would be easy at cost > of type safety, though it appears that people want lockless push > etc so I'm not sure of the value. > > I didn't implement resizing but it's possible by holding > both consumer and producer locks. > > I think this code works fine without any extra memory barriers since we > always read and write the same location, so the accesses can not be > reordered. > Multiple writes of the same value into memory would mess things up > for us, I don't think compilers would do it though. > But if people feel it's better to be safe wrt compiler optimizations, > specifying queue as volatile would probably do it in a cleaner way > than converting all accesses to READ_ONCE/WRITE_ONCE. Thoughts? > > The only issue is with calls within a loop using the __skb_array_XXX > accessors - in theory compiler could hoist accesses out of the loop. > > Following volatile-considered-harmful.txt I merely > documented that callers that busy-poll should invoke cpu_relax(). > Most people will use the external skb_array_XXX APIs with a spinlock, > so this should not be an issue for them. > > changes since v4 (v3 was never posted) > documentation > dropped SKB_ARRAY_MIN_SIZE heuristic > unit test (in userspace, included as patch 2) > > changes since v2: > fixed integer overflow pointed out by Eric. > added some comments. > > changes since v1: > fixed bug pointed out by Eric. > > > Michael S. Tsirkin (2): > skb_array: array based FIFO for skbs > skb_array: ring test > > include/linux/skb_array.h | 127 +++++++++++++++++++++++++++++ > tools/virtio/ringtest/skb_array.c | 167 ++++++++++++++++++++++++++++++++++++++ > tools/virtio/ringtest/Makefile | 4 +- > 3 files changed, 297 insertions(+), 1 deletion(-) > create mode 100644 include/linux/skb_array.h > create mode 100644 tools/virtio/ringtest/skb_array.c > I change tun to use skb array, looks like it can give about 5% more faster than skb ring. And we usually don't need touch bhs during consume and produce (e.g for the case of tun). Thanks