From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-bn1bbn0105.outbound.protection.outlook.com ([157.56.111.105]:64336 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S965298AbcBQRmI (ORCPT ); Wed, 17 Feb 2016 12:42:08 -0500 Message-ID: <56C4B0E1.4090902@hpe.com> Date: Wed, 17 Feb 2016 12:41:53 -0500 From: Waiman Long MIME-Version: 1.0 To: Peter Zijlstra CC: Christoph Lameter , Dave Chinner , Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , , , Ingo Molnar , Andi Kleen , Dave Chinner , Scott J Norton , Douglas Hatch Subject: Re: [RFC PATCH 1/2] lib/percpu-list: Per-cpu list with associated per-cpu locks References: <1455672680-7153-1-git-send-email-Waiman.Long@hpe.com> <1455672680-7153-2-git-send-email-Waiman.Long@hpe.com> <20160217095318.GO14668@dastard> <20160217110040.GB6357@twins.programming.kicks-ass.net> <20160217110520.GN6375@twins.programming.kicks-ass.net> <56C49CCA.7090805@hpe.com> <56C4AA19.1080907@hpe.com> <20160217171845.GK6357@twins.programming.kicks-ass.net> In-Reply-To: <20160217171845.GK6357@twins.programming.kicks-ass.net> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 02/17/2016 12:18 PM, Peter Zijlstra wrote: > On Wed, Feb 17, 2016 at 12:12:57PM -0500, Waiman Long wrote: >> On 02/17/2016 11:27 AM, Christoph Lameter wrote: >>> On Wed, 17 Feb 2016, Waiman Long wrote: >>> >>>> I know we can use RCU for singly linked list, but I don't think we can use >>>> that for doubly linked list as there is no easy way to make atomic changes to >>>> both prev and next pointers simultaneously unless you are taking about 16b >>>> cmpxchg which is only supported in some architecture. >>> But its supported in the most important architecutes. You can fall back to >>> spinlocks on the ones that do not support it. >>> >> I guess with some limitations on how the lists can be traversed, we may be >> able to do that with RCU without lock. However, that will make the code more >> complex and harder to verify. Given that in both my and Dave's testing that >> contentions with list insertion and deletion are almost gone from the perf >> profile when they used to be a bottleneck, is it really worth the effort to >> do such a conversion? > My initial concern was the preempt disable delay introduced by holding > the spinlock over the entire iteration. > > There is no saying how many elements are on that list and there is no > lock break. But preempt_disable() is called at the beginning of the spin_lock() call. So the additional preempt_disable() in percpu_list_add() is just to cover the this_cpu_ptr() call to make sure that the cpu number doesn't change. So we are talking about a few ns at most here. Actually, I think I can remove the preempt_disable() and preempt_enable() calls as we just need to put list entry in one of the per-cpu lists. It doesn't need to be the same CPU of the current task. Cheers, Longman