From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Message-ID: <56C3CEDD.9010709@hpe.com> Date: Tue, 16 Feb 2016 20:37:33 -0500 From: Waiman Long MIME-Version: 1.0 To: Dave Chinner CC: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Alexander Viro , , , , Peter Zijlstra , Andi Kleen , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v2 1/3] lib/list_batch: A simple list insertion/deletion batching facility References: <1454095846-19628-1-git-send-email-Waiman.Long@hpe.com> <1454095846-19628-2-git-send-email-Waiman.Long@hpe.com> <20160201004708.GQ20456@dastard> <56B2893C.4030609@hpe.com> <20160206235744.GI31407@dastard> In-Reply-To: <20160206235744.GI31407@dastard> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: On 02/06/2016 06:57 PM, Dave Chinner wrote: > On Wed, Feb 03, 2016 at 06:11:56PM -0500, Waiman Long wrote: >> On 01/31/2016 07:47 PM, Dave Chinner wrote: >>> So at what point does simply replacing the list_head with a list_lru >>> become more efficient than this batch processing (i.e. >>> https://lkml.org/lkml/2015/3/10/660)? The list_lru isn't a great >>> fit for the inode list (doesn't need any of the special LRU/memcg >>> stuff https://lkml.org/lkml/2015/3/16/261) but it will tell us if, >>> like Ingo suggested, moving more towards a generic per-cpu list >>> would provide better overall performance... >> I will take a look at the list_lru patch to see if that help. As for >> the per-cpu list, I tried that and it didn't quite work out. > OK, see my last email as to why Andi's patch didn't change anything. > The list_lru implementation has a list per node, a lock per node, > and each item is placed on the list for the node it is physically > allocated from. Hence for local workloads, the list/lock that is > accessed for add/remove should be local to the node and hence should > reduce cache line contention mostly to within a single node. > > Cheers, > > Dave. I have just sent out a new patchset using per-cpu list with per-cpu locks. I used the per-cpu list as the changes will be simpler and easier to review. Please let me know your thought on that. Thanks, Longman