From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756739AbXKFKIU (ORCPT ); Tue, 6 Nov 2007 05:08:20 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753632AbXKFKIN (ORCPT ); Tue, 6 Nov 2007 05:08:13 -0500 Received: from mx1.redhat.com ([66.187.233.31]:60535 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750951AbXKFKIM (ORCPT ); Tue, 6 Nov 2007 05:08:12 -0500 Message-ID: <47303D07.4050404@redhat.com> Date: Tue, 06 Nov 2007 05:08:07 -0500 From: Chris Snook User-Agent: Thunderbird 2.0.0.5 (X11/20070719) MIME-Version: 1.0 To: Don Porter CC: linux-kernel@vger.kernel.org Subject: Re: [RFC/PATCH] Optimize zone allocator synchronization References: <20071104195212.GF16354@olive-green.cs.utexas.edu> In-Reply-To: <20071104195212.GF16354@olive-green.cs.utexas.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Don Porter wrote: > From: Donald E. Porter > > In the bulk page allocation/free routines in mm/page_alloc.c, the zone > lock is held across all iterations. For certain parallel workloads, I > have found that releasing and reacquiring the lock for each iteration > yields better performance, especially at higher CPU counts. For > instance, kernel compilation is sped up by 5% on an 8 CPU test > machine. In most cases, there is no significant effect on performance > (although the effect tends to be slightly positive). This seems quite > reasonable for the very small scope of the change. > > My intuition is that this patch prevents smaller requests from waiting > on larger ones. While grabbing and releasing the lock within the loop > adds a few instructions, it can lower the latency for a particular > thread's allocation which is often on the thread's critical path. > Lowering the average latency for allocation can increase system throughput. > > More detailed information, including data from the tests I ran to > validate this change are available at > http://www.cs.utexas.edu/~porterde/kernel-patch.html . > > Thanks in advance for your consideration and feedback. That's an interesting insight. My intuition is that Nick Piggin's recently-posted ticket spinlocks patches[1] will reduce the need for this patch, though it may be useful to have both. Can you benchmark again with only ticket spinlocks, and with ticket spinlocks + this patch? You'll probably want to use 2.6.24-rc1 as your baseline, due to the x86 architecture merge. -- Chris [1] http://lkml.org/lkml/2007/11/1/123