From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756654AbZBVX70 (ORCPT ); Sun, 22 Feb 2009 18:59:26 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754747AbZBVX7S (ORCPT ); Sun, 22 Feb 2009 18:59:18 -0500 Received: from one.firstfloor.org ([213.235.205.2]:49351 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754004AbZBVX7R (ORCPT ); Sun, 22 Feb 2009 18:59:17 -0500 To: Mel Gorman Cc: Linux Memory Management List , Pekka Enberg , Rik van Riel , KOSAKI Motohiro , Christoph Lameter , Johannes Weiner , Nick Piggin , Linux Kernel Mailing List , Lin Ming , Zhang Yanmin Subject: Re: [RFC PATCH 00/20] Cleanup and optimise the page allocator From: Andi Kleen References: <1235344649-18265-1-git-send-email-mel@csn.ul.ie> Date: Mon, 23 Feb 2009 00:57:37 +0100 In-Reply-To: <1235344649-18265-1-git-send-email-mel@csn.ul.ie> (Mel Gorman's message of "Sun, 22 Feb 2009 23:17:09 +0000") Message-ID: <87prhauiry.fsf@basil.nowhere.org> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mel Gorman writes: > The complexity of the page allocator has been increasing for some time > and it has now reached the point where the SLUB allocator is doing strange > tricks to avoid the page allocator. This is obviously bad as it may encourage > other subsystems to try avoiding the page allocator as well. Congratulations! That was long overdue. Haven't read the patches yet though. > Patch 15 reduces the number of times interrupts are disabled by reworking > what free_page_mlock() does. However, I notice that the cost of calling > TestClearPageMlocked() is still quite high and I'm guessing it's because > it's a locked bit operation. It's be nice if it could be established if > it's safe to use an unlocked version here. Rik, can you comment? What machine was that again? > Patch 16 avoids using the zonelist cache on non-NUMA machines My suspicion is that it can be even dropped on most small (all?) NUMA systems. > Patch 20 gets rid of hot/cold freeing of pages because it incurs cost for > what I believe to be very dubious gain. I'm not sure we currently gain > anything by it but it's further discussed in the patch itself. Yes the hot/cold thing was always quite dubious. > Counters are surprising expensive, we spent a good chuck of our time in > functions like __dec_zone_page_state and __dec_zone_state. In a profiled > run of kernbench, the time spent in __dec_zone_state was roughly equal to > the combined cost of the rest of the page free path. A quick check showed > that almost half of the time in that function is spent on line 233 alone > which for me is; > > (*p)--; > > That's worth a separate investigation but it might be a case that > manipulating int8_t on the machine I was using for profiling is unusually > expensive. What machine was that? In general I wouldn't expect even on a system with slow char operations to be that expensive. It sounds more like a cache miss or a cache line bounce. You could possibly confirm by using appropiate performance counters. > Converting this to an int might be faster but the increased > memory consumption and cache footprint might be a problem. Opinions? One possibility would be to move the zone statistics to allocated per cpu data. Or perhaps just stop counting per zone at all and only count per cpu. > The downside is that the patches do increase text size because of the > splitting of the fast path into one inlined blob and the slow path into a > number of other functions. On my test machine, text increased by 1.2K so > I might revisit that again and see how much of a difference it really made. > > That all said, I'm seeing good results on actual benchmarks with these > patches. > > o On many machines, I'm seeing a 0-2% improvement on kernbench. The dominant Neat. > So, by and large it's an improvement of some sort. That seems like an understatement. -Andi -- ak@linux.intel.com -- Speaking for myself only.