From mboxrd@z Thu Jan 1 00:00:00 1970 From: Denys Vlasenko Subject: Re: [rfc][patch] store-free path walking Date: Thu, 8 Oct 2009 15:12:08 +0200 Message-ID: <1158166a0910080612h29d93d50y875d5305cd4d985f@mail.gmail.com> References: <20091006101414.GM5216@kernel.dk> <20091007164622.GX30316@wotan.suse.de> <87eipfymcv.fsf@basil.nowhere.org> <20091007210651.GB1656@one.firstfloor.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: Andi Kleen , Nick Piggin , Jens Axboe , Linux Kernel Mailing List , linux-fsdevel@vger.kernel.org, Ravikiran G Thirumalai , Peter Zijlstra To: Linus Torvalds Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Oct 7, 2009 at 11:57 PM, Linus Torvalds wrote: > This, btw, is exactly the kind of thing we saw with some of the > non-temporal work, when we used nontemporal stores to copy pages on COW > faults, or when doing pre-zeroing of pages. You get rid of some of the > hot-spots in the kernel, and you then replace them with user space taking > the cache misses in random spots instead. The kernel profile looks better, > and system time may go down, but actual performace never went down - you > just moved your cache miss cost from one place to another. A few years ago when K7s were not ancient yet, after hearing argument for and against non-temporal stores, I decided to finally figure it for myself. I tested kernel build workload on two kernels with the only one difference - clear_page with and without non-temporal stores. "Non-temporal stores" kernel was faster, not slower. Just a little bit, but reproducibly. -- vda