From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Date: Wed, 12 Jan 2005 23:16:54 +0000 Subject: Re: page table lock patch V15 [0/7]: overview Message-Id: <41E5AFE6.6000509@yahoo.com.au> List-Id: References: <41E4BCBE.2010001@yahoo.com.au> <20050112014235.7095dcf4.akpm@osdl.org> <20050112104326.69b99298.akpm@osdl.org> In-Reply-To: <20050112104326.69b99298.akpm@osdl.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Andrew Morton Cc: Christoph Lameter , torvalds@osdl.org, ak@muc.de, hugh@veritas.com, linux-mm@kvack.org, linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, benh@kernel.crashing.org Andrew Morton wrote: > Christoph Lameter wrote: > >>>Do we have measurements of the negative and/or positive impact on smaller >> >> > machines? >> >> Here is a measurement of 256M allocation on a 2 way SMP machine 2x >> PIII-500Mhz: >> >> Gb Rep Threads User System Wall flt/cpu/s fault/wsec >> 0 10 1 0.005s 0.016s 0.002s 54357.280 52261.895 >> 0 10 2 0.008s 0.019s 0.002s 43112.368 42463.566 >> >> With patch: >> >> Gb Rep Threads User System Wall flt/cpu/s fault/wsec >> 0 10 1 0.005s 0.016s 0.002s 54357.280 53439.357 >> 0 10 2 0.008s 0.018s 0.002s 44650.831 44202.412 >> >> So only a very minor improvements for old machines (this one from ~ 98). > > > OK. But have you written a test to demonstrate any performance > regressions? From, say, the use of atomic ops on ptes? > Performance wise, Christoph's never had as much of a problem as my patches because it isn't doing extra atomic operations in copy_page_range. However, it looks like it should be. For the same reason there needs to be an atomic read in handle_mm_fault. And it probably needs atomic ops in other places too, I think. So my patches cost about 7% in lmbench fork benchmark... however, I've been thinking we could take the mmap_sem for writing before doing the copy_page_range which could reduce the need for atomic ops.