From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754589AbZALRnt (ORCPT ); Mon, 12 Jan 2009 12:43:49 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752539AbZALRnl (ORCPT ); Mon, 12 Jan 2009 12:43:41 -0500 Received: from e3.ny.us.ibm.com ([32.97.182.143]:41295 "EHLO e3.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751936AbZALRnk (ORCPT ); Mon, 12 Jan 2009 12:43:40 -0500 Date: Mon, 12 Jan 2009 09:43:32 -0800 From: "Paul E. McKenney" To: Manfred Spraul Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, akpm@linux-foundation.org Subject: Re: [RFC, PATCH] kernel/rcu: add kfree_rcu Message-ID: <20090112174332.GA14675@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <200901021159.n02BxDLg024728@mail.q-ag.de> <49604BAD.5010405@cn.fujitsu.com> <4960603F.2030002@colorfullife.com> <496073AB.2030400@cn.fujitsu.com> <4960A9E8.3090309@colorfullife.com> <20090104200658.GN6958@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090104200658.GN6958@linux.vnet.ibm.com> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jan 04, 2009 at 12:06:58PM -0800, Paul E. McKenney wrote: > On Sun, Jan 04, 2009 at 01:22:00PM +0100, Manfred Spraul wrote: > > Lai Jiangshan wrote: > >> I have not posted it. -:) > >> > > Could you post it? > > > > Paul: What would break if we stop processing rcu entries in (cpu) order? > > If I understand, you are suggesting that a given CPU process its RCU > callbacks out of order. This would break rcu_barrier(), so please do > not do this. > > If I misunderstood what you are suggesting, please enlighten me! One other thing that might be really cool is for memory freed via RCU to be treated as if it was cache-cold, which it is unless the RCU callback needs to write to the memory block. In the case of kfree_rcu(), the callback should not need to do writes, so it might make sense to handle the block differently than the typical hot-in-cache free. Thanx, Paul > > The head->func(head) in rcu_do_batch() is probably a nightmare for the > > branch target predictor. > > > > What about: > > - shrinking struct rcu_head to just a pointer (let's start with the goodie) > > - Adding a register_rcu_callback() function. > > It allocates the per-cpu storage for the rcu grace period lists. > > Seperate lists for each registered callback - thus no need to copy the > > callback target into each rcu_head structure. > > It returns a pointer/handle to these lists. > > - call_rcu gets that handle instead of the plain function pointer. > > - rcu_do_batch enumerates all registered callbacks. Thus first all > > callback_struct->func(head) calls for the first registered callback, then > > the calls for the 2nd callback, etc. > > Better for the icache, better for the branch predictor. > > Hmmm... I guess that rcu_barrier() could put a callback on each of the > resulting per-CPU lists for each CPU. Making rcu_barrier() more > expensive is probably not a problem. But there would need to be a way > of marking rcu_barrier()'s rcu_head structures, perhaps the bottom bit > of the pointer (shudder!). > > The rcu_offline code will of course need to traverse these lists in > order to move the callbacks from an outgoing CPU. > > It would also be necessary to inspect the current call_rcu() invocations > in the kernel (not too big a job, as there are only about 100 of them). > If there are any that rely on callbacks being invoked in order, these > would need to be addressed if we are to do something like what you > are suggesting. I do not recall ever suggesting that people rely on > such ordering, but given that people can read the code and see that > rcu_barrier() already relies on it... > > So if we do go this way, we will need to update the documentation. > > The deep embedded guys would like a single-pointer rcu_head, and your > approach seems better than the one I came up with a couple of years ago > on page 11 of: > > http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf > > At least assuming that the problems can be resolved. > > I don't see how this helps the icache at all, but could see how it might > help branch prediction. > > > Paul: Do you have a test case that is suitable for benchmarking rcu? > > Any workloads were rcu appears significantly in oprofile? > > And: Do you know how many rcu entries are typically alive? How much memory > > is used for the function pointers? > > The test cases I know of are those used to validate the performance of > various RCU patches, most of which have been quite insensitive to the > update-side overhead. The only workloads that I am aware of where RCU > update-side processing shows up are those running on hundreds of CPUs > (hence hierarchical RCU). Some workloads have many thousands of RCU > callbacks in flight -- I believe that Dipankar Sarma measured something > like 1600 per grace period on a file-system benchmark some years back. > > The amount of memory used for the function pointers can be large, though > many cases now union this space with other storage (e.g., struct dentry). > The deep embedded guys have worried about it in the past, though I have > not heard much from them in the past few years -- something about even > cellphones having hundreds of megabytes of DRAM, I guess. ;-) > > So, in short, I am not sure that this will be worth the increase in code > complexity, but it does sound like an interesting possibility. > > Thanx, Paul > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/