From mboxrd@z Thu Jan 1 00:00:00 1970 From: john stultz Subject: Re: [patch 11/33] fs: dcache scale subdirs Date: Tue, 22 Jun 2010 19:03:16 -0700 Message-ID: <1277258596.1685.16.camel@localhost> References: <20090904065142.114706411@nick.local0.net> <20090904065535.609317663@nick.local0.net> <1276787615.27822.426.camel@twins> <20100617165329.GA6138@laptop> <1277127322.1875.516.camel@laptop> <20100621144806.GC31679@laptop> <1277132103.1875.519.camel@laptop> <1277186557.1791.7.camel@work-vm> <1277191631.1875.525.camel@laptop> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Cc: Nick Piggin , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, John Kacur , Thomas Gleixner To: Peter Zijlstra Return-path: In-Reply-To: <1277191631.1875.525.camel@laptop> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Tue, 2010-06-22 at 09:27 +0200, Peter Zijlstra wrote: > On Mon, 2010-06-21 at 23:02 -0700, john stultz wrote: > > On Mon, 2010-06-21 at 16:55 +0200, Peter Zijlstra wrote: > > > On Tue, 2010-06-22 at 00:48 +1000, Nick Piggin wrote: > > > > > Right, so I was staring at the -rt splat, so its John who created that > > > > > wreckage? > > > > > > > > It was, but apparently they saw an RCU bug there somewhere and hit it > > > > with the big hammer. I haven't been able to reproduce it on a non-rt > > > > kernel yet, and I see yet why RCU is not good enough here. > > > > > > John, could you describe the failure you spotted? > > > > The problem was that the rcu_read_lock() on the dentry ascending wasn't > > preventing d_put/d_kill from removing entries from the parent node. So > > the next entry we tried to follow was invalid. So we were getting odd > > oopses from select_parent(). > > > > I'm not as familiar with the rcu rules there, so the patch I made just > > held the locks as it went down the chain. Not ideal of course, but still > > an improvement over the dcache_lock that was there prior. > > > > Peter: I'm sorry, I've been out for a few days. Can you give me some > > background on what brought this up and what -rt splat you mean? > > Well, you make lockdep very unhappy by locking multiple dentries > (unbounded number) all in the same lock class. So.. Is there a way to tell lockdep that the nesting is ok (I thought that was what the spin_lock_nested call was doing...)? Or is locking a (possibly quite long) chain of objects really just a do-not-do type of operation? thanks -john