From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH 17/46] fs: Use rename lock and RCU for multi-step operations Date: Wed, 19 Jan 2011 09:42:34 +1100 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Nick Piggin , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Sage Weil To: Yehuda Sadeh Weinraub Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, Jan 19, 2011 at 9:32 AM, Yehuda Sadeh Weinraub wrote: > On Sat, Nov 27, 2010 at 1:44 AM, Nick Piggin wrot= e: >> The remaining usages for dcache_lock is to allow atomic, multi-step = read-side >> operations over the directory tree by excluding modifications to the= tree. >> Also, to walk in the leaf->root direction in the tree where we don't= have >> a natural d_lock ordering. >> >> This could be accomplished by taking every d_lock, but this would me= an a >> huge number of locks and actually gets very tricky. >> >> Solve this instead by using the rename seqlock for multi-step read-s= ide >> operations, retry in case of a rename so we don't walk up the wrong = parent. >> Concurrent dentry insertions are not serialised against. =A0Concurre= nt deletes >> are tricky when walking up the directory: our parent might have been= deleted >> when dropping locks so also need to check and retry for that. >> >> We can also use the rename lock in cases where livelock is a worry (= and it >> is introduced in subsequent patch). >> >> Signed-off-by: Nick Piggin > .. >> @@ -237,6 +238,7 @@ static struct dentry *d_kill(struct dentry *dent= ry, struct dentry *parent) >> =A0 =A0 =A0 =A0__releases(dcache_inode_lock) >> =A0 =A0 =A0 =A0__releases(dcache_lock) >> =A0{ >> + =A0 =A0 =A0 dentry->d_parent =3D NULL; >> =A0 =A0 =A0 =A0list_del(&dentry->d_u.d_child); >> =A0 =A0 =A0 =A0if (parent) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0spin_unlock(&parent->d_lock); > > There's an issue with ceph as it references the > dentry->d_parent(->d_inode) at dentry_release(), so setting > dentry->d_parent to NULL here doesn't work with ceph. Though there is > some workaround for it, we would like to be sure that this one is > really required so that we don't exacerbate the ugliness. The > workaround is to keep a pointer to the parent inode in the private > dentry structure, which will be referenced only at the .release() > callback. This is clearly not ideal. Hmm, I'll have to think about it. Probably we can check for d_count =3D=3D 0 rather than parent !=3D NULL I think?