From: Louis Rilling <Louis.Rilling@kerlabs.com>
To: Joel.Becker@oracle.com
Cc: ocfs2-devel@oss.oracle.com, linux-kernel@vger.kernel.org
Subject: [Ocfs2-devel] [BUG] deadlock between configfs_rmdir() and sys_rename() (WAS Re: [RFC][PATCH 4/4] configfs: Make multiple default_group) destructions lockdep friendly
Date: Wed, 11 Jun 2008 16:10:10 +0200 [thread overview]
Message-ID: <20080611141010.GP18153@localhost> (raw)
In-Reply-To: <20080610173654.GA23829@ca-server1.us.oracle.com>
On Tue, Jun 10, 2008 at 10:36:54AM -0700, Joel Becker wrote:
> On Tue, Jun 10, 2008 at 12:14:27PM +0200, Louis Rilling wrote:
> > On Mon, Jun 09, 2008 at 06:58:00PM -0700, Joel Becker wrote:
> > > I'm not against the latter AT ALL. I just haven't come up with
> > > it yet - we can't remove parts of the tree, it must be all or none.
> > > Hence, we lock them all speculatively.
> >
> > I'm slowly thinking about a solution, but I don't know the VFS enough
> > yet, especially regarding dentry invalidation and locking. Would it be possible
> > to start an rmdir() by moving the group to some unreachable place? We could
> > probably use the mutex of the configfs subsystem and enlarge its scope to
> > protect against concurrent mkdir(), check for user-created items under the
> > group (and descendent default groups) to remove, invalidate all dentries and
> > actually remove the group. Do you think that something is feasible in this way?
>
> Nope, because you may have live objects below you - the rmdir
> should fail, and nothing should change. Sure, you could put it back,
> but in the middle there is a period where another process tries to look
> at the tree and gets ENOENT. That's not right.
Sure. I was more thinking about moving the to-be-removed group only once we ware
sure that nobody would use it anymore (thanks to the subsys mutex enlarging its
scope for instance). Anyway, see below.
> But blocking lookup another way might work. If we keep that
> rename process out of looking up its targets (blocked on a lock we hold)
> it might work.
> Note, btw, that the create side (populate_groups) is safe,
> because we hold the creating parent's i_mutex throughout the entire
> process.
Yes, mkdir() is completely ok. In fact, I am able to formally prove its locking
correctness (from lockdep's point of view) provided that I_MUTEX_PARENT ->
I_MUTEX_CHILD is correct. I'll come back to lockdep annotations once the
rmdir() bug is fixed.
> Hey, can we use d_revalidate? Here's the issue. rename, when
> going to lookup the objects it wants to lock, is getting them out of
> cached_lookup - there dcache locking is all that protects it. I was
> first thinking we could take the dentry locks to block this out. But
> rather, why not fail d_revalidate and force a locked lookup? So, when
> we go to lock one of these groups for detaching, we also set a flag on
> the configfs_dirent. We add a configfs_d_revalidate function that
> returns based on that flag - if set, revalidation is needed. Thus, when
> another process comes in to look at the object we've already locked, it
> blocks waiting to find it.
> See, in do_rename, it does do_path_lookup() before actually
> calling lock_rename(). It would block there waiting for our speculative
> removal. We'd either fail rmdir, and those lookups would succeed, or
> we'd succeed rmdir, and the lookup fails.
> The only concern is, can the reverse happen? We get past the
> lookups in do_rename(), and then another process comes into rmdir() -
> will they deadlock there?
No we cannot make d_revalidate() temporarily fail, because it will either make
do_lookup() fail (return value < 0), or will tell do_revalidate() to call
d_invalidate() (return value == 0), which will either definitely invalidate the
dentry stored in item->ci_dentry (unlikely because its use count should be > 1,
right?), or return success to do_lookup().
The good news is that we can make d_revalidate() block on whatever we use to
protect rmdir() against racing operations. I'll try to send a patch levaraging
the subsystem's mutexes later in the day.
Louis
--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes
next prev parent reply other threads:[~2008-06-11 14:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-22 11:40 [Ocfs2-devel] [RFC][PATCH 0/4] configfs: Make nested default groups lockdep-friendly v2 Louis Rilling
2008-05-22 11:40 ` [Ocfs2-devel] [RFC][PATCH 1/4] Prepare i_mutex lockdep subclasses for locking of variable path lengths Louis Rilling
2008-05-22 11:40 ` [Ocfs2-devel] [RFC][PATCH 2/4] Prepare vfs_rmdir() for further nested i_mutex locking Louis Rilling
2008-05-22 11:40 ` [Ocfs2-devel] [RFC][PATCH 3/4] configfs: Make nested default groups creations lockdep-friendly Louis Rilling
2008-05-22 11:40 ` [Ocfs2-devel] [RFC][PATCH 4/4] configfs: Make multiple default_group destructions lockdep friendly Louis Rilling
2008-05-23 16:44 ` Louis Rilling
2008-06-02 23:07 ` Joel Becker
2008-06-03 16:00 ` Louis Rilling
2008-06-06 23:01 ` Joel Becker
2008-06-09 11:03 ` Louis Rilling
2008-06-09 12:54 ` [Ocfs2-devel] [BUG] deadlock between configfs_rmdir() and sys_rename() (WAS Re: [RFC][PATCH 4/4] configfs: Make multiple default_group) " Louis Rilling
2008-06-10 1:58 ` Joel Becker
2008-06-10 10:14 ` Louis Rilling
2008-06-10 17:36 ` Joel Becker
2008-06-11 14:10 ` Louis Rilling [this message]
2008-06-11 17:30 ` Louis Rilling
2008-06-11 21:15 ` Joel Becker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080611141010.GP18153@localhost \
--to=louis.rilling@kerlabs.com \
--cc=Joel.Becker@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).