From: "J. Bruce Fields" <bfields@redhat.com>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: Dave Chinner <david@fromorbit.com>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
"Theodore Ts'o" <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
swhiteho@redhat.com
Subject: Re: [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code
Date: Fri, 12 Jul 2013 18:07:32 -0400 [thread overview]
Message-ID: <20130712220731.GD20370@pad.fieldses.org> (raw)
In-Reply-To: <20130711100406.21b08420@tlielax.poochiereds.net>
On Thu, Jul 11, 2013 at 10:04:06AM -0400, Jeff Layton wrote:
> On Wed, 10 Jul 2013 17:26:21 -0400
> "J. Bruce Fields" <bfields@redhat.com> wrote:
>
> > On Wed, Jul 10, 2013 at 01:38:53PM +1000, Dave Chinner wrote:
> > > On Tue, Jul 09, 2013 at 10:40:59PM -0400, J. Bruce Fields wrote:
> > > > On Wed, Jul 10, 2013 at 12:09:21PM +1000, Dave Chinner wrote:
> > > > > Sure. I'd prefer ordering by inode number, because then ordering is
> > > > > deterministic rather than being dependent on memory allocation
> > > > > results. It makes forensic analysis of deadlocks and corruptions
> > > > > easier because you can look at on-disk structures and accurately
> > > > > predict locking behaviour and therefore determine the order of
> > > > > operations that should occur. With lock ordering determined by
> > > > > memory addresses, you can't easily predict the lock ordering two
> > > > > particular inodes might take from one operation to another.
> > > >
> > > > Hm, OK, not having done this I don't have a good feeling for how
> > > > important that is, but I can take your word for it.
> > > >
> > > > But the ext4 code actually originally used i_ino order and was changed
> > > > by 03bd8b9b896c8e "ext4: move_extent code cleanup", possibly on Linus's
> > > > suggestion?:
> > > >
> > > > http://mid.gmane.org/<CA+55aFwdh_QWG-R2FQ71kDXiNYZ04qPANBsY_PssVUwEBH4uSw@mail.gmail.com>
> > > >
> > > > "And the only sane order is comparing inode pointers, not inode
> > > > numbers like ext4 apparently does."
> > >
> > > Interesting. What has worked for the last 20 years must be wrong if
> > > Linus says so ;)
> > >
> > > >
> > > > (Uh, I thought I also remembered some rationale but can't dig up the
> > > > email now.)
> > >
> > > Probably duplicate inode numbers on inodes in different filesystems.
> > > But rename doesn't allow that, and I don't we ever want to allow
> > > arbitrary nested inode locking across superblocks. Hence I can't
> > > think of a reason why it's a problem...
> >
> > I have some vague memory the argument was rather that inode numbers
> > could fail to be unique within a fs due to bugs, but I may be making
> > that up. I've got no strong opinion here.
> >
>
> There are also legitimate cases where inode numbers can collide,
> particularly on network filesystems. That's one of the main reasons we
> have iget5_locked().
>
> One possibility might be to order by i_ino first, and then fall back to
> using the inode pointer value if they are equal.
As long as no one ever modifies i_ino. Which I'd think would be a
shooting offense. But it sure looks like fuse allows this--see
fuse_do_getattr->fuse_change_attributes->fuse_change_attributes_common.
Maybe I'm misunderstanding....
As long as there's a chance filesystems (even if only due to bugs) could
mess with this sort of guarantee I'm really inclined to stick with the
obviously-well-defined pointer ordering even if it means giving up the
determinism Dave wants. Argh.
> > > FWIW - gfs2 does multiple glock locking similar to XFS inode locking
> > > - it sorts the locks in lock number order and the locks them all one
> > > at a time...
Taking a look--I don't think I'm going to begin to understand how that's
used in any reasonable amount of time. Cc'ing Steve in case he can.
> > > A quick grep shows lock_2_inodes() in fs/ubifs/dir.c. I don't see
> > > any other obvious ones.
Which isn't bothering with consistent lock ordering because (says a
comment) its only called after taking the vfs locks. Which looks
correct--the only callers are in link, unlink, and rmdir methods. And a
similar lock_3_inodes is called from the rename method.
--b.
WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>
Cc: Dave Chinner <david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org>,
Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
"Theodore Ts'o" <tytso-3s7WtUTddSA@public.gmane.org>,
Andreas Dilger
<adilger.kernel-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>,
swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code
Date: Fri, 12 Jul 2013 18:07:32 -0400 [thread overview]
Message-ID: <20130712220731.GD20370@pad.fieldses.org> (raw)
In-Reply-To: <20130711100406.21b08420-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
On Thu, Jul 11, 2013 at 10:04:06AM -0400, Jeff Layton wrote:
> On Wed, 10 Jul 2013 17:26:21 -0400
> "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>
> > On Wed, Jul 10, 2013 at 01:38:53PM +1000, Dave Chinner wrote:
> > > On Tue, Jul 09, 2013 at 10:40:59PM -0400, J. Bruce Fields wrote:
> > > > On Wed, Jul 10, 2013 at 12:09:21PM +1000, Dave Chinner wrote:
> > > > > Sure. I'd prefer ordering by inode number, because then ordering is
> > > > > deterministic rather than being dependent on memory allocation
> > > > > results. It makes forensic analysis of deadlocks and corruptions
> > > > > easier because you can look at on-disk structures and accurately
> > > > > predict locking behaviour and therefore determine the order of
> > > > > operations that should occur. With lock ordering determined by
> > > > > memory addresses, you can't easily predict the lock ordering two
> > > > > particular inodes might take from one operation to another.
> > > >
> > > > Hm, OK, not having done this I don't have a good feeling for how
> > > > important that is, but I can take your word for it.
> > > >
> > > > But the ext4 code actually originally used i_ino order and was changed
> > > > by 03bd8b9b896c8e "ext4: move_extent code cleanup", possibly on Linus's
> > > > suggestion?:
> > > >
> > > > http://mid.gmane.org/<CA+55aFwdh_QWG-R2FQ71kDXiNYZ04qPANBsY_PssVUwEBH4uSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
> > > >
> > > > "And the only sane order is comparing inode pointers, not inode
> > > > numbers like ext4 apparently does."
> > >
> > > Interesting. What has worked for the last 20 years must be wrong if
> > > Linus says so ;)
> > >
> > > >
> > > > (Uh, I thought I also remembered some rationale but can't dig up the
> > > > email now.)
> > >
> > > Probably duplicate inode numbers on inodes in different filesystems.
> > > But rename doesn't allow that, and I don't we ever want to allow
> > > arbitrary nested inode locking across superblocks. Hence I can't
> > > think of a reason why it's a problem...
> >
> > I have some vague memory the argument was rather that inode numbers
> > could fail to be unique within a fs due to bugs, but I may be making
> > that up. I've got no strong opinion here.
> >
>
> There are also legitimate cases where inode numbers can collide,
> particularly on network filesystems. That's one of the main reasons we
> have iget5_locked().
>
> One possibility might be to order by i_ino first, and then fall back to
> using the inode pointer value if they are equal.
As long as no one ever modifies i_ino. Which I'd think would be a
shooting offense. But it sure looks like fuse allows this--see
fuse_do_getattr->fuse_change_attributes->fuse_change_attributes_common.
Maybe I'm misunderstanding....
As long as there's a chance filesystems (even if only due to bugs) could
mess with this sort of guarantee I'm really inclined to stick with the
obviously-well-defined pointer ordering even if it means giving up the
determinism Dave wants. Argh.
> > > FWIW - gfs2 does multiple glock locking similar to XFS inode locking
> > > - it sorts the locks in lock number order and the locks them all one
> > > at a time...
Taking a look--I don't think I'm going to begin to understand how that's
used in any reasonable amount of time. Cc'ing Steve in case he can.
> > > A quick grep shows lock_2_inodes() in fs/ubifs/dir.c. I don't see
> > > any other obvious ones.
Which isn't bothering with consistent lock ordering because (says a
comment) its only called after taking the vfs locks. Which looks
correct--the only callers are in link, unlink, and rmdir methods. And a
similar lock_3_inodes is called from the rename method.
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-07-12 22:07 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-03 20:12 [PATCH 00/12] Implement NFSv4 delegations, take 8 J. Bruce Fields
2013-07-03 20:12 ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2013-07-09 10:49 ` Jeff Layton
2013-07-09 10:49 ` Jeff Layton
2013-07-09 15:48 ` Theodore Ts'o
2013-07-09 22:04 ` Dave Chinner
2013-07-09 22:04 ` Dave Chinner
2013-07-10 0:21 ` J. Bruce Fields
2013-07-10 0:21 ` J. Bruce Fields
2013-07-10 2:09 ` Dave Chinner
2013-07-10 2:09 ` Dave Chinner
2013-07-10 2:40 ` J. Bruce Fields
2013-07-10 3:38 ` Dave Chinner
2013-07-10 3:38 ` Dave Chinner
2013-07-10 21:26 ` J. Bruce Fields
2013-07-10 21:26 ` J. Bruce Fields
2013-07-11 14:04 ` Jeff Layton
2013-07-11 14:04 ` Jeff Layton
2013-07-12 22:07 ` J. Bruce Fields [this message]
2013-07-12 22:07 ` J. Bruce Fields
2013-07-03 20:12 ` [PATCH 02/12] vfs: don't use PARENT/CHILD lock classes for non-directories J. Bruce Fields
2013-07-09 10:50 ` Jeff Layton
2013-07-09 10:50 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 03/12] vfs: rename I_MUTEX_QUOTA now that it's not used for quotas J. Bruce Fields
2013-07-03 20:12 ` J. Bruce Fields
2013-07-09 10:54 ` Jeff Layton
2013-07-09 10:54 ` Jeff Layton
2013-07-09 14:26 ` J. Bruce Fields
2013-07-09 14:31 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 04/12] vfs: take i_mutex on renamed file J. Bruce Fields
2013-07-03 20:12 ` J. Bruce Fields
2013-07-09 10:59 ` Jeff Layton
2013-07-09 10:59 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 05/12] locks: introduce new FL_DELEG lock flag J. Bruce Fields
2013-07-03 20:12 ` J. Bruce Fields
2013-07-09 11:00 ` Jeff Layton
2013-07-09 11:00 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 06/12] locks: implement delegations J. Bruce Fields
2013-07-09 12:23 ` Jeff Layton
2013-07-09 12:23 ` Jeff Layton
2013-07-09 14:41 ` J. Bruce Fields
2013-07-09 14:41 ` J. Bruce Fields
2013-07-03 20:12 ` [PATCH 07/12] namei: minor vfs_unlink cleanup J. Bruce Fields
2013-07-09 12:50 ` Jeff Layton
2013-07-09 12:50 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 08/12] locks: break delegations on unlink J. Bruce Fields
2013-07-09 13:05 ` Jeff Layton
2013-07-09 13:05 ` Jeff Layton
2013-07-09 13:07 ` Jeff Layton
2013-07-09 13:07 ` Jeff Layton
2013-07-09 15:58 ` J. Bruce Fields
2013-07-09 15:58 ` J. Bruce Fields
2013-07-09 16:02 ` Jeff Layton
2013-07-09 19:29 ` J. Bruce Fields
2013-07-09 19:29 ` J. Bruce Fields
2013-07-03 20:12 ` [PATCH 09/12] locks: helper functions for delegation breaking J. Bruce Fields
2013-07-09 13:09 ` Jeff Layton
2013-07-09 13:09 ` Jeff Layton
2013-07-09 19:31 ` J. Bruce Fields
2013-07-09 19:37 ` Jeff Layton
2013-07-09 13:23 ` Jeff Layton
2013-07-09 19:38 ` J. Bruce Fields
2013-07-09 20:28 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 10/12] locks: break delegations on rename J. Bruce Fields
2013-07-09 13:14 ` Jeff Layton
2013-07-09 13:14 ` Jeff Layton
2013-07-03 20:12 ` [PATCH 11/12] locks: break delegations on link J. Bruce Fields
2013-07-09 13:16 ` Jeff Layton
2013-07-09 13:16 ` Jeff Layton
2013-07-09 20:41 ` J. Bruce Fields
2013-07-09 20:41 ` J. Bruce Fields
2013-07-03 20:12 ` [PATCH 12/12] locks: break delegations on any attribute modification J. Bruce Fields
2013-07-09 13:30 ` Jeff Layton
2013-07-09 20:51 ` J. Bruce Fields
2013-07-09 20:51 ` J. Bruce Fields
2013-07-09 21:19 ` J. Bruce Fields
2013-07-09 21:19 ` J. Bruce Fields
2013-07-10 1:26 ` Jeff Layton
2013-07-10 1:26 ` Jeff Layton
2013-07-10 19:33 ` J. Bruce Fields
2013-07-10 19:33 ` J. Bruce Fields
2013-07-09 23:57 ` Jeff Layton
2013-07-09 23:57 ` Jeff Layton
-- strict thread matches above, loose matches on Subject: below --
2013-09-05 16:30 [PATCH 00/12] Implement NFSv4 delegations, take 10 J. Bruce Fields
2013-09-05 16:30 ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2013-04-17 1:46 [PATCH 00/12] Implement NFSv4 delegations, take 7 J. Bruce Fields
2013-04-17 1:46 ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2013-02-03 16:31 [PATCH 00/12] Implement NFSv4 delegations, take 6 J. Bruce Fields
2013-02-03 16:31 ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2012-10-16 22:01 [PATCH 00/12] Implement NFSv4 delegations, take 5 J. Bruce Fields
2012-10-16 22:01 ` [PATCH 01/12] vfs: pull ext4's double-i_mutex-locking into common code J. Bruce Fields
2012-10-16 22:01 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130712220731.GD20370@pad.fieldses.org \
--to=bfields@redhat.com \
--cc=adilger.kernel@dilger.ca \
--cc=david@fromorbit.com \
--cc=jlayton@poochiereds.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=swhiteho@redhat.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.