All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nfs@vger.kernel.org, Miklos Szeredi <mszeredi@suse.cz>
Subject: Re: [PATCH] dcache: fix d_splice_alias handling of aliases
Date: Fri, 17 Jan 2014 10:39:17 -0500	[thread overview]
Message-ID: <20140117153917.GA26636@fieldses.org> (raw)
In-Reply-To: <20140117121723.GA18375@infradead.org>

On Fri, Jan 17, 2014 at 04:17:23AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 15, 2014 at 10:17:49AM -0500, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > d_splice_alias can create duplicate directory aliases (in the !new
> > case), or (in the new case) d_move without holding appropriate locks.
> > 
> > d_materialise_unique deals with both of these problems.  (The latter
> > seems to be dealt by trylocks (see __d_unalias), which look like they
> > could cause spurious lookup failures--but that's at least better than
> > corrupting the dcache.)
> 
> I'm a bit worried about those spurious failures, maybe we should
> retry in that case?

Maybe so.  I'm not sure how.  d_materialise_unique is called from lookup
and we'd need to at least drop the parent i_mutex to give a concurrent
rename a chance to progress.

I think NFS or cluster filesystem clients could hit this case with:

	host A		host B
	---------	-------------------------
	process 1	process 1	process 2
	---------	---------	---------

			mkdir foo/X
	mv foo/X bar/
			stat bar/X	mv baz qux


When (B,1) looks up X in bar it finds that X still has an alias in foo,
tries to rename that alias to bar/X, but can't because the current
baz->qux rename is holding the rename mutex.  So __d_unalias and the
lookup return -EBUSY.

None of those operations are particularly fast, so I'm a bit surprised
we haven't already heard complaints.  I must be missing some reason this
doesn't happen.  I guess I should set up a test.

> Also looking over the changes I wonder if the explicit cecking for
> aliases for every non-directory might have a major performance impact,
> all the dcache growling already was a major issues in NFS workloads
> years ago and I dumb it's become any better.

This only happens on the first (uncached) lookup.  So we've already
acquired a bunch of locks and probably done a round trip to a disk or a
server--is walking a (typically short) list really something to worry
about?

> Also looking at this area I'd like to suggest that if you end up
> merging the two I'd continue using the d_splice_alias name and
> calling conventions.

OK, I guess I don't care which one we keep.

> Also the inode == NULL case really should be split out from
> d_materialise_unique into a separate helper.  It shares almost no
> code, is entirely undocumented to the point that I don't really
> understand what the purpose is, and the only caller that can get
> there (fuse) already branches around that case in the caller anyway.

I think I see what you mean, I can fix that.

--b.

WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org>
Subject: Re: [PATCH] dcache: fix d_splice_alias handling of aliases
Date: Fri, 17 Jan 2014 10:39:17 -0500	[thread overview]
Message-ID: <20140117153917.GA26636@fieldses.org> (raw)
In-Reply-To: <20140117121723.GA18375-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>

On Fri, Jan 17, 2014 at 04:17:23AM -0800, Christoph Hellwig wrote:
> On Wed, Jan 15, 2014 at 10:17:49AM -0500, J. Bruce Fields wrote:
> > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > 
> > d_splice_alias can create duplicate directory aliases (in the !new
> > case), or (in the new case) d_move without holding appropriate locks.
> > 
> > d_materialise_unique deals with both of these problems.  (The latter
> > seems to be dealt by trylocks (see __d_unalias), which look like they
> > could cause spurious lookup failures--but that's at least better than
> > corrupting the dcache.)
> 
> I'm a bit worried about those spurious failures, maybe we should
> retry in that case?

Maybe so.  I'm not sure how.  d_materialise_unique is called from lookup
and we'd need to at least drop the parent i_mutex to give a concurrent
rename a chance to progress.

I think NFS or cluster filesystem clients could hit this case with:

	host A		host B
	---------	-------------------------
	process 1	process 1	process 2
	---------	---------	---------

			mkdir foo/X
	mv foo/X bar/
			stat bar/X	mv baz qux


When (B,1) looks up X in bar it finds that X still has an alias in foo,
tries to rename that alias to bar/X, but can't because the current
baz->qux rename is holding the rename mutex.  So __d_unalias and the
lookup return -EBUSY.

None of those operations are particularly fast, so I'm a bit surprised
we haven't already heard complaints.  I must be missing some reason this
doesn't happen.  I guess I should set up a test.

> Also looking over the changes I wonder if the explicit cecking for
> aliases for every non-directory might have a major performance impact,
> all the dcache growling already was a major issues in NFS workloads
> years ago and I dumb it's become any better.

This only happens on the first (uncached) lookup.  So we've already
acquired a bunch of locks and probably done a round trip to a disk or a
server--is walking a (typically short) list really something to worry
about?

> Also looking at this area I'd like to suggest that if you end up
> merging the two I'd continue using the d_splice_alias name and
> calling conventions.

OK, I guess I don't care which one we keep.

> Also the inode == NULL case really should be split out from
> d_materialise_unique into a separate helper.  It shares almost no
> code, is entirely undocumented to the point that I don't really
> understand what the purpose is, and the only caller that can get
> there (fuse) already branches around that case in the caller anyway.

I think I see what you mean, I can fix that.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-01-17 15:39 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-15 15:17 [PATCH] dcache: fix d_splice_alias handling of aliases J. Bruce Fields
2014-01-15 15:17 ` J. Bruce Fields
2014-01-15 17:34 ` Miklos Szeredi
2014-01-15 17:57   ` J. Bruce Fields
2014-01-15 18:25     ` Miklos Szeredi
2014-01-15 18:25       ` Miklos Szeredi
2014-01-16 15:41       ` J. Bruce Fields
2014-01-16 16:13         ` Miklos Szeredi
2014-01-16 16:13           ` Miklos Szeredi
2014-01-16 16:10 ` J. Bruce Fields
2014-01-16 16:10   ` J. Bruce Fields
2014-01-16 16:15   ` Steven Whitehouse
2014-01-16 16:44     ` J. Bruce Fields
2014-01-16 16:44       ` J. Bruce Fields
2014-01-16 16:54       ` Bob Peterson
2014-01-16 18:51         ` J. Bruce Fields
2014-01-17 10:04           ` Steven Whitehouse
2014-01-17 18:04             ` J. Bruce Fields
2014-01-17 12:17 ` Christoph Hellwig
2014-01-17 15:39   ` J. Bruce Fields [this message]
2014-01-17 15:39     ` J. Bruce Fields
2014-01-17 21:03     ` J. Bruce Fields
2014-01-17 21:03       ` J. Bruce Fields
2014-01-17 21:26       ` J. Bruce Fields
2014-01-17 21:26         ` J. Bruce Fields
2014-01-23 21:27         ` [PATCH] dcache: make d_splice_alias use d_materialise_unique J. Bruce Fields
2014-01-31 18:42           ` Al Viro
2014-01-31 19:47             ` J. Bruce Fields
2014-02-06 17:03               ` J. Bruce Fields
2014-02-06 17:03                 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140117153917.GA26636@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mszeredi@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.