From: "J. Bruce Fields" <bfields@fieldses.org>
To: Mi Jinlong <mijinlong@cn.fujitsu.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
Jeff Layton <jlayton@redhat.com>,
NFSv3 list <linux-nfs@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, ebiederm@xmission.com,
adobriyan@gmail.com, viro@ZenIV.linux.org.uk,
jamie@shareable.org
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file
Date: Fri, 21 May 2010 17:07:38 -0400 [thread overview]
Message-ID: <20100521210738.GK11675@fieldses.org> (raw)
In-Reply-To: <4BF504DE.7010804@cn.fujitsu.com>
On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
> J. Bruce Fields :
> > I don't know of any existing lock that does exactly what we want.
> >
> > Somebody at citi worked on a better lease implementation for a while,
> > but I don't think we ever really got it right; the last version I can
> > find is here:
> >
> > git://linux-nfs.org/~bfields/linux-topics.git leases
>
> When reading the code of the git, i found a patch which try to fix
> the lease's problem, but only for unlink.
It's 8 patches together:
> commit id: d5a08e556116c66fb60a448f805a40bf54314634
> msg: "leases: break file leases on unlink."
>
> In this patch, it seems only add break_lease() and some other functions
> but seems don't avoid the problem of race.
Look again: break_lease() is there, but there's also a break_lease_end()
after the unlink.
> Or there is some different
> at break_lease() with the community's kernel.
>
> Can you give me some message about the new lease? Thanks.
So the 8 patches at that branch are:
leases: introduce per-inode lease enabling/disabling
VFS: clean up extra dereferences in do_unlinkat()
leases: break file leases on unlink.
leases: break file leases on rename.
leases: break leases on chown.
VFS: refactor sys_fchmod and sys_fchmodat
leases: break leases on chmod.
leases: break leases on link.
Like I say, I don't think they're correct or I'd just copy them all to
the list. But maybe the comment on the first patch (appended) is
useful.
--b.
leases: introduce per-inode lease enabling/disabling
The current file lease implementation is inadequate (for the purposes of
nfs, and, we believe, for the purposes of Samba), in at least two ways:
- Leases are broken only conflicting opens; but both nfsv4
delegations and (we're told) Windows op locks actually require
that leases be broken on any operation that changes file
metadata--including unlink, link, rename, chmod, and chown.
- The internal kernel api used for lease-breaking is inherently
racy, consisting as it does of a single break_lease() call.
(Consider this scenario: a file is not currently open and is
about to be unlinked. During unlink processing, a lookup is
done, and break_lease() is called. After the break_lease(),
but before the unlink completes, another user opens the file
and gets a read lease. The unlink then completes, but the
other user thinks their read lease is still valid. This
situation would be avoided if lease-granting for the inode
were disabled for the duration of the unlink.)
We're primarily interested in the case of read leases for now. (Write
leases, which also must be broken on *access* to a file, are more
difficult to get completely right, and aren't used by the current nfs
server.)
Fixing the second problem requires replacing break_lease() by a pair of
calls, here called break_lease() and break_lease_end(), between which
new leases are temporarily prohibited.
We want to implement that temporary prohibition in a simple way that has
low impact in common (uncontended) cases.
This patch adds a field, i_leasecount, which provides mutual exclusion
between inode-modifying operations and read leases in the same way the
i_writecount provides mutual exclusion between write opens and execs:
when i_leasecount is positive, it counts the number of leases on the
given inode, and when it's negative it counts the number of operations
which want leases temporarily disabled. This allows selective
enabling/disabling of leases on a per-inode basis.
To that end, the functions leases_get_access() and leases_put_access()
are used when a lease is granted and returned, respectively. The
functions leases_deny_access() and leases_allow_access() are used to
prevent races between breaking-with-FMODE_WRITE and write-lease-granting
for the entire duration of a file operation. Currently, leases are
broken only when a file is opened or truncated; these functions will
allow leases to be broken on things like unlink and rename as well.
NFSv4 implements delegations using leases, and needs its delegations to
be revoked on unlinks, renames, chowns, etc.
Note that this patch changes break_lease() and __break_lease(), such
that when they are called with FMODE_WRITE and return successfully, they
will leave leases disabled on the inode in question, and the caller must
eventually call break_lease_end() to re-enable leasing. As alluded to
in the scenario above, this behavior isn't necessary when breaking
without FMODE_WRITE: existing and new read-leases wouldn't need to be
revoked or blocked; and a write-lease-granting setlease won't race the
break_lease() because the latter is presumed to have been preceded by
something like a dget() on the dentry in question (where d_count or
i_count > 1 blocks write-lease-granting).
This patch also closes a small existing open/lease race: a lease-related
race exists between the time that outstanding leases are broken (by
may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
the inode's i_writecount variable (which will prevent subsequent
lease-granting setlease calls). Conceivably, a read lease could be
granted in the interim.
To deal with this, may_open() is modified so that, on success and when
called with FMODE_WRITE, it will return with lease-granting disabled for
the inode in question. do_filp_open() is modified so that leasing is
re-enabled once everything is finished. Analogous changes are made on
truncation.
next prev parent reply other threads:[~2010-05-21 21:07 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-14 9:35 [PATCH] VFS: Unlink should revoke all outstanding leases on file Mi Jinlong
2010-05-14 9:58 ` Jeff Layton
2010-05-14 17:17 ` Trond Myklebust
2010-05-14 17:38 ` Jeff Layton
2010-05-14 17:46 ` Jamie Lokier
2010-05-14 18:16 ` Jeremy Allison
2010-05-19 14:06 ` J. Bruce Fields
2010-05-19 16:21 ` Jamie Lokier
2010-05-14 17:59 ` Trond Myklebust
2010-05-14 18:31 ` Trond Myklebust
2010-05-14 19:23 ` J. Bruce Fields
2010-05-19 9:46 ` Mi Jinlong
2010-05-19 15:57 ` J. Bruce Fields
2010-05-20 9:46 ` Mi Jinlong
2010-05-21 21:07 ` J. Bruce Fields [this message]
2010-05-25 10:14 ` Mi Jinlong
2010-05-19 16:14 ` Jamie Lokier
2010-05-20 2:21 ` J. Bruce Fields
[not found] ` <20100514055844.109d2fdc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-19 9:49 ` Mi Jinlong
2010-05-19 16:03 ` J. Bruce Fields
2010-05-20 9:23 ` Mi Jinlong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100521210738.GK11675@fieldses.org \
--to=bfields@fieldses.org \
--cc=adobriyan@gmail.com \
--cc=ebiederm@xmission.com \
--cc=jamie@shareable.org \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=mijinlong@cn.fujitsu.com \
--cc=trond.myklebust@fys.uio.no \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).