From: Mi Jinlong <mijinlong-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
To: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
Cc: Trond Myklebust
<trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org>,
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
NFSv3 list <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org,
adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
jamie-yetKDKU6eevNLxjTenLetw@public.gmane.org
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file
Date: Tue, 25 May 2010 18:14:56 +0800 [thread overview]
Message-ID: <4BFBA320.8090102@cn.fujitsu.com> (raw)
In-Reply-To: <20100521210738.GK11675-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
J. Bruce Fields :
> On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> I don't know of any existing lock that does exactly what we want.
>>>
>>> Somebody at citi worked on a better lease implementation for a while,
>>> but I don't think we ever really got it right; the last version I can
>>> find is here:
>>>
>>> git://linux-nfs.org/~bfields/linux-topics.git leases
>> When reading the code of the git, i found a patch which try to fix
>> the lease's problem, but only for unlink.
>
> It's 8 patches together:
>
>> commit id: d5a08e556116c66fb60a448f805a40bf54314634
>> msg: "leases: break file leases on unlink."
>>
>> In this patch, it seems only add break_lease() and some other functions
>> but seems don't avoid the problem of race.
>
> Look again: break_lease() is there, but there's also a break_lease_end()
> after the unlink.
Thanks. That's important.
>
>> Or there is some different
>> at break_lease() with the community's kernel.
>>
>> Can you give me some message about the new lease? Thanks.
>
> So the 8 patches at that branch are:
> leases: introduce per-inode lease enabling/disabling
> VFS: clean up extra dereferences in do_unlinkat()
> leases: break file leases on unlink.
> leases: break file leases on rename.
> leases: break leases on chown.
> VFS: refactor sys_fchmod and sys_fchmodat
> leases: break leases on chmod.
> leases: break leases on link.
>
> Like I say, I don't think they're correct or I'd just copy them all to
> the list. But maybe the comment on the first patch (appended) is
> useful.
As reading the first patch ("leases: introduce ... "), it's really useful.
But, as Jamie Lokier said "new lease semantics should use new userspace API.",
is there some new lease under development ?
And, IMO, this problem we discussing is serious that should be fix ASAP.
Can we fix this problem refer to this solutions?
thanks,
Mi Jinlong
>
> --b.
>
> leases: introduce per-inode lease enabling/disabling
>
> The current file lease implementation is inadequate (for the purposes of
> nfs, and, we believe, for the purposes of Samba), in at least two ways:
>
> - Leases are broken only conflicting opens; but both nfsv4
> delegations and (we're told) Windows op locks actually require
> that leases be broken on any operation that changes file
> metadata--including unlink, link, rename, chmod, and chown.
>
> - The internal kernel api used for lease-breaking is inherently
> racy, consisting as it does of a single break_lease() call.
> (Consider this scenario: a file is not currently open and is
> about to be unlinked. During unlink processing, a lookup is
> done, and break_lease() is called. After the break_lease(),
> but before the unlink completes, another user opens the file
> and gets a read lease. The unlink then completes, but the
> other user thinks their read lease is still valid. This
> situation would be avoided if lease-granting for the inode
> were disabled for the duration of the unlink.)
>
> We're primarily interested in the case of read leases for now. (Write
> leases, which also must be broken on *access* to a file, are more
> difficult to get completely right, and aren't used by the current nfs
> server.)
>
> Fixing the second problem requires replacing break_lease() by a pair of
> calls, here called break_lease() and break_lease_end(), between which
> new leases are temporarily prohibited.
>
> We want to implement that temporary prohibition in a simple way that has
> low impact in common (uncontended) cases.
>
> This patch adds a field, i_leasecount, which provides mutual exclusion
> between inode-modifying operations and read leases in the same way the
> i_writecount provides mutual exclusion between write opens and execs:
> when i_leasecount is positive, it counts the number of leases on the
> given inode, and when it's negative it counts the number of operations
> which want leases temporarily disabled. This allows selective
> enabling/disabling of leases on a per-inode basis.
>
> To that end, the functions leases_get_access() and leases_put_access()
> are used when a lease is granted and returned, respectively. The
> functions leases_deny_access() and leases_allow_access() are used to
> prevent races between breaking-with-FMODE_WRITE and write-lease-granting
> for the entire duration of a file operation. Currently, leases are
> broken only when a file is opened or truncated; these functions will
> allow leases to be broken on things like unlink and rename as well.
> NFSv4 implements delegations using leases, and needs its delegations to
> be revoked on unlinks, renames, chowns, etc.
>
> Note that this patch changes break_lease() and __break_lease(), such
> that when they are called with FMODE_WRITE and return successfully, they
> will leave leases disabled on the inode in question, and the caller must
> eventually call break_lease_end() to re-enable leasing. As alluded to
> in the scenario above, this behavior isn't necessary when breaking
> without FMODE_WRITE: existing and new read-leases wouldn't need to be
> revoked or blocked; and a write-lease-granting setlease won't race the
> break_lease() because the latter is presumed to have been preceded by
> something like a dget() on the dentry in question (where d_count or
> i_count > 1 blocks write-lease-granting).
>
> This patch also closes a small existing open/lease race: a lease-related
> race exists between the time that outstanding leases are broken (by
> may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
> the inode's i_writecount variable (which will prevent subsequent
> lease-granting setlease calls). Conceivably, a read lease could be
> granted in the interim.
>
> To deal with this, may_open() is modified so that, on success and when
> called with FMODE_WRITE, it will return with lease-granting disabled for
> the inode in question. do_filp_open() is modified so that leasing is
> re-enabled once everything is finished. Analogous changes are made on
> truncation.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-05-25 10:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-14 9:35 [PATCH] VFS: Unlink should revoke all outstanding leases on file Mi Jinlong
[not found] ` <4BED195F.3070504-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2010-05-14 9:58 ` Jeff Layton
2010-05-14 17:17 ` Trond Myklebust
[not found] ` <1273857471.4732.7.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-05-14 17:38 ` Jeff Layton
[not found] ` <20100514133819.5e383485-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-14 17:46 ` Jamie Lokier
2010-05-14 18:16 ` Jeremy Allison
2010-05-19 14:06 ` J. Bruce Fields
[not found] ` <20100519140639.GB4581-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-19 16:21 ` Jamie Lokier
2010-05-14 17:59 ` Trond Myklebust
2010-05-14 18:31 ` Trond Myklebust
[not found] ` <1273861872.4732.34.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-05-14 19:23 ` J. Bruce Fields
[not found] ` <20100514192327.GA20192-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-19 9:46 ` Mi Jinlong
2010-05-19 15:57 ` J. Bruce Fields
[not found] ` <20100519155700.GE4581-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-20 9:46 ` Mi Jinlong
[not found] ` <4BF504DE.7010804-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2010-05-21 21:07 ` J. Bruce Fields
[not found] ` <20100521210738.GK11675-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-25 10:14 ` Mi Jinlong [this message]
2010-05-19 16:14 ` Jamie Lokier
[not found] ` <20100519161419.GB1693-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2010-05-20 2:21 ` J. Bruce Fields
[not found] ` <20100514055844.109d2fdc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-19 9:49 ` Mi Jinlong
2010-05-19 16:03 ` J. Bruce Fields
2010-05-20 9:23 ` Mi Jinlong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BFBA320.8090102@cn.fujitsu.com \
--to=mijinlong-bthxqxjhjhxqfuhtdcdx3a@public.gmane.org \
--cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=jamie-yetKDKU6eevNLxjTenLetw@public.gmane.org \
--cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org \
--cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).