linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mi Jinlong <mijinlong@cn.fujitsu.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>,
	Jeff Layton <jlayton@redhat.com>,
	NFSv3 list <linux-nfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, ebiederm@xmission.com,
	adobriyan@gmail.com, viro@ZenIV.linux.org.uk,
	jamie@shareable.org
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on	file
Date: Tue, 25 May 2010 18:14:56 +0800	[thread overview]
Message-ID: <4BFBA320.8090102@cn.fujitsu.com> (raw)
In-Reply-To: <20100521210738.GK11675@fieldses.org>



J. Bruce Fields :
> On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> I don't know of any existing lock that does exactly what we want.
>>>
>>> Somebody at citi worked on a better lease implementation for a while,
>>> but I don't think we ever really got it right; the last version I can
>>> find is here:
>>>
>>> 	git://linux-nfs.org/~bfields/linux-topics.git leases
>>   When reading the code of the git, i found a patch which try to fix
>>   the lease's problem, but only for unlink.
> 
> It's 8 patches together:
> 
>>   commit id: d5a08e556116c66fb60a448f805a40bf54314634
>>         msg: "leases: break file leases on unlink."
>>
>>   In this patch, it seems only add break_lease() and some other functions
>>   but seems don't avoid the problem of race.
> 
> Look again: break_lease() is there, but there's also a break_lease_end()
> after the unlink.

  Thanks. That's important.

> 
>> Or there is some different 
>>   at break_lease() with the community's kernel.
>>
>>   Can you give me some message about the new lease? Thanks.
> 
> So the 8 patches at that branch are:
> 	leases: introduce per-inode lease enabling/disabling
> 	VFS: clean up extra dereferences in do_unlinkat()
> 	leases: break file leases on unlink.
> 	leases: break file leases on rename.
> 	leases: break leases on chown.
> 	VFS: refactor sys_fchmod and sys_fchmodat
> 	leases: break leases on chmod.
> 	leases: break leases on link.
> 
> Like I say, I don't think they're correct or I'd just copy them all to
> the list.  But maybe the comment on the first patch (appended) is
> useful.

  As reading the first patch ("leases: introduce ... "), it's really useful.
  But, as Jamie Lokier said "new lease semantics should use new userspace API.",
  is there some new lease under development ?

  And, IMO, this problem we discussing is serious that should be fix ASAP.
  Can we fix this problem refer to this solutions? 

thanks, 
Mi Jinlong

> 
> --b.
> 
> leases: introduce per-inode lease enabling/disabling
> 
> The current file lease implementation is inadequate (for the purposes of
> nfs, and, we believe, for the purposes of Samba), in at least two ways:
> 
> 	- Leases are broken only conflicting opens; but both nfsv4
> 	  delegations and (we're told) Windows op locks actually require
> 	  that leases be broken on any operation that changes file
> 	  metadata--including unlink, link, rename, chmod, and chown.
> 
> 	- The internal kernel api used for lease-breaking is inherently
> 	  racy, consisting as it does of a single break_lease() call.
> 	  (Consider this scenario: a file is not currently open and is
> 	  about to be unlinked.  During unlink processing, a lookup is
> 	  done, and break_lease() is called.  After the break_lease(),
> 	  but before the unlink completes, another user opens the file
> 	  and gets a read lease.  The unlink then completes, but the
> 	  other user thinks their read lease is still valid.  This
> 	  situation would be avoided if lease-granting for the inode
> 	  were disabled for the duration of the unlink.)
> 
> We're primarily interested in the case of read leases for now.  (Write
> leases, which also must be broken on *access* to a file, are more
> difficult to get completely right, and aren't used by the current nfs
> server.)
> 
> Fixing the second problem requires replacing break_lease() by a pair of
> calls, here called break_lease() and break_lease_end(), between which
> new leases are temporarily prohibited.
> 
> We want to implement that temporary prohibition in a simple way that has
> low impact in common (uncontended) cases.
> 
> This patch adds a field, i_leasecount, which provides mutual exclusion
> between inode-modifying operations and read leases in the same way the
> i_writecount provides mutual exclusion between write opens and execs:
> when i_leasecount is positive, it counts the number of leases on the
> given inode, and when it's negative it counts the number of operations
> which want leases temporarily disabled.  This allows selective
> enabling/disabling of leases on a per-inode basis.
> 
> To that end, the functions leases_get_access() and leases_put_access()
> are used when a lease is granted and returned, respectively.  The
> functions leases_deny_access() and leases_allow_access() are used to
> prevent races between breaking-with-FMODE_WRITE and write-lease-granting
> for the entire duration of a file operation.  Currently, leases are
> broken only when a file is opened or truncated; these functions will
> allow leases to be broken on things like unlink and rename as well.
> NFSv4 implements delegations using leases, and needs its delegations to
> be revoked on unlinks, renames, chowns, etc.
> 
> Note that this patch changes break_lease() and __break_lease(), such
> that when they are called with FMODE_WRITE and return successfully, they
> will leave leases disabled on the inode in question, and the caller must
> eventually call break_lease_end() to re-enable leasing.  As alluded to
> in the scenario above, this behavior isn't necessary when breaking
> without FMODE_WRITE: existing and new read-leases wouldn't need to be
> revoked or blocked; and a write-lease-granting setlease won't race the
> break_lease() because the latter is presumed to have been preceded by
> something like a dget() on the dentry in question (where d_count or
> i_count > 1 blocks write-lease-granting).
> 
> This patch also closes a small existing open/lease race: a lease-related
> race exists between the time that outstanding leases are broken (by
> may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
> the inode's i_writecount variable (which will prevent subsequent
> lease-granting setlease calls).  Conceivably, a read lease could be
> granted in the interim.
> 
> To deal with this, may_open() is modified so that, on success and when
> called with FMODE_WRITE, it will return with lease-granting disabled for
> the inode in question.  do_filp_open() is modified so that leasing is
> re-enabled once everything is finished.  Analogous changes are made on
> truncation.


  reply	other threads:[~2010-05-25 10:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-14  9:35 [PATCH] VFS: Unlink should revoke all outstanding leases on file Mi Jinlong
2010-05-14  9:58 ` Jeff Layton
2010-05-14 17:17   ` Trond Myklebust
2010-05-14 17:38     ` Jeff Layton
2010-05-14 17:46       ` Jamie Lokier
2010-05-14 18:16         ` Jeremy Allison
2010-05-19 14:06           ` J. Bruce Fields
2010-05-19 16:21             ` Jamie Lokier
2010-05-14 17:59       ` Trond Myklebust
2010-05-14 18:31         ` Trond Myklebust
2010-05-14 19:23           ` J. Bruce Fields
2010-05-19  9:46             ` Mi Jinlong
2010-05-19 15:57               ` J. Bruce Fields
2010-05-20  9:46                 ` Mi Jinlong
2010-05-21 21:07                   ` J. Bruce Fields
2010-05-25 10:14                     ` Mi Jinlong [this message]
2010-05-19 16:14             ` Jamie Lokier
2010-05-20  2:21               ` J. Bruce Fields
     [not found]   ` <20100514055844.109d2fdc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-19  9:49     ` Mi Jinlong
2010-05-19 16:03       ` J. Bruce Fields
2010-05-20  9:23         ` Mi Jinlong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BFBA320.8090102@cn.fujitsu.com \
    --to=mijinlong@cn.fujitsu.com \
    --cc=adobriyan@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=ebiederm@xmission.com \
    --cc=jamie@shareable.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).