linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mi Jinlong <mijinlong-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
To: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
Cc: Trond Myklebust
	<trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org>,
	Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	NFSv3 list <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org,
	adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org,
	jamie-yetKDKU6eevNLxjTenLetw@public.gmane.org
Subject: Re: [PATCH] VFS: Unlink should revoke all outstanding leases on	file
Date: Tue, 25 May 2010 18:14:56 +0800	[thread overview]
Message-ID: <4BFBA320.8090102@cn.fujitsu.com> (raw)
In-Reply-To: <20100521210738.GK11675-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>



J. Bruce Fields :
> On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> I don't know of any existing lock that does exactly what we want.
>>>
>>> Somebody at citi worked on a better lease implementation for a while,
>>> but I don't think we ever really got it right; the last version I can
>>> find is here:
>>>
>>> 	git://linux-nfs.org/~bfields/linux-topics.git leases
>>   When reading the code of the git, i found a patch which try to fix
>>   the lease's problem, but only for unlink.
> 
> It's 8 patches together:
> 
>>   commit id: d5a08e556116c66fb60a448f805a40bf54314634
>>         msg: "leases: break file leases on unlink."
>>
>>   In this patch, it seems only add break_lease() and some other functions
>>   but seems don't avoid the problem of race.
> 
> Look again: break_lease() is there, but there's also a break_lease_end()
> after the unlink.

  Thanks. That's important.

> 
>> Or there is some different 
>>   at break_lease() with the community's kernel.
>>
>>   Can you give me some message about the new lease? Thanks.
> 
> So the 8 patches at that branch are:
> 	leases: introduce per-inode lease enabling/disabling
> 	VFS: clean up extra dereferences in do_unlinkat()
> 	leases: break file leases on unlink.
> 	leases: break file leases on rename.
> 	leases: break leases on chown.
> 	VFS: refactor sys_fchmod and sys_fchmodat
> 	leases: break leases on chmod.
> 	leases: break leases on link.
> 
> Like I say, I don't think they're correct or I'd just copy them all to
> the list.  But maybe the comment on the first patch (appended) is
> useful.

  As reading the first patch ("leases: introduce ... "), it's really useful.
  But, as Jamie Lokier said "new lease semantics should use new userspace API.",
  is there some new lease under development ?

  And, IMO, this problem we discussing is serious that should be fix ASAP.
  Can we fix this problem refer to this solutions? 

thanks, 
Mi Jinlong

> 
> --b.
> 
> leases: introduce per-inode lease enabling/disabling
> 
> The current file lease implementation is inadequate (for the purposes of
> nfs, and, we believe, for the purposes of Samba), in at least two ways:
> 
> 	- Leases are broken only conflicting opens; but both nfsv4
> 	  delegations and (we're told) Windows op locks actually require
> 	  that leases be broken on any operation that changes file
> 	  metadata--including unlink, link, rename, chmod, and chown.
> 
> 	- The internal kernel api used for lease-breaking is inherently
> 	  racy, consisting as it does of a single break_lease() call.
> 	  (Consider this scenario: a file is not currently open and is
> 	  about to be unlinked.  During unlink processing, a lookup is
> 	  done, and break_lease() is called.  After the break_lease(),
> 	  but before the unlink completes, another user opens the file
> 	  and gets a read lease.  The unlink then completes, but the
> 	  other user thinks their read lease is still valid.  This
> 	  situation would be avoided if lease-granting for the inode
> 	  were disabled for the duration of the unlink.)
> 
> We're primarily interested in the case of read leases for now.  (Write
> leases, which also must be broken on *access* to a file, are more
> difficult to get completely right, and aren't used by the current nfs
> server.)
> 
> Fixing the second problem requires replacing break_lease() by a pair of
> calls, here called break_lease() and break_lease_end(), between which
> new leases are temporarily prohibited.
> 
> We want to implement that temporary prohibition in a simple way that has
> low impact in common (uncontended) cases.
> 
> This patch adds a field, i_leasecount, which provides mutual exclusion
> between inode-modifying operations and read leases in the same way the
> i_writecount provides mutual exclusion between write opens and execs:
> when i_leasecount is positive, it counts the number of leases on the
> given inode, and when it's negative it counts the number of operations
> which want leases temporarily disabled.  This allows selective
> enabling/disabling of leases on a per-inode basis.
> 
> To that end, the functions leases_get_access() and leases_put_access()
> are used when a lease is granted and returned, respectively.  The
> functions leases_deny_access() and leases_allow_access() are used to
> prevent races between breaking-with-FMODE_WRITE and write-lease-granting
> for the entire duration of a file operation.  Currently, leases are
> broken only when a file is opened or truncated; these functions will
> allow leases to be broken on things like unlink and rename as well.
> NFSv4 implements delegations using leases, and needs its delegations to
> be revoked on unlinks, renames, chowns, etc.
> 
> Note that this patch changes break_lease() and __break_lease(), such
> that when they are called with FMODE_WRITE and return successfully, they
> will leave leases disabled on the inode in question, and the caller must
> eventually call break_lease_end() to re-enable leasing.  As alluded to
> in the scenario above, this behavior isn't necessary when breaking
> without FMODE_WRITE: existing and new read-leases wouldn't need to be
> revoked or blocked; and a write-lease-granting setlease won't race the
> break_lease() because the latter is presumed to have been preceded by
> something like a dget() on the dentry in question (where d_count or
> i_count > 1 blocks write-lease-granting).
> 
> This patch also closes a small existing open/lease race: a lease-related
> race exists between the time that outstanding leases are broken (by
> may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
> the inode's i_writecount variable (which will prevent subsequent
> lease-granting setlease calls).  Conceivably, a read lease could be
> granted in the interim.
> 
> To deal with this, may_open() is modified so that, on success and when
> called with FMODE_WRITE, it will return with lease-granting disabled for
> the inode in question.  do_filp_open() is modified so that leasing is
> re-enabled once everything is finished.  Analogous changes are made on
> truncation.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-05-25 10:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-14  9:35 [PATCH] VFS: Unlink should revoke all outstanding leases on file Mi Jinlong
     [not found] ` <4BED195F.3070504-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2010-05-14  9:58   ` Jeff Layton
2010-05-14 17:17     ` Trond Myklebust
     [not found]       ` <1273857471.4732.7.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-05-14 17:38         ` Jeff Layton
     [not found]           ` <20100514133819.5e383485-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-14 17:46             ` Jamie Lokier
2010-05-14 18:16               ` Jeremy Allison
2010-05-19 14:06                 ` J. Bruce Fields
     [not found]                   ` <20100519140639.GB4581-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-19 16:21                     ` Jamie Lokier
2010-05-14 17:59             ` Trond Myklebust
2010-05-14 18:31               ` Trond Myklebust
     [not found]                 ` <1273861872.4732.34.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-05-14 19:23                   ` J. Bruce Fields
     [not found]                     ` <20100514192327.GA20192-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-19  9:46                       ` Mi Jinlong
2010-05-19 15:57                         ` J. Bruce Fields
     [not found]                           ` <20100519155700.GE4581-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-20  9:46                             ` Mi Jinlong
     [not found]                               ` <4BF504DE.7010804-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2010-05-21 21:07                                 ` J. Bruce Fields
     [not found]                                   ` <20100521210738.GK11675-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-05-25 10:14                                     ` Mi Jinlong [this message]
2010-05-19 16:14                     ` Jamie Lokier
     [not found]                       ` <20100519161419.GB1693-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2010-05-20  2:21                         ` J. Bruce Fields
     [not found]     ` <20100514055844.109d2fdc-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-05-19  9:49       ` Mi Jinlong
2010-05-19 16:03         ` J. Bruce Fields
2010-05-20  9:23           ` Mi Jinlong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BFBA320.8090102@cn.fujitsu.com \
    --to=mijinlong-bthxqxjhjhxqfuhtdcdx3a@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=jamie-yetKDKU6eevNLxjTenLetw@public.gmane.org \
    --cc=jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=trond.myklebust-41N18TsMXrtuMpJDpNschA@public.gmane.org \
    --cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).