public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "david m. richter" <richterd@citi.umich.edu>,
	linux-fsdevel@vger.kernel.org
Subject: Re: On setting a lease across a cluster
Date: Fri, 4 Jan 2008 13:35:50 -0700	[thread overview]
Message-ID: <20080104203550.GG20473@parisc-linux.org> (raw)
In-Reply-To: <20080104194718.GK17112@fieldses.org>

On Fri, Jan 04, 2008 at 02:47:18PM -0500, J. Bruce Fields wrote:
> > > Then I started to wonder about the current split of functionality between
> > > fcntl_setlease, vfs_setlease and generic_setlease.  The check for no
> > > other process having this file open should be common to all filesystems.
> > > And it should be honoured by NFS (it shouldn't hand out a delegation if
> > > a local user has the file open), so it needs to be in vfs_setlease().
> 
> Of course, in the case of a (still unfortunately theoretical) cluster
> filesystem implementation, those checks will be insufficient on their
> own.

Oh yes, definitely insufficient, but also necessary.  Right now,
filesystems bypass them entirely.  And the last thing we want is
individual filesystems duplicating these tests ...

> Though the write-lease check especially:
> 
> 	if ((arg == F_WRLCK)
>             && ((atomic_read(&dentry->d_count) > 1)
> 	                    || (atomic_read(&inode->i_count) > 1)))
> 		goto out;
> 
> seems like a terrible hack, and gives some annoying false positives:
> 
> 	http://www.ussg.iu.edu/hypermail/linux/kernel/0711.1/2255.html

It is a terrible hack.  It was sufficient for the time, but it wasn't
supposed to be merged in that state ... I haven't had the courage to
figure out how to solve it properly yet.

> > > Now I'm worrying about the mechanics of calling out to a filesystem to
> > > perform a lock.  Obviously, a network filesystem is going to have to
> > > sleep waiting for a response from the fileserver, so that precludes the
> > > use of a spinlock held over the call to f_op->setlease.  I'm not keen on
> > > the idea of EXPORT_SYMBOL_GPL() a spinlock -- verifying locking is quite
> > > hard enough without worrying what a filesystem might be doing with it.
> 
> I'm confused as to what the locks are actually protecting.
> 
> If they're mainly just protecting the inode's lock list, for example,
> then we should let the filesystem call back to helpers in locks.c for
> any manipulations of that list and do the locking in those helpers.

But we need to ensure the lock list is always consistent for, eg,
/proc/locks.  We don't want locks to temporarily appear on the list when
they never get granted (due to an error elsewhere in the chain).

> > > vfs_setlease()
> > >    if (f_op->setlease())
> > >       res = f_op->setlease()
> > >       if (res)
> > >          return res;
> > >    lock_kernel()
> > >    generic_setlease()
> > >    unlock_kernel()
> 
> Why can't the filesystem call into generic_setlease() on its own?

Because (assuming we're rid of the BKL), fcntl_setlease() needs to
acquire the spinlock and hold it while generic_setlease() runs, so
generic_setlease() can't acquire the lock.

> Christoph Hellwig has suggested removing the second case and just doing
> 
> 	.setlease = generic_setlease
> 
> for all filesystems that should support leases; see the final patch in
> 
> 	git://linux-nfs.org/~bfields/linux.git server-cluster-lease-api

Yeah, but that doesn't work if we repurpose the setlease f_op.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

  reply	other threads:[~2008-01-04 20:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-04 18:14 On setting a lease across a cluster Matthew Wilcox
2008-01-04 18:55 ` david m. richter
2008-01-04 19:47   ` J. Bruce Fields
2008-01-04 20:35     ` Matthew Wilcox [this message]
2008-01-04 20:53       ` J. Bruce Fields
2008-01-04 21:08         ` Matthew Wilcox
2008-01-04 21:16           ` J. Bruce Fields
2008-01-04 20:18   ` Matthew Wilcox
2008-01-05 17:44     ` david m. richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080104203550.GG20473@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=bfields@fieldses.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=richterd@citi.umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox