linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* File system awareness (or lack thereof) of vfs granting of leases
@ 2007-02-16 23:51 Robert Rappaport
  2007-02-17  5:32 ` Wendy Cheng
  0 siblings, 1 reply; 14+ messages in thread
From: Robert Rappaport @ 2007-02-16 23:51 UTC (permalink / raw)
  To: linux-fsdevel

I am investigating the problem of supporting a samba server's granting
of OpLocks to its clients when the files that the samba serving is
accessing are in a clustered file system.

A samba server running in linux determines whether and what kind of
OpLock (i.e. either shared or exclusive) to grant to its clients for a
particular file, by establishing a corresponding lease on the file in
question.  This works well for single host based file systems (e.g.
ext3, etc.) but does not work at all for distributed or clustered file
systems.  This is because the vfs running on the same node where the
samba server is running is not necessarily aware of all accesses to
the file on which it is granting a lease.  Since vfs does not
currently inform file systems about the granting and rescinding of
leases, a clustered file system cannot allow a samba server to support
OpLocks on its files and this has a negative impact on performance.

What I think is needed is to add a file systems defined
file_operations function, that would be invoked when vfs is
considering the granting of a lease on a file associated with an
inode.  Such an enhancement would allow a file system to be come aware
of vfs lease activity and allow it to support this activity.

I would like to know if there has been any work done in this area and
if so, are there people with whom I could correspond to help me work
this out.

I am fairly new to this area and would appreciate any help that I can get.

Thanks.

- Robert Rappaport

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-16 23:51 File system awareness (or lack thereof) of vfs granting of leases Robert Rappaport
@ 2007-02-17  5:32 ` Wendy Cheng
  2007-02-18  6:39   ` J. Bruce Fields
  0 siblings, 1 reply; 14+ messages in thread
From: Wendy Cheng @ 2007-02-17  5:32 UTC (permalink / raw)
  To: Robert Rappaport; +Cc: linux-fsdevel

Robert Rappaport wrote:

> [snip]
> ....   This is because the vfs running on the same node where the
> samba server is running is not necessarily aware of all accesses to
> the file on which it is granting a lease.  Since vfs does not
> currently inform file systems about the granting and rescinding of
> leases, a clustered file system cannot allow a samba server to support
> OpLocks on its files and this has a negative impact on performance.
>
> What I think is needed is to add a file systems defined
> file_operations function, that would be invoked when vfs is
> considering the granting of a lease on a file associated with an
> inode.  Such an enhancement would allow a file system to be come aware
> of vfs lease activity and allow it to support this activity.
>
NFS has similar issues because Linux NLM-VFS does not invoke server side 
filesystem specific lock method. This implies NFS client applications is 
not able to use posix locks to coordinate file access across different 
nodes  with a cluster filesystem, even the cluster filesystem itself 
does support posix locking. IBM Research and University of Michigan CITI 
group have worked out a set of patches to remedy the issue:

http://www.opensubscriber.com/message/linux-fsdevel@vger.kernel.org/5527833.html

-- Wendy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-17  5:32 ` Wendy Cheng
@ 2007-02-18  6:39   ` J. Bruce Fields
  2007-02-20 15:46     ` Robert Rappaport
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2007-02-18  6:39 UTC (permalink / raw)
  To: Wendy Cheng; +Cc: Robert Rappaport, linux-fsdevel

On Sat, Feb 17, 2007 at 12:32:42AM -0500, Wendy Cheng wrote:
> Robert Rappaport wrote:
> 
> >[snip]
> >....   This is because the vfs running on the same node where the
> >samba server is running is not necessarily aware of all accesses to
> >the file on which it is granting a lease.  Since vfs does not
> >currently inform file systems about the granting and rescinding of
> >leases, a clustered file system cannot allow a samba server to support
> >OpLocks on its files and this has a negative impact on performance.
> >
> >What I think is needed is to add a file systems defined
> >file_operations function, that would be invoked when vfs is
> >considering the granting of a lease on a file associated with an
> >inode.  Such an enhancement would allow a file system to be come aware
> >of vfs lease activity and allow it to support this activity.
> >
> NFS has similar issues because Linux NLM-VFS does not invoke server side 
> filesystem specific lock method. This implies NFS client applications is 
> not able to use posix locks to coordinate file access across different 
> nodes  with a cluster filesystem, even the cluster filesystem itself 
> does support posix locking.

We also have the same problem with leases, since we're using leases to
implement NFSv4 delegations.  There's a simple-minded patch here:

	http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=commitdiff;h=4e8aff5dabe07b2e4e95ef0c741a34f65409087f

I'm not really sure if it's right.

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-18  6:39   ` J. Bruce Fields
@ 2007-02-20 15:46     ` Robert Rappaport
  2007-02-20 16:33       ` J. Bruce Fields
  2007-02-20 19:08       ` David Teigland
  0 siblings, 2 replies; 14+ messages in thread
From: Robert Rappaport @ 2007-02-20 15:46 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Wendy Cheng, linux-fsdevel

On 2/18/07, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Sat, Feb 17, 2007 at 12:32:42AM -0500, Wendy Cheng wrote:
> > Robert Rappaport wrote:
> >
> > >[snip]
> > >....   This is because the vfs running on the same node where the
> > >samba server is running is not necessarily aware of all accesses to
> > >the file on which it is granting a lease.  Since vfs does not
> > >currently inform file systems about the granting and rescinding of
> > >leases, a clustered file system cannot allow a samba server to support
> > >OpLocks on its files and this has a negative impact on performance.
> > >
> > >What I think is needed is to add a file systems defined
> > >file_operations function, that would be invoked when vfs is
> > >considering the granting of a lease on a file associated with an
> > >inode.  Such an enhancement would allow a file system to be come aware
> > >of vfs lease activity and allow it to support this activity.
> > >
> > NFS has similar issues because Linux NLM-VFS does not invoke server side
> > filesystem specific lock method. This implies NFS client applications is
> > not able to use posix locks to coordinate file access across different
> > nodes  with a cluster filesystem, even the cluster filesystem itself
> > does support posix locking.
>
> We also have the same problem with leases, since we're using leases to
> implement NFSv4 delegations.  There's a simple-minded patch here:
>
>         http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=commitdiff;h=4e8aff5dabe07b2e4e95ef0c741a34f65409087f
>
> I'm not really sure if it's right.
>
> --b.
>

Thank you both for your helpful replies.  In particular, the addition
of the calls to file system specific functions in routines,
fcntl_setlease() and break_lease(), as well as the modifications to
the file_operations and inode_operations structures, pointed to by
Bruce's reply, look exactly like the hooks that I would need to
proceed to resolve my problems.  Is there any timetable established
for these modifications to make it into a future release?  These hooks
would clearly benefit any cluster file system that has to deal with
leases.

- Robert Rappaport

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 15:46     ` Robert Rappaport
@ 2007-02-20 16:33       ` J. Bruce Fields
  2007-02-20 19:08         ` Robert Rappaport
  2007-02-20 19:08       ` David Teigland
  1 sibling, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2007-02-20 16:33 UTC (permalink / raw)
  To: Robert Rappaport; +Cc: Wendy Cheng, linux-fsdevel

On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> >We also have the same problem with leases, since we're using leases to
> >implement NFSv4 delegations.  There's a simple-minded patch here:
> >
> >        http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=commitdiff;h=4e8aff5dabe07b2e4e95ef0c741a34f65409087f
> >
> >I'm not really sure if it's right.
> 
> Thank you both for your helpful replies.  In particular, the addition
> of the calls to file system specific functions in routines,
> fcntl_setlease() and break_lease(), as well as the modifications to
> the file_operations and inode_operations structures, pointed to by
> Bruce's reply, look exactly like the hooks that I would need to
> proceed to resolve my problems.  Is there any timetable established
> for these modifications to make it into a future release?  These hooks
> would clearly benefit any cluster file system that has to deal with
> leases.

We've been concentrating on the posix locks problem first, but that may
be done in time for 2.6.22.

If someone wants to help--we'll need to figure out how to implement this
for gfs2 and/or ocfs2.  And any review or testing (e.g. with Samba)
would be helpful.

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 16:33       ` J. Bruce Fields
@ 2007-02-20 19:08         ` Robert Rappaport
  2007-02-20 21:14           ` bfields
  0 siblings, 1 reply; 14+ messages in thread
From: Robert Rappaport @ 2007-02-20 19:08 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Wendy Cheng, linux-fsdevel

Bruce,

After looking more carefully at your changes, I have a question.  Why
didn't you modify the linux kernel routine, setlease(), so that it
would either call f_op->set_lease() or __setlease()?  Instead you
created a new routine, nfs4_setlease(), and you modified the previous
calls to setlease() in nfs4 to now call nfs4_setlease.

- Robert Rappaport

On 2/20/07, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> > >We also have the same problem with leases, since we're using leases to
> > >implement NFSv4 delegations.  There's a simple-minded patch here:
> > >
> > >        http://linux-nfs.org/cgi-bin/gitweb.cgi?p=bfields-2.6.git;a=commitdiff;h=4e8aff5dabe07b2e4e95ef0c741a34f65409087f
> > >
> > >I'm not really sure if it's right.
> >
> > Thank you both for your helpful replies.  In particular, the addition
> > of the calls to file system specific functions in routines,
> > fcntl_setlease() and break_lease(), as well as the modifications to
> > the file_operations and inode_operations structures, pointed to by
> > Bruce's reply, look exactly like the hooks that I would need to
> > proceed to resolve my problems.  Is there any timetable established
> > for these modifications to make it into a future release?  These hooks
> > would clearly benefit any cluster file system that has to deal with
> > leases.
>
> We've been concentrating on the posix locks problem first, but that may
> be done in time for 2.6.22.
>
> If someone wants to help--we'll need to figure out how to implement this
> for gfs2 and/or ocfs2.  And any review or testing (e.g. with Samba)
> would be helpful.
>
> --b.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 15:46     ` Robert Rappaport
  2007-02-20 16:33       ` J. Bruce Fields
@ 2007-02-20 19:08       ` David Teigland
  2007-02-20 20:51         ` bfields
  1 sibling, 1 reply; 14+ messages in thread
From: David Teigland @ 2007-02-20 19:08 UTC (permalink / raw)
  To: Robert Rappaport; +Cc: J. Bruce Fields, Wendy Cheng, linux-fsdevel

On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> Thank you both for your helpful replies.  In particular, the addition
> of the calls to file system specific functions in routines,
> fcntl_setlease() and break_lease(), as well as the modifications to
> the file_operations and inode_operations structures, pointed to by
> Bruce's reply, look exactly like the hooks that I would need to
> proceed to resolve my problems.  Is there any timetable established
> for these modifications to make it into a future release?  These hooks
> would clearly benefit any cluster file system that has to deal with
> leases.

We did an experimental distributed lease implementation in gfs(1) a while
ago.  It worked, but was so extremely expensive that there was no point in
considering it seriously.  The problem is that _every_ open and close of
every file requires a new dlm lock operation.  Leases require knowledge
about the cluster-wide opened/closed state of files, not only that but the
mode they're open in.

Dave


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 19:08       ` David Teigland
@ 2007-02-20 20:51         ` bfields
  2007-02-20 20:55           ` bfields
  2007-02-20 21:25           ` David Teigland
  0 siblings, 2 replies; 14+ messages in thread
From: bfields @ 2007-02-20 20:51 UTC (permalink / raw)
  To: David Teigland; +Cc: J. Bruce Fields, Wendy Cheng, linux-fsdevel

> On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> We did an experimental distributed lease implementation in gfs(1) a while
> ago.  It worked, but was so extremely expensive that there was no point in
> considering it seriously.  The problem is that _every_ open and close of
> every file requires a new dlm lock operation.  Leases require knowledge
> about the cluster-wide opened/closed state of files, not only that but the
> mode they're open in.

We're using leases to implement NFSv4 delegations.  Delegations are
similar to leases--they come in read and write variants, and they give
clients a guarantee that they'll be warned before another client is
allowed to do a conflicting open--but delegations are completely optional.
 A server can deny a delegation for any reason, even when there isn't
necessarily a conflicting open.

So perhaps we need some way for nfsd to ask the filesystem to give it a
lease, but only if it's easy to do so.  Would it be possible to make it
cheap for GFS to give out leases in some particular (hopefully common)
cases?

And would such an operation be useful for Samba, or does it really need
leases to be mandatory?

--b.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of      leases
  2007-02-20 20:51         ` bfields
@ 2007-02-20 20:55           ` bfields
  2007-02-20 21:25           ` David Teigland
  1 sibling, 0 replies; 14+ messages in thread
From: bfields @ 2007-02-20 20:55 UTC (permalink / raw)
  To: bfields; +Cc: David Teigland, J. Bruce Fields, Wendy Cheng, linux-fsdevel

I wrote:
>> On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
>>We did an experimental distributed lease implementation in gfs(1) a while

(Sorry for the misattribution, it was David Tiegland that said that.  I'm
having some mail troubles....)--b.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 19:08         ` Robert Rappaport
@ 2007-02-20 21:14           ` bfields
  2007-02-20 21:57             ` Robert Rappaport
  0 siblings, 1 reply; 14+ messages in thread
From: bfields @ 2007-02-20 21:14 UTC (permalink / raw)
  To: Robert Rappaport; +Cc: J. Bruce Fields, Wendy Cheng, linux-fsdevel

"Robert Rappaport" <robert.rappaport@gmail.com> said:
> After looking more carefully at your changes, I have a question.  Why
> didn't you modify the linux kernel routine, setlease(), so that it
> would either call f_op->set_lease() or __setlease()?  Instead you
> created a new routine, nfs4_setlease(), and you modified the previous
> calls to setlease() in nfs4 to now call nfs4_setlease.

No good reason that I can see.  I think you're correct that the
->setlease() call should go into the common code, and by invoked by fcntl
too.

--b.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 20:51         ` bfields
  2007-02-20 20:55           ` bfields
@ 2007-02-20 21:25           ` David Teigland
  2007-02-22 21:58             ` J. Bruce Fields
  1 sibling, 1 reply; 14+ messages in thread
From: David Teigland @ 2007-02-20 21:25 UTC (permalink / raw)
  To: bfields; +Cc: Wendy Cheng, linux-fsdevel

On Tue, Feb 20, 2007 at 03:51:54PM -0500, bfields@fieldses.org wrote:
> > On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> > We did an experimental distributed lease implementation in gfs(1) a while
> > ago.  It worked, but was so extremely expensive that there was no point in
> > considering it seriously.  The problem is that _every_ open and close of
> > every file requires a new dlm lock operation.  Leases require knowledge
> > about the cluster-wide opened/closed state of files, not only that but the
> > mode they're open in.
> 
> We're using leases to implement NFSv4 delegations.  Delegations are
> similar to leases--they come in read and write variants, and they give
> clients a guarantee that they'll be warned before another client is
> allowed to do a conflicting open--but delegations are completely optional.
>  A server can deny a delegation for any reason, even when there isn't
> necessarily a conflicting open.
> 
> So perhaps we need some way for nfsd to ask the filesystem to give it a
> lease, but only if it's easy to do so.  Would it be possible to make it
> cheap for GFS to give out leases in some particular (hopefully common)
> cases?

I don't know of any shortcuts off hand, but there could certainly be some.
Doing something completely different and not using cluster locks may also
be worth investigating.

Dave


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 21:14           ` bfields
@ 2007-02-20 21:57             ` Robert Rappaport
  0 siblings, 0 replies; 14+ messages in thread
From: Robert Rappaport @ 2007-02-20 21:57 UTC (permalink / raw)
  To: bfields@fieldses.org; +Cc: Wendy Cheng, linux-fsdevel

On 2/20/07, bfields@fieldses.org <bfields@fieldses.org> wrote:
> "Robert Rappaport" <robert.rappaport@gmail.com> said:
> > After looking more carefully at your changes, I have a question.  Why
> > didn't you modify the linux kernel routine, setlease(), so that it
> > would either call f_op->set_lease() or __setlease()?  Instead you
> > created a new routine, nfs4_setlease(), and you modified the previous
> > calls to setlease() in nfs4 to now call nfs4_setlease.
>
> No good reason that I can see.  I think you're correct that the
> ->setlease() call should go into the common code, and by invoked by fcntl
> too.

That brings up another thing for me.  As I envision what I need to do
for my problem, what I want in my f_op->set_lease() routine is to do
some specific file system work and then I would like to invoke the
__setlease() routine also.  Currently that routine is defined as
static and therefore uncallable from outside.  Is it possible to
export that routine or at least the functionality currently in that
routine?

- Robert Rappaport

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-20 21:25           ` David Teigland
@ 2007-02-22 21:58             ` J. Bruce Fields
  2007-02-22 22:57               ` David Teigland
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2007-02-22 21:58 UTC (permalink / raw)
  To: David Teigland; +Cc: Wendy Cheng, linux-fsdevel

On Tue, Feb 20, 2007 at 03:25:08PM -0600, David Teigland wrote:
> On Tue, Feb 20, 2007 at 03:51:54PM -0500, bfields@fieldses.org wrote:
> > > On Tue, Feb 20, 2007 at 10:46:51AM -0500, Robert Rappaport wrote:
> > > We did an experimental distributed lease implementation in gfs(1) a while
> > > ago.  It worked, but was so extremely expensive that there was no point in
> > > considering it seriously.  The problem is that _every_ open and close of
> > > every file requires a new dlm lock operation.  Leases require knowledge
> > > about the cluster-wide opened/closed state of files, not only that but the
> > > mode they're open in.
> > 
> > We're using leases to implement NFSv4 delegations.  Delegations are
> > similar to leases--they come in read and write variants, and they give
> > clients a guarantee that they'll be warned before another client is
> > allowed to do a conflicting open--but delegations are completely optional.
> >  A server can deny a delegation for any reason, even when there isn't
> > necessarily a conflicting open.
> > 
> > So perhaps we need some way for nfsd to ask the filesystem to give it a
> > lease, but only if it's easy to do so.  Would it be possible to make it
> > cheap for GFS to give out leases in some particular (hopefully common)
> > cases?
> 
> I don't know of any shortcuts off hand, but there could certainly be some.
> Doing something completely different and not using cluster locks may also
> be worth investigating.

I'm also curious--exposing my total ignorance of the dlm--why taking
such a lock would always be so expensive, or would always be required on
open.  Surely the typical case should be one where there's no conflict
and never has been, in which case asking for the lock you need should be
a trivial local operation?

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: File system awareness (or lack thereof) of vfs granting of leases
  2007-02-22 21:58             ` J. Bruce Fields
@ 2007-02-22 22:57               ` David Teigland
  0 siblings, 0 replies; 14+ messages in thread
From: David Teigland @ 2007-02-22 22:57 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Wendy Cheng, linux-fsdevel

On Thu, Feb 22, 2007 at 04:58:28PM -0500, J. Bruce Fields wrote:
> I'm also curious--exposing my total ignorance of the dlm--why taking
> such a lock would always be so expensive, or would always be required on
> open.  Surely the typical case should be one where there's no conflict
> and never has been, in which case asking for the lock you need should be
> a trivial local operation?

Acquiring a dlm lock can require, 0, 1 or 2 remote requests, depending on
the luck of a hash and whether another node has a lock on the same file.
It's usually nonzero and you have to do a remote operation.

You have to do a dlm operation on every open and every close of every file
because you need to have a record of the opened state of the file (and
mode) _in case_ someone tries to do a lease.  If you removed these two
rules (from fcntl man page)

  "A read lease can only be placed on a file descriptor that is opened
   read-only."

  "A write lease may be placed on a file only if no other process
   currently has the file open."

it would become much easier because you could distribute only the current
lease state across all the nodes and every open would just require a local
check.  Assuming there are far fewer leases than opens.

Dave


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-02-22 22:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-02-16 23:51 File system awareness (or lack thereof) of vfs granting of leases Robert Rappaport
2007-02-17  5:32 ` Wendy Cheng
2007-02-18  6:39   ` J. Bruce Fields
2007-02-20 15:46     ` Robert Rappaport
2007-02-20 16:33       ` J. Bruce Fields
2007-02-20 19:08         ` Robert Rappaport
2007-02-20 21:14           ` bfields
2007-02-20 21:57             ` Robert Rappaport
2007-02-20 19:08       ` David Teigland
2007-02-20 20:51         ` bfields
2007-02-20 20:55           ` bfields
2007-02-20 21:25           ` David Teigland
2007-02-22 21:58             ` J. Bruce Fields
2007-02-22 22:57               ` David Teigland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).