public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
@ 2008-04-29  3:42 Bryan Wu
  2008-04-29 20:54 ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Bryan Wu @ 2008-04-29  3:42 UTC (permalink / raw)
  To: LKML, Andrew Morton, Linus Torvalds, willy, Al Viro,
	uclinux-dist-devel

Hi folk,

This days I am digging into this LTP bug reported on our Blackfin test
machine, but I think it is general for other system.
https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743

And I also found Kumar Gala reported this similar bug before.
http://lkml.org/lkml/2007/11/14/388

1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
will be added one as below:
--
ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
|_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
|_______int error = -ENOSPC;

|_______if (inode) {
|_______|_______if (dir->i_mode & S_ISGID) {
|_______|_______|_______inode->i_gid = dir->i_gid;
|_______|_______|_______if (S_ISDIR(mode))
|_______|_______|_______|_______inode->i_mode |= S_ISGID;
|_______|_______}
|_______|_______d_instantiate(dentry, inode);
|_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
|_______|_______error = 0;
|_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
|_______}
|_______return error;
}
--
The dget(dentry) call introduces an extra count, why?
it is the same in tmpfs.

2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
--
|_______if ((arg == F_WRLCK)
|_______    && ((atomic_read(&dentry->d_count) > 1)
|_______|_______|| (atomic_read(&inode->i_count) > 1)))
|_______|_______goto out;
--

because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.

3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
after remove this dget(),
the ramfs can not be mounted as rootfs at all.

Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?

Of course, simply remove the test '((atomic_read(&dentry->d_count) >
1)' can workaround this issue.

Thanks
Best Regards,
-Bryan Wu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29  3:42 [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs Bryan Wu
@ 2008-04-29 20:54 ` Andrew Morton
  2008-04-29 21:42   ` J. Bruce Fields
  2008-04-29 22:21   ` Matthew Wilcox
  0 siblings, 2 replies; 14+ messages in thread
From: Andrew Morton @ 2008-04-29 20:54 UTC (permalink / raw)
  To: Bryan Wu
  Cc: linux-kernel, torvalds, willy, viro, uclinux-dist-devel,
	J. Bruce Fields

On Tue, 29 Apr 2008 11:42:48 +0800
"Bryan Wu" <cooloney@kernel.org> wrote:

> Hi folk,
> 
> This days I am digging into this LTP bug reported on our Blackfin test
> machine, but I think it is general for other system.
> https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743
> 
> And I also found Kumar Gala reported this similar bug before.
> http://lkml.org/lkml/2007/11/14/388
> 
> 1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
> will be added one as below:
> --
> ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> {
> |_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
> |_______int error = -ENOSPC;
> 
> |_______if (inode) {
> |_______|_______if (dir->i_mode & S_ISGID) {
> |_______|_______|_______inode->i_gid = dir->i_gid;
> |_______|_______|_______if (S_ISDIR(mode))
> |_______|_______|_______|_______inode->i_mode |= S_ISGID;
> |_______|_______}
> |_______|_______d_instantiate(dentry, inode);
> |_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
> |_______|_______error = 0;
> |_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
> |_______}
> |_______return error;
> }
> --
> The dget(dentry) call introduces an extra count, why?
> it is the same in tmpfs.

Because those dentries have no backing store.  Their sole existance is in
the dentry cache which is normally reclaimable.  But we can't reclaim these
dentries because there is nowhere from where they can be reestablished.

> 2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
> --
> |_______if ((arg == F_WRLCK)
> |_______    && ((atomic_read(&dentry->d_count) > 1)
> |_______|_______|| (atomic_read(&inode->i_count) > 1)))
> |_______|_______goto out;
> --

Sucky heuristic.

> because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.
> 
> 3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
> after remove this dget(),
> the ramfs can not be mounted as rootfs at all.

Interesting.  Presumably it got reclaimed synchronously somehow.

> Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?
> 
> Of course, simply remove the test '((atomic_read(&dentry->d_count) >
> 1)' can workaround this issue.

I guess we should make the generic_setlease() heuristic smarter.

Of course the _reason_ for that heuristic is uncommented and lost in time. 
And one wonders what locking prevents it from being totally racy, and if
"none", what happens when the race hits.  Sigh.

I suppose a stupid fix would be to set (and later clear) a new flag in
dentry.d_flags which means

  this dentry is pinned by a ram-backed device, so d_count==2 means
  "unused""

But it would be better to work out exactly what generic_setlease() is
trying to do there, and do it in a better way.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 20:54 ` Andrew Morton
@ 2008-04-29 21:42   ` J. Bruce Fields
  2008-04-29 22:01     ` Mike Frysinger
                       ` (2 more replies)
  2008-04-29 22:21   ` Matthew Wilcox
  1 sibling, 3 replies; 14+ messages in thread
From: J. Bruce Fields @ 2008-04-29 21:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Bryan Wu, linux-kernel, torvalds, willy, viro, uclinux-dist-devel,
	richterd

On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> On Tue, 29 Apr 2008 11:42:48 +0800
> "Bryan Wu" <cooloney@kernel.org> wrote:
> 
> > Hi folk,
> > 
> > This days I am digging into this LTP bug reported on our Blackfin test
> > machine, but I think it is general for other system.
> > https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743
> > 
> > And I also found Kumar Gala reported this similar bug before.
> > http://lkml.org/lkml/2007/11/14/388
> > 
> > 1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
> > will be added one as below:
> > --
> > ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > {
> > |_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
> > |_______int error = -ENOSPC;
> > 
> > |_______if (inode) {
> > |_______|_______if (dir->i_mode & S_ISGID) {
> > |_______|_______|_______inode->i_gid = dir->i_gid;
> > |_______|_______|_______if (S_ISDIR(mode))
> > |_______|_______|_______|_______inode->i_mode |= S_ISGID;
> > |_______|_______}
> > |_______|_______d_instantiate(dentry, inode);
> > |_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
> > |_______|_______error = 0;
> > |_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
> > |_______}
> > |_______return error;
> > }
> > --
> > The dget(dentry) call introduces an extra count, why?
> > it is the same in tmpfs.
> 
> Because those dentries have no backing store.  Their sole existance is in
> the dentry cache which is normally reclaimable.  But we can't reclaim these
> dentries because there is nowhere from where they can be reestablished.
> 
> > 2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
> > --
> > |_______if ((arg == F_WRLCK)
> > |_______    && ((atomic_read(&dentry->d_count) > 1)
> > |_______|_______|| (atomic_read(&inode->i_count) > 1)))
> > |_______|_______goto out;
> > --
> 
> Sucky heuristic.

Yes.

> 
> > because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.
> > 
> > 3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
> > after remove this dget(),
> > the ramfs can not be mounted as rootfs at all.
> 
> Interesting.  Presumably it got reclaimed synchronously somehow.
> 
> > Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?
> > 
> > Of course, simply remove the test '((atomic_read(&dentry->d_count) >
> > 1)' can workaround this issue.
> 
> I guess we should make the generic_setlease() heuristic smarter.
> 
> Of course the _reason_ for that heuristic is uncommented and lost in time. 
> And one wonders what locking prevents it from being totally racy, and if
> "none", what happens when the race hits.  Sigh.

Yes, I think the race is:

	1. generic_setlease(., F_WRLCK, .) checks d_count and i_count,
	   both are 1.

	2. a read open comes in, calls break_lease which finds no lease
	   and continues happily on.

	3. generic_setlease() sets the write lease.

The most likely consequences are that a local reader gets out-of-date
data for a file that a Samba client has modified.

I suppose that re-checking the d_count and i_count after step 3 might
close the race.

> I suppose a stupid fix would be to set (and later clear) a new flag in
> dentry.d_flags which means
> 
>   this dentry is pinned by a ram-backed device, so d_count==2 means
>   "unused""
> 
> But it would be better to work out exactly what generic_setlease() is
> trying to do there, and do it in a better way.

Yes.  What it's supposed to do is provide exclusion between opens and
write leases.

We already have a mechanism that provides exclusion between write opens
and exec, using the i_writecount, so we're using that for read leases.
I suppose it'd be possible to do something similar for write leases;
would there be smp scalability problems associated with counting all the
read opens of a given inode?  Other problems?

Even with this problem solved, I'm not convinced write leases are very
useful as implemented.  Their only current user is Samba, which uses
them to grant exclusive access to given files to allow clients to cache
writes.

Samba knows when to revoke that exclusive access because the lease
subsystem signals it on a read open of the file.  It doesn't revoke on
stat, however.  This causes problem.  E.g., say Samba takes out a lease
and tells some client it can now cache its writes indefinitely.
Meanwhile a local application (say, make) is polling that file for
changes using stat.  They never see those changes.

The NFSv2/v3 server for some reason has its own one-off hack that
reports the ctime as now for on any write-leased file, which leads
people to complain about spurious rebuilds:

	http://bugzilla.kernel.org/show_bug.cgi?id=9454

The one thing I suspect is *not* a really serious problem here is the
reported LTP failure, since probably the only user of this is Samba,
which probably doesn't do a lot of tmpfs exports, and in any case it can
probably soldier on (if with degraded performance--how badly I don't
know) without getting the write lease it wants.

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 21:42   ` J. Bruce Fields
@ 2008-04-29 22:01     ` Mike Frysinger
  2008-04-29 22:11       ` J. Bruce Fields
  2008-04-29 23:21     ` david m. richter
  2008-05-01  6:24     ` Al Viro
  2 siblings, 1 reply; 14+ messages in thread
From: Mike Frysinger @ 2008-04-29 22:01 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel, richterd

On Tue, Apr 29, 2008 at 5:42 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>  The one thing I suspect is *not* a really serious problem here is the
>  reported LTP failure, since probably the only user of this is Samba,
>  which probably doesn't do a lot of tmpfs exports, and in any case it can
>  probably soldier on (if with degraded performance--how badly I don't
>  know) without getting the write lease it wants.

i'm not sure i follow.  the reported problem is that file locking does
not work on tmpfs/ramfs storage.  a not terribly uncommon scenario is
to use tmpfs on /tmp (or similar location) and have file locking not
work at all.  so programs that use file locking or scripts that
leverage the flock utility from the util-linux package break.
-mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 22:01     ` Mike Frysinger
@ 2008-04-29 22:11       ` J. Bruce Fields
  2008-04-29 22:15         ` Mike Frysinger
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2008-04-29 22:11 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel, richterd

On Tue, Apr 29, 2008 at 06:01:47PM -0400, Mike Frysinger wrote:
> On Tue, Apr 29, 2008 at 5:42 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >  The one thing I suspect is *not* a really serious problem here is the
> >  reported LTP failure, since probably the only user of this is Samba,
> >  which probably doesn't do a lot of tmpfs exports, and in any case it can
> >  probably soldier on (if with degraded performance--how badly I don't
> >  know) without getting the write lease it wants.
> 
> i'm not sure i follow.  the reported problem is that file locking does
> not work on tmpfs/ramfs storage.  a not terribly uncommon scenario is
> to use tmpfs on /tmp (or similar location) and have file locking not
> work at all.  so programs that use file locking or scripts that
> leverage the flock utility from the util-linux package break.

There are three different mechanisms that might be called "file locks":

	- fcntl() locks, aka "posix locks", "byte-range locks":
	  documented in the "Advisory locking" section of fcntl(2).
	- flock() locks: documented in flock(2) This is what the shell
	  utility from util-linux uses.
	- leases, documented in the "Leases" section of fcntl(2).

For the former two, I agree with you, applications actually depend on
them.

The bug report, however, is for leases, which are much less widely used.
(The only users I know of are Samba and, to a lesser extent, NFSv4
(which doesn't currently use write leases due to all these problems).)

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 22:11       ` J. Bruce Fields
@ 2008-04-29 22:15         ` Mike Frysinger
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Frysinger @ 2008-04-29 22:15 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel, richterd

On Tue, Apr 29, 2008 at 6:11 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Tue, Apr 29, 2008 at 06:01:47PM -0400, Mike Frysinger wrote:
>  > On Tue, Apr 29, 2008 at 5:42 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>  > >  The one thing I suspect is *not* a really serious problem here is the
>  > >  reported LTP failure, since probably the only user of this is Samba,
>  > >  which probably doesn't do a lot of tmpfs exports, and in any case it can
>  > >  probably soldier on (if with degraded performance--how badly I don't
>  > >  know) without getting the write lease it wants.
>  >
>  > i'm not sure i follow.  the reported problem is that file locking does
>  > not work on tmpfs/ramfs storage.  a not terribly uncommon scenario is
>  > to use tmpfs on /tmp (or similar location) and have file locking not
>  > work at all.  so programs that use file locking or scripts that
>  > leverage the flock utility from the util-linux package break.
>
>  There are three different mechanisms that might be called "file locks":
>
>         - fcntl() locks, aka "posix locks", "byte-range locks":
>           documented in the "Advisory locking" section of fcntl(2).
>         - flock() locks: documented in flock(2) This is what the shell
>           utility from util-linux uses.
>         - leases, documented in the "Leases" section of fcntl(2).
>
>  For the former two, I agree with you, applications actually depend on
>  them.
>
>  The bug report, however, is for leases, which are much less widely used.
>  (The only users I know of are Samba and, to a lesser extent, NFSv4
>  (which doesn't currently use write leases due to all these problems).)

ah sorry, you're right of course ... while i investigated the issue
originally, i clearly forget the details ;)
-mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 20:54 ` Andrew Morton
  2008-04-29 21:42   ` J. Bruce Fields
@ 2008-04-29 22:21   ` Matthew Wilcox
  2008-05-01  6:33     ` Al Viro
  1 sibling, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2008-04-29 22:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Bryan Wu, linux-kernel, torvalds, willy, viro, uclinux-dist-devel,
	J. Bruce Fields

On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> I guess we should make the generic_setlease() heuristic smarter.
> 
> Of course the _reason_ for that heuristic is uncommented and lost in time. 
> And one wonders what locking prevents it from being totally racy, and if
> "none", what happens when the race hits.  Sigh.

It's hardly "lost in time" when you can ask the original author.

If there are multiple processes with this file open, you can't place a
lease on it.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 21:42   ` J. Bruce Fields
  2008-04-29 22:01     ` Mike Frysinger
@ 2008-04-29 23:21     ` david m. richter
  2008-04-30 17:50       ` J. Bruce Fields
  2008-05-01  6:24     ` Al Viro
  2 siblings, 1 reply; 14+ messages in thread
From: david m. richter @ 2008-04-29 23:21 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel

On Tue, 29 Apr 2008, J. Bruce Fields wrote:

> On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> > On Tue, 29 Apr 2008 11:42:48 +0800
> > "Bryan Wu" <cooloney@kernel.org> wrote:
> > 
> > > Hi folk,
> > > 
> > > This days I am digging into this LTP bug reported on our Blackfin test
> > > machine, but I think it is general for other system.
> > > https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743
> > > 
> > > And I also found Kumar Gala reported this similar bug before.
> > > http://lkml.org/lkml/2007/11/14/388
> > > 
> > > 1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
> > > will be added one as below:
> > > --
> > > ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > > {
> > > |_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
> > > |_______int error = -ENOSPC;
> > > 
> > > |_______if (inode) {
> > > |_______|_______if (dir->i_mode & S_ISGID) {
> > > |_______|_______|_______inode->i_gid = dir->i_gid;
> > > |_______|_______|_______if (S_ISDIR(mode))
> > > |_______|_______|_______|_______inode->i_mode |= S_ISGID;
> > > |_______|_______}
> > > |_______|_______d_instantiate(dentry, inode);
> > > |_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
> > > |_______|_______error = 0;
> > > |_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
> > > |_______}
> > > |_______return error;
> > > }
> > > --
> > > The dget(dentry) call introduces an extra count, why?
> > > it is the same in tmpfs.
> > 
> > Because those dentries have no backing store.  Their sole existance is in
> > the dentry cache which is normally reclaimable.  But we can't reclaim these
> > dentries because there is nowhere from where they can be reestablished.
> > 
> > > 2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
> > > --
> > > |_______if ((arg == F_WRLCK)
> > > |_______    && ((atomic_read(&dentry->d_count) > 1)
> > > |_______|_______|| (atomic_read(&inode->i_count) > 1)))
> > > |_______|_______goto out;
> > > --
> > 
> > Sucky heuristic.
> 
> Yes.
> 
> > 
> > > because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.
> > > 
> > > 3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
> > > after remove this dget(),
> > > the ramfs can not be mounted as rootfs at all.
> > 
> > Interesting.  Presumably it got reclaimed synchronously somehow.
> > 
> > > Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?
> > > 
> > > Of course, simply remove the test '((atomic_read(&dentry->d_count) >
> > > 1)' can workaround this issue.
> > 
> > I guess we should make the generic_setlease() heuristic smarter.
> > 
> > Of course the _reason_ for that heuristic is uncommented and lost in time. 
> > And one wonders what locking prevents it from being totally racy, and if
> > "none", what happens when the race hits.  Sigh.

	i'm not sure which particular kernel version we're talking about, 
but i think the intent was to rely on the BKL.  i noticed a couple cases 
where this didn't actually hold -- e.g., bruce has a patch queued to move 
the kmalloc in generic_setlease() so that it precedes the 
d_count/i_writecount checks and covers the race he mentions below.  an 
earlier patch closed a similar thing with get_write_access() when handling 
a truncate, etc.

	fyi (well, not you, bruce): relatedly, i have a set of patches to 
introduce per-inode lease enabling/disabling (so we can fully implement 
NFSv4.0 file delegations and v4.1 directory delegations) which expand the 
cases where leases are broken (e.g. unlink) and which make it somewhat 
more explicit when it's safe to lease/break.  perhaps the next merge 
window for the first set of them.


> Yes, I think the race is:
> 
> 	1. generic_setlease(., F_WRLCK, .) checks d_count and i_count,
> 	   both are 1.
> 
> 	2. a read open comes in, calls break_lease which finds no lease
> 	   and continues happily on.
> 
> 	3. generic_setlease() sets the write lease.
> 
> The most likely consequences are that a local reader gets out-of-date
> data for a file that a Samba client has modified.
>
> I suppose that re-checking the d_count and i_count after step 3 might
> close the race.

	as things currently stand, i believe that race can only happen if 
the leaser is blocking on the kmalloc.  but yeah, the d_count check is 
pretty frail anyway ...


> > I suppose a stupid fix would be to set (and later clear) a new flag in
> > dentry.d_flags which means
> > 
> >   this dentry is pinned by a ram-backed device, so d_count==2 means
> >   "unused""
> > 
> > But it would be better to work out exactly what generic_setlease() is
> > trying to do there, and do it in a better way.
> 
> Yes.  What it's supposed to do is provide exclusion between opens and
> write leases.
> 
> We already have a mechanism that provides exclusion between write opens
> and exec, using the i_writecount, so we're using that for read leases.
> I suppose it'd be possible to do something similar for write leases;
> would there be smp scalability problems associated with counting all the
> read opens of a given inode?  Other problems?
> 
> Even with this problem solved, I'm not convinced write leases are very
> useful as implemented.  Their only current user is Samba, which uses
> them to grant exclusive access to given files to allow clients to cache
> writes.
> 
> Samba knows when to revoke that exclusive access because the lease
> subsystem signals it on a read open of the file.  It doesn't revoke on
> stat, however.  This causes problem.  E.g., say Samba takes out a lease
> and tells some client it can now cache its writes indefinitely.
> Meanwhile a local application (say, make) is polling that file for
> changes using stat.  They never see those changes.
> 
> The NFSv2/v3 server for some reason has its own one-off hack that
> reports the ctime as now for on any write-leased file, which leads
> people to complain about spurious rebuilds:
> 
> 	http://bugzilla.kernel.org/show_bug.cgi?id=9454
> 
> The one thing I suspect is *not* a really serious problem here is the
> reported LTP failure, since probably the only user of this is Samba,
> which probably doesn't do a lot of tmpfs exports, and in any case it can
> probably soldier on (if with degraded performance--how badly I don't
> know) without getting the write lease it wants.
> 
> --b.
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 23:21     ` david m. richter
@ 2008-04-30 17:50       ` J. Bruce Fields
  2008-04-30 18:14         ` david m. richter
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2008-04-30 17:50 UTC (permalink / raw)
  To: david m. richter
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel

On Tue, Apr 29, 2008 at 07:21:02PM -0400, david m. richter wrote:
> On Tue, 29 Apr 2008, J. Bruce Fields wrote:
> 
> > On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> > > On Tue, 29 Apr 2008 11:42:48 +0800
> > > "Bryan Wu" <cooloney@kernel.org> wrote:
> > > 
> > > > Hi folk,
> > > > 
> > > > This days I am digging into this LTP bug reported on our Blackfin test
> > > > machine, but I think it is general for other system.
> > > > https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743
> > > > 
> > > > And I also found Kumar Gala reported this similar bug before.
> > > > http://lkml.org/lkml/2007/11/14/388
> > > > 
> > > > 1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
> > > > will be added one as below:
> > > > --
> > > > ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > > > {
> > > > |_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
> > > > |_______int error = -ENOSPC;
> > > > 
> > > > |_______if (inode) {
> > > > |_______|_______if (dir->i_mode & S_ISGID) {
> > > > |_______|_______|_______inode->i_gid = dir->i_gid;
> > > > |_______|_______|_______if (S_ISDIR(mode))
> > > > |_______|_______|_______|_______inode->i_mode |= S_ISGID;
> > > > |_______|_______}
> > > > |_______|_______d_instantiate(dentry, inode);
> > > > |_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
> > > > |_______|_______error = 0;
> > > > |_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
> > > > |_______}
> > > > |_______return error;
> > > > }
> > > > --
> > > > The dget(dentry) call introduces an extra count, why?
> > > > it is the same in tmpfs.
> > > 
> > > Because those dentries have no backing store.  Their sole existance is in
> > > the dentry cache which is normally reclaimable.  But we can't reclaim these
> > > dentries because there is nowhere from where they can be reestablished.
> > > 
> > > > 2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
> > > > --
> > > > |_______if ((arg == F_WRLCK)
> > > > |_______    && ((atomic_read(&dentry->d_count) > 1)
> > > > |_______|_______|| (atomic_read(&inode->i_count) > 1)))
> > > > |_______|_______goto out;
> > > > --
> > > 
> > > Sucky heuristic.
> > 
> > Yes.
> > 
> > > 
> > > > because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.
> > > > 
> > > > 3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
> > > > after remove this dget(),
> > > > the ramfs can not be mounted as rootfs at all.
> > > 
> > > Interesting.  Presumably it got reclaimed synchronously somehow.
> > > 
> > > > Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?
> > > > 
> > > > Of course, simply remove the test '((atomic_read(&dentry->d_count) >
> > > > 1)' can workaround this issue.
> > > 
> > > I guess we should make the generic_setlease() heuristic smarter.
> > > 
> > > Of course the _reason_ for that heuristic is uncommented and lost in time. 
> > > And one wonders what locking prevents it from being totally racy, and if
> > > "none", what happens when the race hits.  Sigh.
> 
> 	i'm not sure which particular kernel version we're talking about, 
> but i think the intent was to rely on the BKL.  i noticed a couple cases 
> where this didn't actually hold -- e.g., bruce has a patch queued to move 
> the kmalloc in generic_setlease() so that it precedes the 
> d_count/i_writecount checks and covers the race he mentions below.  an 
> earlier patch closed a similar thing with get_write_access() when handling 
> a truncate, etc.
> 
> 	fyi (well, not you, bruce): relatedly, i have a set of patches to 
> introduce per-inode lease enabling/disabling (so we can fully implement 
> NFSv4.0 file delegations and v4.1 directory delegations) which expand the 
> cases where leases are broken (e.g. unlink) and which make it somewhat 
> more explicit when it's safe to lease/break.  perhaps the next merge 
> window for the first set of them.
> 
> 
> > Yes, I think the race is:
> > 
> > 	1. generic_setlease(., F_WRLCK, .) checks d_count and i_count,
> > 	   both are 1.
> > 
> > 	2. a read open comes in, calls break_lease which finds no lease
> > 	   and continues happily on.
> > 
> > 	3. generic_setlease() sets the write lease.
> > 
> > The most likely consequences are that a local reader gets out-of-date
> > data for a file that a Samba client has modified.
> >
> > I suppose that re-checking the d_count and i_count after step 3 might
> > close the race.
> 
> 	as things currently stand, i believe that race can only happen if 
> the leaser is blocking on the kmalloc.

Neither break_lease() (the shortcut inlined function, not the full
__break_lease()) nor any of the open code that I can see is under the
BKL, so unless I'm missing something, that code is racy.

--b.

> but yeah, the d_count check is 
> pretty frail anyway ...
> 
> 
> > > I suppose a stupid fix would be to set (and later clear) a new flag in
> > > dentry.d_flags which means
> > > 
> > >   this dentry is pinned by a ram-backed device, so d_count==2 means
> > >   "unused""
> > > 
> > > But it would be better to work out exactly what generic_setlease() is
> > > trying to do there, and do it in a better way.
> > 
> > Yes.  What it's supposed to do is provide exclusion between opens and
> > write leases.
> > 
> > We already have a mechanism that provides exclusion between write opens
> > and exec, using the i_writecount, so we're using that for read leases.
> > I suppose it'd be possible to do something similar for write leases;
> > would there be smp scalability problems associated with counting all the
> > read opens of a given inode?  Other problems?
> > 
> > Even with this problem solved, I'm not convinced write leases are very
> > useful as implemented.  Their only current user is Samba, which uses
> > them to grant exclusive access to given files to allow clients to cache
> > writes.
> > 
> > Samba knows when to revoke that exclusive access because the lease
> > subsystem signals it on a read open of the file.  It doesn't revoke on
> > stat, however.  This causes problem.  E.g., say Samba takes out a lease
> > and tells some client it can now cache its writes indefinitely.
> > Meanwhile a local application (say, make) is polling that file for
> > changes using stat.  They never see those changes.
> > 
> > The NFSv2/v3 server for some reason has its own one-off hack that
> > reports the ctime as now for on any write-leased file, which leads
> > people to complain about spurious rebuilds:
> > 
> > 	http://bugzilla.kernel.org/show_bug.cgi?id=9454
> > 
> > The one thing I suspect is *not* a really serious problem here is the
> > reported LTP failure, since probably the only user of this is Samba,
> > which probably doesn't do a lot of tmpfs exports, and in any case it can
> > probably soldier on (if with degraded performance--how badly I don't
> > know) without getting the write lease it wants.
> > 
> > --b.
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-30 17:50       ` J. Bruce Fields
@ 2008-04-30 18:14         ` david m. richter
  0 siblings, 0 replies; 14+ messages in thread
From: david m. richter @ 2008-04-30 18:14 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy, viro,
	uclinux-dist-devel

On Wed, 30 Apr 2008, J. Bruce Fields wrote:

> On Tue, Apr 29, 2008 at 07:21:02PM -0400, david m. richter wrote:
> > On Tue, 29 Apr 2008, J. Bruce Fields wrote:
> > 
> > > On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> > > > On Tue, 29 Apr 2008 11:42:48 +0800
> > > > "Bryan Wu" <cooloney@kernel.org> wrote:
> > > > 
> > > > > Hi folk,
> > > > > 
> > > > > This days I am digging into this LTP bug reported on our Blackfin test
> > > > > machine, but I think it is general for other system.
> > > > > https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=3743
> > > > > 
> > > > > And I also found Kumar Gala reported this similar bug before.
> > > > > http://lkml.org/lkml/2007/11/14/388
> > > > > 
> > > > > 1, when opening and creating a new on ramfs/tmpfs, the dentry->d_count
> > > > > will be added one as below:
> > > > > --
> > > > > ramfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
> > > > > {
> > > > > |_______struct inode * inode = ramfs_get_inode(dir->i_sb, mode, dev);
> > > > > |_______int error = -ENOSPC;
> > > > > 
> > > > > |_______if (inode) {
> > > > > |_______|_______if (dir->i_mode & S_ISGID) {
> > > > > |_______|_______|_______inode->i_gid = dir->i_gid;
> > > > > |_______|_______|_______if (S_ISDIR(mode))
> > > > > |_______|_______|_______|_______inode->i_mode |= S_ISGID;
> > > > > |_______|_______}
> > > > > |_______|_______d_instantiate(dentry, inode);
> > > > > |_______|_______dget(dentry);|__/* Extra count - pin the dentry in core */
> > > > > |_______|_______error = 0;
> > > > > |_______|_______dir->i_mtime = dir->i_ctime = CURRENT_TIME;
> > > > > |_______}
> > > > > |_______return error;
> > > > > }
> > > > > --
> > > > > The dget(dentry) call introduces an extra count, why?
> > > > > it is the same in tmpfs.
> > > > 
> > > > Because those dentries have no backing store.  Their sole existance is in
> > > > the dentry cache which is normally reclaimable.  But we can't reclaim these
> > > > dentries because there is nowhere from where they can be reestablished.
> > > > 
> > > > > 2, when calling  fcntl(fd, F_SETLEASE,F_WRLCK), it will return -EAGAIN
> > > > > --
> > > > > |_______if ((arg == F_WRLCK)
> > > > > |_______    && ((atomic_read(&dentry->d_count) > 1)
> > > > > |_______|_______|| (atomic_read(&inode->i_count) > 1)))
> > > > > |_______|_______goto out;
> > > > > --
> > > > 
> > > > Sucky heuristic.
> > > 
> > > Yes.
> > > 
> > > > 
> > > > > because the dentry->d_count will be 2 not 1. I tested ext2 on Blackfin, it is 1.
> > > > > 
> > > > > 3, so I guess maybe the dget(dentry) of ramfs_mknod is useless. But
> > > > > after remove this dget(),
> > > > > the ramfs can not be mounted as rootfs at all.
> > > > 
> > > > Interesting.  Presumably it got reclaimed synchronously somehow.
> > > > 
> > > > > Is the bug in generic_setlease() or in the ramfs/tmpfs inode create function?
> > > > > 
> > > > > Of course, simply remove the test '((atomic_read(&dentry->d_count) >
> > > > > 1)' can workaround this issue.
> > > > 
> > > > I guess we should make the generic_setlease() heuristic smarter.
> > > > 
> > > > Of course the _reason_ for that heuristic is uncommented and lost in time. 
> > > > And one wonders what locking prevents it from being totally racy, and if
> > > > "none", what happens when the race hits.  Sigh.
> > 
> > 	i'm not sure which particular kernel version we're talking about, 
> > but i think the intent was to rely on the BKL.  i noticed a couple cases 
> > where this didn't actually hold -- e.g., bruce has a patch queued to move 
> > the kmalloc in generic_setlease() so that it precedes the 
> > d_count/i_writecount checks and covers the race he mentions below.  an 
> > earlier patch closed a similar thing with get_write_access() when handling 
> > a truncate, etc.
> > 
> > 	fyi (well, not you, bruce): relatedly, i have a set of patches to 
> > introduce per-inode lease enabling/disabling (so we can fully implement 
> > NFSv4.0 file delegations and v4.1 directory delegations) which expand the 
> > cases where leases are broken (e.g. unlink) and which make it somewhat 
> > more explicit when it's safe to lease/break.  perhaps the next merge 
> > window for the first set of them.
> > 
> > 
> > > Yes, I think the race is:
> > > 
> > > 	1. generic_setlease(., F_WRLCK, .) checks d_count and i_count,
> > > 	   both are 1.
> > > 
> > > 	2. a read open comes in, calls break_lease which finds no lease
> > > 	   and continues happily on.
> > > 
> > > 	3. generic_setlease() sets the write lease.
> > > 
> > > The most likely consequences are that a local reader gets out-of-date
> > > data for a file that a Samba client has modified.
> > >
> > > I suppose that re-checking the d_count and i_count after step 3 might
> > > close the race.
> > 
> > 	as things currently stand, i believe that race can only happen if 
> > the leaser is blocking on the kmalloc.
> 
> Neither break_lease() (the shortcut inlined function, not the full
> __break_lease()) nor any of the open code that I can see is under the
> BKL, so unless I'm missing something, that code is racy.
> 
> --b.

	sorry, obviously you're right; if ->i_flock is null, break_lease() 
is doing the bad thing.  i've been through so many reworkings of that code 
that i'd forgotten which race was where; i came into this thread midway 
and should've reacquainted myself with the actual code in question.  my 
bad :-/

> > but yeah, the d_count check is 
> > pretty frail anyway ...
> > 
> > 
> > > > I suppose a stupid fix would be to set (and later clear) a new flag in
> > > > dentry.d_flags which means
> > > > 
> > > >   this dentry is pinned by a ram-backed device, so d_count==2 means
> > > >   "unused""
> > > > 
> > > > But it would be better to work out exactly what generic_setlease() is
> > > > trying to do there, and do it in a better way.
> > > 
> > > Yes.  What it's supposed to do is provide exclusion between opens and
> > > write leases.
> > > 
> > > We already have a mechanism that provides exclusion between write opens
> > > and exec, using the i_writecount, so we're using that for read leases.
> > > I suppose it'd be possible to do something similar for write leases;
> > > would there be smp scalability problems associated with counting all the
> > > read opens of a given inode?  Other problems?

	interesting; is this feasible?  i'd like to hear what folks think.

> > > 
> > > Even with this problem solved, I'm not convinced write leases are very
> > > useful as implemented.  Their only current user is Samba, which uses
> > > them to grant exclusive access to given files to allow clients to cache
> > > writes.
> > > 
> > > Samba knows when to revoke that exclusive access because the lease
> > > subsystem signals it on a read open of the file.  It doesn't revoke on
> > > stat, however.  This causes problem.  E.g., say Samba takes out a lease
> > > and tells some client it can now cache its writes indefinitely.
> > > Meanwhile a local application (say, make) is polling that file for
> > > changes using stat.  They never see those changes.
> > > 
> > > The NFSv2/v3 server for some reason has its own one-off hack that
> > > reports the ctime as now for on any write-leased file, which leads
> > > people to complain about spurious rebuilds:
> > > 
> > > 	http://bugzilla.kernel.org/show_bug.cgi?id=9454
> > > 
> > > The one thing I suspect is *not* a really serious problem here is the
> > > reported LTP failure, since probably the only user of this is Samba,
> > > which probably doesn't do a lot of tmpfs exports, and in any case it can
> > > probably soldier on (if with degraded performance--how badly I don't
> > > know) without getting the write lease it wants.
> > > 
> > > --b.
> > > 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 21:42   ` J. Bruce Fields
  2008-04-29 22:01     ` Mike Frysinger
  2008-04-29 23:21     ` david m. richter
@ 2008-05-01  6:24     ` Al Viro
  2008-05-02 22:22       ` J. Bruce Fields
  2 siblings, 1 reply; 14+ messages in thread
From: Al Viro @ 2008-05-01  6:24 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy,
	uclinux-dist-devel, richterd

On Tue, Apr 29, 2008 at 05:42:31PM -0400, J. Bruce Fields wrote:
> The most likely consequences are that a local reader gets out-of-date
> data for a file that a Samba client has modified.
> 
> I suppose that re-checking the d_count and i_count after step 3 might
> close the race.
 
The hell it might.  Leases are broken, plain and simple.  Not to mention
anything else, a couple of threads with shared descriptor table will
bypass these checks happily.

FWIW, that's far from the worst problem in fs/locks.c, and not even the
worst one with leases.

That, BTW, is a fine demonstration of the reasons why application-specific
kernel warts(tm) are bad.  Lease support is samba-only turd; so's dnotify,
with its lovely problems.  And interfaces like that *suck*; they are
developed with one application in mind and that leads to "we know how it
will be used" mentality.  With obvious implications for quality of review
they get from their developers...

Al, currently crawling through struct file_lock review and extremely annoyed
by the amount of turds being found...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-04-29 22:21   ` Matthew Wilcox
@ 2008-05-01  6:33     ` Al Viro
  2008-05-02 22:26       ` J. Bruce Fields
  0 siblings, 1 reply; 14+ messages in thread
From: Al Viro @ 2008-05-01  6:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy,
	uclinux-dist-devel, J. Bruce Fields

On Tue, Apr 29, 2008 at 04:21:42PM -0600, Matthew Wilcox wrote:
> On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> > I guess we should make the generic_setlease() heuristic smarter.
> > 
> > Of course the _reason_ for that heuristic is uncommented and lost in time. 
> > And one wonders what locking prevents it from being totally racy, and if
> > "none", what happens when the race hits.  Sigh.
> 
> It's hardly "lost in time" when you can ask the original author.
> 
> If there are multiple processes with this file open, you can't place a
> lease on it.

... except that it has nofsckingthing in common with the checks in
question.  Number of processes having a file open has has nothing to
do dentry or inode refcounts; indeed, if you have opened file once
it'd have only one struct file.  Moreover, e.g. stat(2) on its name
will bump dentry refcount just fine.  Moreover, if you have two threads
with common descriptor table, not even *file* refcount will help you.

BTW, ->fl_owner in those suckers is fairly useless - open files, take
leases, fork, have parent exit.  Voila - you've got a bunch of file_lock
with ->fl_owner pointing to freed files_struct.  Fortunately it's never
going to be dereferenced, but results of comparisons are unreliable as
hell.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-05-01  6:24     ` Al Viro
@ 2008-05-02 22:22       ` J. Bruce Fields
  0 siblings, 0 replies; 14+ messages in thread
From: J. Bruce Fields @ 2008-05-02 22:22 UTC (permalink / raw)
  To: Al Viro
  Cc: Andrew Morton, Bryan Wu, linux-kernel, torvalds, willy,
	uclinux-dist-devel, richterd

On Thu, May 01, 2008 at 07:24:32AM +0100, Al Viro wrote:
> On Tue, Apr 29, 2008 at 05:42:31PM -0400, J. Bruce Fields wrote:
> > The most likely consequences are that a local reader gets out-of-date
> > data for a file that a Samba client has modified.
> > 
> > I suppose that re-checking the d_count and i_count after step 3 might
> > close the race.
>  
> The hell it might.

Yeah, looking back at the code, I suppose by the time we've added the
new lease to the inode's lock list, we've already broken conflicting
leases.  OK.

> Leases are broken, plain and simple.  Not to mention
> anything else, a couple of threads with shared descriptor table will
> bypass these checks happily.

I lost you there.

> 
> FWIW, that's far from the worst problem in fs/locks.c, and not even the
> worst one with leases.
> 
> That, BTW, is a fine demonstration of the reasons why application-specific
> kernel warts(tm) are bad.  Lease support is samba-only turd; so's dnotify,
> with its lovely problems.  And interfaces like that *suck*; they are
> developed with one application in mind and that leads to "we know how it
> will be used" mentality.  With obvious implications for quality of review
> they get from their developers...

I honestly don't understand how exactly Samba uses leases; it'd be
extremely useful to have a concise list of requirements from them.  I
know that they don't really meet the nfsv4 server's requirements (any
bugs aside).

> Al, currently crawling through struct file_lock review and extremely annoyed
> by the amount of turds being found...

Thanks, it's been long in need of more attention--details welcomed.

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs
  2008-05-01  6:33     ` Al Viro
@ 2008-05-02 22:26       ` J. Bruce Fields
  0 siblings, 0 replies; 14+ messages in thread
From: J. Bruce Fields @ 2008-05-02 22:26 UTC (permalink / raw)
  To: Al Viro
  Cc: Matthew Wilcox, Andrew Morton, Bryan Wu, linux-kernel, torvalds,
	willy, uclinux-dist-devel, richterd

On Thu, May 01, 2008 at 07:33:39AM +0100, Al Viro wrote:
> On Tue, Apr 29, 2008 at 04:21:42PM -0600, Matthew Wilcox wrote:
> > On Tue, Apr 29, 2008 at 01:54:54PM -0700, Andrew Morton wrote:
> > > I guess we should make the generic_setlease() heuristic smarter.
> > > 
> > > Of course the _reason_ for that heuristic is uncommented and lost in time. 
> > > And one wonders what locking prevents it from being totally racy, and if
> > > "none", what happens when the race hits.  Sigh.
> > 
> > It's hardly "lost in time" when you can ask the original author.
> > 
> > If there are multiple processes with this file open, you can't place a
> > lease on it.
> 
> ... except that it has nofsckingthing in common with the checks in
> question.  Number of processes having a file open has has nothing to
> do dentry or inode refcounts; indeed, if you have opened file once
> it'd have only one struct file.  Moreover, e.g. stat(2) on its name
> will bump dentry refcount just fine.  Moreover, if you have two threads
> with common descriptor table, not even *file* refcount will help you.

Your point about unclear requirements is taken, but I doubt anyone needs
exclusion between leases and threads that share the file descriptor on
which the lease was taken.

--b.

> BTW, ->fl_owner in those suckers is fairly useless - open files, take
> leases, fork, have parent exit.  Voila - you've got a bunch of file_lock
> with ->fl_owner pointing to freed files_struct.  Fortunately it's never
> going to be dereferenced, but results of comparisons are unreliable as
> hell.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-05-02 22:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-29  3:42 [LTP/VFS] fcntl SETLEASE fails on ramfs/tmpfs Bryan Wu
2008-04-29 20:54 ` Andrew Morton
2008-04-29 21:42   ` J. Bruce Fields
2008-04-29 22:01     ` Mike Frysinger
2008-04-29 22:11       ` J. Bruce Fields
2008-04-29 22:15         ` Mike Frysinger
2008-04-29 23:21     ` david m. richter
2008-04-30 17:50       ` J. Bruce Fields
2008-04-30 18:14         ` david m. richter
2008-05-01  6:24     ` Al Viro
2008-05-02 22:22       ` J. Bruce Fields
2008-04-29 22:21   ` Matthew Wilcox
2008-05-01  6:33     ` Al Viro
2008-05-02 22:26       ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox