* [git pull] vfs and fs fixes
@ 2012-04-17 5:25 Al Viro
2012-04-17 15:01 ` Linus Torvalds
2012-04-19 3:23 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 31+ messages in thread
From: Al Viro @ 2012-04-17 5:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, linux-fsdevel
A bunch of endianness fixes plus a patch from bfields untangling
dependencies between vfs and nfsd trees; in principle, we could keep it
in nfsd tree (along with a bunch of followups that definitely belong there),
but Miklos' stuff in fs/namei.c steps fairly close to it and overlayfs
and unionfs series - even closer, so that would create serious PITA for
both, whichever tree it would sit in.
Speaking of endianness stuff, I'm rather tempted to slap
ccflags-y += -D__CHECK_ENDIAN__ in fs/Makefile, if not making it
default for the entire tree; nfsd regressions I've caught make one
hell of a pile and we'd obviously benefit from having that kind of
stuff caught earlier...
Please, pull from the usual place -
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus
Shortlog:
Al Viro (12):
nfsd: fix b0rken error value for setattr on read-only mount
nfsd: fix error values returned by nfsd4_lockt() when nfsd_open() fails
nfsd: fix endianness breakage in TEST_STATEID handling
nfsd: fix error value on allocation failure in nfsd4_decode_test_stateid()
nfsd: fix compose_entry_fh() failure exits
ext4: fix endianness breakage in ext4_split_extent_at()
btrfs: btrfs_root_readonly() broken on big-endian
ocfs2: ->l_next_free_req breakage on big-endian
ocfs: ->rl_used breakage on big-endian
ocfs2: ->rl_count endianness breakage
ocfs2: ->e_leaf_clusters endianness breakage
lockd: fix the endianness bug
J. Bruce Fields (1):
vfs: take i_mutex on renamed file
Diffstat:
Documentation/filesystems/directory-locking | 11 ++++++-----
fs/btrfs/ctree.h | 2 +-
fs/ext4/extents.c | 2 +-
fs/lockd/clnt4xdr.c | 2 +-
fs/lockd/clntxdr.c | 2 +-
fs/namei.c | 3 +++
fs/nfsd/nfs3xdr.c | 22 ++++++++--------------
fs/nfsd/nfs4proc.c | 7 ++++---
fs/nfsd/nfs4state.c | 23 +++++++++--------------
fs/nfsd/nfs4xdr.c | 4 ++--
fs/ocfs2/alloc.c | 2 +-
fs/ocfs2/refcounttree.c | 12 ++++++------
fs/ocfs2/suballoc.c | 4 ++--
include/linux/fs.h | 9 ++++++---
14 files changed, 51 insertions(+), 54 deletions(-)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 5:25 Al Viro
@ 2012-04-17 15:01 ` Linus Torvalds
2012-04-17 16:22 ` J. Bruce Fields
2012-04-17 18:01 ` Al Viro
2012-04-19 3:23 ` Benjamin Herrenschmidt
1 sibling, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2012-04-17 15:01 UTC (permalink / raw)
To: Al Viro; +Cc: linux-kernel, linux-fsdevel
On Mon, Apr 16, 2012 at 10:25 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> A bunch of endianness fixes plus a patch from bfields untangling
> dependencies between vfs and nfsd trees; in principle, we could keep it
> in nfsd tree (along with a bunch of followups that definitely belong there),
> but Miklos' stuff in fs/namei.c steps fairly close to it and overlayfs
> and unionfs series - even closer, so that would create serious PITA for
> both, whichever tree it would sit in.
Why is that double mutex taking in vfs_rename_other() safe from ABBA?
We aren't guaranteed to hold the s_vfs_rename_mutex, since the parent
directories may be the same.
And yes, we hold the i_mutex on that shared parent, but the inodes may
exist (hardlinked) in another directory, so another rename could be
doing the i_mutex in the reverse order.
Maybe there is some reason why that double lock is safe, but I don't
see it, and I want it clearly documented. So I'm not pulling this.
Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 15:01 ` Linus Torvalds
@ 2012-04-17 16:22 ` J. Bruce Fields
2012-04-17 16:33 ` Linus Torvalds
2012-04-17 18:01 ` Al Viro
1 sibling, 1 reply; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-17 16:22 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Al Viro, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 08:01:48AM -0700, Linus Torvalds wrote:
> On Mon, Apr 16, 2012 at 10:25 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > A bunch of endianness fixes plus a patch from bfields untangling
> > dependencies between vfs and nfsd trees; in principle, we could keep it
> > in nfsd tree (along with a bunch of followups that definitely belong there),
> > but Miklos' stuff in fs/namei.c steps fairly close to it and overlayfs
> > and unionfs series - even closer, so that would create serious PITA for
> > both, whichever tree it would sit in.
>
> Why is that double mutex taking in vfs_rename_other() safe from ABBA?
>
> We aren't guaranteed to hold the s_vfs_rename_mutex, since the parent
> directories may be the same.
>
> And yes, we hold the i_mutex on that shared parent, but the inodes may
> exist (hardlinked) in another directory, so another rename could be
> doing the i_mutex in the reverse order.
>
> Maybe there is some reason why that double lock is safe, but I don't
> see it, and I want it clearly documented. So I'm not pulling this.
Ugh, no, I think you're right:
rename A/a->A/b
rename B/b->B/b
where A/a and B/a are the same file, and A/b and B/b are the same file,
can result in the first rename holding the lock on A and a and waiting
on b, and the second holding the lock on B and b and waiting on a.
--b.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 16:22 ` J. Bruce Fields
@ 2012-04-17 16:33 ` Linus Torvalds
2012-04-17 17:06 ` J. Bruce Fields
2012-04-17 17:59 ` Al Viro
0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2012-04-17 16:33 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Al Viro, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 9:22 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
>
> Ugh, no, I think you're right:
>
> rename A/a->A/b
> rename B/b->B/b
>
> where A/a and B/a are the same file, and A/b and B/b are the same file,
> can result in the first rename holding the lock on A and a and waiting
> on b, and the second holding the lock on B and b and waiting on a.
In fact I don't think you need even that much. Just a simple
touch a
ln a b
mv a b
looks like it should deadlock on itself, no? source and dest inodes
will be the same, so the mutex_lock() will just deadlock without even
any ABBA race.
(I didn't really check - maybe there is some reason that doesn't happen).
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 16:33 ` Linus Torvalds
@ 2012-04-17 17:06 ` J. Bruce Fields
2012-04-17 17:59 ` Al Viro
1 sibling, 0 replies; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-17 17:06 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Al Viro, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 09:33:06AM -0700, Linus Torvalds wrote:
> On Tue, Apr 17, 2012 at 9:22 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > Ugh, no, I think you're right:
> >
> > rename A/a->A/b
> > rename B/b->B/b
> >
> > where A/a and B/a are the same file, and A/b and B/b are the same file,
> > can result in the first rename holding the lock on A and a and waiting
> > on b, and the second holding the lock on B and b and waiting on a.
>
> In fact I don't think you need even that much. Just a simple
>
> touch a
> ln a b
> mv a b
>
> looks like it should deadlock on itself, no? source and dest inodes
> will be the same, so the mutex_lock() will just deadlock without even
> any ABBA race.
>
> (I didn't really check - maybe there is some reason that doesn't happen).
Yeah, rename has that funny exception that makes the above a no-op, so I
think that's safe.
But the patch is still wrong; back to the drawing board.
Maybe a paper bag over my head will help my concentration....
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 16:33 ` Linus Torvalds
2012-04-17 17:06 ` J. Bruce Fields
@ 2012-04-17 17:59 ` Al Viro
1 sibling, 0 replies; 31+ messages in thread
From: Al Viro @ 2012-04-17 17:59 UTC (permalink / raw)
To: Linus Torvalds; +Cc: J. Bruce Fields, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 09:33:06AM -0700, Linus Torvalds wrote:
> On Tue, Apr 17, 2012 at 9:22 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> In fact I don't think you need even that much. Just a simple
>
> touch a
> ln a b
> mv a b
>
> looks like it should deadlock on itself, no? source and dest inodes
> will be the same, so the mutex_lock() will just deadlock without even
> any ABBA race.
No, this one will bail out much earlier.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 15:01 ` Linus Torvalds
2012-04-17 16:22 ` J. Bruce Fields
@ 2012-04-17 18:01 ` Al Viro
2012-04-17 18:28 ` Al Viro
1 sibling, 1 reply; 31+ messages in thread
From: Al Viro @ 2012-04-17 18:01 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 08:01:48AM -0700, Linus Torvalds wrote:
> On Mon, Apr 16, 2012 at 10:25 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > ? ? ? ?A bunch of endianness fixes plus a patch from bfields untangling
> > dependencies between vfs and nfsd trees; in principle, we could keep it
> > in nfsd tree (along with a bunch of followups that definitely belong there),
> > but Miklos' stuff in fs/namei.c steps fairly close to it and overlayfs
> > and unionfs series - even closer, so that would create serious PITA for
> > both, whichever tree it would sit in.
>
> Why is that double mutex taking in vfs_rename_other() safe from ABBA?
>
> We aren't guaranteed to hold the s_vfs_rename_mutex, since the parent
> directories may be the same.
>
> And yes, we hold the i_mutex on that shared parent, but the inodes may
> exist (hardlinked) in another directory, so another rename could be
> doing the i_mutex in the reverse order.
>
> Maybe there is some reason why that double lock is safe, but I don't
> see it, and I want it clearly documented. So I'm not pulling this.
It isn't. Hell knows - I wonder if taking s_vfs_rename_mutex in all cases
in lock_rename() would be the right thing to do; it would remove the
problem, but might cost us too much contention...
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 18:01 ` Al Viro
@ 2012-04-17 18:28 ` Al Viro
2012-04-17 21:14 ` J. Bruce Fields
0 siblings, 1 reply; 31+ messages in thread
From: Al Viro @ 2012-04-17 18:28 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 07:01:29PM +0100, Al Viro wrote:
> It isn't. Hell knows - I wonder if taking s_vfs_rename_mutex in all cases
> in lock_rename() would be the right thing to do; it would remove the
> problem, but might cost us too much contention...
Actually, it's even worse. ext4_move_extents() locks a _pair_ of
->i_mutex (having checked that both are non-directories first). In
i_ino order. So the only plausible ordering would be
* directories by tree order (with s_vfs_rename_mutex held to
stabilize the tree topology)
* non-directories after all directories, ordered in some consistent
way. Which would have to be by inumber if we want to leave ext4 code
as-is.
Bruce: for now I'm dropping that patch. We _might_ take ext4
mutex_inode_double_lock() into fs/namei.c and have it used by
vfs_rename_other(), but I'm not convinced that this is the right
thing to do. Is there any other sane way to deal with nfsd problem?
i_mutex is already used for more things than I'd like...
Linus: I've removed that sucker from the queue; it should propagate to
git.kernel.org in a few. Could you pull the rest? Same place, the stats
are below:
Shortlog:
Al Viro (12):
nfsd: fix b0rken error value for setattr on read-only mount
nfsd: fix error values returned by nfsd4_lockt() when nfsd_open() fails
nfsd: fix endianness breakage in TEST_STATEID handling
nfsd: fix error value on allocation failure in nfsd4_decode_test_stateid()
nfsd: fix compose_entry_fh() failure exits
ext4: fix endianness breakage in ext4_split_extent_at()
btrfs: btrfs_root_readonly() broken on big-endian
ocfs2: ->l_next_free_req breakage on big-endian
ocfs: ->rl_used breakage on big-endian
ocfs2: ->rl_count endianness breakage
ocfs2: ->e_leaf_clusters endianness breakage
lockd: fix the endianness bug
Diffstat:
fs/btrfs/ctree.h | 2 +-
fs/ext4/extents.c | 2 +-
fs/lockd/clnt4xdr.c | 2 +-
fs/lockd/clntxdr.c | 2 +-
fs/nfsd/nfs3xdr.c | 22 ++++++++--------------
fs/nfsd/nfs4proc.c | 7 ++++---
fs/nfsd/nfs4state.c | 23 +++++++++--------------
fs/nfsd/nfs4xdr.c | 4 ++--
fs/ocfs2/alloc.c | 2 +-
fs/ocfs2/refcounttree.c | 12 ++++++------
fs/ocfs2/suballoc.c | 4 ++--
11 files changed, 36 insertions(+), 46 deletions(-)
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 18:28 ` Al Viro
@ 2012-04-17 21:14 ` J. Bruce Fields
2012-04-17 22:08 ` Linus Torvalds
0 siblings, 1 reply; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-17 21:14 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 07:28:26PM +0100, Al Viro wrote:
> On Tue, Apr 17, 2012 at 07:01:29PM +0100, Al Viro wrote:
>
> > It isn't. Hell knows - I wonder if taking s_vfs_rename_mutex in all cases
> > in lock_rename() would be the right thing to do; it would remove the
> > problem, but might cost us too much contention...
>
> Actually, it's even worse. ext4_move_extents() locks a _pair_ of
> ->i_mutex (having checked that both are non-directories first). In
> i_ino order. So the only plausible ordering would be
> * directories by tree order (with s_vfs_rename_mutex held to
> stabilize the tree topology)
> * non-directories after all directories, ordered in some consistent
> way. Which would have to be by inumber if we want to leave ext4 code
> as-is.
>
> Bruce: for now I'm dropping that patch. We _might_ take ext4
> mutex_inode_double_lock() into fs/namei.c and have it used by
> vfs_rename_other(), but I'm not convinced that this is the right
> thing to do. Is there any other sane way to deal with nfsd problem?
> i_mutex is already used for more things than I'd like...
I don't want to give out a delegation while a rename, link, unlink, or
setattr of an inode is in progress. All but rename are covered by the
i_mutex.
I'm happy just failing the delegation in case of conflict.
Maybe instead I could continue using the i_mutex but handle rename some
other way; e.g. in delegation code:
if (!mutex_trylock(inode->i_mutex))
return -EAGAIN;
if (atomic_read(inode->i_renames_in_progress))
return -EAGAIN;
and add an
atomic_inc(inode->i_renames_in_progress);
atomic_dec(inode->i_renames_in_progress);
pair around rename.
Or I could increment that counter for all the conflicting operations and
rely on it instead of the i_mutex. I was trying to avoid adding
something like that (an inc, a dec, another error path) to every
operation. And hoping to avoid adding another field to struct inode.
Oh well.
--b.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 21:14 ` J. Bruce Fields
@ 2012-04-17 22:08 ` Linus Torvalds
2012-04-17 23:44 ` Al Viro
2012-04-18 0:47 ` J. Bruce Fields
0 siblings, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2012-04-17 22:08 UTC (permalink / raw)
To: J. Bruce Fields; +Cc: Al Viro, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 2:14 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Tue, Apr 17, 2012 at 07:28:26PM +0100, Al Viro wrote:>
> Maybe instead I could continue using the i_mutex but handle rename some
> other way; e.g. in delegation code:
>
> if (!mutex_trylock(inode->i_mutex))
> return -EAGAIN;
> if (atomic_read(inode->i_renames_in_progress))
> return -EAGAIN;
>
> and add an
>
> atomic_inc(inode->i_renames_in_progress);
> atomic_dec(inode->i_renames_in_progress);
>
> pair around rename.
Please don't make up your own locking. Plus it's broken anyway, since
a rename could come in directly after your atomic_read (and this is
*why* people shouldn't make up their own locks - they are invariably
broken).
> Or I could increment that counter for all the conflicting operations and
> rely on it instead of the i_mutex. I was trying to avoid adding
> something like that (an inc, a dec, another error path) to every
> operation. And hoping to avoid adding another field to struct inode.
> Oh well.
We could just say that we can do a double inode lock, but then
standardize on the order. And the only sane order is comparing inode
pointers, not inode numbers like ext4 apparently does.
With a standard order, I don't think it would be at all wrong to just
take the inode lock on rename.
Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 22:08 ` Linus Torvalds
@ 2012-04-17 23:44 ` Al Viro
2012-04-18 0:49 ` J. Bruce Fields
` (3 more replies)
2012-04-18 0:47 ` J. Bruce Fields
1 sibling, 4 replies; 31+ messages in thread
From: Al Viro @ 2012-04-17 23:44 UTC (permalink / raw)
To: Linus Torvalds; +Cc: J. Bruce Fields, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > Or I could increment that counter for all the conflicting operations and
> > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > something like that (an inc, a dec, another error path) to every
> > operation. ?And hoping to avoid adding another field to struct inode.
> > Oh well.
>
> We could just say that we can do a double inode lock, but then
> standardize on the order. And the only sane order is comparing inode
> pointers, not inode numbers like ext4 apparently does.
>
> With a standard order, I don't think it would be at all wrong to just
> take the inode lock on rename.
In principle, yes, but have you tried to grep for i_mutex? Note that
we have *another* place where multiple ->i_mutex might be held on
non-directories (and unless I'm missing something, ext4 move_extent.c
stuff doesn't play well with it): quota writes. Which can, AFAICS,
happen while write(2) is holding ->i_mutex on a regular file. So
it's not _that_ easy - we want something like "and quota file is goes
last", since there we don't get to change the locking order - the first
->i_mutex is taken too far outside.
I really don't like how messy i_mutex had become these days. Right now
I'm staring at 700-odd lines all over the place where it's taken/released
and it's a wastebucket lock - used to protect random bits and scraps, with a
lot of filesystems, etc. using it for purposes of their own ;-/
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 22:08 ` Linus Torvalds
2012-04-17 23:44 ` Al Viro
@ 2012-04-18 0:47 ` J. Bruce Fields
1 sibling, 0 replies; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-18 0:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Al Viro, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> On Tue, Apr 17, 2012 at 2:14 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > On Tue, Apr 17, 2012 at 07:28:26PM +0100, Al Viro wrote:>
> > Maybe instead I could continue using the i_mutex but handle rename some
> > other way; e.g. in delegation code:
> >
> > if (!mutex_trylock(inode->i_mutex))
> > return -EAGAIN;
> > if (atomic_read(inode->i_renames_in_progress))
> > return -EAGAIN;
> >
> > and add an
> >
> > atomic_inc(inode->i_renames_in_progress);
> > atomic_dec(inode->i_renames_in_progress);
> >
> > pair around rename.
>
> Please don't make up your own locking. Plus it's broken anyway, since
> a rename could come in directly after your atomic_read (and this is
> *why* people shouldn't make up their own locks - they are invariably
> broken).
Doh, yes, sounds like a good rule. (I was misremembering some previous
attempt at this--which admittedly may just have failed in some more
complicated way.)
--b.
> > Or I could increment that counter for all the conflicting operations and
> > rely on it instead of the i_mutex. I was trying to avoid adding
> > something like that (an inc, a dec, another error path) to every
> > operation. And hoping to avoid adding another field to struct inode.
> > Oh well.
>
> We could just say that we can do a double inode lock, but then
> standardize on the order. And the only sane order is comparing inode
> pointers, not inode numbers like ext4 apparently does.
>
> With a standard order, I don't think it would be at all wrong to just
> take the inode lock on rename.
>
> Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 23:44 ` Al Viro
@ 2012-04-18 0:49 ` J. Bruce Fields
2012-04-18 0:56 ` Linus Torvalds
` (2 subsequent siblings)
3 siblings, 0 replies; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-18 0:49 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-fsdevel
On Wed, Apr 18, 2012 at 12:44:24AM +0100, Al Viro wrote:
> On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > Or I could increment that counter for all the conflicting operations and
> > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > something like that (an inc, a dec, another error path) to every
> > > operation. ?And hoping to avoid adding another field to struct inode.
> > > Oh well.
> >
> > We could just say that we can do a double inode lock, but then
> > standardize on the order. And the only sane order is comparing inode
> > pointers, not inode numbers like ext4 apparently does.
> >
> > With a standard order, I don't think it would be at all wrong to just
> > take the inode lock on rename.
I'll take a stab at that. But:
> In principle, yes, but have you tried to grep for i_mutex? Note that
> we have *another* place where multiple ->i_mutex might be held on
> non-directories (and unless I'm missing something, ext4 move_extent.c
> stuff doesn't play well with it): quota writes. Which can, AFAICS,
> happen while write(2) is holding ->i_mutex on a regular file. So
> it's not _that_ easy - we want something like "and quota file is goes
> last", since there we don't get to change the locking order - the first
> ->i_mutex is taken too far outside.
>
> I really don't like how messy i_mutex had become these days. Right now
> I'm staring at 700-odd lines all over the place where it's taken/released
> and it's a wastebucket lock - used to protect random bits and scraps, with a
> lot of filesystems, etc. using it for purposes of their own ;-/
I can understand not wanting the i_mutex to have another use.
--b.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 23:44 ` Al Viro
2012-04-18 0:49 ` J. Bruce Fields
@ 2012-04-18 0:56 ` Linus Torvalds
2012-04-18 21:52 ` J. Bruce Fields
2012-04-20 11:15 ` Jan Kara
3 siblings, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2012-04-18 0:56 UTC (permalink / raw)
To: Al Viro; +Cc: J. Bruce Fields, linux-kernel, linux-fsdevel
On Tue, Apr 17, 2012 at 4:44 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> In principle, yes, but have you tried to grep for i_mutex?
No, I hadn't, and right you are. That's a mess.
It's not the i_mutex hits themselves that look scary, but there's 75
hits on just the *nested* locking. That's way more than I would have
guessed without the grep.
The ext4 ones look pretty simple. But yeah, there's a lot of other
noise there...
Linus
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 23:44 ` Al Viro
2012-04-18 0:49 ` J. Bruce Fields
2012-04-18 0:56 ` Linus Torvalds
@ 2012-04-18 21:52 ` J. Bruce Fields
2012-04-25 15:20 ` J. Bruce Fields
2012-04-20 11:15 ` Jan Kara
3 siblings, 1 reply; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-18 21:52 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-fsdevel
On Wed, Apr 18, 2012 at 12:44:24AM +0100, Al Viro wrote:
> On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > Or I could increment that counter for all the conflicting operations and
> > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > something like that (an inc, a dec, another error path) to every
> > > operation. ?And hoping to avoid adding another field to struct inode.
> > > Oh well.
> >
> > We could just say that we can do a double inode lock, but then
> > standardize on the order. And the only sane order is comparing inode
> > pointers, not inode numbers like ext4 apparently does.
> >
> > With a standard order, I don't think it would be at all wrong to just
> > take the inode lock on rename.
>
> In principle, yes, but have you tried to grep for i_mutex? Note that
> we have *another* place where multiple ->i_mutex might be held on
> non-directories (and unless I'm missing something, ext4 move_extent.c
> stuff doesn't play well with it): quota writes. Which can, AFAICS,
> happen while write(2) is holding ->i_mutex on a regular file. So
> it's not _that_ easy - we want something like "and quota file is goes
> last"
So the idea would be to always take the i_mutex on non-quota files
before taking it on quota files?
I tried pulling the ext4 thing into fs/inode.c, modifying the order to
do that, and then doing the rename change on top of that.
One thing I don't understand is how that interacts with
quota_on/quota_off. How do we decide the right lock ordering if files
can go back and forth between being quota files?
--b.
> , since there we don't get to change the locking order - the first
> ->i_mutex is taken too far outside.
>
> I really don't like how messy i_mutex had become these days. Right now
> I'm staring at 700-odd lines all over the place where it's taken/released
> and it's a wastebucket lock - used to protect random bits and scraps, with a
> lot of filesystems, etc. using it for purposes of their own ;-/
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 5:25 Al Viro
2012-04-17 15:01 ` Linus Torvalds
@ 2012-04-19 3:23 ` Benjamin Herrenschmidt
2012-04-19 14:50 ` Ted Ts'o
1 sibling, 1 reply; 31+ messages in thread
From: Benjamin Herrenschmidt @ 2012-04-19 3:23 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-fsdevel, Anton Blanchard
> ext4: fix endianness breakage in ext4_split_extent_at()
That smells like something we'd want in -stable no ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-19 3:23 ` Benjamin Herrenschmidt
@ 2012-04-19 14:50 ` Ted Ts'o
2012-04-24 17:40 ` Greg KH
0 siblings, 1 reply; 31+ messages in thread
From: Ted Ts'o @ 2012-04-19 14:50 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel,
Anton Blanchard
On Thu, Apr 19, 2012 at 01:23:18PM +1000, Benjamin Herrenschmidt wrote:
> > ext4: fix endianness breakage in ext4_split_extent_at()
>
> That smells like something we'd want in -stable no ?
Yes, commit af1584f5 is something that should definitely go into
-stable.
- Ted
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-17 23:44 ` Al Viro
` (2 preceding siblings ...)
2012-04-18 21:52 ` J. Bruce Fields
@ 2012-04-20 11:15 ` Jan Kara
2012-04-24 19:52 ` J. Bruce Fields
3 siblings, 1 reply; 31+ messages in thread
From: Jan Kara @ 2012-04-20 11:15 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, J. Bruce Fields, linux-kernel, linux-fsdevel
On Wed 18-04-12 00:44:24, Al Viro wrote:
> On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > Or I could increment that counter for all the conflicting operations and
> > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > something like that (an inc, a dec, another error path) to every
> > > operation. ?And hoping to avoid adding another field to struct inode.
> > > Oh well.
> >
> > We could just say that we can do a double inode lock, but then
> > standardize on the order. And the only sane order is comparing inode
> > pointers, not inode numbers like ext4 apparently does.
> >
> > With a standard order, I don't think it would be at all wrong to just
> > take the inode lock on rename.
>
> In principle, yes, but have you tried to grep for i_mutex? Note that
> we have *another* place where multiple ->i_mutex might be held on
> non-directories (and unless I'm missing something, ext4 move_extent.c
> stuff doesn't play well with it): quota writes. Which can, AFAICS,
> happen while write(2) is holding ->i_mutex on a regular file. So
> it's not _that_ easy - we want something like "and quota file is goes
> last", since there we don't get to change the locking order - the first
> ->i_mutex is taken too far outside.
Hum, I think I could just do away with quota file i_mutex being special.
It's used for two purposes:
1) When quota is being turned on/off, we want to set/clear inode immutable
flag, truncate page cache, etc. But we should be able push this locking
outside of quota locks.
2) Inside filesystems when quota file is written to. Quota writes are
serialized by quota code anyway and noone else has any bussiness with quota
files (they are marked as immutable to avoid mistakes) so there i_mutex is
not really needed.
> I really don't like how messy i_mutex had become these days. Right now
> I'm staring at 700-odd lines all over the place where it's taken/released
> and it's a wastebucket lock - used to protect random bits and scraps, with a
> lot of filesystems, etc. using it for purposes of their own ;-/
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-19 14:50 ` Ted Ts'o
@ 2012-04-24 17:40 ` Greg KH
2012-04-24 17:45 ` Al Viro
0 siblings, 1 reply; 31+ messages in thread
From: Greg KH @ 2012-04-24 17:40 UTC (permalink / raw)
To: Ted Ts'o, Benjamin Herrenschmidt, Al Viro, Linus Torvalds,
linux-kernel, linux-fsdevel, Anton Blanchard
On Thu, Apr 19, 2012 at 10:50:07AM -0400, Ted Ts'o wrote:
> On Thu, Apr 19, 2012 at 01:23:18PM +1000, Benjamin Herrenschmidt wrote:
> > > ext4: fix endianness breakage in ext4_split_extent_at()
> >
> > That smells like something we'd want in -stable no ?
>
> Yes, commit af1584f5 is something that should definitely go into
> -stable.
Now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 17:40 ` Greg KH
@ 2012-04-24 17:45 ` Al Viro
2012-04-24 17:59 ` Greg KH
0 siblings, 1 reply; 31+ messages in thread
From: Al Viro @ 2012-04-24 17:45 UTC (permalink / raw)
To: Greg KH
Cc: Ted Ts'o, Benjamin Herrenschmidt, Linus Torvalds,
linux-kernel, linux-fsdevel, Anton Blanchard
On Tue, Apr 24, 2012 at 10:40:06AM -0700, Greg KH wrote:
> On Thu, Apr 19, 2012 at 10:50:07AM -0400, Ted Ts'o wrote:
> > On Thu, Apr 19, 2012 at 01:23:18PM +1000, Benjamin Herrenschmidt wrote:
> > > > ext4: fix endianness breakage in ext4_split_extent_at()
> > >
> > > That smells like something we'd want in -stable no ?
> >
> > Yes, commit af1584f5 is something that should definitely go into
> > -stable.
>
> Now queued up, thanks.
Actually, the rest of commits from that pull also should go there; forgot
to add Cc: stable, my apologies...
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 17:45 ` Al Viro
@ 2012-04-24 17:59 ` Greg KH
2012-04-24 18:04 ` Al Viro
0 siblings, 1 reply; 31+ messages in thread
From: Greg KH @ 2012-04-24 17:59 UTC (permalink / raw)
To: Al Viro
Cc: Ted Ts'o, Benjamin Herrenschmidt, Linus Torvalds,
linux-kernel, linux-fsdevel, Anton Blanchard
On Tue, Apr 24, 2012 at 06:45:02PM +0100, Al Viro wrote:
> On Tue, Apr 24, 2012 at 10:40:06AM -0700, Greg KH wrote:
> > On Thu, Apr 19, 2012 at 10:50:07AM -0400, Ted Ts'o wrote:
> > > On Thu, Apr 19, 2012 at 01:23:18PM +1000, Benjamin Herrenschmidt wrote:
> > > > > ext4: fix endianness breakage in ext4_split_extent_at()
> > > >
> > > > That smells like something we'd want in -stable no ?
> > >
> > > Yes, commit af1584f5 is something that should definitely go into
> > > -stable.
> >
> > Now queued up, thanks.
>
> Actually, the rest of commits from that pull also should go there; forgot
> to add Cc: stable, my apologies...
So that would be commits
96f6f98501196d46ce52c2697dd758d9300c63f5..e847469bf77a1d339274074ed068d461f0c872bc
inclusive?
Or a different range?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 17:59 ` Greg KH
@ 2012-04-24 18:04 ` Al Viro
2012-04-24 20:37 ` Greg KH
0 siblings, 1 reply; 31+ messages in thread
From: Al Viro @ 2012-04-24 18:04 UTC (permalink / raw)
To: Greg KH
Cc: Ted Ts'o, Benjamin Herrenschmidt, Linus Torvalds,
linux-kernel, linux-fsdevel, Anton Blanchard
On Tue, Apr 24, 2012 at 10:59:11AM -0700, Greg KH wrote:
> So that would be commits
> 96f6f98501196d46ce52c2697dd758d9300c63f5..e847469bf77a1d339274074ed068d461f0c872bc
> inclusive?
Yes.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-20 11:15 ` Jan Kara
@ 2012-04-24 19:52 ` J. Bruce Fields
2012-04-24 22:23 ` Jan Kara
0 siblings, 1 reply; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-24 19:52 UTC (permalink / raw)
To: Jan Kara; +Cc: Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel
On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> On Wed 18-04-12 00:44:24, Al Viro wrote:
> > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > Or I could increment that counter for all the conflicting operations and
> > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > something like that (an inc, a dec, another error path) to every
> > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > Oh well.
> > >
> > > We could just say that we can do a double inode lock, but then
> > > standardize on the order. And the only sane order is comparing inode
> > > pointers, not inode numbers like ext4 apparently does.
> > >
> > > With a standard order, I don't think it would be at all wrong to just
> > > take the inode lock on rename.
> >
> > In principle, yes, but have you tried to grep for i_mutex? Note that
> > we have *another* place where multiple ->i_mutex might be held on
> > non-directories (and unless I'm missing something, ext4 move_extent.c
> > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > happen while write(2) is holding ->i_mutex on a regular file. So
> > it's not _that_ easy - we want something like "and quota file is goes
> > last", since there we don't get to change the locking order - the first
> > ->i_mutex is taken too far outside.
> Hum, I think I could just do away with quota file i_mutex being special.
> It's used for two purposes:
> 1) When quota is being turned on/off, we want to set/clear inode immutable
> flag, truncate page cache, etc. But we should be able push this locking
> outside of quota locks.
> 2) Inside filesystems when quota file is written to. Quota writes are
> serialized by quota code anyway and noone else has any bussiness with quota
> files (they are marked as immutable to avoid mistakes) so there i_mutex is
> not really needed.
Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
former two are in code called from the quota code (through the
->quota_write method). But the gfs2 code appears to be called directly
from gfs2's write code.
--b.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 18:04 ` Al Viro
@ 2012-04-24 20:37 ` Greg KH
0 siblings, 0 replies; 31+ messages in thread
From: Greg KH @ 2012-04-24 20:37 UTC (permalink / raw)
To: Al Viro
Cc: Ted Ts'o, Benjamin Herrenschmidt, Linus Torvalds,
linux-kernel, linux-fsdevel, Anton Blanchard
On Tue, Apr 24, 2012 at 07:04:07PM +0100, Al Viro wrote:
> On Tue, Apr 24, 2012 at 10:59:11AM -0700, Greg KH wrote:
>
> > So that would be commits
> > 96f6f98501196d46ce52c2697dd758d9300c63f5..e847469bf77a1d339274074ed068d461f0c872bc
> > inclusive?
>
> Yes.
Wonderful, now queued up, thanks for letting me know.
greg k-h
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 19:52 ` J. Bruce Fields
@ 2012-04-24 22:23 ` Jan Kara
2012-04-25 11:29 ` J. Bruce Fields
0 siblings, 1 reply; 31+ messages in thread
From: Jan Kara @ 2012-04-24 22:23 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Jan Kara, Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel
On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > Or I could increment that counter for all the conflicting operations and
> > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > something like that (an inc, a dec, another error path) to every
> > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > Oh well.
> > > >
> > > > We could just say that we can do a double inode lock, but then
> > > > standardize on the order. And the only sane order is comparing inode
> > > > pointers, not inode numbers like ext4 apparently does.
> > > >
> > > > With a standard order, I don't think it would be at all wrong to just
> > > > take the inode lock on rename.
> > >
> > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > we have *another* place where multiple ->i_mutex might be held on
> > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > it's not _that_ easy - we want something like "and quota file is goes
> > > last", since there we don't get to change the locking order - the first
> > > ->i_mutex is taken too far outside.
> > Hum, I think I could just do away with quota file i_mutex being special.
> > It's used for two purposes:
> > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > flag, truncate page cache, etc. But we should be able push this locking
> > outside of quota locks.
> > 2) Inside filesystems when quota file is written to. Quota writes are
> > serialized by quota code anyway and noone else has any bussiness with quota
> > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > not really needed.
>
> Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> former two are in code called from the quota code (through the
> ->quota_write method). But the gfs2 code appears to be called directly
> from gfs2's write code.
Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
wrote.
Honza
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-24 22:23 ` Jan Kara
@ 2012-04-25 11:29 ` J. Bruce Fields
2012-04-25 16:26 ` Jan Kara
0 siblings, 1 reply; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-25 11:29 UTC (permalink / raw)
To: Jan Kara; +Cc: Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel
On Wed, Apr 25, 2012 at 12:23:12AM +0200, Jan Kara wrote:
> On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> > On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > > Or I could increment that counter for all the conflicting operations and
> > > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > > something like that (an inc, a dec, another error path) to every
> > > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > > Oh well.
> > > > >
> > > > > We could just say that we can do a double inode lock, but then
> > > > > standardize on the order. And the only sane order is comparing inode
> > > > > pointers, not inode numbers like ext4 apparently does.
> > > > >
> > > > > With a standard order, I don't think it would be at all wrong to just
> > > > > take the inode lock on rename.
> > > >
> > > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > > we have *another* place where multiple ->i_mutex might be held on
> > > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > > it's not _that_ easy - we want something like "and quota file is goes
> > > > last", since there we don't get to change the locking order - the first
> > > > ->i_mutex is taken too far outside.
> > > Hum, I think I could just do away with quota file i_mutex being special.
> > > It's used for two purposes:
> > > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > > flag, truncate page cache, etc. But we should be able push this locking
> > > outside of quota locks.
> > > 2) Inside filesystems when quota file is written to. Quota writes are
> > > serialized by quota code anyway and noone else has any bussiness with quota
> > > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > > not really needed.
> >
> > Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> > former two are in code called from the quota code (through the
> > ->quota_write method). But the gfs2 code appears to be called directly
> > from gfs2's write code.
> Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
> invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
> wrote.
So, just the appended?
But unfortunately as long as that's left in gfs2 we're still stuck
trying to order quota files after other files when we take two
non-directory i_mutexes elsewhere.
--b.
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index e1025c7..1a6fb52 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -1441,7 +1441,6 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
struct buffer_head tmp_bh;
struct buffer_head *bh;
- mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
while (towrite > 0) {
tocopy = sb->s_blocksize - offset < towrite ?
sb->s_blocksize - offset : towrite;
@@ -1471,16 +1470,13 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
blk++;
}
out:
- if (len == towrite) {
- mutex_unlock(&inode->i_mutex);
+ if (len == towrite)
return err;
- }
if (inode->i_size < off+len-towrite)
i_size_write(inode, off+len-towrite);
inode->i_version++;
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
- mutex_unlock(&inode->i_mutex);
return len - towrite;
}
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index cf0b592..7c08c93 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -3000,7 +3000,6 @@ static ssize_t ext3_quota_write(struct super_block *sb, int type,
(unsigned long long)off, (unsigned long long)len);
return -EIO;
}
- mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
bh = ext3_bread(handle, inode, blk, 1, &err);
if (!bh)
goto out;
@@ -3024,10 +3023,8 @@ static ssize_t ext3_quota_write(struct super_block *sb, int type,
}
brelse(bh);
out:
- if (err) {
- mutex_unlock(&inode->i_mutex);
+ if (err)
return err;
- }
if (inode->i_size < off + len) {
i_size_write(inode, off + len);
EXT3_I(inode)->i_disksize = inode->i_size;
@@ -3035,7 +3032,6 @@ out:
inode->i_version++;
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
ext3_mark_inode_dirty(handle, inode);
- mutex_unlock(&inode->i_mutex);
return len;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ceebaf8..97938db 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4760,7 +4760,6 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
return -EIO;
}
- mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
bh = ext4_bread(handle, inode, blk, 1, &err);
if (!bh)
goto out;
@@ -4776,16 +4775,13 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
err = ext4_handle_dirty_metadata(handle, NULL, bh);
brelse(bh);
out:
- if (err) {
- mutex_unlock(&inode->i_mutex);
+ if (err)
return err;
- }
if (inode->i_size < off + len) {
i_size_write(inode, off + len);
EXT4_I(inode)->i_disksize = inode->i_size;
ext4_mark_inode_dirty(handle, inode);
}
- mutex_unlock(&inode->i_mutex);
return len;
}
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 8b7616e..c07b7d7 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -2270,7 +2270,6 @@ static ssize_t reiserfs_quota_write(struct super_block *sb, int type,
(unsigned long long)off, (unsigned long long)len);
return -EIO;
}
- mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
while (towrite > 0) {
tocopy = sb->s_blocksize - offset < towrite ?
sb->s_blocksize - offset : towrite;
@@ -2302,16 +2301,13 @@ static ssize_t reiserfs_quota_write(struct super_block *sb, int type,
blk++;
}
out:
- if (len == towrite) {
- mutex_unlock(&inode->i_mutex);
+ if (len == towrite)
return err;
- }
if (inode->i_size < off + len - towrite)
i_size_write(inode, off + len - towrite);
inode->i_version++;
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
- mutex_unlock(&inode->i_mutex);
return len - towrite;
}
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-18 21:52 ` J. Bruce Fields
@ 2012-04-25 15:20 ` J. Bruce Fields
0 siblings, 0 replies; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-25 15:20 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-fsdevel
On Wed, Apr 18, 2012 at 05:52:38PM -0400, J. Bruce Fields wrote:
> On Wed, Apr 18, 2012 at 12:44:24AM +0100, Al Viro wrote:
> > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > Or I could increment that counter for all the conflicting operations and
> > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > something like that (an inc, a dec, another error path) to every
> > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > Oh well.
> > >
> > > We could just say that we can do a double inode lock, but then
> > > standardize on the order. And the only sane order is comparing inode
> > > pointers, not inode numbers like ext4 apparently does.
> > >
> > > With a standard order, I don't think it would be at all wrong to just
> > > take the inode lock on rename.
> >
> > In principle, yes, but have you tried to grep for i_mutex? Note that
> > we have *another* place where multiple ->i_mutex might be held on
> > non-directories (and unless I'm missing something, ext4 move_extent.c
> > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > happen while write(2) is holding ->i_mutex on a regular file. So
> > it's not _that_ easy - we want something like "and quota file is goes
> > last"
>
> So the idea would be to always take the i_mutex on non-quota files
> before taking it on quota files?
>
> I tried pulling the ext4 thing into fs/inode.c, modifying the order to
> do that, and then doing the rename change on top of that.
Patches follow, with the ordering change at the end.
And a documentation fix that I suppose could go in whenever.
--b.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-25 11:29 ` J. Bruce Fields
@ 2012-04-25 16:26 ` Jan Kara
2012-04-25 16:47 ` Steven Whitehouse
2012-04-25 17:11 ` J. Bruce Fields
0 siblings, 2 replies; 31+ messages in thread
From: Jan Kara @ 2012-04-25 16:26 UTC (permalink / raw)
To: J. Bruce Fields
Cc: Jan Kara, Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel,
Steven Whitehouse
On Wed 25-04-12 07:29:30, J. Bruce Fields wrote:
> On Wed, Apr 25, 2012 at 12:23:12AM +0200, Jan Kara wrote:
> > On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> > > On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > > > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > > > Or I could increment that counter for all the conflicting operations and
> > > > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > > > something like that (an inc, a dec, another error path) to every
> > > > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > > > Oh well.
> > > > > >
> > > > > > We could just say that we can do a double inode lock, but then
> > > > > > standardize on the order. And the only sane order is comparing inode
> > > > > > pointers, not inode numbers like ext4 apparently does.
> > > > > >
> > > > > > With a standard order, I don't think it would be at all wrong to just
> > > > > > take the inode lock on rename.
> > > > >
> > > > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > > > we have *another* place where multiple ->i_mutex might be held on
> > > > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > > > it's not _that_ easy - we want something like "and quota file is goes
> > > > > last", since there we don't get to change the locking order - the first
> > > > > ->i_mutex is taken too far outside.
> > > > Hum, I think I could just do away with quota file i_mutex being special.
> > > > It's used for two purposes:
> > > > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > > > flag, truncate page cache, etc. But we should be able push this locking
> > > > outside of quota locks.
> > > > 2) Inside filesystems when quota file is written to. Quota writes are
> > > > serialized by quota code anyway and noone else has any bussiness with quota
> > > > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > > > not really needed.
> > >
> > > Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> > > former two are in code called from the quota code (through the
> > > ->quota_write method). But the gfs2 code appears to be called directly
> > > from gfs2's write code.
> > Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
> > invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
> > wrote.
>
> So, just the appended?
Yup, that's the easier part. We also use the mutex in quota code itself
(fs/quota/dquot.c). That's somewhat harder to solve but still relatively
simple.
> But unfortunately as long as that's left in gfs2 we're still stuck
> trying to order quota files after other files when we take two
> non-directory i_mutexes elsewhere.
As far as GFS2 is concerned, I'm not sure what it uses i_mutex in quota
code for. In any case it should be possible to replace that usage by some
GFS2 internal lock to get rid of the last usage of I_MUTEX_QUOTA... Stephen?
Honza
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index e1025c7..1a6fb52 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -1441,7 +1441,6 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
> struct buffer_head tmp_bh;
> struct buffer_head *bh;
>
> - mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
> while (towrite > 0) {
> tocopy = sb->s_blocksize - offset < towrite ?
> sb->s_blocksize - offset : towrite;
> @@ -1471,16 +1470,13 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
> blk++;
> }
> out:
> - if (len == towrite) {
> - mutex_unlock(&inode->i_mutex);
> + if (len == towrite)
> return err;
> - }
> if (inode->i_size < off+len-towrite)
> i_size_write(inode, off+len-towrite);
> inode->i_version++;
> inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> mark_inode_dirty(inode);
> - mutex_unlock(&inode->i_mutex);
> return len - towrite;
> }
>
> diff --git a/fs/ext3/super.c b/fs/ext3/super.c
> index cf0b592..7c08c93 100644
> --- a/fs/ext3/super.c
> +++ b/fs/ext3/super.c
> @@ -3000,7 +3000,6 @@ static ssize_t ext3_quota_write(struct super_block *sb, int type,
> (unsigned long long)off, (unsigned long long)len);
> return -EIO;
> }
> - mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
> bh = ext3_bread(handle, inode, blk, 1, &err);
> if (!bh)
> goto out;
> @@ -3024,10 +3023,8 @@ static ssize_t ext3_quota_write(struct super_block *sb, int type,
> }
> brelse(bh);
> out:
> - if (err) {
> - mutex_unlock(&inode->i_mutex);
> + if (err)
> return err;
> - }
> if (inode->i_size < off + len) {
> i_size_write(inode, off + len);
> EXT3_I(inode)->i_disksize = inode->i_size;
> @@ -3035,7 +3032,6 @@ out:
> inode->i_version++;
> inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> ext3_mark_inode_dirty(handle, inode);
> - mutex_unlock(&inode->i_mutex);
> return len;
> }
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index ceebaf8..97938db 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4760,7 +4760,6 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> return -EIO;
> }
>
> - mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
> bh = ext4_bread(handle, inode, blk, 1, &err);
> if (!bh)
> goto out;
> @@ -4776,16 +4775,13 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type,
> err = ext4_handle_dirty_metadata(handle, NULL, bh);
> brelse(bh);
> out:
> - if (err) {
> - mutex_unlock(&inode->i_mutex);
> + if (err)
> return err;
> - }
> if (inode->i_size < off + len) {
> i_size_write(inode, off + len);
> EXT4_I(inode)->i_disksize = inode->i_size;
> ext4_mark_inode_dirty(handle, inode);
> }
> - mutex_unlock(&inode->i_mutex);
> return len;
> }
>
> diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
> index 8b7616e..c07b7d7 100644
> --- a/fs/reiserfs/super.c
> +++ b/fs/reiserfs/super.c
> @@ -2270,7 +2270,6 @@ static ssize_t reiserfs_quota_write(struct super_block *sb, int type,
> (unsigned long long)off, (unsigned long long)len);
> return -EIO;
> }
> - mutex_lock_nested(&inode->i_mutex, I_MUTEX_QUOTA);
> while (towrite > 0) {
> tocopy = sb->s_blocksize - offset < towrite ?
> sb->s_blocksize - offset : towrite;
> @@ -2302,16 +2301,13 @@ static ssize_t reiserfs_quota_write(struct super_block *sb, int type,
> blk++;
> }
> out:
> - if (len == towrite) {
> - mutex_unlock(&inode->i_mutex);
> + if (len == towrite)
> return err;
> - }
> if (inode->i_size < off + len - towrite)
> i_size_write(inode, off + len - towrite);
> inode->i_version++;
> inode->i_mtime = inode->i_ctime = CURRENT_TIME;
> mark_inode_dirty(inode);
> - mutex_unlock(&inode->i_mutex);
> return len - towrite;
> }
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-25 16:26 ` Jan Kara
@ 2012-04-25 16:47 ` Steven Whitehouse
2012-04-25 17:11 ` J. Bruce Fields
1 sibling, 0 replies; 31+ messages in thread
From: Steven Whitehouse @ 2012-04-25 16:47 UTC (permalink / raw)
To: Jan Kara
Cc: J. Bruce Fields, Al Viro, Linus Torvalds, linux-kernel,
linux-fsdevel
Hi,
On Wed, 2012-04-25 at 18:26 +0200, Jan Kara wrote:
> On Wed 25-04-12 07:29:30, J. Bruce Fields wrote:
> > On Wed, Apr 25, 2012 at 12:23:12AM +0200, Jan Kara wrote:
> > > On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> > > > On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > > > > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > > > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > > > > Or I could increment that counter for all the conflicting operations and
> > > > > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > > > > something like that (an inc, a dec, another error path) to every
> > > > > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > > > > Oh well.
> > > > > > >
> > > > > > > We could just say that we can do a double inode lock, but then
> > > > > > > standardize on the order. And the only sane order is comparing inode
> > > > > > > pointers, not inode numbers like ext4 apparently does.
> > > > > > >
> > > > > > > With a standard order, I don't think it would be at all wrong to just
> > > > > > > take the inode lock on rename.
> > > > > >
> > > > > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > > > > we have *another* place where multiple ->i_mutex might be held on
> > > > > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > > > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > > > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > > > > it's not _that_ easy - we want something like "and quota file is goes
> > > > > > last", since there we don't get to change the locking order - the first
> > > > > > ->i_mutex is taken too far outside.
> > > > > Hum, I think I could just do away with quota file i_mutex being special.
> > > > > It's used for two purposes:
> > > > > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > > > > flag, truncate page cache, etc. But we should be able push this locking
> > > > > outside of quota locks.
> > > > > 2) Inside filesystems when quota file is written to. Quota writes are
> > > > > serialized by quota code anyway and noone else has any bussiness with quota
> > > > > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > > > > not really needed.
> > > >
> > > > Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> > > > former two are in code called from the quota code (through the
> > > > ->quota_write method). But the gfs2 code appears to be called directly
> > > > from gfs2's write code.
> > > Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
> > > invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
> > > wrote.
> >
> > So, just the appended?
> Yup, that's the easier part. We also use the mutex in quota code itself
> (fs/quota/dquot.c). That's somewhat harder to solve but still relatively
> simple.
>
> > But unfortunately as long as that's left in gfs2 we're still stuck
> > trying to order quota files after other files when we take two
> > non-directory i_mutexes elsewhere.
> As far as GFS2 is concerned, I'm not sure what it uses i_mutex in quota
> code for. In any case it should be possible to replace that usage by some
> GFS2 internal lock to get rid of the last usage of I_MUTEX_QUOTA... Stephen?
>
> Honza
>
I'll have a look and see what we can do. I'm not sure off the top of my
head how easy it will be to eliminate this lock,
Steve.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [git pull] vfs and fs fixes
2012-04-25 16:26 ` Jan Kara
2012-04-25 16:47 ` Steven Whitehouse
@ 2012-04-25 17:11 ` J. Bruce Fields
1 sibling, 0 replies; 31+ messages in thread
From: J. Bruce Fields @ 2012-04-25 17:11 UTC (permalink / raw)
To: Jan Kara
Cc: Al Viro, Linus Torvalds, linux-kernel, linux-fsdevel,
Steven Whitehouse
On Wed, Apr 25, 2012 at 06:26:40PM +0200, Jan Kara wrote:
> On Wed 25-04-12 07:29:30, J. Bruce Fields wrote:
> > On Wed, Apr 25, 2012 at 12:23:12AM +0200, Jan Kara wrote:
> > > On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> > > > On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > > > > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > > > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > > > > Or I could increment that counter for all the conflicting operations and
> > > > > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > > > > something like that (an inc, a dec, another error path) to every
> > > > > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > > > > Oh well.
> > > > > > >
> > > > > > > We could just say that we can do a double inode lock, but then
> > > > > > > standardize on the order. And the only sane order is comparing inode
> > > > > > > pointers, not inode numbers like ext4 apparently does.
> > > > > > >
> > > > > > > With a standard order, I don't think it would be at all wrong to just
> > > > > > > take the inode lock on rename.
> > > > > >
> > > > > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > > > > we have *another* place where multiple ->i_mutex might be held on
> > > > > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > > > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > > > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > > > > it's not _that_ easy - we want something like "and quota file is goes
> > > > > > last", since there we don't get to change the locking order - the first
> > > > > > ->i_mutex is taken too far outside.
> > > > > Hum, I think I could just do away with quota file i_mutex being special.
> > > > > It's used for two purposes:
> > > > > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > > > > flag, truncate page cache, etc. But we should be able push this locking
> > > > > outside of quota locks.
> > > > > 2) Inside filesystems when quota file is written to. Quota writes are
> > > > > serialized by quota code anyway and noone else has any bussiness with quota
> > > > > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > > > > not really needed.
> > > >
> > > > Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> > > > former two are in code called from the quota code (through the
> > > > ->quota_write method). But the gfs2 code appears to be called directly
> > > > from gfs2's write code.
> > > Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
> > > invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
> > > wrote.
> >
> > So, just the appended?
> Yup, that's the easier part. We also use the mutex in quota code itself
> (fs/quota/dquot.c). That's somewhat harder to solve but still relatively
> simple.
Yeah, OK, I stared at that part but wasn't completely sure what you
meant to do there.
--b.
> > But unfortunately as long as that's left in gfs2 we're still stuck
> > trying to order quota files after other files when we take two
> > non-directory i_mutexes elsewhere.
> As far as GFS2 is concerned, I'm not sure what it uses i_mutex in quota
> code for. In any case it should be possible to replace that usage by some
> GFS2 internal lock to get rid of the last usage of I_MUTEX_QUOTA... Stephen?
^ permalink raw reply [flat|nested] 31+ messages in thread
* [git pull] vfs and fs fixes
@ 2013-09-18 22:52 Al Viro
0 siblings, 0 replies; 31+ messages in thread
From: Al Viro @ 2013-09-18 22:52 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, linux-fsdevel
atomic_open-related fixes (Miklos' series, with EEXIST-related parts replaced
with fix in fs/namei.c:atomic_open() instead of messing with the instances)
+ race fix in autofs + leak on failure exit in 9p. Please, pull from the
usual place -
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus
Shortlog:
Al Viro (3):
autofs4: close the races around autofs4_notify_daemon()
atomic_open: take care of EEXIST in no-open case with O_CREAT|O_EXCL in fs/namei.c
9p: don't forget to destroy inode cache if fscache registration fails
Miklos Szeredi (5):
vfs: improve i_op->atomic_open() documentation
cifs: fix filp leak in cifs_atomic_open()
gfs2: set FILE_CREATED
nfs: set FILE_CREATED
vfs: don't set FILE_CREATED before calling ->atomic_open()
Diffstat:
Documentation/filesystems/vfs.txt | 14 +++++++-------
fs/9p/v9fs.c | 7 ++++---
fs/9p/vfs_inode_dotl.c | 8 +-------
fs/autofs4/waitq.c | 13 +++----------
fs/cifs/dir.c | 1 +
fs/gfs2/inode.c | 4 +++-
fs/namei.c | 34 ++++++++++++++++++++++------------
fs/nfs/dir.c | 3 +++
fs/open.c | 21 ++++++++++++++++++---
9 files changed, 62 insertions(+), 43 deletions(-)
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2013-09-18 22:52 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-18 22:52 [git pull] vfs and fs fixes Al Viro
-- strict thread matches above, loose matches on Subject: below --
2012-04-17 5:25 Al Viro
2012-04-17 15:01 ` Linus Torvalds
2012-04-17 16:22 ` J. Bruce Fields
2012-04-17 16:33 ` Linus Torvalds
2012-04-17 17:06 ` J. Bruce Fields
2012-04-17 17:59 ` Al Viro
2012-04-17 18:01 ` Al Viro
2012-04-17 18:28 ` Al Viro
2012-04-17 21:14 ` J. Bruce Fields
2012-04-17 22:08 ` Linus Torvalds
2012-04-17 23:44 ` Al Viro
2012-04-18 0:49 ` J. Bruce Fields
2012-04-18 0:56 ` Linus Torvalds
2012-04-18 21:52 ` J. Bruce Fields
2012-04-25 15:20 ` J. Bruce Fields
2012-04-20 11:15 ` Jan Kara
2012-04-24 19:52 ` J. Bruce Fields
2012-04-24 22:23 ` Jan Kara
2012-04-25 11:29 ` J. Bruce Fields
2012-04-25 16:26 ` Jan Kara
2012-04-25 16:47 ` Steven Whitehouse
2012-04-25 17:11 ` J. Bruce Fields
2012-04-18 0:47 ` J. Bruce Fields
2012-04-19 3:23 ` Benjamin Herrenschmidt
2012-04-19 14:50 ` Ted Ts'o
2012-04-24 17:40 ` Greg KH
2012-04-24 17:45 ` Al Viro
2012-04-24 17:59 ` Greg KH
2012-04-24 18:04 ` Al Viro
2012-04-24 20:37 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).