* Re: [RFC] breakage in sysfs_readdir() and s_instances abuse in sysfs
@ 2011-06-04 22:25 Al Viro
2011-06-05 10:37 ` [PATCH] get_net_ns_by_fd() oopses if proc_ns_fget() returns an error Al Viro
0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2011-06-04 22:25 UTC (permalink / raw)
To: netdev; +Cc: Eric W. Biederman, linux-fsdevel, Linus Torvalds
[Forwarded to netdev]
On Sat, Jun 04, 2011 at 01:15:19AM +0100, Al Viro wrote:
> Granted, sysfs_dir_pos() doesn't dereference ns; however, it does compare
> with it. And ns might have been freed and reused by that point.
>
> I don't like what's going on there; that code looks inherently racy.
> We never set ->ns[...] non-NULL outside of mount(). So it looks like
> the intended behaviour is to have all ns-specific entries of that type
> disappear forever from that fs instance. Having entries for _another_
> namespace to show up for a (short) while, and that only in readdir()
> (lookup runs completely under sysfs_mutex, so we don't have that race
> there)...
Eeep... We do not have a *race* in lookup. However, any lookup done
after that ->ns[...] = NULL is going to do the following:
mutex_lock(&sysfs_mutex);
type = sysfs_ns_type(parent_sd);
ns = sysfs_info(dir->i_sb)->ns[type];
/* i.e. ns = NULL */
sd = sysfs_find_dirent(parent_sd, ns, dentry->d_name.name);
which will do rather unpleasant things:
for (sd = parent_sd->s_dir.children; sd; sd = sd->s_sibling) {
if (ns && sd->s_ns && (sd->s_ns != ns))
continue;
if (!strcmp(sd->s_name, name))
return sd;
}
i.e. with ns == NULL it will outright ignore sd->s_ns and look for name
match and nothing else. Any sysfs node with that name will do, whatever
it might have in ->s_ns. E.g. for lookups in /sys/class/net it will find
the first sysfs node of network interface with that name in _some_ namespace.
Back to sysfs_lookup():
/* no such entry */
if (!sd) {
ret = ERR_PTR(-ENOENT);
goto out_unlock;
}
/* attach dentry and inode */
inode = sysfs_get_inode(dir->i_sb, sd);
if (!inode) {
ret = ERR_PTR(-ENOMEM);
goto out_unlock;
}
and if there was an entry from some surviving net_ns with that name,
sysfs_get_inode() will cheerfully allocate an inode for us and associate
it with that sd. Then we complete the lookup and return a shiny positive
dentry to caller...
Just what is that code supposed to do? Somehow I doubt that "after net_ns
is gone, lookups for class/net/name succeed when there is an interface called
name in at least one net_ns; resulting object refers to one of such
interfaces, with no promises regarding which net_ns it is about" is the
intended behaviour here, especially since readdir() called after that
will skip all sysfs nodes with non-NULL ->s_ns, i.e. show empty class/net.
Frankly, my preference would be to kill the void * nonsense, introduce a
structure we'll embed into struct net (and all future tags) containing
refcount and "it is doomed, skip it" flag. Purely passive refs - i.e. all
they guarantee is that memory won't be freed under you. Having *active* refs
(i.e. your current net->count being non-zero) contributes 1 to that counter,
kfree() is done when that counter reaches zero. With info->ns[] contributing
to passive refcount and exit_ns logics tagging the sucker as doomed and leaving
it at that. kill_sb() would drop references, lookup and readdir check if
ns is doomed and skip the entry in that case.
However, it's your code and I don't know in which direction do you plan to take
it. IMO it's badly overdesigned...
Are you OK with the sketch above? I can probably cook a patch along those
lines by tomorrow...
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] get_net_ns_by_fd() oopses if proc_ns_fget() returns an error
2011-06-04 22:25 [RFC] breakage in sysfs_readdir() and s_instances abuse in sysfs Al Viro
@ 2011-06-05 10:37 ` Al Viro
2011-06-05 10:54 ` [PATCH] fix return values of l2tp_dfs_seq_open() Al Viro
0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2011-06-05 10:37 UTC (permalink / raw)
To: netdev; +Cc: Eric W. Biederman, linux-fsdevel, Linus Torvalds
BTW, looking through the code related to struct net lifetime rules has
caught something else:
struct net *get_net_ns_by_fd(int fd)
{
...
file = proc_ns_fget(fd);
if (!file)
goto out;
ei = PROC_I(file->f_dentry->d_inode);
while in proc_ns_fget() we have two return ERR_PTR(...) and not a single
path that would return NULL. The other caller of proc_ns_fget() treats
ERR_PTR() correctly...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
[I don't know which tree should that go through; I'm throwing that into vfs-2.6
#for-linus, but if networking folks prefer that to go through their tree...]
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 6c6b86d..e41e511 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -310,19 +310,17 @@ struct net *get_net_ns_by_fd(int fd)
struct file *file;
struct net *net;
- net = ERR_PTR(-EINVAL);
file = proc_ns_fget(fd);
- if (!file)
- goto out;
+ if (IS_ERR(file))
+ return ERR_CAST(file);
ei = PROC_I(file->f_dentry->d_inode);
- if (ei->ns_ops != &netns_operations)
- goto out;
+ if (ei->ns_ops == &netns_operations)
+ net = get_net(ei->ns);
+ else
+ net = ERR_PTR(-EINVAL);
- net = get_net(ei->ns);
-out:
- if (file)
- fput(file);
+ fput(file);
return net;
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] fix return values of l2tp_dfs_seq_open()
2011-06-05 10:37 ` [PATCH] get_net_ns_by_fd() oopses if proc_ns_fget() returns an error Al Viro
@ 2011-06-05 10:54 ` Al Viro
2011-06-05 11:05 ` James Chapman
2011-06-05 21:11 ` David Miller
0 siblings, 2 replies; 5+ messages in thread
From: Al Viro @ 2011-06-05 10:54 UTC (permalink / raw)
To: netdev; +Cc: James Chapman
More fallout from struct net lifetime rules review: PTR_ERR() is *already*
negative and failing ->open() should return negatives on failure.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
index b8dbae8..7613013 100644
--- a/net/l2tp/l2tp_debugfs.c
+++ b/net/l2tp/l2tp_debugfs.c
@@ -258,7 +258,7 @@ static int l2tp_dfs_seq_open(struct inode *inode, struct file *file)
*/
pd->net = get_net_ns_by_pid(current->pid);
if (IS_ERR(pd->net)) {
- rc = -PTR_ERR(pd->net);
+ rc = PTR_ERR(pd->net);
goto err_free_pd;
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] fix return values of l2tp_dfs_seq_open()
2011-06-05 10:54 ` [PATCH] fix return values of l2tp_dfs_seq_open() Al Viro
@ 2011-06-05 11:05 ` James Chapman
2011-06-05 21:11 ` David Miller
1 sibling, 0 replies; 5+ messages in thread
From: James Chapman @ 2011-06-05 11:05 UTC (permalink / raw)
To: Al Viro; +Cc: netdev
On 05/06/2011 11:54, Al Viro wrote:
> More fallout from struct net lifetime rules review: PTR_ERR() is *already*
> negative and failing ->open() should return negatives on failure.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c
> index b8dbae8..7613013 100644
> --- a/net/l2tp/l2tp_debugfs.c
> +++ b/net/l2tp/l2tp_debugfs.c
> @@ -258,7 +258,7 @@ static int l2tp_dfs_seq_open(struct inode *inode, struct file *file)
> */
> pd->net = get_net_ns_by_pid(current->pid);
> if (IS_ERR(pd->net)) {
> - rc = -PTR_ERR(pd->net);
> + rc = PTR_ERR(pd->net);
> goto err_free_pd;
> }
>
Whoops. Thanks Al.
Signed-off-by: James Chapman <jchapman@katalix.com>
--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix return values of l2tp_dfs_seq_open()
2011-06-05 10:54 ` [PATCH] fix return values of l2tp_dfs_seq_open() Al Viro
2011-06-05 11:05 ` James Chapman
@ 2011-06-05 21:11 ` David Miller
1 sibling, 0 replies; 5+ messages in thread
From: David Miller @ 2011-06-05 21:11 UTC (permalink / raw)
To: viro; +Cc: netdev, jchapman
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Sun, 5 Jun 2011 11:54:03 +0100
> More fallout from struct net lifetime rules review: PTR_ERR() is *already*
> negative and failing ->open() should return negatives on failure.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al, I applied this and your get_net_ns_by_fd() fix to my net-2.6
tree.
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-06-05 21:11 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-04 22:25 [RFC] breakage in sysfs_readdir() and s_instances abuse in sysfs Al Viro
2011-06-05 10:37 ` [PATCH] get_net_ns_by_fd() oopses if proc_ns_fget() returns an error Al Viro
2011-06-05 10:54 ` [PATCH] fix return values of l2tp_dfs_seq_open() Al Viro
2011-06-05 11:05 ` James Chapman
2011-06-05 21:11 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).