* kjournald keeps reference to namespace
@ 2006-02-18 1:35 Herbert Poetzl
2006-02-18 1:54 ` Andrew Morton
2006-02-18 13:36 ` Björn Steinbrink
0 siblings, 2 replies; 10+ messages in thread
From: Herbert Poetzl @ 2006-02-18 1:35 UTC (permalink / raw)
To: Andrew Morton, Al Viro; +Cc: Linux Kernel ML
Hi Folks!
when creating a private namespace (CLONE_NS) and
then mounting an ext3 filesystem, a new kernel
thread (kjournald) is created, which keeps a
reference to the namespace, which after the the
process exits, remains and blocks access to the
block device, as it is still bd_claim-ed.
this leaves a private namespace behind and a
block device which cannot be opened exclusively.
unmount is not an option, as the namespace is
not longer reachable.
this behaviour seems to be there since ever,
well since namespaces and kjournald exists :)
the following 'cruel' hack 'solves' this issue
best,
Herbert
--- fs/jbd/journal.c.orig 2006-01-03 17:29:56 +0100
+++ fs/jbd/journal.c 2006-02-18 02:23:21 +0100
@@ -33,6 +33,7 @@
#include <linux/mm.h>
#include <linux/suspend.h>
#include <linux/pagemap.h>
+#include <linux/namespace.h>
#include <asm/uaccess.h>
#include <asm/page.h>
#include <linux/proc_fs.h>
@@ -116,6 +117,13 @@ static int kjournald(void *arg)
struct timer_list timer;
daemonize("kjournald");
+ {
+ struct namespace *ns = current->namespace;
+
+ current->namespace = NULL;
+ put_namespace(ns);
+ }
+
/* Set up an interval timer which can be used to trigger a
commit wakeup after the commit interval expires */
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: kjournald keeps reference to namespace 2006-02-18 1:35 kjournald keeps reference to namespace Herbert Poetzl @ 2006-02-18 1:54 ` Andrew Morton 2006-02-18 3:30 ` Herbert Poetzl 2006-02-18 13:36 ` Björn Steinbrink 1 sibling, 1 reply; 10+ messages in thread From: Andrew Morton @ 2006-02-18 1:54 UTC (permalink / raw) To: Herbert Poetzl; +Cc: viro, linux-kernel Herbert Poetzl <herbert@13thfloor.at> wrote: > > > Hi Folks! > > when creating a private namespace (CLONE_NS) and > then mounting an ext3 filesystem, a new kernel > thread (kjournald) is created, which keeps a > reference to the namespace, which after the the > process exits, remains and blocks access to the > block device, as it is still bd_claim-ed. There are numerous ways in which user processes can parent kernel threads. bix:/usr/src/linux-2.6.16-rc4> grep -rl kernel_thread drivers net fs | wc 64 64 1657 > this leaves a private namespace behind and a > block device which cannot be opened exclusively. > unmount is not an option, as the namespace is > not longer reachable. > > this behaviour seems to be there since ever, > well since namespaces and kjournald exists :) > > the following 'cruel' hack 'solves' this issue > > best, > Herbert > > > --- fs/jbd/journal.c.orig 2006-01-03 17:29:56 +0100 > +++ fs/jbd/journal.c 2006-02-18 02:23:21 +0100 > @@ -33,6 +33,7 @@ > #include <linux/mm.h> > #include <linux/suspend.h> > #include <linux/pagemap.h> > +#include <linux/namespace.h> > #include <asm/uaccess.h> > #include <asm/page.h> > #include <linux/proc_fs.h> > @@ -116,6 +117,13 @@ static int kjournald(void *arg) > struct timer_list timer; > > daemonize("kjournald"); > + { > + struct namespace *ns = current->namespace; > + > + current->namespace = NULL; > + put_namespace(ns); > + } > + > I think it'd be better to convert ext3 to use the kthread API which appears to accidentally not have this problem, because such threads are parented by keventd, which were parented by init. That being said, perhaps we should do a put_namespace() in kernel_thread(), too. I'm kinda surprised that your patch didn't oops over a NULL ->namespace when the kernel internally mounted the root filesystem. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 1:54 ` Andrew Morton @ 2006-02-18 3:30 ` Herbert Poetzl 2006-02-24 21:28 ` Paul Collins 0 siblings, 1 reply; 10+ messages in thread From: Herbert Poetzl @ 2006-02-18 3:30 UTC (permalink / raw) To: Andrew Morton; +Cc: viro, linux-kernel On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote: > Herbert Poetzl <herbert@13thfloor.at> wrote: > > > > > > Hi Folks! > > > > when creating a private namespace (CLONE_NS) and > > then mounting an ext3 filesystem, a new kernel > > thread (kjournald) is created, which keeps a > > reference to the namespace, which after the the > > process exits, remains and blocks access to the > > block device, as it is still bd_claim-ed. > > There are numerous ways in which user processes can parent kernel threads. > > bix:/usr/src/linux-2.6.16-rc4> grep -rl kernel_thread drivers net fs | wc > 64 64 1657 > > > this leaves a private namespace behind and a > > block device which cannot be opened exclusively. > > unmount is not an option, as the namespace is > > not longer reachable. > > > > this behaviour seems to be there since ever, > > well since namespaces and kjournald exists :) > > > > the following 'cruel' hack 'solves' this issue > > > > best, > > Herbert > > > > > > --- fs/jbd/journal.c.orig 2006-01-03 17:29:56 +0100 > > +++ fs/jbd/journal.c 2006-02-18 02:23:21 +0100 > > @@ -33,6 +33,7 @@ > > #include <linux/mm.h> > > #include <linux/suspend.h> > > #include <linux/pagemap.h> > > +#include <linux/namespace.h> > > #include <asm/uaccess.h> > > #include <asm/page.h> > > #include <linux/proc_fs.h> > > @@ -116,6 +117,13 @@ static int kjournald(void *arg) > > struct timer_list timer; > > > > daemonize("kjournald"); > > + { > > + struct namespace *ns = current->namespace; > > + > > + current->namespace = NULL; > > + put_namespace(ns); > > + } > > + > > > > I think it'd be better to convert ext3 to use the kthread API which > appears to accidentally not have this problem, because such threads > are parented by keventd, which were parented by init. sounds like a plan! > That being said, perhaps we should do a put_namespace() in > kernel_thread(), too. hmm, keep the reference but put_namespace()? > I'm kinda surprised that your patch didn't oops over a NULL > ->namespace when the kernel internally mounted the root filesystem. nope, booted just fine, but I was worried too, the interesting detail is that the kjournald thread vanishes with this 'hack' when the namespace is disposed, which doesn't happen without it ... anyway, will look into it, of course, input is welcome ... best, Herbert ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 3:30 ` Herbert Poetzl @ 2006-02-24 21:28 ` Paul Collins 2006-02-24 21:36 ` Andrew Morton 0 siblings, 1 reply; 10+ messages in thread From: Paul Collins @ 2006-02-24 21:28 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Herbert Poetzl <herbert@13thfloor.at> writes: > On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote: >> I think it'd be better to convert ext3 to use the kthread API which >> appears to accidentally not have this problem, because such threads >> are parented by keventd, which were parented by init. > > sounds like a plan! Here's my attempt at such a conversion. Since jbd doesn't seem to want to collect an exit status, I didn't bother with kthread_stop(). I got overexcited and also embedded the journal device in the process name, but that's probably useless churn. Looks nice in pstree though: |-kthread-+-kblockd/0 | |-khubd | |-2*[pdflush] | |-aio/0 | |-v9fs/0 | |-cqueue/0 | |-kfand | |-kcryptd/0 | |-kjournald/3:3 | |-kjournald/3:8 | |-kjournald/3:4 | |-kjournald/3:5 | `-kjournald/254:1 Signed-off-by: Paul Collins <paul@ondioline.org> diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c index e4b516a..e33d993 100644 --- a/fs/jbd/journal.c +++ b/fs/jbd/journal.c @@ -33,6 +33,8 @@ #include <linux/mm.h> #include <linux/suspend.h> #include <linux/pagemap.h> +#include <linux/kthread.h> +#include <linux/kdev_t.h> #include <asm/uaccess.h> #include <asm/page.h> #include <linux/proc_fs.h> @@ -115,8 +117,6 @@ static int kjournald(void *arg) transaction_t *transaction; struct timer_list timer; - daemonize("kjournald"); - /* Set up an interval timer which can be used to trigger a commit wakeup after the commit interval expires */ init_timer(&timer); @@ -207,12 +207,14 @@ end_loop: journal->j_task = NULL; wake_up(&journal->j_wait_done_commit); jbd_debug(1, "Journal thread exiting.\n"); - return 0; + do_exit(0); } static void journal_start_thread(journal_t *journal) { - kernel_thread(kjournald, journal, CLONE_VM|CLONE_FS|CLONE_FILES); + dev_t jdev = journal->j_dev->bd_dev; + kthread_run(kjournald, journal, "kjournald/%d:%d", + MAJOR(jdev), MINOR(jdev), NULL); wait_event(journal->j_wait_done_commit, journal->j_task != 0); } -- Dag vijandelijk luchtschip de huismeester is dood ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-24 21:28 ` Paul Collins @ 2006-02-24 21:36 ` Andrew Morton 2006-02-24 22:01 ` Paul Collins 0 siblings, 1 reply; 10+ messages in thread From: Andrew Morton @ 2006-02-24 21:36 UTC (permalink / raw) To: Paul Collins; +Cc: linux-kernel Paul Collins <paul@briny.ondioline.org> wrote: > > Herbert Poetzl <herbert@13thfloor.at> writes: > > > On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote: > >> I think it'd be better to convert ext3 to use the kthread API which > >> appears to accidentally not have this problem, because such threads > >> are parented by keventd, which were parented by init. > > > > sounds like a plan! > > Here's my attempt at such a conversion. Since jbd doesn't seem to > want to collect an exit status, I didn't bother with kthread_stop(). Ah. I already did something similar. ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc4/2.6.16-rc4-mm2/broken-out/jbd-convert-kjournald-to-kthread-api.patch > I got overexcited and also embedded the journal device in the process > name, but that's probably useless churn. Looks nice in pstree though: > > |-kthread-+-kblockd/0 > | |-khubd > | |-2*[pdflush] > | |-aio/0 > | |-v9fs/0 > | |-cqueue/0 > | |-kfand > | |-kcryptd/0 > | |-kjournald/3:3 > | |-kjournald/3:8 > | |-kjournald/3:4 > | |-kjournald/3:5 > | `-kjournald/254:1 We only have 15 chars for that string - the final one you have there is on the raggedy edge. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-24 21:36 ` Andrew Morton @ 2006-02-24 22:01 ` Paul Collins 0 siblings, 0 replies; 10+ messages in thread From: Paul Collins @ 2006-02-24 22:01 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Andrew Morton <akpm@osdl.org> writes: > Paul Collins <paul@briny.ondioline.org> wrote: >> >> Herbert Poetzl <herbert@13thfloor.at> writes: >> >> > On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote: >> >> I think it'd be better to convert ext3 to use the kthread API which >> >> appears to accidentally not have this problem, because such threads >> >> are parented by keventd, which were parented by init. >> > >> > sounds like a plan! >> >> Here's my attempt at such a conversion. Since jbd doesn't seem to >> want to collect an exit status, I didn't bother with kthread_stop(). > > Ah. I already did something similar. > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc4/2.6.16-rc4-mm2/broken-out/jbd-convert-kjournald-to-kthread-api.patch Sweet, I managed to get it right at least. >> I got overexcited and also embedded the journal device in the process >> name, but that's probably useless churn. Looks nice in pstree though: >> >> | `-kjournald/254:1 > > We only have 15 chars for that string - the final one you have there is on > the raggedy edge. Oh well, it's not that useful anyway. I can count the number of times I've needed to kill a particular kjournald on the fingers of one head. -- Dag vijandelijk luchtschip de huismeester is dood ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 1:35 kjournald keeps reference to namespace Herbert Poetzl 2006-02-18 1:54 ` Andrew Morton @ 2006-02-18 13:36 ` Björn Steinbrink 2006-02-18 16:32 ` Björn Steinbrink 1 sibling, 1 reply; 10+ messages in thread From: Björn Steinbrink @ 2006-02-18 13:36 UTC (permalink / raw) To: herbert; +Cc: akpm, viro, linux-kernel On 2006.02.18 02:35:47 +0100, Herbert Poetzl wrote: > > Hi Folks! > > when creating a private namespace (CLONE_NS) and > then mounting an ext3 filesystem, a new kernel > thread (kjournald) is created, which keeps a > reference to the namespace, which after the the > process exits, remains and blocks access to the > block device, as it is still bd_claim-ed. > > this leaves a private namespace behind and a > block device which cannot be opened exclusively. > unmount is not an option, as the namespace is > not longer reachable. > > this behaviour seems to be there since ever, > well since namespaces and kjournald exists :) > > the following 'cruel' hack 'solves' this issue In daemonize() a new thread gets cleaned up and 'merged' with init_task. The current fs_struct is handled there, but not the current namespace. The following patch adds the namespace part. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> --- diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c --- linux-2.6.16-rc4/kernel/exit.c 2006-02-18 13:59:59.000000000 +0100 +++ linux-2.6.16-rc4-ns/kernel/exit.c 2006-02-18 14:04:26.000000000 +0100 @@ -360,6 +360,8 @@ void daemonize(const char *name, ...) fs = init_task.fs; current->fs = fs; atomic_inc(&fs->count); + exit_namespace(current); + current->namespace = init_task.namespace; exit_files(current); current->files = init_task.files; atomic_inc(¤t->files->count); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 13:36 ` Björn Steinbrink @ 2006-02-18 16:32 ` Björn Steinbrink 2006-02-18 17:12 ` Björn Steinbrink 0 siblings, 1 reply; 10+ messages in thread From: Björn Steinbrink @ 2006-02-18 16:32 UTC (permalink / raw) To: herbert, akpm, viro, linux-kernel > In daemonize() a new thread gets cleaned up and 'merged' with init_task. > The current fs_struct is handled there, but not the current namespace. > The following patch adds the namespace part. > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > --- Oops, forgot the increment the namespace usage count... --- diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c --- linux-2.6.16-rc4/kernel/exit.c 2006-02-18 13:59:59.000000000 +0100 +++ linux-2.6.16-rc4-ns/kernel/exit.c 2006-02-18 17:27:48.000000000 +0100 @@ -360,6 +360,9 @@ void daemonize(const char *name, ...) fs = init_task.fs; current->fs = fs; atomic_inc(&fs->count); + exit_namespace(current); + current->namespace = init_task.namespace; + atomic_inc(¤t->namespace->count); exit_files(current); current->files = init_task.files; atomic_inc(¤t->files->count); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 16:32 ` Björn Steinbrink @ 2006-02-18 17:12 ` Björn Steinbrink 2006-02-19 2:32 ` Eric W. Biederman 0 siblings, 1 reply; 10+ messages in thread From: Björn Steinbrink @ 2006-02-18 17:12 UTC (permalink / raw) To: herbert, akpm, viro, linux-kernel; +Cc: ebiederm, torvalds On 2006.02.18 17:32:27 +0100, Björn Steinbrink wrote: > > In daemonize() a new thread gets cleaned up and 'merged' with init_task. > > The current fs_struct is handled there, but not the current namespace. > > The following patch adds the namespace part. > > > > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> > > --- > > Oops, forgot the increment the namespace usage count... Ok, this time with the get_namespace wrapper, thanks to Eric Biederman for pointing that out to me. --- diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c --- linux-2.6.16-rc4/kernel/exit.c 2006-02-18 13:59:59.000000000 +0100 +++ linux-2.6.16-rc4-ns/kernel/exit.c 2006-02-18 17:57:21.000000000 +0100 @@ -360,6 +360,9 @@ void daemonize(const char *name, ...) fs = init_task.fs; current->fs = fs; atomic_inc(&fs->count); + exit_namespace(current); + current->namespace = init_task.namespace; + get_namespace(current->namespace); exit_files(current); current->files = init_task.files; atomic_inc(¤t->files->count); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: kjournald keeps reference to namespace 2006-02-18 17:12 ` Björn Steinbrink @ 2006-02-19 2:32 ` Eric W. Biederman 0 siblings, 0 replies; 10+ messages in thread From: Eric W. Biederman @ 2006-02-19 2:32 UTC (permalink / raw) To: Björn Steinbrink; +Cc: herbert, akpm, viro, linux-kernel, torvalds Björn Steinbrink <B.Steinbrink@gmx.de> writes: > On 2006.02.18 17:32:27 +0100, Björn Steinbrink wrote: >> > In daemonize() a new thread gets cleaned up and 'merged' with init_task. >> > The current fs_struct is handled there, but not the current namespace. >> > The following patch adds the namespace part. >> > >> > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> >> > --- >> >> Oops, forgot the increment the namespace usage count... > > Ok, this time with the get_namespace wrapper, thanks to Eric Biederman > for pointing that out to me. Acked-by: Eric Biederman <ebiederm@xmission.com> Note we can't ever count on using our parents namespace because we already have called exit_fs(), which is the only way to the namespace from a process. > --- > > > diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c > linux-2.6.16-rc4-ns/kernel/exit.c > --- linux-2.6.16-rc4/kernel/exit.c 2006-02-18 13:59:59.000000000 +0100 > +++ linux-2.6.16-rc4-ns/kernel/exit.c 2006-02-18 17:57:21.000000000 +0100 > @@ -360,6 +360,9 @@ void daemonize(const char *name, ...) > fs = init_task.fs; > current->fs = fs; > atomic_inc(&fs->count); > + exit_namespace(current); > + current->namespace = init_task.namespace; > + get_namespace(current->namespace); > exit_files(current); > current->files = init_task.files; > atomic_inc(¤t->files->count); ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-02-24 22:02 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-02-18 1:35 kjournald keeps reference to namespace Herbert Poetzl 2006-02-18 1:54 ` Andrew Morton 2006-02-18 3:30 ` Herbert Poetzl 2006-02-24 21:28 ` Paul Collins 2006-02-24 21:36 ` Andrew Morton 2006-02-24 22:01 ` Paul Collins 2006-02-18 13:36 ` Björn Steinbrink 2006-02-18 16:32 ` Björn Steinbrink 2006-02-18 17:12 ` Björn Steinbrink 2006-02-19 2:32 ` Eric W. Biederman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.