* [no patch] broken use of mm_release / deactivate_mm
@ 2004-09-13 19:06 Andries Brouwer
2004-09-13 19:30 ` Linus Torvalds
0 siblings, 1 reply; 6+ messages in thread
From: Andries Brouwer @ 2004-09-13 19:06 UTC (permalink / raw)
To: torvalds, akpm; +Cc: linux-kernel
Recent kernels have a bug in fork(). Things can be improved a bit
by commenting out the lines
/* Get rid of any cached register state */
deactivate_mm(tsk, mm);
in fork.c:mm_release().
What happens at a fork, is that a long sequence of things is done,
and if a failure occurs all previous things are undone. Thus
(in copy_process()):
if ((retval = copy_mm(clone_flags, p)))
goto bad_fork_cleanup_signal;
if ((retval = copy_namespace(clone_flags, p)))
goto bad_fork_cleanup_mm;
retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
if (retval)
goto bad_fork_cleanup_namespace;
...
bad_fork_cleanup_namespace:
exit_namespace(p);
bad_fork_cleanup_mm:
exit_mm(p);
if (p->active_mm)
mmdrop(p->active_mm);
bad_fork_cleanup_signal:
...
Thus, we may do exit_mm() when an attempted fork fails.
The argument of exit_mm() is this new, not completeley initialized
task_struct.
Now exit.c:exit_mm(p) does mm_release(p,p->mm), and this
mm_release() does deactivate_mm(), a macro that clears %fs and %gs.
Ach. A segfault is the result.
More is wrong with mm_release(). It examines p->clear_child_tid,
and possibly does put_user(0, tidptr);.
Oops.
In our case p->clear_child_tid had not yet been initialized for
the child, that happens in copy_thread() that we never reached.
So this is the value the parent had.
Also the call
enter_lazy_tlb(mm, current);
seems strange in this context.
It seems to me that the proper action is some reshuffling
of this code. Maybe it is cleanest to separate the cleanup
code for a failed fork entirely from the code for an exiting
process.
Andries
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [no patch] broken use of mm_release / deactivate_mm
2004-09-13 19:06 [no patch] broken use of mm_release / deactivate_mm Andries Brouwer
@ 2004-09-13 19:30 ` Linus Torvalds
2004-09-14 12:41 ` Nick Piggin
0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2004-09-13 19:30 UTC (permalink / raw)
To: Andries Brouwer
Cc: Andrew Morton, Kernel Mailing List, Ingo Molnar, Nick Piggin
On Mon, 13 Sep 2004, Andries Brouwer wrote:
>
> What happens at a fork, is that a long sequence of things is done,
> and if a failure occurs all previous things are undone. Thus
> (in copy_process()):
>
> if ((retval = copy_mm(clone_flags, p)))
> goto bad_fork_cleanup_signal;
> if ((retval = copy_namespace(clone_flags, p)))
> goto bad_fork_cleanup_mm;
> retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
> if (retval)
> goto bad_fork_cleanup_namespace;
>
> ...
>
> bad_fork_cleanup_namespace:
> exit_namespace(p);
> bad_fork_cleanup_mm:
> exit_mm(p);
> if (p->active_mm)
> mmdrop(p->active_mm);
I agree. Looks like the "exit_mm()" should really be a "mmput()".
Can we have a few more eyes on this thing? Ingo, Nick?
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [no patch] broken use of mm_release / deactivate_mm
2004-09-13 19:30 ` Linus Torvalds
@ 2004-09-14 12:41 ` Nick Piggin
2004-09-14 15:06 ` Linus Torvalds
0 siblings, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2004-09-14 12:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: Andries Brouwer, Andrew Morton, Kernel Mailing List, Ingo Molnar
Linus Torvalds wrote:
>
> On Mon, 13 Sep 2004, Andries Brouwer wrote:
>
>>What happens at a fork, is that a long sequence of things is done,
>>and if a failure occurs all previous things are undone. Thus
>>(in copy_process()):
>>
>> if ((retval = copy_mm(clone_flags, p)))
>> goto bad_fork_cleanup_signal;
>> if ((retval = copy_namespace(clone_flags, p)))
>> goto bad_fork_cleanup_mm;
>> retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
>> if (retval)
>> goto bad_fork_cleanup_namespace;
>>
>>...
>>
>>bad_fork_cleanup_namespace:
>> exit_namespace(p);
>>bad_fork_cleanup_mm:
>> exit_mm(p);
>> if (p->active_mm)
>> mmdrop(p->active_mm);
>
>
> I agree. Looks like the "exit_mm()" should really be a "mmput()".
>
> Can we have a few more eyes on this thing? Ingo, Nick?
>
AFAIKS yes. exit_mm doesn't look legal unless its dropping the current
mm context. And mmput looks like it should clean up everything - it is
used almost exactly the same way to cleanup a failure case in copy_mm.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [no patch] broken use of mm_release / deactivate_mm
2004-09-14 12:41 ` Nick Piggin
@ 2004-09-14 15:06 ` Linus Torvalds
2004-09-14 17:00 ` Herbert Poetzl
2004-09-14 23:21 ` Andries Brouwer
0 siblings, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2004-09-14 15:06 UTC (permalink / raw)
To: Nick Piggin
Cc: Andries Brouwer, Andrew Morton, Kernel Mailing List, Ingo Molnar
On Tue, 14 Sep 2004, Nick Piggin wrote:
> >
> > I agree. Looks like the "exit_mm()" should really be a "mmput()".
> >
> > Can we have a few more eyes on this thing? Ingo, Nick?
>
> AFAIKS yes. exit_mm doesn't look legal unless its dropping the current
> mm context. And mmput looks like it should clean up everything - it is
> used almost exactly the same way to cleanup a failure case in copy_mm.
Does everybody also agree that the
if (p->active_mm)
mmdrop(p->active_mm);
should also be dropped, and that mmput() does all of that correctly too?
(Again, looking at all the counts etc, I think the answer is a resounding
yes, but dammit, this code has obviously never gotten any testing at all,
since it effectively never happens).
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [no patch] broken use of mm_release / deactivate_mm
2004-09-14 15:06 ` Linus Torvalds
@ 2004-09-14 17:00 ` Herbert Poetzl
2004-09-14 23:21 ` Andries Brouwer
1 sibling, 0 replies; 6+ messages in thread
From: Herbert Poetzl @ 2004-09-14 17:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nick Piggin, Andries Brouwer, Andrew Morton, Kernel Mailing List,
Ingo Molnar
On Tue, Sep 14, 2004 at 08:06:14AM -0700, Linus Torvalds wrote:
>
>
> On Tue, 14 Sep 2004, Nick Piggin wrote:
> > >
> > > I agree. Looks like the "exit_mm()" should really be a "mmput()".
> > >
> > > Can we have a few more eyes on this thing? Ingo, Nick?
> >
> > AFAIKS yes. exit_mm doesn't look legal unless its dropping the current
> > mm context. And mmput looks like it should clean up everything - it is
> > used almost exactly the same way to cleanup a failure case in copy_mm.
>
> Does everybody also agree that the
>
> if (p->active_mm)
> mmdrop(p->active_mm);
>
> should also be dropped, and that mmput() does all of that correctly too?
>
> (Again, looking at all the counts etc, I think the answer is a resounding
> yes, but dammit, this code has obviously never gotten any testing at all,
> since it effectively never happens).
IIRC, linux-vserver did hit it once or twice
but I wasn't sure at that time that it isn't my
fault ...
this 'error path' was used by the memory limiting
code, so it's probably easy to test with minor
adjustments ...
best,
Herbert
> Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [no patch] broken use of mm_release / deactivate_mm
2004-09-14 15:06 ` Linus Torvalds
2004-09-14 17:00 ` Herbert Poetzl
@ 2004-09-14 23:21 ` Andries Brouwer
1 sibling, 0 replies; 6+ messages in thread
From: Andries Brouwer @ 2004-09-14 23:21 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nick Piggin, Andries Brouwer, Andrew Morton, Kernel Mailing List,
Ingo Molnar
On Tue, Sep 14, 2004 at 08:06:14AM -0700, Linus Torvalds wrote:
> Does everybody also agree that ... mmput() does all of that correctly too?
I think so, but do not have time to check all details.
Now the first parameter of mm_release() always is current,
so that this parameter is superfluous. Similarly, the only
parameter of exit_mm() always is current, and hence is superfluous.
Maybe it is a good idea to remove the pretense that these routines
might do something useful for general parameters, now that
deactivate_mm() does sneaky global things.
Andries
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-09-14 23:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-13 19:06 [no patch] broken use of mm_release / deactivate_mm Andries Brouwer
2004-09-13 19:30 ` Linus Torvalds
2004-09-14 12:41 ` Nick Piggin
2004-09-14 15:06 ` Linus Torvalds
2004-09-14 17:00 ` Herbert Poetzl
2004-09-14 23:21 ` Andries Brouwer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox