From: Oleg Nesterov <oleg@redhat.com>
To: Roland McGrath <roland@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>
Subject: Re: uninterruptible CLONE_VFORK (Was: oom: Make coredump interruptible)
Date: Mon, 14 Jun 2010 18:33:05 +0200 [thread overview]
Message-ID: <20100614163304.GA21313@redhat.com> (raw)
In-Reply-To: <20100614005608.0D006408C1@magilla.sf.frob.com>
On 06/13, Roland McGrath wrote:
>
> > Oh. And another problem, vfork() is not interruptible too. This means
> > that the user can hide the memory hog from oom-killer.
>
> I'm not sure there is really any danger like that, because of the
> oom_kill_process "Try to kill a child first" logic.
But note that oom_kill_process() doesn't kill the children with the
same ->mm. I never understood this code.
Anyway I agree. Even if I am right, this is not very serious problem
from oom-kill pov. To me, the uninterruptible CLONE_VFORK is bad by
itself.
> > But let's forget about oom.
>
> Sure, but it reminds me to mention that vfork mm sharing is another reason
> that having oom_kill set some persistent state in the mm seems wrong.
Yes, yes, this was already discussed a bit. Only if the core dump is in
progress we can touch ->mm or (probably better but needs a bit more locking)
mm->core_state to signal the coredumping thread and (perhaps) for something
else.
> > Roland, any reason it should be uninterruptible? This doesn't look good
> > in any case. Perhaps the pseudo-patch below makes sense?
>
> I've long thought that we should make a vfork parent SIGKILL-able.
Good ;)
> (Of
> course the vfork wait can't be made interruptible by other signals, since
> it must never do anything userish
Yes sure. That is why wait_for_completion_killable(), not _interrutpible.
But I assume you didn't mean that only SIGKILL should interrupt the
parent, any sig_fatal() signal should.
> I don't know off hand of any problem with your
> straightforward change. But I don't have much confidence that there isn't
> any strange gotcha waiting there due to some other kind of implicit
> assumption about vfork parent blocks that we are overlooking at the moment.
> So I wouldn't change this without more thorough auditing and thinking about
> everything related to vfork.
Agreed. This needs auditing. And CLONE_VFORK can be used with/without all
other CLONE_ flags... Probably we should mostly worry about vfork ==
CLONE_VM | CLONE_VFORK case.
Anyway. ->vfork_done is per-thread. This means that without any changes
do_fork(CLONE_VFORK) can return (to user-mode) before the child's thread
group exits/execs. Perhaps this means we shouldn't worry too much.
> Personally, what I've really been interested in is changing the vfork wait
> to use some different kind of blocking entirely. My real motivation for
> that is to let a vfork wait be morphed into and out of TASK_TRACED,
I see. I never thought about this, but I think you are right.
Hmm. Even without debugger, the parent doesn't react to SIGSTOP. Say,
int main(voif)
{
if (!vfork())
pause();
}
and ^Z won't work obviously. Not good.
This is not trivail I guess. Needs thinking...
Oleg.
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Roland McGrath <roland@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm <linux-mm@kvack.org>,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Nick Piggin <npiggin@suse.de>
Subject: Re: uninterruptible CLONE_VFORK (Was: oom: Make coredump interruptible)
Date: Mon, 14 Jun 2010 18:33:05 +0200 [thread overview]
Message-ID: <20100614163304.GA21313@redhat.com> (raw)
In-Reply-To: <20100614005608.0D006408C1@magilla.sf.frob.com>
On 06/13, Roland McGrath wrote:
>
> > Oh. And another problem, vfork() is not interruptible too. This means
> > that the user can hide the memory hog from oom-killer.
>
> I'm not sure there is really any danger like that, because of the
> oom_kill_process "Try to kill a child first" logic.
But note that oom_kill_process() doesn't kill the children with the
same ->mm. I never understood this code.
Anyway I agree. Even if I am right, this is not very serious problem
from oom-kill pov. To me, the uninterruptible CLONE_VFORK is bad by
itself.
> > But let's forget about oom.
>
> Sure, but it reminds me to mention that vfork mm sharing is another reason
> that having oom_kill set some persistent state in the mm seems wrong.
Yes, yes, this was already discussed a bit. Only if the core dump is in
progress we can touch ->mm or (probably better but needs a bit more locking)
mm->core_state to signal the coredumping thread and (perhaps) for something
else.
> > Roland, any reason it should be uninterruptible? This doesn't look good
> > in any case. Perhaps the pseudo-patch below makes sense?
>
> I've long thought that we should make a vfork parent SIGKILL-able.
Good ;)
> (Of
> course the vfork wait can't be made interruptible by other signals, since
> it must never do anything userish
Yes sure. That is why wait_for_completion_killable(), not _interrutpible.
But I assume you didn't mean that only SIGKILL should interrupt the
parent, any sig_fatal() signal should.
> I don't know off hand of any problem with your
> straightforward change. But I don't have much confidence that there isn't
> any strange gotcha waiting there due to some other kind of implicit
> assumption about vfork parent blocks that we are overlooking at the moment.
> So I wouldn't change this without more thorough auditing and thinking about
> everything related to vfork.
Agreed. This needs auditing. And CLONE_VFORK can be used with/without all
other CLONE_ flags... Probably we should mostly worry about vfork ==
CLONE_VM | CLONE_VFORK case.
Anyway. ->vfork_done is per-thread. This means that without any changes
do_fork(CLONE_VFORK) can return (to user-mode) before the child's thread
group exits/execs. Perhaps this means we shouldn't worry too much.
> Personally, what I've really been interested in is changing the vfork wait
> to use some different kind of blocking entirely. My real motivation for
> that is to let a vfork wait be morphed into and out of TASK_TRACED,
I see. I never thought about this, but I think you are right.
Hmm. Even without debugger, the parent doesn't react to SIGSTOP. Say,
int main(voif)
{
if (!vfork())
pause();
}
and ^Z won't work obviously. Not good.
This is not trivail I guess. Needs thinking...
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-06-14 17:35 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-31 9:33 [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads KOSAKI Motohiro
2010-05-31 9:33 ` KOSAKI Motohiro
2010-05-31 9:35 ` [PATCH 2/5] oom: select_bad_process: PF_EXITING check should take ->mm into account KOSAKI Motohiro
2010-05-31 9:35 ` KOSAKI Motohiro
2010-05-31 16:43 ` Oleg Nesterov
2010-05-31 16:43 ` Oleg Nesterov
2010-06-01 1:10 ` KOSAKI Motohiro
2010-06-01 1:10 ` KOSAKI Motohiro
2010-06-01 20:18 ` Oleg Nesterov
2010-06-01 20:18 ` Oleg Nesterov
2010-06-02 13:54 ` [PATCH] oom: remove PF_EXITING check completely KOSAKI Motohiro
2010-06-02 13:54 ` KOSAKI Motohiro
2010-06-02 15:54 ` Oleg Nesterov
2010-06-02 15:54 ` Oleg Nesterov
2010-06-02 21:02 ` David Rientjes
2010-06-02 21:02 ` David Rientjes
2010-06-03 4:48 ` KOSAKI Motohiro
2010-06-03 4:48 ` KOSAKI Motohiro
2010-06-03 6:29 ` David Rientjes
2010-06-03 6:29 ` David Rientjes
2010-06-02 13:54 ` [PATCH] oom: Make coredump interruptible KOSAKI Motohiro
2010-06-02 13:54 ` KOSAKI Motohiro
2010-06-02 15:42 ` Oleg Nesterov
2010-06-02 15:42 ` Oleg Nesterov
2010-06-02 17:29 ` Roland McGrath
2010-06-02 17:29 ` Roland McGrath
2010-06-02 17:53 ` Oleg Nesterov
2010-06-02 17:53 ` Oleg Nesterov
2010-06-02 18:58 ` Roland McGrath
2010-06-02 18:58 ` Roland McGrath
2010-06-02 20:38 ` Oleg Nesterov
2010-06-02 20:38 ` Oleg Nesterov
2010-06-03 14:03 ` Oleg Nesterov
2010-06-03 14:03 ` Oleg Nesterov
2010-06-04 10:54 ` KOSAKI Motohiro
2010-06-04 10:54 ` KOSAKI Motohiro
2010-06-04 11:27 ` Oleg Nesterov
2010-06-04 11:27 ` Oleg Nesterov
2010-06-04 11:34 ` Oleg Nesterov
2010-06-04 11:34 ` Oleg Nesterov
2010-06-09 19:53 ` Oleg Nesterov
2010-06-09 19:53 ` Oleg Nesterov
2010-06-09 20:41 ` David Rientjes
2010-06-09 20:41 ` David Rientjes
2010-06-09 21:03 ` Oleg Nesterov
2010-06-09 21:03 ` Oleg Nesterov
2010-06-13 11:24 ` KOSAKI Motohiro
2010-06-13 11:24 ` KOSAKI Motohiro
2010-06-13 15:53 ` Oleg Nesterov
2010-06-13 15:53 ` Oleg Nesterov
2010-06-13 17:13 ` uninterruptible CLONE_VFORK (Was: oom: Make coredump interruptible) Oleg Nesterov
2010-06-13 17:13 ` Oleg Nesterov
2010-06-14 0:56 ` Roland McGrath
2010-06-14 0:56 ` Roland McGrath
2010-06-14 16:33 ` Oleg Nesterov [this message]
2010-06-14 16:33 ` Oleg Nesterov
2010-06-14 19:17 ` Roland McGrath
2010-06-14 19:17 ` Roland McGrath
2010-06-28 17:33 ` Oleg Nesterov
2010-06-28 17:33 ` Oleg Nesterov
2010-06-28 18:04 ` Roland McGrath
2010-06-28 18:04 ` Roland McGrath
2010-06-14 0:36 ` [PATCH] oom: Make coredump interruptible Roland McGrath
2010-06-14 0:36 ` Roland McGrath
2010-06-14 0:26 ` Roland McGrath
2010-06-14 0:26 ` Roland McGrath
2010-06-01 20:39 ` [PATCH 2/5] oom: select_bad_process: PF_EXITING check should take ->mm into account David Rientjes
2010-06-01 20:39 ` David Rientjes
2010-05-31 9:36 ` [PATCH 3/5] oom: introduce find_lock_task_mm() to fix !mm false positives KOSAKI Motohiro
2010-05-31 9:36 ` KOSAKI Motohiro
2010-06-01 0:57 ` KAMEZAWA Hiroyuki
2010-06-01 0:57 ` KAMEZAWA Hiroyuki
2010-06-01 20:42 ` David Rientjes
2010-06-01 20:42 ` David Rientjes
2010-06-02 16:05 ` Minchan Kim
2010-06-02 16:05 ` Minchan Kim
2010-05-31 9:37 ` [PATCH 4/5] oom: the points calculation of child processes must use find_lock_task_mm() too KOSAKI Motohiro
2010-05-31 9:37 ` KOSAKI Motohiro
2010-05-31 16:56 ` Oleg Nesterov
2010-05-31 16:56 ` Oleg Nesterov
2010-05-31 23:48 ` KOSAKI Motohiro
2010-05-31 23:48 ` KOSAKI Motohiro
2010-05-31 9:38 ` [PATCH 5/5] oom: __oom_kill_task() " KOSAKI Motohiro
2010-05-31 9:38 ` KOSAKI Motohiro
2010-06-01 1:02 ` KAMEZAWA Hiroyuki
2010-06-01 1:02 ` KAMEZAWA Hiroyuki
2010-06-01 20:44 ` David Rientjes
2010-06-01 20:44 ` David Rientjes
2010-06-01 0:54 ` [PATCH 1/5] oom: select_bad_process: check PF_KTHREAD instead of !mm to skip kthreads KAMEZAWA Hiroyuki
2010-06-01 0:54 ` KAMEZAWA Hiroyuki
2010-06-01 20:36 ` David Rientjes
2010-06-01 20:36 ` David Rientjes
2010-06-01 21:20 ` Oleg Nesterov
2010-06-01 21:20 ` Oleg Nesterov
2010-06-01 21:26 ` David Rientjes
2010-06-01 21:26 ` David Rientjes
2010-06-02 13:54 ` KOSAKI Motohiro
2010-06-02 13:54 ` KOSAKI Motohiro
2010-06-02 21:09 ` David Rientjes
2010-06-02 21:09 ` David Rientjes
2010-06-02 21:33 ` Oleg Nesterov
2010-06-02 21:33 ` Oleg Nesterov
2010-06-02 21:46 ` David Rientjes
2010-06-02 21:46 ` David Rientjes
2010-06-03 14:27 ` Oleg Nesterov
2010-06-03 14:27 ` Oleg Nesterov
2010-06-03 20:11 ` David Rientjes
2010-06-03 20:11 ` David Rientjes
2010-06-02 15:32 ` Minchan Kim
2010-06-02 15:32 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100614163304.GA21313@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=rientjes@google.com \
--cc=roland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.