public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Pavel Machek <pavel@suse.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, nigel@suspend2.net,
	Ulrich Drepper <drepper@redhat.com>,
	Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH -mm] swsusp: freeze user space processes first
Date: Sun, 5 Feb 2006 15:22:26 +0100	[thread overview]
Message-ID: <20060205142226.GA20141@elte.hu> (raw)
In-Reply-To: <20060205111145.GE1790@elf.ucw.cz>


* Pavel Machek <pavel@suse.cz> wrote:

> > then i'd suggest to change the vfork implementation to make this code 
> > freezable. Nothing that userspace does should cause freezing to fail.  
> > If it does, we've designed things incorrectly on the kernel side.
> 
> Does that also mean we have bugs with signal delivery? If vfork(); 
> sleep(100000); causes process to be uninterruptible for few days, it 
> will not be killable and increase load average...

"half-done" vforks are indeed in uninterruptible sleep. They are not 
directly killable, but they are killable indirectly through their 
parent. But yes, in theory it would be cleaner if the vfork code used 
wait_for_completion_interruptible(). It has to be done carefully though, 
for two reasons:

- implementational: use task_lock()/task_unlock() to protect
  p->vfork_done in mm_release() and in do_fork().

- semantical: signals to a vfork-ing parent are defined to be delayed
  to after the child has released the parent/MM.

the (untested) patch below handles issue #1, but doesnt handle issue #2: 
this patch opens up a vfork parent to get interrupted early, with any 
signal.

a full approach would probably be to set up a restart handler perhaps, 
and restart the wait later on? This would also require the &vfork 
completion structure [which is on the kernel stack right now] to be 
embedded in task_struct, to make sure the parent can wait on the child 
without extra state. (one more detail is nesting: a signal handler could 
do another vfork(), which thus needs to initialize the &vfork safely and 
in a race-free manner.)

i'm wondering whether signals handled in the parent during a vfork wait 
would be the correct semantics though. Ulrich?

	Ingo

--- linux/kernel/fork.c.orig
+++ linux/kernel/fork.c
@@ -423,16 +423,24 @@ EXPORT_SYMBOL_GPL(get_task_mm);
  */
 void mm_release(struct task_struct *tsk, struct mm_struct *mm)
 {
-	struct completion *vfork_done = tsk->vfork_done;
-
 	/* Get rid of any cached register state */
 	deactivate_mm(tsk, mm);
 
-	/* notify parent sleeping on vfork() */
-	if (vfork_done) {
-		tsk->vfork_done = NULL;
-		complete(vfork_done);
+	if (unlikely(tsk->vfork_done)) {
+		task_lock(tsk);
+		/*
+		 * notify parent sleeping on vfork().
+		 *
+		 * (re-check vfork_done under the task lock, to make sure
+		 *  we did not race with a signal sent to the parent)
+		 */
+		if (tsk->vfork_done) {
+			complete(tsk->vfork_done);
+			tsk->vfork_done = NULL;
+		}
+		task_unlock(tsk);
 	}
+
 	if (tsk->clear_child_tid && atomic_read(&mm->mm_users) > 1) {
 		u32 __user * tidptr = tsk->clear_child_tid;
 		tsk->clear_child_tid = NULL;
@@ -1287,7 +1295,21 @@ long do_fork(unsigned long clone_flags,
 		}
 
 		if (clone_flags & CLONE_VFORK) {
-			wait_for_completion(&vfork);
+			int ret = wait_for_completion_interruptible(&vfork);
+			if (unlikely(ret)) {
+				/*
+				 * make sure we did not race with
+				 * mm_release():
+				 */
+				task_lock(p);
+				if (p->vfork_done)
+					p->vfork_done = NULL;
+				else
+					ret = 0;
+				task_unlock(p);
+				if (ret)
+					return -EINTR;
+			}
 			if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
 				ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
 		}

  reply	other threads:[~2006-02-05 14:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-05  9:14 [PATCH -mm] swsusp: freeze user space processes first Rafael J. Wysocki
2006-02-05  9:38 ` Andrew Morton
2006-02-05 10:34   ` Rafael J. Wysocki
2006-02-05 10:50     ` Ingo Molnar
2006-02-05 11:11       ` Rafael J. Wysocki
2006-02-05 11:18         ` Pavel Machek
2006-02-05 11:39           ` Rafael J. Wysocki
2006-02-05 13:34             ` Rafael J. Wysocki
2006-02-10 20:20             ` vfork makes processes uninterruptible [was Re: [PATCH -mm] swsusp: freeze user space processes first] Pavel Machek
2006-02-05 11:11       ` [PATCH -mm] swsusp: freeze user space processes first Pavel Machek
2006-02-05 14:22         ` Ingo Molnar [this message]
2006-02-10 20:36           ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060205142226.GA20141@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@osdl.org \
    --cc=drepper@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nigel@suspend2.net \
    --cc=pavel@suse.cz \
    --cc=rjw@sisk.pl \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox