From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>, Linux PM <linux-pm@osdl.org>
Subject: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first
Date: Wed, 1 Feb 2006 23:19:20 +0100 [thread overview]
Message-ID: <200602012319.21701.rjw@sisk.pl> (raw)
In-Reply-To: <20060201114717.GA1768@elf.ucw.cz>
Hi,
On Wednesday 01 February 2006 12:47, Pavel Machek wrote:
> > This is an experimantal patch aimed at the "unable to freeze processes under
> > load" problem.
> >
> > On my box the 2.6.16-rc1-mm4 kernel with this patch applied survives the
> > "dd if=/dev/hda of=/dev/null" test.
> >
> > Please have a look.
>
> It makes it better (well, I used my own, simpler variant, but that
> should not matter; patch is attached). I now can't reproduce hangs
> with simple stress testing,
This means the direction is right ...
> but running kernel make alongside that
> makes it hang sometimes.
... but there still is a race here.
> Example of non-frozen gcc:
>
> gcc D EEE06A70 0 1750 1749 1751
> (NOTLB)
> df85df38 00000046 bf878130 eee06a70 00004111 eee06a70 eee06a70
> 003d0900
> 00000000 c0137cf5 df85c000 00000000 c058ada2 c012503e ef2c915c
> ef2c9030
> c1c0b480 7c3b8500 003d0927 df85c000 00000a98 7c3b8500 003d0927
> c0770800
> Call Trace:
> [<c0137cf5>] attach_pid+0x25/0xb0
> [<c058ada2>] _write_unlock_irq+0x12/0x30
> [<c012503e>] copy_process+0xe5e/0x11b0
> [<c0588f74>] wait_for_completion+0x94/0xd0
> [<c0121690>] default_wake_function+0x0/0x10
> [<c01254d9>] do_fork+0x149/0x210
> [<c0101218>] sys_vfork+0x28/0x30
> [<c0103231>] syscall_call+0x7/0xb
[I'm trying to understand this trace, but with not much success, so far.
Apparently sys_vfork() calls do_fork() which calls copy_process(),
which calls attach_pid(). But what default_wake_function() and
wait_for_completion() are doing here? And where it got stuck, actually?]
I think the problem is related to the kernel make using fork() heavily.
Namely, if the parent process uses CLONE_VFORK, it will wait for the
vfork completion. Now if the child process is frozen _before_
it completes the vfork completion, the parent will be unfreezeable.
> ...maybe solving this would solve journalling problems, too? It is
> similar AFAICT.
There's a chance, I think. ;-)
> diff --git a/kernel/power/disk.c b/kernel/power/disk.c
> index e24446f..90d6c1a 100644
> --- a/kernel/power/disk.c
> +++ b/kernel/power/disk.c
> @@ -87,7 +87,6 @@ static int prepare_processes(void)
> int error;
>
> pm_prepare_console();
> - sys_sync();
> disable_nonboot_cpus();
>
> if (freeze_processes()) {
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 02a1b3a..bd16e44 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -16,7 +16,7 @@
> /*
> * Timeout for stopping processes
> */
> -#define TIMEOUT (6 * HZ)
> +#define TIMEOUT (60 * HZ)
>
>
> static inline int freezeable(struct task_struct * p)
> @@ -54,32 +54,53 @@ void refrigerator(void)
> current->state = save;
> }
>
> +static void freeze_process(struct task_struct *p)
> +{
> + unsigned long flags;
> +
> + freeze(p);
> + spin_lock_irqsave(&p->sighand->siglock, flags);
> + signal_wake_up(p, 0);
> + spin_unlock_irqrestore(&p->sighand->siglock, flags);
> +}
> +
> /* 0 = success, else # of processes that we failed to stop */
> int freeze_processes(void)
> {
> - int todo;
> + int todo, user_frozen, nr_user;
> unsigned long start_time;
> struct task_struct *g, *p;
> unsigned long flags;
>
> printk( "Stopping tasks: " );
> start_time = jiffies;
> + user_frozen = 0;
> do {
> - todo = 0;
> + nr_user = todo = 0;
> read_lock(&tasklist_lock);
> do_each_thread(g, p) {
> if (!freezeable(p))
> continue;
> if (frozen(p))
> continue;
> -
> - freeze(p);
> - spin_lock_irqsave(&p->sighand->siglock, flags);
> - signal_wake_up(p, 0);
> - spin_unlock_irqrestore(&p->sighand->siglock, flags);
> - todo++;
> + if (p->mm) {
This one is too simplistic, it appears. We should lock the task to check its mm
and there may be kernel threads with non-null mm (they have the
PF_BORROWED_MM flag set though, so we have to check this too).
> + /* The task is a user-space one. Freeze it */
> + freeze_process(p);
> + todo++;
I think you can drop this ...
> + nr_user++;
> + } else {
> + /* Freeze only if the user space is frozen */
> + if (user_frozen)
> + freeze_process(p);
> + todo++;
> + }
> } while_each_thread(g, p);
> read_unlock(&tasklist_lock);
... if you do
+ todo += nr_user;
> + if (!user_frozen && !nr_user) {
> + printk("sync");
> + sys_sync();
> + }
> + user_frozen = !nr_user;
> yield(); /* Yield is okay here */
> if (todo && time_after(jiffies, start_time + TIMEOUT)) {
> printk( "\n" );
}-- debugging stuff snipped --{
Greetings,
Rafael
prev parent reply other threads:[~2006-02-01 22:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-01 0:41 [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first Rafael J. Wysocki
2006-02-01 10:55 ` Pavel Machek
2006-02-01 11:18 ` Nigel Cunningham
2006-02-01 20:13 ` Rafael J. Wysocki
2006-02-01 11:47 ` Pavel Machek
2006-02-01 12:24 ` Nigel Cunningham
2006-02-01 12:49 ` Pavel Machek
2006-02-01 21:41 ` Nigel Cunningham
2006-02-01 23:57 ` Rafael J. Wysocki
2006-02-02 13:55 ` Rafael J. Wysocki
2006-02-02 15:08 ` Pavel Machek
2006-02-02 18:32 ` Rafael J. Wysocki
2006-02-04 21:26 ` Pavel Machek
2006-02-04 21:47 ` Rafael J. Wysocki
2006-02-01 22:19 ` Rafael J. Wysocki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200602012319.21701.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=linux-pm@osdl.org \
--cc=nigel@suspend2.net \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.