From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>, Linux PM <linux-pm@osdl.org>
Subject: Re: [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first
Date: Wed, 1 Feb 2006 23:19:20 +0100 [thread overview]
Message-ID: <200602012319.21701.rjw@sisk.pl> (raw)
In-Reply-To: <20060201114717.GA1768@elf.ucw.cz>
Hi,
On Wednesday 01 February 2006 12:47, Pavel Machek wrote:
> > This is an experimantal patch aimed at the "unable to freeze processes under
> > load" problem.
> >
> > On my box the 2.6.16-rc1-mm4 kernel with this patch applied survives the
> > "dd if=/dev/hda of=/dev/null" test.
> >
> > Please have a look.
>
> It makes it better (well, I used my own, simpler variant, but that
> should not matter; patch is attached). I now can't reproduce hangs
> with simple stress testing,
This means the direction is right ...
> but running kernel make alongside that
> makes it hang sometimes.
... but there still is a race here.
> Example of non-frozen gcc:
>
> gcc D EEE06A70 0 1750 1749 1751
> (NOTLB)
> df85df38 00000046 bf878130 eee06a70 00004111 eee06a70 eee06a70
> 003d0900
> 00000000 c0137cf5 df85c000 00000000 c058ada2 c012503e ef2c915c
> ef2c9030
> c1c0b480 7c3b8500 003d0927 df85c000 00000a98 7c3b8500 003d0927
> c0770800
> Call Trace:
> [<c0137cf5>] attach_pid+0x25/0xb0
> [<c058ada2>] _write_unlock_irq+0x12/0x30
> [<c012503e>] copy_process+0xe5e/0x11b0
> [<c0588f74>] wait_for_completion+0x94/0xd0
> [<c0121690>] default_wake_function+0x0/0x10
> [<c01254d9>] do_fork+0x149/0x210
> [<c0101218>] sys_vfork+0x28/0x30
> [<c0103231>] syscall_call+0x7/0xb
[I'm trying to understand this trace, but with not much success, so far.
Apparently sys_vfork() calls do_fork() which calls copy_process(),
which calls attach_pid(). But what default_wake_function() and
wait_for_completion() are doing here? And where it got stuck, actually?]
I think the problem is related to the kernel make using fork() heavily.
Namely, if the parent process uses CLONE_VFORK, it will wait for the
vfork completion. Now if the child process is frozen _before_
it completes the vfork completion, the parent will be unfreezeable.
> ...maybe solving this would solve journalling problems, too? It is
> similar AFAICT.
There's a chance, I think. ;-)
> diff --git a/kernel/power/disk.c b/kernel/power/disk.c
> index e24446f..90d6c1a 100644
> --- a/kernel/power/disk.c
> +++ b/kernel/power/disk.c
> @@ -87,7 +87,6 @@ static int prepare_processes(void)
> int error;
>
> pm_prepare_console();
> - sys_sync();
> disable_nonboot_cpus();
>
> if (freeze_processes()) {
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 02a1b3a..bd16e44 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -16,7 +16,7 @@
> /*
> * Timeout for stopping processes
> */
> -#define TIMEOUT (6 * HZ)
> +#define TIMEOUT (60 * HZ)
>
>
> static inline int freezeable(struct task_struct * p)
> @@ -54,32 +54,53 @@ void refrigerator(void)
> current->state = save;
> }
>
> +static void freeze_process(struct task_struct *p)
> +{
> + unsigned long flags;
> +
> + freeze(p);
> + spin_lock_irqsave(&p->sighand->siglock, flags);
> + signal_wake_up(p, 0);
> + spin_unlock_irqrestore(&p->sighand->siglock, flags);
> +}
> +
> /* 0 = success, else # of processes that we failed to stop */
> int freeze_processes(void)
> {
> - int todo;
> + int todo, user_frozen, nr_user;
> unsigned long start_time;
> struct task_struct *g, *p;
> unsigned long flags;
>
> printk( "Stopping tasks: " );
> start_time = jiffies;
> + user_frozen = 0;
> do {
> - todo = 0;
> + nr_user = todo = 0;
> read_lock(&tasklist_lock);
> do_each_thread(g, p) {
> if (!freezeable(p))
> continue;
> if (frozen(p))
> continue;
> -
> - freeze(p);
> - spin_lock_irqsave(&p->sighand->siglock, flags);
> - signal_wake_up(p, 0);
> - spin_unlock_irqrestore(&p->sighand->siglock, flags);
> - todo++;
> + if (p->mm) {
This one is too simplistic, it appears. We should lock the task to check its mm
and there may be kernel threads with non-null mm (they have the
PF_BORROWED_MM flag set though, so we have to check this too).
> + /* The task is a user-space one. Freeze it */
> + freeze_process(p);
> + todo++;
I think you can drop this ...
> + nr_user++;
> + } else {
> + /* Freeze only if the user space is frozen */
> + if (user_frozen)
> + freeze_process(p);
> + todo++;
> + }
> } while_each_thread(g, p);
> read_unlock(&tasklist_lock);
... if you do
+ todo += nr_user;
> + if (!user_frozen && !nr_user) {
> + printk("sync");
> + sys_sync();
> + }
> + user_frozen = !nr_user;
> yield(); /* Yield is okay here */
> if (todo && time_after(jiffies, start_time + TIMEOUT)) {
> printk( "\n" );
}-- debugging stuff snipped --{
Greetings,
Rafael
prev parent reply other threads:[~2006-02-01 22:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-01 0:41 [RFC][PATCH -mm][Experimental] swsusp: freeze userspace processes first Rafael J. Wysocki
2006-02-01 10:55 ` Pavel Machek
2006-02-01 11:18 ` Nigel Cunningham
2006-02-01 20:13 ` Rafael J. Wysocki
2006-02-01 11:47 ` Pavel Machek
2006-02-01 12:24 ` Nigel Cunningham
2006-02-01 12:49 ` Pavel Machek
2006-02-01 21:41 ` Nigel Cunningham
2006-02-01 23:57 ` Rafael J. Wysocki
2006-02-02 13:55 ` Rafael J. Wysocki
2006-02-02 15:08 ` Pavel Machek
2006-02-02 18:32 ` Rafael J. Wysocki
2006-02-04 21:26 ` Pavel Machek
2006-02-04 21:47 ` Rafael J. Wysocki
2006-02-01 22:19 ` Rafael J. Wysocki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200602012319.21701.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=linux-pm@osdl.org \
--cc=nigel@suspend2.net \
--cc=pavel@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox