From: Kirill Korotaev <dev@sw.ru>
To: Christoph Lameter <christoph@lameter.com>
Cc: Pavel Machek <pavel@ucw.cz>, Linus Torvalds <torvalds@osdl.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
raybry@engr.sgi.com, Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Subject: Re: [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes
Date: Tue, 28 Jun 2005 11:30:59 +0400 [thread overview]
Message-ID: <42C0FCB3.4030205@sw.ru> (raw)
In-Reply-To: <Pine.LNX.4.62.0506272323490.30956@graphe.net>
>>>-static inline int freezing(struct task_struct *p)
>>>-{
>>>- return p->flags & PF_FREEZE;
>>>-}
>>>+#if defined(CONFIG_PM) || defined(CONFIG_MIGRATE)
>>
>><<<< why not to make a single option CONFIG_REFRIGERATOR? It looks to be a
>>more robust way, since there are multiple users of it.
>
>
> Yes. That may be better. We can do that once the migration code is
> finished and when we know what kind of CONFIG_XXX the migration code
> really needs.
>
>
>>>+#ifdef CONFIG_PM
>>
>><<<< is it intentionaly? or you just lost CONFIG_MIGRATE?
> It is intentional. freeze_processes and thaw_processes are only needed
> for suspend. One only needs to freeze a couple of processes for process
> migration.
But PM and your migrate code can be not the only users of it.
>><<<< I still think this refrigerator is racy with freeze_processes():
>><<<< scenarios:
>><<<< scenario 1
>><<<<
>>task1 -> freeze_processes(): taskXXX ->refrigerator()
>> checked (task->flags & PF_FROZEN) == 0 cur->flags |= PF_FROZEN
>> clear TIF_FREEZE
>> <sleep on thaw>
>> set TIF_FREEZING
>> clear PF_FROZEN
>>
>><<<< so the task awakes with TIF_FREEZE flag set!!!
>
>
> Hmm... If we wait to clear both flags until after the completion
> notification then we do not have the race right? But then we need to move
> the signal recalc since it tests for TIF_FREEZE too.
It is almost ok, but it is still not fine :)
look what happens if you call freeze/unfreeze in a loop:
refrigerator:
awakes
freezer:
check PF_FROZEN, it is still set, skips task and thinks it is finished
freezing.
refrigerator:
clears PF_FROZEN and TIF_FREEZE and returns.
I think you can fix this by moving PF_FROZEN check and set in both
places under siglock.
Kirill
>><<<< scenario 2
>><<<< look at error path in freeze_processes (on timeout), it is broken as
>>well. You need to wakeup tasks there...
>
>
> Ok. How about this additional patch? This still requires that process
> freezing does not immediately occurr again after the completion
> handler. All of this is iffy due to not having a real lock protecting all
> these values and we may still need to add some barriers.
>
> Index: linux-2.6.12/kernel/power/process.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/power/process.c 2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/power/process.c 2005-06-28 06:40:28.000000000 +0000
> @@ -47,12 +47,13 @@ int freeze_processes(void)
> unsigned long flags;
> if (!freezeable(p))
> continue;
> - if ((p->flags & PF_FROZEN) ||
> - (p->state == TASK_TRACED) ||
> + if ((p->state == TASK_TRACED) ||
> (p->state == TASK_STOPPED))
> continue;
>
> set_thread_flag(TIF_FREEZE);
> + if (p->flags & PF_FROZEN)
> + continue;
> spin_lock_irqsave(&p->sighand->siglock, flags);
> signal_wake_up(p, 0);
> spin_unlock_irqrestore(&p->sighand->siglock, flags);
> @@ -63,6 +64,8 @@ int freeze_processes(void)
> if (time_after(jiffies, start_time + TIMEOUT)) {
> printk( "\n" );
> printk(KERN_ERR " stopping tasks failed (%d tasks remaining)\n", todo );
> + complete_all(&thaw);
> + up(&freezer_sem);
> return todo;
> }
> } while(todo);
> Index: linux-2.6.12/kernel/sched.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/sched.c 2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/sched.c 2005-06-28 06:37:36.000000000 +0000
> @@ -5210,13 +5210,13 @@ DECLARE_COMPLETION(thaw);
> void refrigerator(void)
> {
> current->flags |= PF_FROZEN;
> + wait_for_completion(&thaw);
> clear_thread_flag(TIF_FREEZE);
> + current->flags &= ~PF_FROZEN;
> /* A fake signal 0 may have been sent. Recalculate sigpending */
> spin_lock_irq(¤t->sighand->siglock);
> recalc_sigpending();
> spin_unlock_irq(¤t->sighand->siglock);
> - wait_for_completion(&thaw);
> - current->flags &= ~PF_FROZEN;
> }
> EXPORT_SYMBOL(refrigerator);
> #endif
>
WARNING: multiple messages have this Message-ID (diff)
From: Kirill Korotaev <dev@sw.ru>
To: Christoph Lameter <christoph@lameter.com>
Cc: Pavel Machek <pavel@ucw.cz>, Linus Torvalds <torvalds@osdl.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
raybry@engr.sgi.com, Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Subject: Re: [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes
Date: Tue, 28 Jun 2005 11:30:59 +0400 [thread overview]
Message-ID: <42C0FCB3.4030205@sw.ru> (raw)
In-Reply-To: <Pine.LNX.4.62.0506272323490.30956@graphe.net>
>>>-static inline int freezing(struct task_struct *p)
>>>-{
>>>- return p->flags & PF_FREEZE;
>>>-}
>>>+#if defined(CONFIG_PM) || defined(CONFIG_MIGRATE)
>>
>><<<< why not to make a single option CONFIG_REFRIGERATOR? It looks to be a
>>more robust way, since there are multiple users of it.
>
>
> Yes. That may be better. We can do that once the migration code is
> finished and when we know what kind of CONFIG_XXX the migration code
> really needs.
>
>
>>>+#ifdef CONFIG_PM
>>
>><<<< is it intentionaly? or you just lost CONFIG_MIGRATE?
> It is intentional. freeze_processes and thaw_processes are only needed
> for suspend. One only needs to freeze a couple of processes for process
> migration.
But PM and your migrate code can be not the only users of it.
>><<<< I still think this refrigerator is racy with freeze_processes():
>><<<< scenarios:
>><<<< scenario 1
>><<<<
>>task1 -> freeze_processes(): taskXXX ->refrigerator()
>> checked (task->flags & PF_FROZEN) == 0 cur->flags |= PF_FROZEN
>> clear TIF_FREEZE
>> <sleep on thaw>
>> set TIF_FREEZING
>> clear PF_FROZEN
>>
>><<<< so the task awakes with TIF_FREEZE flag set!!!
>
>
> Hmm... If we wait to clear both flags until after the completion
> notification then we do not have the race right? But then we need to move
> the signal recalc since it tests for TIF_FREEZE too.
It is almost ok, but it is still not fine :)
look what happens if you call freeze/unfreeze in a loop:
refrigerator:
awakes
freezer:
check PF_FROZEN, it is still set, skips task and thinks it is finished
freezing.
refrigerator:
clears PF_FROZEN and TIF_FREEZE and returns.
I think you can fix this by moving PF_FROZEN check and set in both
places under siglock.
Kirill
>><<<< scenario 2
>><<<< look at error path in freeze_processes (on timeout), it is broken as
>>well. You need to wakeup tasks there...
>
>
> Ok. How about this additional patch? This still requires that process
> freezing does not immediately occurr again after the completion
> handler. All of this is iffy due to not having a real lock protecting all
> these values and we may still need to add some barriers.
>
> Index: linux-2.6.12/kernel/power/process.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/power/process.c 2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/power/process.c 2005-06-28 06:40:28.000000000 +0000
> @@ -47,12 +47,13 @@ int freeze_processes(void)
> unsigned long flags;
> if (!freezeable(p))
> continue;
> - if ((p->flags & PF_FROZEN) ||
> - (p->state == TASK_TRACED) ||
> + if ((p->state == TASK_TRACED) ||
> (p->state == TASK_STOPPED))
> continue;
>
> set_thread_flag(TIF_FREEZE);
> + if (p->flags & PF_FROZEN)
> + continue;
> spin_lock_irqsave(&p->sighand->siglock, flags);
> signal_wake_up(p, 0);
> spin_unlock_irqrestore(&p->sighand->siglock, flags);
> @@ -63,6 +64,8 @@ int freeze_processes(void)
> if (time_after(jiffies, start_time + TIMEOUT)) {
> printk( "\n" );
> printk(KERN_ERR " stopping tasks failed (%d tasks remaining)\n", todo );
> + complete_all(&thaw);
> + up(&freezer_sem);
> return todo;
> }
> } while(todo);
> Index: linux-2.6.12/kernel/sched.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/sched.c 2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/sched.c 2005-06-28 06:37:36.000000000 +0000
> @@ -5210,13 +5210,13 @@ DECLARE_COMPLETION(thaw);
> void refrigerator(void)
> {
> current->flags |= PF_FROZEN;
> + wait_for_completion(&thaw);
> clear_thread_flag(TIF_FREEZE);
> + current->flags &= ~PF_FROZEN;
> /* A fake signal 0 may have been sent. Recalculate sigpending */
> spin_lock_irq(¤t->sighand->siglock);
> recalc_sigpending();
> spin_unlock_irq(¤t->sighand->siglock);
> - wait_for_completion(&thaw);
> - current->flags &= ~PF_FROZEN;
> }
> EXPORT_SYMBOL(refrigerator);
> #endif
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-06-28 7:33 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-24 20:20 [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes Christoph Lameter
2005-06-24 20:20 ` Christoph Lameter
2005-06-24 22:25 ` Nigel Cunningham
2005-06-24 22:25 ` Nigel Cunningham
2005-06-25 2:51 ` Pavel Machek
2005-06-25 2:51 ` Pavel Machek
2005-06-25 4:31 ` Christoph Lameter
2005-06-25 4:31 ` Christoph Lameter
2005-06-25 4:46 ` Nigel Cunningham
2005-06-25 4:46 ` Nigel Cunningham
2005-06-25 22:37 ` Pavel Machek
2005-06-25 22:37 ` Pavel Machek
2005-06-25 6:13 ` Christoph Lameter
2005-06-25 6:13 ` Christoph Lameter
2005-06-26 2:30 ` Pavel Machek
2005-06-26 2:30 ` Pavel Machek
2005-06-26 2:55 ` Christoph Lameter
2005-06-26 2:55 ` Christoph Lameter
2005-06-26 3:09 ` Pavel Machek
2005-06-26 3:09 ` Pavel Machek
2005-06-27 3:51 ` Christoph Lameter
2005-06-27 3:51 ` Christoph Lameter
2005-06-27 4:21 ` Pavel Machek
2005-06-27 4:21 ` Pavel Machek
2005-06-27 4:24 ` Linus Torvalds
2005-06-27 4:24 ` Linus Torvalds
2005-06-27 5:53 ` Christoph Lameter
2005-06-27 5:53 ` Christoph Lameter
2005-06-27 14:13 ` Pavel Machek
2005-06-27 14:13 ` Pavel Machek
2005-06-27 15:13 ` Christoph Lameter
2005-06-27 15:13 ` Christoph Lameter
2005-06-28 6:18 ` Kirill Korotaev
2005-06-28 6:18 ` Kirill Korotaev
2005-06-28 7:01 ` Christoph Lameter
2005-06-28 7:01 ` Christoph Lameter
2005-06-28 7:30 ` Kirill Korotaev [this message]
2005-06-28 7:30 ` Kirill Korotaev
2005-06-28 8:13 ` Kirill Korotaev
2005-06-28 8:13 ` Kirill Korotaev
2005-06-28 14:02 ` Christoph Lameter
2005-06-28 14:02 ` Christoph Lameter
2005-06-28 14:01 ` Christoph Lameter
2005-06-28 14:01 ` Christoph Lameter
2005-06-28 12:47 ` Pavel Machek
2005-06-28 12:47 ` Pavel Machek
2005-07-01 17:06 ` [SUSPEND 1/2]Replace PF_FREEZE with TIF_FREEZE Christoph Lameter
2005-07-01 17:06 ` Christoph Lameter
2005-07-01 17:14 ` [SUSPEND 2/2] Replace PF_FROZEN with TASK_FROZEN Christoph Lameter
2005-07-01 17:14 ` Christoph Lameter
2005-06-28 21:50 ` [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes Ray Bryant
2005-06-28 21:50 ` Ray Bryant
2005-06-28 21:54 ` Christoph Lameter
2005-06-28 21:54 ` Christoph Lameter
2005-07-03 11:06 ` Pavel Machek
2005-07-03 11:06 ` Pavel Machek
2005-06-26 10:54 ` Nigel Cunningham
2005-06-26 10:54 ` Nigel Cunningham
2005-06-25 6:18 ` Christoph Lameter
2005-06-25 6:18 ` Christoph Lameter
2005-06-25 7:35 ` Kirill Korotaev
2005-06-25 7:35 ` Kirill Korotaev
2005-06-27 7:06 ` Ray Bryant
2005-06-27 7:06 ` Ray Bryant
2005-06-27 13:17 ` Pavel Machek
2005-06-27 13:17 ` Pavel Machek
2005-06-27 14:59 ` Ray Bryant
2005-06-27 14:59 ` Ray Bryant
2005-06-27 18:05 ` Pavel Machek
2005-06-27 18:05 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42C0FCB3.4030205@sw.ru \
--to=dev@sw.ru \
--cc=christoph@lameter.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pavel@ucw.cz \
--cc=raybry@engr.sgi.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.