[RFC][PATCH -mm] Freezer: Handle uninterruptible tasks

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
@ 2007-07-06  8:12 Rafael J. Wysocki
  0 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-07-06  8:12 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, LKML, Pavel Machek, Ingo Molnar

Hi,

The main limitation of the freezer is that it cannot handle uninterruptible
tasks.  Namely, if there are uninterruptible tasks in the system, the freezer
returns an error, which makes it impossible to suspend the system.

This mechanism is used to prevent the situations in which the suspend process
can deadlock with a task holding a lock needed by it from happening.  However,
AFAICS, the probability of that happening is very small and if the freezer is
removed from the suspend code patch, then the suspend process will be exposed to
deadlocking in this manner anyway.

Unfortunately, this mechanism also leads to severe limitations, such as that it
makes the freezer unable to handle systems using FUSE in a reliable way.

This patch makes the freezer skip uninterruptible user space tasks (ie. such
that have an mm of their own) when counting the tasks to be frozen.  As a
result, these tasks have the TIF_FREEZE and TIF_SIGPENDING flags set, but the
freezer doesn't wait for them to enter the refrigerator.  Nevertheless, they
will enter the refrigerator as soon as they change their state.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 include/linux/freezer.h |   44 --------------------------------------------
 include/linux/sched.h   |    1 -
 kernel/fork.c           |    2 --
 kernel/power/process.c  |   18 +++++++++++++++++-
 4 files changed, 17 insertions(+), 48 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -105,6 +105,16 @@ static void cancel_freezing(struct task_
 	}
 }
 
+static int has_mm(struct task_struct *p)
+{
+	return (p->mm && !(p->flags & PF_BORROWED_MM));
+}
+
+static int freezer_should_skip(struct task_struct *p)
+{
+	return (has_mm(p) && (p->state | TASK_UNINTERRUPTIBLE));
+}
+
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
@@ -135,7 +145,7 @@ static int try_to_freeze_tasks(int freez
 				 * occuring.
 				 */
 				task_lock(p);
-				if (!p->mm || (p->flags & PF_BORROWED_MM)) {
+				if (!has_mm(p)) {
 					task_unlock(p);
 					continue;
 				}
@@ -144,8 +154,14 @@ static int try_to_freeze_tasks(int freez
 			} else {
 				freeze_task(p);
 			}
+			/*
+			 * task_lock() is necessary to prevent races with
+			 * use_mm()/unuse_mm() from occuring.
+			 */
+			task_lock(p);
 			if (!freezer_should_skip(p))
 				todo++;
+			task_unlock(p);
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
 		yield();			/* Yield is okay here */
Index: linux-2.6.22-rc6-mm1/include/linux/freezer.h
===================================================================
--- linux-2.6.22-rc6-mm1.orig/include/linux/freezer.h
+++ linux-2.6.22-rc6-mm1/include/linux/freezer.h
@@ -75,50 +75,6 @@ static inline int try_to_freeze(void)
 }
 
 /*
- * The PF_FREEZER_SKIP flag should be set by a vfork parent right before it
- * calls wait_for_completion(&vfork) and reset right after it returns from this
- * function.  Next, the parent should call try_to_freeze() to freeze itself
- * appropriately in case the child has exited before the freezing of tasks is
- * complete.  However, we don't want kernel threads to be frozen in unexpected
- * places, so we allow them to block freeze_processes() instead or to set
- * PF_NOFREEZE if needed and PF_FREEZER_SKIP is only set for userland vfork
- * parents.  Fortunately, in the ____call_usermodehelper() case the parent won't
- * really block freeze_processes(), since ____call_usermodehelper() (the child)
- * does a little before exec/exit and it can't be frozen before waking up the
- * parent.
- */
-
-/*
- * If the current task is a user space one, tell the freezer not to count it as
- * freezable.
- */
-static inline void freezer_do_not_count(void)
-{
-	if (current->mm)
-		current->flags |= PF_FREEZER_SKIP;
-}
-
-/*
- * If the current task is a user space one, tell the freezer to count it as
- * freezable again and try to freeze it.
- */
-static inline void freezer_count(void)
-{
-	if (current->mm) {
-		current->flags &= ~PF_FREEZER_SKIP;
-		try_to_freeze();
-	}
-}
-
-/*
- * Check if the task should be counted as freezeable by the freezer
- */
-static inline int freezer_should_skip(struct task_struct *p)
-{
-	return !!(p->flags & PF_FREEZER_SKIP);
-}
-
-/*
  * Tell the freezer that the current task should be frozen by it
  */
 static inline void set_freezable(void)
Index: linux-2.6.22-rc6-mm1/include/linux/sched.h
===================================================================
--- linux-2.6.22-rc6-mm1.orig/include/linux/sched.h
+++ linux-2.6.22-rc6-mm1/include/linux/sched.h
@@ -1275,7 +1275,6 @@ static inline void put_task_struct(struc
 #define PF_SPREAD_SLAB	0x02000000	/* Spread some slab caches over cpuset */
 #define PF_MEMPOLICY	0x10000000	/* Non-default NUMA mempolicy */
 #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
-#define PF_FREEZER_SKIP	0x40000000	/* Freezer should not count it as freezeable */
 
 /*
  * Only the _current_ task can read/write to tsk->flags, but other
Index: linux-2.6.22-rc6-mm1/kernel/fork.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/fork.c
+++ linux-2.6.22-rc6-mm1/kernel/fork.c
@@ -1424,9 +1424,7 @@ long do_fork(unsigned long clone_flags,
 		}
 
 		if (clone_flags & CLONE_VFORK) {
-			freezer_do_not_count();
 			wait_for_completion(&vfork);
-			freezer_count();
 			if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE)) {
 				current->ptrace_message = nr;
 				ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] <200707061012.48998.rjw@sisk.pl>
@ 2007-07-06 15:01 ` Alan Stern
  2007-07-07  7:50 ` Pavel Machek
  1 sibling, 0 replies; 13+ messages in thread
From: Alan Stern @ 2007-07-06 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, Miklos Szeredi, LKML, Pavel Machek, pm list,
	Ingo Molnar

On Fri, 6 Jul 2007, Rafael J. Wysocki wrote:

> Hi,
> 
> The main limitation of the freezer is that it cannot handle uninterruptible
> tasks.  Namely, if there are uninterruptible tasks in the system, the freezer
> returns an error, which makes it impossible to suspend the system.
> 
> This mechanism is used to prevent the situations in which the suspend process
> can deadlock with a task holding a lock needed by it from happening.  However,
> AFAICS, the probability of that happening is very small and if the freezer is
> removed from the suspend code patch, then the suspend process will be exposed to
> deadlocking in this manner anyway.
> 
> Unfortunately, this mechanism also leads to severe limitations, such as that it
> makes the freezer unable to handle systems using FUSE in a reliable way.
> 
> This patch makes the freezer skip uninterruptible user space tasks (ie. such
> that have an mm of their own) when counting the tasks to be frozen.  As a
> result, these tasks have the TIF_FREEZE and TIF_SIGPENDING flags set, but the
> freezer doesn't wait for them to enter the refrigerator.  Nevertheless, they
> will enter the refrigerator as soon as they change their state.

This is a very interesting idea.  Are you certain it is safe?  That is,
if a task is waiting for a mutex and sometime later the mutex becomes
available, will the task then enter the freezer immediately -- before
it can do any I/O or call any drivers?

If you can't make this guarantee, then you might as well simply not try 
to freeze any user task until it returns from kernel mode to user mode.  
And then you will face the problem of a user task doing I/O during 
hibernate after the atomic snapshot has been made.

I had in mind something more complicated.  My idea was to define 
certain mutexes as "freezable", with the thought that it would be okay 
for a task in the freezer to own freezable mutexes.  None of the 
mutexes needed by drivers or the PM core during a suspend transition 
should be freezable.

Presumably the VFS locks _would_ fall into the freezable category.  In 
order for this to work, we would have to guarantee the following:

	A thread blocked in the kernel waiting for some response
	from userspace is allowed to hold freezable mutexes but no
	other locks.

	When a thread tries to acquire a freezable mutex, it is allowed
	to hold other freezable mutexes but no other locks.

	When the freezer is running, any task that is blocked on a
	freezable mutex (maybe also any task that tries to acquire 
	an unlocked freezable mutex) would immediately go into the 
	freezer.  So would any task waiting for a response from
	userspace.

I think this would solve the VFS-related problems (FUSE's and others).  
But obviously Raphael's approach is preferable, if it works.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] <200707061012.48998.rjw@sisk.pl>
  2007-07-06 15:01 ` Alan Stern
@ 2007-07-07  7:50 ` Pavel Machek
  2007-07-07  9:13   ` Nigel Cunningham
       [not found]   ` <200707071913.43482.nigel@nigel.suspend2.net>
  1 sibling, 2 replies; 13+ messages in thread
From: Pavel Machek @ 2007-07-07  7:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

Hi!

> The main limitation of the freezer is that it cannot handle uninterruptible
> tasks.  Namely, if there are uninterruptible tasks in the system, the freezer
> returns an error, which makes it impossible to suspend the system.
...
> Unfortunately, this mechanism also leads to severe limitations, such as that it
> makes the freezer unable to handle systems using FUSE in a reliable way.
> 
> This patch makes the freezer skip uninterruptible user space tasks (ie. such
> that have an mm of their own) when counting the tasks to be frozen.  As a
> result, these tasks have the TIF_FREEZE and TIF_SIGPENDING flags set, but the
> freezer doesn't wait for them to enter the refrigerator.  Nevertheless, they
> will enter the refrigerator as soon as they change their state.

I don't think we can do that. I suspect rename looks like:

	write directory entry in source
A)	(uninterruptible wait for write)
	write directory entry in destination
	(uninterruptible wait for write)
	write something else

If we freeze some task in place "A)", we'll write to the disk when the
directory write is finished :-(.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
  2007-07-07  7:50 ` Pavel Machek
@ 2007-07-07  9:13   ` Nigel Cunningham
       [not found]   ` <200707071913.43482.nigel@nigel.suspend2.net>
  1 sibling, 0 replies; 13+ messages in thread
From: Nigel Cunningham @ 2007-07-07  9:13 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 1653 bytes --]

Hi.

On Saturday 07 July 2007 17:50:18 Pavel Machek wrote:
> Hi!
> 
> > The main limitation of the freezer is that it cannot handle 
uninterruptible
> > tasks.  Namely, if there are uninterruptible tasks in the system, the 
freezer
> > returns an error, which makes it impossible to suspend the system.
> ...
> > Unfortunately, this mechanism also leads to severe limitations, such as 
that it
> > makes the freezer unable to handle systems using FUSE in a reliable way.
> > 
> > This patch makes the freezer skip uninterruptible user space tasks (ie. 
such
> > that have an mm of their own) when counting the tasks to be frozen.  As a
> > result, these tasks have the TIF_FREEZE and TIF_SIGPENDING flags set, but 
the
> > freezer doesn't wait for them to enter the refrigerator.  Nevertheless, 
they
> > will enter the refrigerator as soon as they change their state.
> 
> I don't think we can do that. I suspect rename looks like:
> 
> 	write directory entry in source
> A)	(uninterruptible wait for write)
> 	write directory entry in destination
> 	(uninterruptible wait for write)
> 	write something else
> 
> If we freeze some task in place "A)", we'll write to the disk when the
> directory write is finished :-(.

Renaming is a single syscall, so won't the process get frozen when the syscall 
finishes? The sys_sync will also help here - it will ensure the rename gets 
flushed before we start on freezing kernel threads.

Perhaps you've just found more logic for keeping the sys_sync there?

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found]   ` <200707071913.43482.nigel@nigel.suspend2.net>
@ 2007-07-07 11:31     ` Pavel Machek
  2007-07-07 20:44       ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Pavel Machek @ 2007-07-07 11:31 UTC (permalink / raw)
  To: nigel; +Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

Hi!

> > I don't think we can do that. I suspect rename looks like:
> > 
> > 	write directory entry in source
> > A)	(uninterruptible wait for write)
> > 	write directory entry in destination
> > 	(uninterruptible wait for write)
> > 	write something else
> > 
> > If we freeze some task in place "A)", we'll write to the disk when the
> > directory write is finished :-(.
> 
> Renaming is a single syscall, so won't the process get frozen when the syscall 
> finishes? 

It would be frozen when syscall finishes. But if we freeze it at A)
point, we have a problem.

> Perhaps you've just found more logic for keeping the sys_sync there?

No, I don't think so.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
  2007-07-07 11:31     ` Pavel Machek
@ 2007-07-07 20:44       ` Rafael J. Wysocki
  0 siblings, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-07-07 20:44 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Matthew Garrett, Miklos Szeredi, nigel, LKML, pm list,
	Ingo Molnar

On Saturday, 7 July 2007 13:31, Pavel Machek wrote:
> Hi!
> 
> > > I don't think we can do that. I suspect rename looks like:
> > > 
> > > 	write directory entry in source
> > > A)	(uninterruptible wait for write)
> > > 	write directory entry in destination
> > > 	(uninterruptible wait for write)
> > > 	write something else
> > > 
> > > If we freeze some task in place "A)", we'll write to the disk when the
> > > directory write is finished :-(.
> > 
> > Renaming is a single syscall, so won't the process get frozen when the syscall 
> > finishes? 
> 
> It would be frozen when syscall finishes. But if we freeze it at A)
> point, we have a problem.

With this patch, we don't freeze it. :-)

We do something like "send a freeze request to this task, but don't care if it
doesn't freeze".

Now, IMHO, the correctness here is the question of for how long a task can be
in TASK_UNINTERRUPTIBLE it it's not stuck (eg. the resource needed is gone
indefinitely, like in NFS) and I think that's not very long.

For example, if the tasks sleeps on a mutex, then it waits for another task to
release the mutex (interrupt handlers don't use mutexes) and I don't know of
any realistic scenario in which it can last in that state throughout
freeze_processes().

Greetings,
Rafael

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] <Pine.LNX.4.44L0.0707061039380.3737-100000@iolanthe.rowland.org>
@ 2007-07-07 23:08 ` Rafael J. Wysocki
       [not found] ` <200707080108.17371.rjw@sisk.pl>
  1 sibling, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-07-07 23:08 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Garrett, Miklos Szeredi, LKML, Pavel Machek, pm list,
	Ingo Molnar

On Friday, 6 July 2007 17:01, Alan Stern wrote:
> On Fri, 6 Jul 2007, Rafael J. Wysocki wrote:
> 
> > Hi,
> > 
> > The main limitation of the freezer is that it cannot handle uninterruptible
> > tasks.  Namely, if there are uninterruptible tasks in the system, the freezer
> > returns an error, which makes it impossible to suspend the system.
> > 
> > This mechanism is used to prevent the situations in which the suspend process
> > can deadlock with a task holding a lock needed by it from happening.  However,
> > AFAICS, the probability of that happening is very small and if the freezer is
> > removed from the suspend code patch, then the suspend process will be exposed to
> > deadlocking in this manner anyway.
> > 
> > Unfortunately, this mechanism also leads to severe limitations, such as that it
> > makes the freezer unable to handle systems using FUSE in a reliable way.
> > 
> > This patch makes the freezer skip uninterruptible user space tasks (ie. such
> > that have an mm of their own) when counting the tasks to be frozen.  As a
> > result, these tasks have the TIF_FREEZE and TIF_SIGPENDING flags set, but the
> > freezer doesn't wait for them to enter the refrigerator.  Nevertheless, they
> > will enter the refrigerator as soon as they change their state.
> 
> This is a very interesting idea.  Are you certain it is safe?  That is,
> if a task is waiting for a mutex and sometime later the mutex becomes
> available, will the task then enter the freezer immediately -- before
> it can do any I/O or call any drivers?

Well, I _think_ it is safe, although I can't prove it.

> If you can't make this guarantee, then you might as well simply not try 
> to freeze any user task until it returns from kernel mode to user mode.  

No, it's not that bad.  The majority of user land tasks are not uninterruptible
and if they are, then usually for a very short time only.

> And then you will face the problem of a user task doing I/O during 
> hibernate after the atomic snapshot has been made.

I don't think that this is possible in normal conditions.  It would be possible
if, for example, the task were waiting for an unavailable resource and that
resource became available after the hibernation image had been created.
In that case, however, to do any damage, the task would have to cause some
filesystem-related data to be flushed in the same syscall (ie. before returning
to user space).

Such situations may be prevented by a mechanizm detecting if any uniterruptible
and freezing task has been woken up after creating the image and aborting the
hibernation in that cases.  For this purpose, we only need to add an
appropriate condition to try_to_wake_up() and make it start to trigger after,
for example, enabling the nonboot CPUs.

> I had in mind something more complicated.  My idea was to define 
> certain mutexes as "freezable", with the thought that it would be okay 
> for a task in the freezer to own freezable mutexes.  None of the 
> mutexes needed by drivers or the PM core during a suspend transition 
> should be freezable.
> 
> Presumably the VFS locks _would_ fall into the freezable category.  In 
> order for this to work, we would have to guarantee the following:
> 
> 	A thread blocked in the kernel waiting for some response
> 	from userspace is allowed to hold freezable mutexes but no
> 	other locks.
> 
> 	When a thread tries to acquire a freezable mutex, it is allowed
> 	to hold other freezable mutexes but no other locks.
> 
> 	When the freezer is running, any task that is blocked on a
> 	freezable mutex (maybe also any task that tries to acquire 
> 	an unlocked freezable mutex) would immediately go into the 
> 	freezer.  So would any task waiting for a response from
> 	userspace.
> 
> I think this would solve the VFS-related problems (FUSE's and others).  

Well, the main difficluty with that would be to convince people to use it. ;-)

> But obviously Raphael's approach is preferable, if it works.

Right now, it sometimes fails to freeze all tasks that in theory should be
freezable.  Investigating.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] ` <200707080108.17371.rjw@sisk.pl>
@ 2007-07-08 12:09   ` Pavel Machek
  2007-07-08 18:37   ` Pavel Machek
       [not found]   ` <20070708120933.GA3866@ucw.cz>
  2 siblings, 0 replies; 13+ messages in thread
From: Pavel Machek @ 2007-07-08 12:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

Hi!

> > And then you will face the problem of a user task doing I/O during 
> > hibernate after the atomic snapshot has been made.
> 
> I don't think that this is possible in normal conditions.  It would be possible
> if, for example, the task were waiting for an unavailable resource and that
> resource became available after the hibernation image had been created.
> In that case, however, to do any damage, the task would have to cause some
> filesystem-related data to be flushed in the same syscall (ie. before returning
> to user space).

I agree that it is relatively unlikely to trigger (if you avoid
freezing the tasks that were uninterruptible for long), but it will
trigger in error cases etc.

> Such situations may be prevented by a mechanizm detecting if any uniterruptible
> and freezing task has been woken up after creating the image and aborting the
> hibernation in that cases.  For this purpose, we only need to add an
> appropriate condition to try_to_wake_up() and make it start to trigger after,
> for example, enabling the nonboot CPUs.

I don't know how to do that mechanism... but if we knew where to trap
filesystem writes, we could simply freeze at that point, and at that
point only, no?
						Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found]   ` <20070708120933.GA3866@ucw.cz>
@ 2007-07-08 13:55     ` Rafael J. Wysocki
  2007-07-09  4:21     ` Jeremy Maitin-Shepard
  1 sibling, 0 replies; 13+ messages in thread
From: Rafael J. Wysocki @ 2007-07-08 13:55 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

On Sunday, 8 July 2007 14:09, Pavel Machek wrote:
> Hi!
> 
> > > And then you will face the problem of a user task doing I/O during 
> > > hibernate after the atomic snapshot has been made.
> > 
> > I don't think that this is possible in normal conditions.  It would be possible
> > if, for example, the task were waiting for an unavailable resource and that
> > resource became available after the hibernation image had been created.
> > In that case, however, to do any damage, the task would have to cause some
> > filesystem-related data to be flushed in the same syscall (ie. before returning
> > to user space).
> 
> I agree that it is relatively unlikely to trigger (if you avoid
> freezing the tasks that were uninterruptible for long), but it will
> trigger in error cases etc.

Yes, it will.

> > Such situations may be prevented by a mechanizm detecting if any uniterruptible
> > and freezing task has been woken up after creating the image and aborting the
> > hibernation in that cases.  For this purpose, we only need to add an
> > appropriate condition to try_to_wake_up() and make it start to trigger after,
> > for example, enabling the nonboot CPUs.
> 
> I don't know how to do that mechanism... but if we knew where to trap
> filesystem writes, we could simply freeze at that point, and at that
> point only, no?

>From the image/filesystems integrity standpoint, yes, that should be
sufficient.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] ` <200707080108.17371.rjw@sisk.pl>
  2007-07-08 12:09   ` Pavel Machek
@ 2007-07-08 18:37   ` Pavel Machek
       [not found]   ` <20070708120933.GA3866@ucw.cz>
  2 siblings, 0 replies; 13+ messages in thread
From: Pavel Machek @ 2007-07-08 18:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

Hi!

> > And then you will face the problem of a user task doing I/O during 
> > hibernate after the atomic snapshot has been made.
> 
> I don't think that this is possible in normal conditions.  It would be possible
> if, for example, the task were waiting for an unavailable resource and that
> resource became available after the hibernation image had been created.
> In that case, however, to do any damage, the task would have to cause some
> filesystem-related data to be flushed in the same syscall (ie. before returning
> to user space).
> 
> Such situations may be prevented by a mechanizm detecting if any uniterruptible
> and freezing task has been woken up after creating the image and aborting the
> hibernation in that cases.  For this purpose, we only need to add an
> appropriate condition to try_to_wake_up() and make it start to trigger after,
> for example, enabling the nonboot CPUs.

Hmm, okay, I see how you meant it. Yes, it probably could work... but
I'd say it is seriously ugly.

Imagine task waking up after complete image is written... we'd have to
invalidate the image before aborting the suspend.

Actually, we could do better: we could just refuse to run those tasks
after atomic snapshot... and hope we don't deadlock,because the
uninterruptible task holds some important lock... but I still think it
is too ugly.
							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found]   ` <20070708120933.GA3866@ucw.cz>
  2007-07-08 13:55     ` Rafael J. Wysocki
@ 2007-07-09  4:21     ` Jeremy Maitin-Shepard
  2007-07-09 14:45       ` Alan Stern
  1 sibling, 1 reply; 13+ messages in thread
From: Jeremy Maitin-Shepard @ 2007-07-09  4:21 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, LKML, pm list, Ingo Molnar

Pavel Machek <pavel@ucw.cz> writes:

[snip]

> I don't know how to do that mechanism... but if we knew where to trap
> filesystem writes, we could simply freeze at that point, and at that
> point only, no?

Any operation at all that has an external effect must not occur after
the snapshot is made; otherwise, there will be random hard-to-find
corruptions and other problems occurring as a result.  Thus, for
example, any writes (either directly or indirectly through e.g. a
filesystem) to non-volatile storage, any network traffic, any
communication with hardware like a printer must be prevented after the
snapshot.  It seems, though, that in general the kernel will have no way
to know which operations are safe, and which are not safe.

(This is why the whole "proper filesystem snapshot support is the
solution" argument is bogus.)

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
  2007-07-09  4:21     ` Jeremy Maitin-Shepard
@ 2007-07-09 14:45       ` Alan Stern
  0 siblings, 0 replies; 13+ messages in thread
From: Alan Stern @ 2007-07-09 14:45 UTC (permalink / raw)
  To: Jeremy Maitin-Shepard
  Cc: Matthew Garrett, Miklos Szeredi, LKML, Pavel Machek, pm list,
	Ingo Molnar

On Mon, 9 Jul 2007, Jeremy Maitin-Shepard wrote:

> Pavel Machek <pavel@ucw.cz> writes:
> 
> [snip]
> 
> > I don't know how to do that mechanism... but if we knew where to trap
> > filesystem writes, we could simply freeze at that point, and at that
> > point only, no?
> 
> Any operation at all that has an external effect must not occur after
> the snapshot is made; otherwise, there will be random hard-to-find
> corruptions and other problems occurring as a result.  Thus, for
> example, any writes (either directly or indirectly through e.g. a
> filesystem) to non-volatile storage, any network traffic, any
> communication with hardware like a printer must be prevented after the
> snapshot.

You have forgotten one critical point: The writes to save the snapshot 
image must be allowed.  That's what makes it really hard.

Alan Stern

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks
       [not found] <Pine.LNX.4.44L0.0707091043480.3851-100000@iolanthe.rowland.org>
@ 2007-07-09 15:36 ` Jeremy Maitin-Shepard
  0 siblings, 0 replies; 13+ messages in thread
From: Jeremy Maitin-Shepard @ 2007-07-09 15:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Matthew Garrett, Miklos Szeredi, LKML, Pavel Machek, pm list,
	Ingo Molnar

Alan Stern <stern@rowland.harvard.edu> writes:

> On Mon, 9 Jul 2007, Jeremy Maitin-Shepard wrote:
>> Pavel Machek <pavel@ucw.cz> writes:
>> 
>> [snip]
>> 
>> > I don't know how to do that mechanism... but if we knew where to trap
>> > filesystem writes, we could simply freeze at that point, and at that
>> > point only, no?
>> 
>> Any operation at all that has an external effect must not occur after
>> the snapshot is made; otherwise, there will be random hard-to-find
>> corruptions and other problems occurring as a result.  Thus, for
>> example, any writes (either directly or indirectly through e.g. a
>> filesystem) to non-volatile storage, any network traffic, any
>> communication with hardware like a printer must be prevented after the
>> snapshot.

> You have forgotten one critical point: The writes to save the snapshot 
> image must be allowed.  That's what makes it really hard.

Well, I didn't forget about that, although my language may have been a
bit ambiguous.  I was referring only to the operations that are done by
normal (i.e. non-hibernate) portions of the system and which are not
explicitly for the purpose of hibernating the system.  It is very
difficult to maintain this guarantee while also attempting to reuse the
same infrastructure that is supposed to not be processing any "normal"
requests in order to write the snapshot.  The kdump approach handily
avoids this problem by *not* reusing the same infrastructure while still
allowing complete flexibility (i.e. not depending on a
drivers/suspend/ide-simple).

-- 
Jeremy Maitin-Shepard

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2007-07-09 15:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.44L0.0707061039380.3737-100000@iolanthe.rowland.org>
2007-07-07 23:08 ` [RFC][PATCH -mm] Freezer: Handle uninterruptible tasks Rafael J. Wysocki
     [not found] ` <200707080108.17371.rjw@sisk.pl>
2007-07-08 12:09   ` Pavel Machek
2007-07-08 18:37   ` Pavel Machek
     [not found]   ` <20070708120933.GA3866@ucw.cz>
2007-07-08 13:55     ` Rafael J. Wysocki
2007-07-09  4:21     ` Jeremy Maitin-Shepard
2007-07-09 14:45       ` Alan Stern
     [not found] <Pine.LNX.4.44L0.0707091043480.3851-100000@iolanthe.rowland.org>
2007-07-09 15:36 ` Jeremy Maitin-Shepard
     [not found] <200707061012.48998.rjw@sisk.pl>
2007-07-06 15:01 ` Alan Stern
2007-07-07  7:50 ` Pavel Machek
2007-07-07  9:13   ` Nigel Cunningham
     [not found]   ` <200707071913.43482.nigel@nigel.suspend2.net>
2007-07-07 11:31     ` Pavel Machek
2007-07-07 20:44       ` Rafael J. Wysocki
2007-07-06  8:12 Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox