[RFC][PATCH -mm 0/2] Freezer: Use wait queue instead of busy looping

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH -mm 0/2] Freezer: Use wait queue instead of busy looping
@ 2007-07-25 12:01 Rafael J. Wysocki
  2007-07-25 12:03 ` [RFC][PATCH -mm 1/2] Freezer: Be more verbose Rafael J. Wysocki
  2007-07-25 12:09 ` [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
  0 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-25 12:01 UTC (permalink / raw)
  To: pm list; +Cc: Nigel Cunningham, Oleg Nesterov, Pavel Machek

Hi,

I thought it would be a good idea to make try_to_freeze_tasks() use a wait
queue instead of looping while tasks are supposed to be entering the
refrigerator.  The details are in the second patch changelog.

The first patch is the known "make it more verbose" thing.

Comments welcome.

Greetings,
Rafael

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 1/2] Freezer: Be more verbose
  2007-07-25 12:01 [RFC][PATCH -mm 0/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
@ 2007-07-25 12:03 ` Rafael J. Wysocki
  2007-07-25 12:27   ` Pavel Machek
  2007-07-25 12:09 ` [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-25 12:03 UTC (permalink / raw)
  To: pm list; +Cc: Nigel Cunningham, Oleg Nesterov, Pavel Machek

From: Rafael J. Wysocki <rjw@sisk.pl>

Increase the freezer's verbosity a bit, so that it's easier to read problem
reports related to it.

This is necessary for the next patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
---
 kernel/power/process.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c	2007-07-24 00:13:46.000000000 +0200
+++ linux-2.6.23-rc1/kernel/power/process.c	2007-07-24 00:13:58.000000000 +0200
@@ -227,18 +227,21 @@ int freeze_processes(void)
 {
 	int error;
 
-	printk("Stopping tasks ... ");
+	printk("Freezing user space processes ... ");
 	error = try_to_freeze_tasks(FREEZER_USER_SPACE);
 	if (error)
-		return error;
+		goto Exit;
+	printk("done.\n");
 
+	printk("Freezing remaining freezable tasks ... ");
 	error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
 	if (error)
-		return error;
-
-	printk("done.\n");
+		goto Exit;
+	printk("done.");
+ Exit:
 	BUG_ON(in_atomic());
-	return 0;
+	printk("\n");
+	return error;
 }
 
 static void thaw_tasks(int thaw_user_space)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 12:01 [RFC][PATCH -mm 0/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
  2007-07-25 12:03 ` [RFC][PATCH -mm 1/2] Freezer: Be more verbose Rafael J. Wysocki
@ 2007-07-25 12:09 ` Rafael J. Wysocki
  2007-07-25 12:28   ` Pavel Machek
  2007-07-25 13:29   ` Oleg Nesterov
  1 sibling, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-25 12:09 UTC (permalink / raw)
  To: pm list; +Cc: Nigel Cunningham, Oleg Nesterov, Pavel Machek

From: Rafael J. Wysocki <rjw@sisk.pl>

Use the observation that try_to_freeze() need not loop while waiting for the
freezing tasks to enter the refrigerator and make it use a wait queue.

The idea is that after sending freeze requests to the tasks regarded as
freezable try_to_freeze() can go to sleep and wait until at least one task
enters the refrigerator.  The first task that does it wakes up try_to_freeze()
and the procedure is repeated.  If the refrigerator is not entered by any tasks
before TIMEOUT expires, try_to_freeze() increases the counter of expired
timeouts and sends freeze requests to the remaining tasks.  If the number of
expired timeouts becomes greater than MAX_WAITS, the freezing of tasks fails
(the counter of expired timeouts is reset whenever a task enters the
refrigerator).

This way, try_to_freeze() doesn't occupy the CPU unnecessarily when some
freezing tasks are waiting for I/O to complete and we have more fine grained
control over the freezing procedure.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
---
 kernel/power/process.c |   70 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 59 insertions(+), 11 deletions(-)

Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c
+++ linux-2.6.23-rc1/kernel/power/process.c
@@ -13,11 +13,22 @@
 #include <linux/module.h>
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
+#include <linux/time.h>
 
 /* 
- * Timeout for stopping processes
+ * Time to wait until one or more tasks enter the refrigerator after sending
+ * freeze requests to them.
  */
-#define TIMEOUT	(20 * HZ)
+#define TIMEOUT (HZ / 5)
+
+/*
+ * Each time after sending freeze requests to tasks the freezer will wait until
+ * some of them enter the refrigerater, but no longer than TIMEOUT.  If TIMEOUT
+ * has been exceeded, the freezer increases the number of waits by one and
+ * repeats.  If the number of waits becomes greater than MAX_WAITS, the
+ * freezing fails.
+ */
+#define MAX_WAITS 5
 
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
@@ -43,6 +54,18 @@ static inline void frozen_process(void)
 	clear_freeze_flag(current);
 }
 
+/*
+ * Wait queue head used by try_to_freeze_tasks() to wait for tasks to enter the
+ * refrigerator.
+ */
+static DECLARE_WAIT_QUEUE_HEAD(refrigerator_waitq);
+
+/*
+ * Used to signal try_to_freeze_tasks() that the refrigerator has been entered
+ * by a task.
+ */
+static int refrigerator_called;
+
 /* Refrigerator is place where frozen processes are stored :-). */
 void refrigerator(void)
 {
@@ -50,6 +73,9 @@ void refrigerator(void)
 	   processes around? */
 	long save;
 
+	refrigerator_called = 1;
+	wake_up(&refrigerator_waitq);
+
 	task_lock(current);
 	if (freezing(current)) {
 		frozen_process();
@@ -166,10 +192,16 @@ static void cancel_freezing(struct task_
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
-	unsigned long end_time;
-	unsigned int todo;
+	unsigned int todo, waits;
+	unsigned long ret;
+	struct timeval start, end;
+	s64 elapsed_csecs64;
+	unsigned int elapsed_csecs;
+
+	do_gettimeofday(&start);
 
-	end_time = jiffies + TIMEOUT;
+	refrigerator_called = 0;
+	waits = 0;
 	do {
 		todo = 0;
 		read_lock(&tasklist_lock);
@@ -189,11 +221,25 @@ static int try_to_freeze_tasks(int freez
 				todo++;
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
-		yield();			/* Yield is okay here */
-		if (time_after(jiffies, end_time))
-			break;
+
+		if (todo) {
+			ret = wait_event_timeout(refrigerator_waitq,
+						refrigerator_called, TIMEOUT);
+			if (!ret) {
+				if (++waits > MAX_WAITS)
+					break;
+			} else {
+				refrigerator_called = 0;
+				waits = 0;
+			}
+		}
 	} while (todo);
 
+	do_gettimeofday(&end);
+	elapsed_csecs64 = timeval_to_ns(&end) - timeval_to_ns(&start);
+	do_div(elapsed_csecs64, NSEC_PER_SEC / 100);
+	elapsed_csecs = elapsed_csecs64;
+
 	if (todo) {
 		/* This does not unfreeze processes that are already frozen
 		 * (we have slightly ugly calling convention in that respect,
@@ -201,10 +247,9 @@ static int try_to_freeze_tasks(int freez
 		 * but it cleans up leftover PF_FREEZE requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks failed after %d.%d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
-				TIMEOUT / HZ, todo);
+				elapsed_csecs / 100, elapsed_csecs % 100, todo);
 		show_state();
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
@@ -215,6 +260,9 @@ static int try_to_freeze_tasks(int freez
 			task_unlock(p);
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+	} else {
+		printk("(elapsed %d.%d seconds) ", elapsed_csecs / 100,
+			elapsed_csecs % 100);
 	}
 
 	return todo ? -EBUSY : 0;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 1/2] Freezer: Be more verbose
  2007-07-25 12:03 ` [RFC][PATCH -mm 1/2] Freezer: Be more verbose Rafael J. Wysocki
@ 2007-07-25 12:27   ` Pavel Machek
  0 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2007-07-25 12:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: pm list, Nigel Cunningham, Oleg Nesterov

On Wed 2007-07-25 14:03:14, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Increase the freezer's verbosity a bit, so that it's easier to read problem
> reports related to it.
> 
> This is necessary for the next patch.

No problem with this one. ACK.


> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 12:09 ` [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
@ 2007-07-25 12:28   ` Pavel Machek
  2007-07-25 12:55     ` Rafael J. Wysocki
  2007-07-25 13:29   ` Oleg Nesterov
  1 sibling, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-07-25 12:28 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: pm list, Nigel Cunningham, Oleg Nesterov

Hi!

> Use the observation that try_to_freeze() need not loop while waiting for the
> freezing tasks to enter the refrigerator and make it use a wait queue.
> 
> The idea is that after sending freeze requests to the tasks regarded as
> freezable try_to_freeze() can go to sleep and wait until at least one task
> enters the refrigerator.  The first task that does it wakes up try_to_freeze()
> and the procedure is repeated.  If the refrigerator is not entered by any tasks
> before TIMEOUT expires, try_to_freeze() increases the counter of expired
> timeouts and sends freeze requests to the remaining tasks.  If the number of
> expired timeouts becomes greater than MAX_WAITS, the freezing of tasks fails
> (the counter of expired timeouts is reset whenever a task enters the
> refrigerator).
> 
> This way, try_to_freeze() doesn't occupy the CPU unnecessarily when some
> freezing tasks are waiting for I/O to complete and we have more fine grained
> control over the freezing procedure.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
> ---
>  kernel/power/process.c |   70 +++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 59 insertions(+), 11 deletions(-)

But is this really neccessary? It is not like the freezing phase of
suspend is particulary time critical, and this only makes it more
complex. We do not poll the task _that_ often that this matters,
right?

Or have you seen some speedups on some particulary perverse workload?

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 12:28   ` Pavel Machek
@ 2007-07-25 12:55     ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-25 12:55 UTC (permalink / raw)
  To: Pavel Machek; +Cc: pm list, Nigel Cunningham, Oleg Nesterov

Hi,

On Wednesday, 25 July 2007 14:28, Pavel Machek wrote:
> Hi!
> 
> > Use the observation that try_to_freeze() need not loop while waiting for the
> > freezing tasks to enter the refrigerator and make it use a wait queue.
> > 
> > The idea is that after sending freeze requests to the tasks regarded as
> > freezable try_to_freeze() can go to sleep and wait until at least one task
> > enters the refrigerator.  The first task that does it wakes up try_to_freeze()
> > and the procedure is repeated.  If the refrigerator is not entered by any tasks
> > before TIMEOUT expires, try_to_freeze() increases the counter of expired
> > timeouts and sends freeze requests to the remaining tasks.  If the number of
> > expired timeouts becomes greater than MAX_WAITS, the freezing of tasks fails
> > (the counter of expired timeouts is reset whenever a task enters the
> > refrigerator).
> > 
> > This way, try_to_freeze() doesn't occupy the CPU unnecessarily when some
> > freezing tasks are waiting for I/O to complete and we have more fine grained
> > control over the freezing procedure.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
> > ---
> >  kernel/power/process.c |   70 +++++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 59 insertions(+), 11 deletions(-)
> 
> But is this really neccessary? It is not like the freezing phase of
> suspend is particulary time critical, and this only makes it more
> complex. We do not poll the task _that_ often that this matters,
> right?

No.  If the majority of tasks is frozen and there is only one or two waiting
on I/O, to freezer is the only thing that's running and burning the CPU.  What
for?

> Or have you seen some speedups on some particulary perverse workload?

OLPC?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 12:09 ` [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
  2007-07-25 12:28   ` Pavel Machek
@ 2007-07-25 13:29   ` Oleg Nesterov
  2007-07-25 14:03     ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Oleg Nesterov @ 2007-07-25 13:29 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: pm list, Nigel Cunningham, Pavel Machek

On 07/25, Rafael J. Wysocki wrote:
>
>  void refrigerator(void)
>  {
> @@ -50,6 +73,9 @@ void refrigerator(void)
>  	   processes around? */
>  	long save;
>
> +	refrigerator_called = 1;
> +	wake_up(&refrigerator_waitq);
> +

This is a bit racy. Unless I missed something, the task should not set
refrigerator_called == 1 until it has PF_FROZEN.

Otherwise, try_to_freeze_tasks() can set refrigerator_called == 0 after
refrigerator() sets it == 1, the the main loop notices this unfrozen task,
and goes to sleep.

(I must admit, I agree with Pavel on that patch).

Oleg.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 13:29   ` Oleg Nesterov
@ 2007-07-25 14:03     ` Rafael J. Wysocki
  2007-07-25 14:24       ` Oleg Nesterov
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-25 14:03 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Nigel Cunningham, Andres Salomon, Pavel Machek, pm list,
	Chris Ball, David Woodhouse

On Wednesday, 25 July 2007 15:29, Oleg Nesterov wrote:
> On 07/25, Rafael J. Wysocki wrote:
> >
> >  void refrigerator(void)
> >  {
> > @@ -50,6 +73,9 @@ void refrigerator(void)
> >  	   processes around? */
> >  	long save;
> >
> > +	refrigerator_called = 1;
> > +	wake_up(&refrigerator_waitq);
> > +
> 
> This is a bit racy. Unless I missed something, the task should not set
> refrigerator_called == 1 until it has PF_FROZEN.

No, it's just to signal that the task has entered the refrigerator, not that
it has actually frozen.

> Otherwise, try_to_freeze_tasks() can set refrigerator_called == 0 after
> refrigerator() sets it == 1, the the main loop notices this unfrozen task,
> and goes to sleep.

refrigerator_called is only reset after try_to_freeze_tasks() has found it
equal to one.  There is only a small window between checking it in
wait_event_timeout() and resetting it, but then we go to send freeze requests
to the remaining tasks and we count 'todo' from the start, so that shouldn't
be a problem.

> (I must admit, I agree with Pavel on that patch).

Well, I think it's a good idea (otherwise I wouldn't have posted it ;-)) and
the OLPC people really had a problem with try_to_freeze_tasks() looping too
fast (they reniced it to avoid this problem, but IMO the $subject patch is
nicer than that).

Greetings,
Rafael

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 14:03     ` Rafael J. Wysocki
@ 2007-07-25 14:24       ` Oleg Nesterov
  2007-07-26 12:24         ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Oleg Nesterov @ 2007-07-25 14:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, Pavel Machek, pm list,
	Chris Ball, David Woodhouse

On 07/25, Rafael J. Wysocki wrote:
>
> On Wednesday, 25 July 2007 15:29, Oleg Nesterov wrote:
> > On 07/25, Rafael J. Wysocki wrote:
> > >
> > >  void refrigerator(void)
> > >  {
> > > @@ -50,6 +73,9 @@ void refrigerator(void)
> > >  	   processes around? */
> > >  	long save;
> > >
> > > +	refrigerator_called = 1;
> > > +	wake_up(&refrigerator_waitq);
> > > +
> > 
> > This is a bit racy. Unless I missed something, the task should not set
> > refrigerator_called == 1 until it has PF_FROZEN.
> 
> No, it's just to signal that the task has entered the refrigerator, not that
> it has actually frozen.

Yes, I see.

> > Otherwise, try_to_freeze_tasks() can set refrigerator_called == 0 after
> > refrigerator() sets it == 1, the the main loop notices this unfrozen task,
> > and goes to sleep.
> 
> refrigerator_called is only reset after try_to_freeze_tasks() has found it
> equal to one.  There is only a small window between checking it in
> wait_event_timeout() and resetting it,

Yes.

> but then we go to send freeze requests
> to the remaining tasks and we count 'todo' from the start, so that shouldn't
> be a problem.

... and we find the task which is not frozen() yet, but which has already passed
the "set condition and wakeup", increment todo, and wait for the event. If it was
the last task, we will sleep until timeout.

I agree, this is not fatal and unlikely, but still it is a race. I think it is
better to move this code down, after frozen_process().

(offtopic: strictly speaking, we don't even need the "refrigerator_called", we
 only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
 to wq at the very start of the main loop).

Oleg.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-25 14:24       ` Oleg Nesterov
@ 2007-07-26 12:24         ` Rafael J. Wysocki
  2007-07-26 12:43           ` Rafael J. Wysocki
  2007-07-31  8:01           ` Pavel Machek
  0 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-26 12:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Nigel Cunningham, Andres Salomon, Pavel Machek, pm list,
	Chris Ball, David Woodhouse

On Wednesday, 25 July 2007 16:24, Oleg Nesterov wrote:
> On 07/25, Rafael J. Wysocki wrote:
> >
> > On Wednesday, 25 July 2007 15:29, Oleg Nesterov wrote:
> > > On 07/25, Rafael J. Wysocki wrote:
> > > >
> > > >  void refrigerator(void)
> > > >  {
> > > > @@ -50,6 +73,9 @@ void refrigerator(void)
> > > >  	   processes around? */
> > > >  	long save;
> > > >
> > > > +	refrigerator_called = 1;
> > > > +	wake_up(&refrigerator_waitq);
> > > > +
> > > 
> > > This is a bit racy. Unless I missed something, the task should not set
> > > refrigerator_called == 1 until it has PF_FROZEN.
> > 
> > No, it's just to signal that the task has entered the refrigerator, not that
> > it has actually frozen.
> 
> Yes, I see.
> 
> > > Otherwise, try_to_freeze_tasks() can set refrigerator_called == 0 after
> > > refrigerator() sets it == 1, the the main loop notices this unfrozen task,
> > > and goes to sleep.
> > 
> > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > equal to one.  There is only a small window between checking it in
> > wait_event_timeout() and resetting it,
> 
> Yes.
> 
> > but then we go to send freeze requests
> > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > be a problem.
> 
> ... and we find the task which is not frozen() yet, but which has already passed
> the "set condition and wakeup", increment todo, and wait for the event. If it was
> the last task, we will sleep until timeout.
> 
> I agree, this is not fatal and unlikely, but still it is a race. I think it is
> better to move this code down, after frozen_process().

OK, I see your point.  The updated patch is appended.

> (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
>  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
>  to wq at the very start of the main loop).

Hmm, yes, I think so.

Greetings,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

Use the observation that try_to_freeze() need not loop while waiting for the
freezing tasks to enter the refrigerator and make it use a wait queue.

The idea is that after sending freeze requests to the tasks regarded as
freezable try_to_freeze() can go to sleep and wait until at least one task
enters the refrigerator.  The first task that does it wakes up try_to_freeze()
and the procedure is repeated.  If the refrigerator is not entered by any tasks
before TIMEOUT expires, try_to_freeze() increases the counter of expired
timeouts and sends freeze requests to the remaining tasks.  If the number of
expired timeouts becomes greater than MAX_WAITS, the freezing of tasks fails
(the counter of expired timeouts is reset whenever a task enters the
refrigerator).

This way, try_to_freeze() doesn't occupy the CPU unnecessarily when some
freezing tasks are waiting for I/O to complete and we have more fine grained
control over the freezing procedure.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   71 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 60 insertions(+), 11 deletions(-)

Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c
+++ linux-2.6.23-rc1/kernel/power/process.c
@@ -13,11 +13,22 @@
 #include <linux/module.h>
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
+#include <linux/time.h>
 
 /* 
- * Timeout for stopping processes
+ * Time to wait until one or more tasks enter the refrigerator after sending
+ * freeze requests to them.
  */
-#define TIMEOUT	(20 * HZ)
+#define TIMEOUT (HZ / 5)
+
+/*
+ * Each time after sending freeze requests to tasks the freezer will wait until
+ * some of them enter the refrigerater, but no longer than TIMEOUT.  If TIMEOUT
+ * has been exceeded, the freezer increases the number of waits by one and
+ * repeats.  If the number of waits becomes greater than MAX_WAITS, the
+ * freezing fails.
+ */
+#define MAX_WAITS 5
 
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
@@ -43,6 +54,18 @@ static inline void frozen_process(void)
 	clear_freeze_flag(current);
 }
 
+/*
+ * Wait queue head used by try_to_freeze_tasks() to wait for tasks to enter the
+ * refrigerator.
+ */
+static DECLARE_WAIT_QUEUE_HEAD(refrigerator_waitq);
+
+/*
+ * Used to signal try_to_freeze_tasks() that the refrigerator has been entered
+ * by a task.
+ */
+static int refrigerator_called;
+
 /* Refrigerator is place where frozen processes are stored :-). */
 void refrigerator(void)
 {
@@ -58,6 +81,10 @@ void refrigerator(void)
 		task_unlock(current);
 		return;
 	}
+
+	refrigerator_called = 1;
+	wake_up(&refrigerator_waitq);
+
 	save = current->state;
 	pr_debug("%s entered refrigerator\n", current->comm);
 
@@ -166,10 +193,16 @@ static void cancel_freezing(struct task_
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
-	unsigned long end_time;
-	unsigned int todo;
+	unsigned int todo, waits;
+	unsigned long ret;
+	struct timeval start, end;
+	s64 elapsed_csecs64;
+	unsigned int elapsed_csecs;
+
+	do_gettimeofday(&start);
 
-	end_time = jiffies + TIMEOUT;
+	refrigerator_called = 0;
+	waits = 0;
 	do {
 		todo = 0;
 		read_lock(&tasklist_lock);
@@ -189,11 +222,25 @@ static int try_to_freeze_tasks(int freez
 				todo++;
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
-		yield();			/* Yield is okay here */
-		if (time_after(jiffies, end_time))
-			break;
+
+		if (todo) {
+			ret = wait_event_timeout(refrigerator_waitq,
+						refrigerator_called, TIMEOUT);
+			if (!ret) {
+				if (++waits > MAX_WAITS)
+					break;
+			} else {
+				refrigerator_called = 0;
+				waits = 0;
+			}
+		}
 	} while (todo);
 
+	do_gettimeofday(&end);
+	elapsed_csecs64 = timeval_to_ns(&end) - timeval_to_ns(&start);
+	do_div(elapsed_csecs64, NSEC_PER_SEC / 100);
+	elapsed_csecs = elapsed_csecs64;
+
 	if (todo) {
 		/* This does not unfreeze processes that are already frozen
 		 * (we have slightly ugly calling convention in that respect,
@@ -201,10 +248,9 @@ static int try_to_freeze_tasks(int freez
 		 * but it cleans up leftover PF_FREEZE requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks failed after %d.%d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
-				TIMEOUT / HZ, todo);
+				elapsed_csecs / 100, elapsed_csecs % 100, todo);
 		show_state();
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
@@ -215,6 +261,9 @@ static int try_to_freeze_tasks(int freez
 			task_unlock(p);
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+	} else {
+		printk("(elapsed %d.%d seconds) ", elapsed_csecs / 100,
+			elapsed_csecs % 100);
 	}
 
 	return todo ? -EBUSY : 0;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-26 12:24         ` Rafael J. Wysocki
@ 2007-07-26 12:43           ` Rafael J. Wysocki
  2007-07-31  8:01           ` Pavel Machek
  1 sibling, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-26 12:43 UTC (permalink / raw)
  To: linux-pm
  Cc: Nigel Cunningham, Andres Salomon, Pavel Machek, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Thursday, 26 July 2007 14:24, Rafael J. Wysocki wrote:
> On Wednesday, 25 July 2007 16:24, Oleg Nesterov wrote:
> > On 07/25, Rafael J. Wysocki wrote:
> > >
> > > On Wednesday, 25 July 2007 15:29, Oleg Nesterov wrote:
> > > > On 07/25, Rafael J. Wysocki wrote:
> > > > >
> > > > >  void refrigerator(void)
> > > > >  {
> > > > > @@ -50,6 +73,9 @@ void refrigerator(void)
> > > > >  	   processes around? */
> > > > >  	long save;
> > > > >
> > > > > +	refrigerator_called = 1;
> > > > > +	wake_up(&refrigerator_waitq);
> > > > > +
> > > > 
> > > > This is a bit racy. Unless I missed something, the task should not set
> > > > refrigerator_called == 1 until it has PF_FROZEN.
> > > 
> > > No, it's just to signal that the task has entered the refrigerator, not that
> > > it has actually frozen.
> > 
> > Yes, I see.
> > 
> > > > Otherwise, try_to_freeze_tasks() can set refrigerator_called == 0 after
> > > > refrigerator() sets it == 1, the the main loop notices this unfrozen task,
> > > > and goes to sleep.
> > > 
> > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > equal to one.  There is only a small window between checking it in
> > > wait_event_timeout() and resetting it,
> > 
> > Yes.
> > 
> > > but then we go to send freeze requests
> > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > be a problem.
> > 
> > ... and we find the task which is not frozen() yet, but which has already passed
> > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > the last task, we will sleep until timeout.
> > 
> > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > better to move this code down, after frozen_process().
> 
> OK, I see your point.  The updated patch is appended.
> 
> > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> >  to wq at the very start of the main loop).
> 
> Hmm, yes, I think so.
> 
> Greetings,
> Rafael
> 
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Use the observation that try_to_freeze() need not loop while waiting for the
> freezing tasks to enter the refrigerator and make it use a wait queue.
> 
> The idea is that after sending freeze requests to the tasks regarded as
> freezable try_to_freeze() can go to sleep and wait until at least one task
> enters the refrigerator.  The first task that does it wakes up try_to_freeze()
> and the procedure is repeated.  If the refrigerator is not entered by any tasks
> before TIMEOUT expires, try_to_freeze() increases the counter of expired
> timeouts and sends freeze requests to the remaining tasks.  If the number of
> expired timeouts becomes greater than MAX_WAITS, the freezing of tasks fails
> (the counter of expired timeouts is reset whenever a task enters the
> refrigerator).
> 
> This way, try_to_freeze() doesn't occupy the CPU unnecessarily when some
> freezing tasks are waiting for I/O to complete and we have more fine grained
> control over the freezing procedure.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---

Well, s/try_to_freeze()/try_to_freeze_tasks()/ all in the changelog above.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-26 12:24         ` Rafael J. Wysocki
  2007-07-26 12:43           ` Rafael J. Wysocki
@ 2007-07-31  8:01           ` Pavel Machek
  2007-07-31  9:39             ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-07-31  8:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, pm list, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > equal to one.  There is only a small window between checking it in
> > > wait_event_timeout() and resetting it,
> > 
> > Yes.
> > 
> > > but then we go to send freeze requests
> > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > be a problem.
> > 
> > ... and we find the task which is not frozen() yet, but which has already passed
> > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > the last task, we will sleep until timeout.
> > 
> > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > better to move this code down, after frozen_process().
> 
> OK, I see your point.  The updated patch is appended.
> 
> > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> >  to wq at the very start of the main loop).
> 
> Hmm, yes, I think so.

Ok, could we just do schedule_timeout(HZ/10) or something, but when we
_know_ we woke someone, wakeup() that task? No new variables, keep
existing logic. That should still get most of the benefits, and be two
liner, no?
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-31  8:01           ` Pavel Machek
@ 2007-07-31  9:39             ` Rafael J. Wysocki
  2007-07-31 10:00               ` Pavel Machek
  2007-07-31 10:08               ` Rafael J. Wysocki
  0 siblings, 2 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31  9:39 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, pm list, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi,

On Tuesday, 31 July 2007 10:01, Pavel Machek wrote:
> Hi!
> 
> > > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > > equal to one.  There is only a small window between checking it in
> > > > wait_event_timeout() and resetting it,
> > > 
> > > Yes.
> > > 
> > > > but then we go to send freeze requests
> > > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > > be a problem.
> > > 
> > > ... and we find the task which is not frozen() yet, but which has already passed
> > > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > > the last task, we will sleep until timeout.
> > > 
> > > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > > better to move this code down, after frozen_process().
> > 
> > OK, I see your point.  The updated patch is appended.
> > 
> > > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> > >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> > >  to wq at the very start of the main loop).
> > 
> > Hmm, yes, I think so.
> 
> Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> _know_ we woke someone, wakeup() that task? No new variables, keep
> existing logic.

The logic doesn't change that much. :-)

> That should still get most of the benefits, and be two liner, no?

Well, I think we can avoid using refrigerator_called, if this is a problem, but
the patch won't be a two liner.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-31  9:39             ` Rafael J. Wysocki
@ 2007-07-31 10:00               ` Pavel Machek
  2007-07-31 10:17                 ` Rafael J. Wysocki
  2007-07-31 10:08               ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-07-31 10:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, pm list, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> > Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> > _know_ we woke someone, wakeup() that task? No new variables, keep
> > existing logic.
> 
> The logic doesn't change that much. :-)
> 
> > That should still get most of the benefits, and be two liner, no?
> 
> Well, I think we can avoid using refrigerator_called, if this is a problem, but
> the patch won't be a two liner.

Sure? The current process is ineffective because it polls. If we add
schedule_timeout(HZ/10), we'll have still correct freezer (and it will
not waste power any more), but it will delay freezing by HZ/20 on
average.

If we add wakeup() at strategic place, we should get rid of that delay...

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-31 10:08               ` Rafael J. Wysocki
@ 2007-07-31 10:02                 ` Pavel Machek
  2007-07-31 22:25                   ` [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated) Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-07-31 10:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Tue 2007-07-31 12:08:40, Rafael J. Wysocki wrote:
> On Tuesday, 31 July 2007 11:39, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > On Tuesday, 31 July 2007 10:01, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > > > > equal to one.  There is only a small window between checking it in
> > > > > > wait_event_timeout() and resetting it,
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > but then we go to send freeze requests
> > > > > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > > > > be a problem.
> > > > > 
> > > > > ... and we find the task which is not frozen() yet, but which has already passed
> > > > > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > > > > the last task, we will sleep until timeout.
> > > > > 
> > > > > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > > > > better to move this code down, after frozen_process().
> > > > 
> > > > OK, I see your point.  The updated patch is appended.
> > > > 
> > > > > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> > > > >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> > > > >  to wq at the very start of the main loop).
> > > > 
> > > > Hmm, yes, I think so.
> > > 
> > > Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> > > _know_ we woke someone, wakeup() that task? No new variables, keep
> > > existing logic.
> > 
> > The logic doesn't change that much. :-)
> > 
> > > That should still get most of the benefits, and be two liner, no?
> > 
> > Well, I think we can avoid using refrigerator_called, if this is a problem, but
> > the patch won't be a two liner.
> 
> To be precise, we'd need to add current to the wait queue manually, which
> would require us to open code wait_event_timeout(), more or less.
> 
> Still, maybe to many things are done in this patch at a time.  I'll try to
> split it into smaller steps. :-)

Ok, whatever.

Hmm, you could be sneaky and send signals to refrigerator, that should
trigger early return from schedule_timeout()... <runs away, hides>
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-31  9:39             ` Rafael J. Wysocki
  2007-07-31 10:00               ` Pavel Machek
@ 2007-07-31 10:08               ` Rafael J. Wysocki
  2007-07-31 10:02                 ` Pavel Machek
  1 sibling, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 10:08 UTC (permalink / raw)
  To: linux-pm
  Cc: Nigel Cunningham, Andres Salomon, Pavel Machek, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Tuesday, 31 July 2007 11:39, Rafael J. Wysocki wrote:
> Hi,
> 
> On Tuesday, 31 July 2007 10:01, Pavel Machek wrote:
> > Hi!
> > 
> > > > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > > > equal to one.  There is only a small window between checking it in
> > > > > wait_event_timeout() and resetting it,
> > > > 
> > > > Yes.
> > > > 
> > > > > but then we go to send freeze requests
> > > > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > > > be a problem.
> > > > 
> > > > ... and we find the task which is not frozen() yet, but which has already passed
> > > > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > > > the last task, we will sleep until timeout.
> > > > 
> > > > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > > > better to move this code down, after frozen_process().
> > > 
> > > OK, I see your point.  The updated patch is appended.
> > > 
> > > > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> > > >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> > > >  to wq at the very start of the main loop).
> > > 
> > > Hmm, yes, I think so.
> > 
> > Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> > _know_ we woke someone, wakeup() that task? No new variables, keep
> > existing logic.
> 
> The logic doesn't change that much. :-)
> 
> > That should still get most of the benefits, and be two liner, no?
> 
> Well, I think we can avoid using refrigerator_called, if this is a problem, but
> the patch won't be a two liner.

To be precise, we'd need to add current to the wait queue manually, which
would require us to open code wait_event_timeout(), more or less.

Still, maybe to many things are done in this patch at a time.  I'll try to
split it into smaller steps. :-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping
  2007-07-31 10:00               ` Pavel Machek
@ 2007-07-31 10:17                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 10:17 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, pm list, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Tuesday, 31 July 2007 12:00, Pavel Machek wrote:
> Hi!
> 
> > > Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> > > _know_ we woke someone, wakeup() that task? No new variables, keep
> > > existing logic.
> > 
> > The logic doesn't change that much. :-)
> > 
> > > That should still get most of the benefits, and be two liner, no?
> > 
> > Well, I think we can avoid using refrigerator_called, if this is a problem, but
> > the patch won't be a two liner.
> 
> Sure? The current process is ineffective because it polls. If we add
> schedule_timeout(HZ/10), we'll have still correct freezer (and it will
> not waste power any more), but it will delay freezing by HZ/20 on
> average.
> 
> If we add wakeup() at strategic place, we should get rid of that delay...

But we'll need to know what to wake up ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated)
  2007-07-31 10:02                 ` Pavel Machek
@ 2007-07-31 22:25                   ` Rafael J. Wysocki
  2007-07-31 22:26                     ` [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
                                       ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 22:25 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Tuesday, 31 July 2007 12:02, Pavel Machek wrote:
> On Tue 2007-07-31 12:08:40, Rafael J. Wysocki wrote:
> > On Tuesday, 31 July 2007 11:39, Rafael J. Wysocki wrote:
> > > Hi,
> > > 
> > > On Tuesday, 31 July 2007 10:01, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > > > > > refrigerator_called is only reset after try_to_freeze_tasks() has found it
> > > > > > > equal to one.  There is only a small window between checking it in
> > > > > > > wait_event_timeout() and resetting it,
> > > > > > 
> > > > > > Yes.
> > > > > > 
> > > > > > > but then we go to send freeze requests
> > > > > > > to the remaining tasks and we count 'todo' from the start, so that shouldn't
> > > > > > > be a problem.
> > > > > > 
> > > > > > ... and we find the task which is not frozen() yet, but which has already passed
> > > > > > the "set condition and wakeup", increment todo, and wait for the event. If it was
> > > > > > the last task, we will sleep until timeout.
> > > > > > 
> > > > > > I agree, this is not fatal and unlikely, but still it is a race. I think it is
> > > > > > better to move this code down, after frozen_process().
> > > > > 
> > > > > OK, I see your point.  The updated patch is appended.
> > > > > 
> > > > > > (offtopic: strictly speaking, we don't even need the "refrigerator_called", we
> > > > > >  only need the wait_queue_head_t. try_to_freeze_tasks() just adds the "current"
> > > > > >  to wq at the very start of the main loop).
> > > > > 
> > > > > Hmm, yes, I think so.
> > > > 
> > > > Ok, could we just do schedule_timeout(HZ/10) or something, but when we
> > > > _know_ we woke someone, wakeup() that task? No new variables, keep
> > > > existing logic.
> > > 
> > > The logic doesn't change that much. :-)
> > > 
> > > > That should still get most of the benefits, and be two liner, no?
> > > 
> > > Well, I think we can avoid using refrigerator_called, if this is a problem, but
> > > the patch won't be a two liner.
> > 
> > To be precise, we'd need to add current to the wait queue manually, which
> > would require us to open code wait_event_timeout(), more or less.
> > 
> > Still, maybe to many things are done in this patch at a time.  I'll try to
> > split it into smaller steps. :-)
> 
> Ok, whatever.
> 
> Hmm, you could be sneaky and send signals to refrigerator, that should
> trigger early return from schedule_timeout()...

Hmm, did you mean *from* the refrigerator?

> <runs away, hides> 

Well, that wasn't really necessary. ;-)

The three patches in the next messages are functionally equivalent to this one.

They do the following:
* make try_to_freeze_tasks() wait instead of busy looping
* make try_to_freeze_tasks() measure the time it takes to freeze tasks
* replace the timeout with counting "waits"

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping
  2007-07-31 22:25                   ` [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated) Rafael J. Wysocki
@ 2007-07-31 22:26                     ` Rafael J. Wysocki
  2007-08-01  7:59                       ` Pavel Machek
  2007-07-31 22:28                     ` [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks Rafael J. Wysocki
  2007-07-31 22:29                     ` [RFC][PATCH -mm 3/3] Freezer: Replace the timeout Rafael J. Wysocki
  2 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 22:26 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Use the observation that try_to_freeze_tasks() need not loop while waiting for
the freezing tasks to enter the refrigerator and make it use a wait queue.

The idea is that after sending freeze requests to the tasks regarded as
freezable try_to_freeze_tasks() can go to sleep and wait until at least one task
enters the refrigerator.  The first task that does it wakes up
try_to_freeze_tasks() and the procedure is repeated.  If the refrigerator is not
entered by any tasks before TIMEOUT expires, the freezing of tasks fails.

This way, try_to_freeze_tasks() doesn't occupy the CPU unnecessarily when some
freezing tasks are waiting for I/O to complete.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c	2007-07-31 21:58:32.000000000 +0200
+++ linux-2.6.23-rc1/kernel/power/process.c	2007-07-31 23:00:43.000000000 +0200
@@ -19,6 +19,12 @@
  */
 #define TIMEOUT	(20 * HZ)
 
+/*
+ * Time to wait until one or more tasks enter the refrigerator after sending
+ * freeze requests to them.
+ */
+#define WAIT_TIME (HZ / 5)
+
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
 
@@ -43,6 +49,18 @@ static inline void frozen_process(void)
 	clear_freeze_flag(current);
 }
 
+/*
+ * Wait queue head used by try_to_freeze_tasks() to wait for tasks to enter the
+ * refrigerator.
+ */
+static DECLARE_WAIT_QUEUE_HEAD(refrigerator_waitq);
+
+/*
+ * Used to notify try_to_freeze_tasks() that the refrigerator has been entered
+ * by a task.
+ */
+static int refrigerator_called;
+
 /* Refrigerator is place where frozen processes are stored :-). */
 void refrigerator(void)
 {
@@ -58,6 +76,10 @@ void refrigerator(void)
 		task_unlock(current);
 		return;
 	}
+
+	refrigerator_called = 1;
+	wake_up(&refrigerator_waitq);
+
 	save = current->state;
 	pr_debug("%s entered refrigerator\n", current->comm);
 
@@ -170,6 +192,7 @@ static int try_to_freeze_tasks(int freez
 	unsigned int todo;
 
 	end_time = jiffies + TIMEOUT;
+	refrigerator_called = 0;
 	do {
 		todo = 0;
 		read_lock(&tasklist_lock);
@@ -189,7 +212,16 @@ static int try_to_freeze_tasks(int freez
 				todo++;
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
-		yield();			/* Yield is okay here */
+
+		if (todo) {
+			unsigned long ret;
+
+			ret = wait_event_timeout(refrigerator_waitq,
+					refrigerator_called, WAIT_TIME);
+			if (ret)
+				refrigerator_called = 0;
+		}
+
 		if (time_after(jiffies, end_time))
 			break;
 	} while (todo);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks
  2007-07-31 22:25                   ` [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated) Rafael J. Wysocki
  2007-07-31 22:26                     ` [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
@ 2007-07-31 22:28                     ` Rafael J. Wysocki
  2007-08-01  8:28                       ` Pavel Machek
  2007-07-31 22:29                     ` [RFC][PATCH -mm 3/3] Freezer: Replace the timeout Rafael J. Wysocki
  2 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 22:28 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Measure the time of the freezing of tasks, even if it doesn't fail.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)
---
Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c	2007-07-31 23:00:43.000000000 +0200
+++ linux-2.6.23-rc1/kernel/power/process.c	2007-07-31 23:00:52.000000000 +0200
@@ -190,6 +190,11 @@ static int try_to_freeze_tasks(int freez
 	struct task_struct *g, *p;
 	unsigned long end_time;
 	unsigned int todo;
+	struct timeval start, end;
+	s64 elapsed_csecs64;
+	unsigned int elapsed_csecs;
+
+	do_gettimeofday(&start);
 
 	end_time = jiffies + TIMEOUT;
 	refrigerator_called = 0;
@@ -226,6 +231,11 @@ static int try_to_freeze_tasks(int freez
 			break;
 	} while (todo);
 
+	do_gettimeofday(&end);
+	elapsed_csecs64 = timeval_to_ns(&end) - timeval_to_ns(&start);
+	do_div(elapsed_csecs64, NSEC_PER_SEC / 100);
+	elapsed_csecs = elapsed_csecs64;
+
 	if (todo) {
 		/* This does not unfreeze processes that are already frozen
 		 * (we have slightly ugly calling convention in that respect,
@@ -233,10 +243,9 @@ static int try_to_freeze_tasks(int freez
 		 * but it cleans up leftover PF_FREEZE requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks failed after %d.%d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
-				TIMEOUT / HZ, todo);
+				elapsed_csecs / 100, elapsed_csecs % 100, todo);
 		show_state();
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
@@ -247,6 +256,9 @@ static int try_to_freeze_tasks(int freez
 			task_unlock(p);
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+	} else {
+		printk("(elapsed %d.%d seconds) ", elapsed_csecs / 100,
+			elapsed_csecs % 100);
 	}
 
 	return todo ? -EBUSY : 0;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-07-31 22:25                   ` [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated) Rafael J. Wysocki
  2007-07-31 22:26                     ` [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
  2007-07-31 22:28                     ` [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks Rafael J. Wysocki
@ 2007-07-31 22:29                     ` Rafael J. Wysocki
  2007-08-01  8:31                       ` Pavel Machek
  2 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-07-31 22:29 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Instead of using the global timeout, we can use a more fine grained method of
checking if the freezing of tasks should fail.  Namely, we can measure the time
in which no tasks have entered the refrigerator by counting the number of calls
to wait_event_timeout() in try_to_freeze_tasks() that have returned 0 (in a
row).

After sending freeze requests to the tasks regarded as freezable
try_to_freeze_tasks() goes to sleep and waits until at least one task enters the
refrigerator.  If the refrigerator is not entered by any tasks before WAIT_TIME
expires, try_to_freeze_tasks() increases the counter of expired timeouts and
sends freeze requests to the remaining tasks.  If the number of expired timeouts
becomes greater than MAX_WAITS, the freezing of tasks fails (the counter of
expired timeouts is reset whenever a task enters the refrigerator).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

Index: linux-2.6.23-rc1/kernel/power/process.c
===================================================================
--- linux-2.6.23-rc1.orig/kernel/power/process.c	2007-07-31 23:01:10.000000000 +0200
+++ linux-2.6.23-rc1/kernel/power/process.c	2007-07-31 23:12:34.000000000 +0200
@@ -14,17 +14,21 @@
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
 
-/* 
- * Timeout for stopping processes
- */
-#define TIMEOUT	(20 * HZ)
-
 /*
  * Time to wait until one or more tasks enter the refrigerator after sending
  * freeze requests to them.
  */
 #define WAIT_TIME (HZ / 5)
 
+/*
+ * Each time after sending freeze requests to tasks the freezer will wait until
+ * some of them enter the refrigerater, but no longer than TIMEOUT.  If TIMEOUT
+ * has been exceeded, the freezer increases the number of waits by one and
+ * repeats.  If the number of waits becomes greater than MAX_WAITS, the
+ * freezing fails.
+ */
+#define MAX_WAITS 5
+
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
 
@@ -188,16 +192,15 @@ static void cancel_freezing(struct task_
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
-	unsigned long end_time;
-	unsigned int todo;
+	unsigned int todo, waits;
 	struct timeval start, end;
 	s64 elapsed_csecs64;
 	unsigned int elapsed_csecs;
 
 	do_gettimeofday(&start);
 
-	end_time = jiffies + TIMEOUT;
 	refrigerator_called = 0;
+	waits = 0;
 	do {
 		todo = 0;
 		read_lock(&tasklist_lock);
@@ -223,12 +226,14 @@ static int try_to_freeze_tasks(int freez
 
 			ret = wait_event_timeout(refrigerator_waitq,
 					refrigerator_called, WAIT_TIME);
-			if (ret)
+			if (ret) {
 				refrigerator_called = 0;
+				waits = 0;
+			} else {
+				if (++waits > MAX_WAITS)
+					break;
+			}
 		}
-
-		if (time_after(jiffies, end_time))
-			break;
 	} while (todo);
 
 	do_gettimeofday(&end);

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping
  2007-07-31 22:26                     ` [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
@ 2007-08-01  7:59                       ` Pavel Machek
  0 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2007-08-01  7:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> Use the observation that try_to_freeze_tasks() need not loop while waiting for
> the freezing tasks to enter the refrigerator and make it use a wait queue.
> 
> The idea is that after sending freeze requests to the tasks regarded as
> freezable try_to_freeze_tasks() can go to sleep and wait until at least one task
> enters the refrigerator.  The first task that does it wakes up
> try_to_freeze_tasks() and the procedure is repeated.  If the refrigerator is not
> entered by any tasks before TIMEOUT expires, the freezing of tasks fails.
> 
> This way, try_to_freeze_tasks() doesn't occupy the CPU unnecessarily when some
> freezing tasks are waiting for I/O to complete.

Ok, I guess this is as simple as it gets. ACK.
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks
  2007-07-31 22:28                     ` [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks Rafael J. Wysocki
@ 2007-08-01  8:28                       ` Pavel Machek
  0 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2007-08-01  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> Measure the time of the freezing of tasks, even if it doesn't fail.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Ok, why not.

> @@ -233,10 +243,9 @@ static int try_to_freeze_tasks(int freez
>  		 * but it cleans up leftover PF_FREEZE requests.
>  		 */
>  		printk("\n");
> -		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
> +		printk(KERN_ERR "Freezing of tasks failed after %d.%d seconds "

%d.%02d... or it will report bogus numbers.

>  				"(%d tasks refusing to freeze):\n",
> -				freeze_user_space ? "user space " : "tasks ",
> -				TIMEOUT / HZ, todo);
> +				elapsed_csecs / 100, elapsed_csecs % 100, todo);
>  		show_state();
>  		read_lock(&tasklist_lock);
>  		do_each_thread(g, p) {
> @@ -247,6 +256,9 @@ static int try_to_freeze_tasks(int freez
>  			task_unlock(p);
>  		} while_each_thread(g, p);
>  		read_unlock(&tasklist_lock);
> +	} else {
> +		printk("(elapsed %d.%d seconds) ", elapsed_csecs / 100,
> +			elapsed_csecs % 100);
>  	}

Same here. You have my ACK with that change.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-07-31 22:29                     ` [RFC][PATCH -mm 3/3] Freezer: Replace the timeout Rafael J. Wysocki
@ 2007-08-01  8:31                       ` Pavel Machek
  2007-08-01 10:43                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-08-01  8:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> Instead of using the global timeout, we can use a more fine grained method of
> checking if the freezing of tasks should fail.  Namely, we can measure the time
> in which no tasks have entered the refrigerator by counting the number of calls
> to wait_event_timeout() in try_to_freeze_tasks() that have returned 0 (in a
> row).
> 
> After sending freeze requests to the tasks regarded as freezable
> try_to_freeze_tasks() goes to sleep and waits until at least one task enters the
> refrigerator.  If the refrigerator is not entered by any tasks before WAIT_TIME
> expires, try_to_freeze_tasks() increases the counter of expired timeouts and
> sends freeze requests to the remaining tasks.  If the number of expired timeouts
> becomes greater than MAX_WAITS, the freezing of tasks fails (the counter of
> expired timeouts is reset whenever a task enters the refrigerator).

I do not get logic behind this.

Old logic was "we give system 20 seconds to come into quiet state".

New logic is "if we do no progress within second, we fail"... which is
quite a big change. What happens on loaded ext3 filesystem, for
example? Bunch of userland tasks will wait on data to be synced to
disk, taking more than second, no?

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-08-01  8:31                       ` Pavel Machek
@ 2007-08-01 10:43                         ` Rafael J. Wysocki
  2007-08-05 21:37                           ` Pavel Machek
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-08-01 10:43 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Wednesday, 1 August 2007 10:31, Pavel Machek wrote:
> Hi!
> 
> > Instead of using the global timeout, we can use a more fine grained method of
> > checking if the freezing of tasks should fail.  Namely, we can measure the time
> > in which no tasks have entered the refrigerator by counting the number of calls
> > to wait_event_timeout() in try_to_freeze_tasks() that have returned 0 (in a
> > row).
> > 
> > After sending freeze requests to the tasks regarded as freezable
> > try_to_freeze_tasks() goes to sleep and waits until at least one task enters the
> > refrigerator.  If the refrigerator is not entered by any tasks before WAIT_TIME
> > expires, try_to_freeze_tasks() increases the counter of expired timeouts and
> > sends freeze requests to the remaining tasks.  If the number of expired timeouts
> > becomes greater than MAX_WAITS, the freezing of tasks fails (the counter of
> > expired timeouts is reset whenever a task enters the refrigerator).
> 
> I do not get logic behind this.
> 
> Old logic was "we give system 20 seconds to come into quiet state".
> 
> New logic is "if we do no progress within second, we fail"... which is
> quite a big change.

Well, I agree, and that's why I wanted to separate this part from the two
previous patches ...

> What happens on loaded ext3 filesystem, for example? Bunch of userland tasks
> will wait on data to be synced to disk, taking more than second, no?

IMHO this only is a question of what the value of MAX_WAITS should be.
[I took 5 because it turned to be enough in my testing, but that could be 10 or
more.]

The point is that in 99.(9)% of cases the 20s timeout is unnecessary, because:
(1) most often we succeed within 1s
(2) if we are going to fail, we can say that we'll fail way before the 20s
    expires.
Now, the question is how we can check that we'll fail and this patch attempts
to use a simple machanism:
* measure the time in which no tasks have entered the refrigerator and if this
  time is long enough, we can safely assume the "blocking" tasks to be stuck
  somewhere and give up.
This isn't bullet proof, but it should cover the vast majority of cases.

Anyway, eventually, I'd like the freezer to detect failures relatively early,
so the user won't have to wait 20s each time it's going to fail.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-08-01 10:43                         ` Rafael J. Wysocki
@ 2007-08-05 21:37                           ` Pavel Machek
  2007-08-05 22:38                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Pavel Machek @ 2007-08-05 21:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> > What happens on loaded ext3 filesystem, for example? Bunch of userland tasks
> > will wait on data to be synced to disk, taking more than second, no?
> 
> IMHO this only is a question of what the value of MAX_WAITS should be.
> [I took 5 because it turned to be enough in my testing, but that could be 10 or
> more.]
> 
> The point is that in 99.(9)% of cases the 20s timeout is unnecessary, because:
> (1) most often we succeed within 1s
> (2) if we are going to fail, we can say that we'll fail way before the 20s
>     expires.

Well, I've just reproduced 10seconds and 17seconds
time-to-freeze. Okay, I did 

make clean; time make -j 350

on kernel (2GB machine). Can you try with something similary evil?

> Anyway, eventually, I'd like the freezer to detect failures relatively early,
> so the user won't have to wait 20s each time it's going to fail.

It should not fail ;-). And failures are _really_ rare these days. Is
20second wait in case of kernel bug that bad? (FUSE case _is_ a kernel
bug, I'm just not sure how to solve it. It is still rare.)

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-08-05 21:37                           ` Pavel Machek
@ 2007-08-05 22:38                             ` Rafael J. Wysocki
  2007-08-05 22:53                               ` Pavel Machek
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2007-08-05 22:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

On Sunday, 5 August 2007 23:37, Pavel Machek wrote:
> Hi!
> 
> > > What happens on loaded ext3 filesystem, for example? Bunch of userland tasks
> > > will wait on data to be synced to disk, taking more than second, no?
> > 
> > IMHO this only is a question of what the value of MAX_WAITS should be.
> > [I took 5 because it turned to be enough in my testing, but that could be 10 or
> > more.]
> > 
> > The point is that in 99.(9)% of cases the 20s timeout is unnecessary, because:
> > (1) most often we succeed within 1s
> > (2) if we are going to fail, we can say that we'll fail way before the 20s
> >     expires.
> 
> Well, I've just reproduced 10seconds and 17seconds
> time-to-freeze. Okay, I did 
> 
> make clean; time make -j 350
> 
> on kernel (2GB machine). Can you try with something similary evil?

I tested with 'make -j 100'.  More than that just makes the machine
unresponsive.
 
> > Anyway, eventually, I'd like the freezer to detect failures relatively early,
> > so the user won't have to wait 20s each time it's going to fail.
> 
> It should not fail ;-). And failures are _really_ rare these days. Is
> 20second wait in case of kernel bug that bad? (FUSE case _is_ a kernel
> bug, I'm just not sure how to solve it. It is still rare.)

Well, not very bad, but still.  Perhaps we should use a progress meter of
some kind to let the user know that something's going on. ;-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC][PATCH -mm 3/3] Freezer: Replace the timeout
  2007-08-05 22:38                             ` Rafael J. Wysocki
@ 2007-08-05 22:53                               ` Pavel Machek
  0 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2007-08-05 22:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Andres Salomon, linux-pm, Chris Ball,
	David Woodhouse, Oleg Nesterov

Hi!

> > > Anyway, eventually, I'd like the freezer to detect failures relatively early,
> > > so the user won't have to wait 20s each time it's going to fail.
> > 
> > It should not fail ;-). And failures are _really_ rare these days. Is
> > 20second wait in case of kernel bug that bad? (FUSE case _is_ a kernel
> > bug, I'm just not sure how to solve it. It is still rare.)
> 
> Well, not very bad, but still.  Perhaps we should use a progress meter of
> some kind to let the user know that something's going on. ;-)

Maybe we should move the process freezing to userspace :-))).

Wait... that would do the trick. We would blame userspace for FUSE
problems, then ;-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2007-08-05 22:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-25 12:01 [RFC][PATCH -mm 0/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
2007-07-25 12:03 ` [RFC][PATCH -mm 1/2] Freezer: Be more verbose Rafael J. Wysocki
2007-07-25 12:27   ` Pavel Machek
2007-07-25 12:09 ` [RFC][PATCH -mm 2/2] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
2007-07-25 12:28   ` Pavel Machek
2007-07-25 12:55     ` Rafael J. Wysocki
2007-07-25 13:29   ` Oleg Nesterov
2007-07-25 14:03     ` Rafael J. Wysocki
2007-07-25 14:24       ` Oleg Nesterov
2007-07-26 12:24         ` Rafael J. Wysocki
2007-07-26 12:43           ` Rafael J. Wysocki
2007-07-31  8:01           ` Pavel Machek
2007-07-31  9:39             ` Rafael J. Wysocki
2007-07-31 10:00               ` Pavel Machek
2007-07-31 10:17                 ` Rafael J. Wysocki
2007-07-31 10:08               ` Rafael J. Wysocki
2007-07-31 10:02                 ` Pavel Machek
2007-07-31 22:25                   ` [RFC][PATCH -mm 0/3] Freezer: Use wait queue instead of busy looping (updated) Rafael J. Wysocki
2007-07-31 22:26                     ` [RFC][PATCH -mm 1/3] Freezer: Use wait queue instead of busy looping Rafael J. Wysocki
2007-08-01  7:59                       ` Pavel Machek
2007-07-31 22:28                     ` [RFC][PATCH -mm 2/3] Freezer: Measure the time of freezing tasks Rafael J. Wysocki
2007-08-01  8:28                       ` Pavel Machek
2007-07-31 22:29                     ` [RFC][PATCH -mm 3/3] Freezer: Replace the timeout Rafael J. Wysocki
2007-08-01  8:31                       ` Pavel Machek
2007-08-01 10:43                         ` Rafael J. Wysocki
2007-08-05 21:37                           ` Pavel Machek
2007-08-05 22:38                             ` Rafael J. Wysocki
2007-08-05 22:53                               ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox