public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: linux-pm@lists.linux-foundation.org
Cc: Matthew Garrett <mjg59@srcf.ucam.org>,
	Oleg Nesterov <oleg@tv-sign.ru>, Pavel Machek <pavel@ucw.cz>,
	Miklos Szeredi <miklos@szeredi.hu>
Subject: Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
Date: Tue, 10 Jul 2007 12:04:42 +0200	[thread overview]
Message-ID: <200707101204.42687.rjw@sisk.pl> (raw)
In-Reply-To: <200707100809.38466.rjw@sisk.pl>

On Tuesday, 10 July 2007 08:09, Rafael J. Wysocki wrote:
> On Tuesday, 10 July 2007 01:34, Pavel Machek wrote:
> > Hi!
> > 
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > The freezer fails if there are uninterruptible tasks waiting for some frozen
> > > tasks to let them continue.  Moreover, in that case try_to_freeze_tasks() loops
> > > uselessly until the timeout expires which is wasteful, so in principle we should
> > > make the freezer fail as soon as all the tasks that refuse to freeze are
> > > uninterruptible.  However, instead of failing the freezer we can try to use the
> > > time left and thaw the tasks that have already been frozen without
> > > clearing the
> > 
> > No, we can't do that:
> > 
> > Imagine we have single uninterruptible task that waits for disk. It
> > would exit uninterruptible state in 10msec, *but* you give up and
> > unfreeze all. Now, another task goes uninterruptible waiting for
> > disk and situation repeats. Livelock.
> 
> For how many times would that have to repeat before 30s of timeout expires?
> 
> Sorry, but I don't buy this argument. :-)
> 
> > Yes, this might play with races in interresting ways and help fuse,
> > but we do not want the livelock in the first place.
> 
> I think that the "livelock" will never happen.
> 
> Besides, we can add another timeout for breaking the loop from a "locked up"
> state.

Actually I like this idea. :-)

I have updated the patch to use the additional timeout, please have a look
(below).

Greetings,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

There is the problem with try_to_freeze_tasks() that it always loops until the
timeout expires, even if it is certain to fail much earlier.  Namely, if there
are uninterruptible tasks waiting for some frozen tasks to let them continue,
try_to_freeze_tasks() will certainly fail and it shouldn't waste time in that
cases.

To detect such situations, we can count the number of tasks that haven't been
frozen yet and the number of uninterruptible tasks among them.  If these two
numbers haven't been changing for sufficiently long time and are equal, then
most probably some uninterruptible tasks are blocked by some frozen tasks and we
should break out of this deadlock.  Thus, it seems reasonable to thaw the tasks
that have already been frozen without clearing the freeze requests of the tasks
that are refusing to freeze.  This way, if these tasks are really blocked by the
frozen ones, they will get extra chance to freeze themselves after we have
thawed the other tasks.  Next, we should repeat the freezing loop and so on,
until the timeout expires.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   88 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 75 insertions(+), 13 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -14,10 +14,11 @@
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
 
-/* 
- * Timeout for stopping processes
- */
-#define TIMEOUT	(20 * HZ)
+/* Timeout for the freezing of tasks */
+#define TIMEOUT	(30 * HZ)
+
+/* Timeout for breaking the freezing loop if lock-up state is detected */
+#define BREAK_TIMEOUT	(HZ / 10)
 
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
@@ -172,15 +173,33 @@ static void cancel_freezing(struct task_
 	}
 }
 
+/**
+ *	try_to_freeze_tasks - send freeze requests to all tasks and wait for
+ *		for them to enter the refrigerator
+ *	@freeze_user_space - if set, only tasks that have mm of their own are
+ *		requested to freeze
+ *
+ *	If this function fails, thaw_tasks() must be called to do the cleanup.
+ */
+
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
-	unsigned long end_time;
-	unsigned int todo;
+	unsigned long end_time, break_time;
+	unsigned int todo, blocking, prev_blocking;
+	unsigned int i = 0;
+	char *tick = "-\\|/";
+
+	printk(" ");
 
 	end_time = jiffies + TIMEOUT;
+ Repeat:
+	break_time = 0;
+	blocking = 0;
 	do {
 		todo = 0;
+		prev_blocking = blocking;
+		blocking = 0;
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
 			if (frozen(p) || !freezeable(p))
@@ -197,25 +216,68 @@ static int try_to_freeze_tasks(int freez
 			} else {
 				freeze_task(p);
 			}
-			if (!freezer_should_skip(p))
+			if (!freezer_should_skip(p)) {
 				todo++;
+				if (freeze_user_space &&
+				    (p->state == TASK_UNINTERRUPTIBLE))
+					blocking++;
+			}
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+
 		yield();			/* Yield is okay here */
+
 		if (time_after(jiffies, end_time))
 			break;
+
+		/*
+		 * Check if we are making or we are likely to make any progress.
+		 * If not, we should better break out of this.
+		 */
+		if (todo && todo == blocking && blocking == prev_blocking) {
+			if (!break_time)
+				break_time = jiffies + BREAK_TIMEOUT;
+			else if (time_after(jiffies, break_time))
+				break;
+		} else {
+			break_time = 0;
+		}
 	} while (todo);
 
+	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
+		/*
+		 * Some tasks have not been able to freeze.  They might be stuck
+		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
+		 * thaw the tasks that have frozen without clearing the freeze
+		 * requests of the remaining tasks and repeat.
+		 */
+		read_lock(&tasklist_lock);
+		do_each_thread(g, p) {
+			if (frozen(p)) {
+				p->flags &= ~PF_FROZEN;
+				wake_up_process(p);
+			}
+		} while_each_thread(g, p);
+		read_unlock(&tasklist_lock);
+
+		yield();
+
+		printk("\b%c", tick[i++%4]);
+
+		goto Repeat;
+	}
+
+	printk("\b");
+
 	if (todo) {
-		/* This does not unfreeze processes that are already frozen
-		 * (we have slightly ugly calling convention in that respect,
-		 * and caller must call thaw_processes() if something fails),
-		 * but it cleans up leftover PF_FREEZE requests.
+		/*
+		 * The freezing of tasks has failed.  List the tasks that have
+		 * refused to freeze.  This also clears all pending freeze
+		 * requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks timed out after %d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
 				TIMEOUT / HZ, todo);
 		show_state();
 		read_lock(&tasklist_lock);

  reply	other threads:[~2007-07-10 10:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
2007-07-09 23:12   ` Pavel Machek
2007-07-10  0:31     ` Matthew Garrett
2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
2007-07-09 23:42   ` Pavel Machek
2007-07-10  5:57     ` Rafael J. Wysocki
2007-07-10 15:00   ` Oleg Nesterov
2007-07-10 21:08     ` Rafael J. Wysocki
2007-07-10 21:22       ` Oleg Nesterov
2007-07-09 20:33 ` [RFC][PATCH -mm 3/6] Freezer: Be more verbose Rafael J. Wysocki
2007-07-09 23:46   ` Pavel Machek
2007-07-10  6:01     ` Rafael J. Wysocki
2007-07-10 15:05       ` Pavel Machek
2007-07-09 20:34 ` [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set Rafael J. Wysocki
2007-07-09 23:21   ` Pavel Machek
2007-07-10  6:03     ` Rafael J. Wysocki
2007-07-10 15:05       ` Pavel Machek
2007-07-09 20:38 ` [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently Rafael J. Wysocki
2007-07-09 23:34   ` Pavel Machek
2007-07-10  6:09     ` Rafael J. Wysocki
2007-07-10 10:04       ` Rafael J. Wysocki [this message]
2007-07-10 17:17         ` Oleg Nesterov
2007-07-10 20:30           ` Rafael J. Wysocki
2007-07-10 20:55             ` Oleg Nesterov
2007-07-10 21:15               ` Rafael J. Wysocki
2007-07-10 18:50         ` Oleg Nesterov
2007-07-10 19:54           ` Rafael J. Wysocki
2007-07-10 20:35             ` Oleg Nesterov
2007-07-10 20:57               ` Rafael J. Wysocki
2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
2007-07-10 21:38           ` Oleg Nesterov
2007-07-10 21:39           ` Rafael J. Wysocki
2007-07-10 21:39             ` Pavel Machek
2007-07-10 22:07               ` Rafael J. Wysocki
2007-07-10 22:21                 ` Pavel Machek
2007-07-23  8:04                 ` Pavel Machek
2007-07-23 19:16                   ` Rafael J. Wysocki
2007-07-09 20:41 ` [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking Rafael J. Wysocki
2007-07-09 23:23   ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200707101204.42687.rjw@sisk.pl \
    --to=rjw@sisk.pl \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=miklos@szeredi.hu \
    --cc=mjg59@srcf.ucam.org \
    --cc=oleg@tv-sign.ru \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox