[RFC][PATCH -mm 0/6] Freezer update

public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH -mm 0/6] Freezer update
@ 2007-07-09 20:29 Rafael J. Wysocki
  2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:29 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

Hi,

The patches in these series update the freezer to eliminate some existing
shortcomings, so I'd like to get them, or their final versions if some
corrections are necessary, into 2.6.23.

The patches do the following:
* remove sys_sync() from the freezer and make the suspend/hibernation code
  invoke it explicitly
* prevent the freezer from sending signals to kernel threads
* increase the verbosity of the freezer slightly
* prevent new tasks from inheriting TIF_FREEZE set from the parents
* make the freezer use the freezing timeout more efficiently
* update the freezer documentation to describe, previously omitted, important
  reason for freezing tasks.
The details are in the changelogs.

Comments welcome.

Greetings,
Rafael

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
@ 2007-07-09 20:31 ` Rafael J. Wysocki
  2007-07-09 23:12   ` Pavel Machek
  2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:31 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

The syncing of filesystems from within the freezer is generally not needed.
Also, if FUSE implements sync, this will cause the freezer to deadlock with it.

Change freeze_processes() so that it doesn't execute sys_sync() and make the
suspend and hibernation code path sync filesystems independently of the freezer.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c    |    4 ++++
 kernel/power/main.c    |    6 ++++++
 kernel/power/process.c |    1 -
 kernel/power/user.c    |    4 ++++
 4 files changed, 14 insertions(+), 1 deletion(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/disk.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/disk.c
+++ linux-2.6.22-rc6-mm1/kernel/power/disk.c
@@ -297,6 +297,10 @@ int hibernate(void)
 	if (error)
 		goto Exit;
 
+	printk("Syncing filesystems ... ");
+	sys_sync();
+	printk("done.\n");
+
 	error = prepare_processes();
 	if (error)
 		goto Finish;
Index: linux-2.6.22-rc6-mm1/kernel/power/main.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/main.c
+++ linux-2.6.22-rc6-mm1/kernel/power/main.c
@@ -20,6 +20,7 @@
 #include <linux/resume-trace.h>
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
+#include <linux/syscalls.h>
 
 #include "power.h"
 
@@ -234,9 +235,14 @@ static int enter_state(suspend_state_t s
 
 	if (!valid_state(state))
 		return -ENODEV;
+
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
+	printk("Syncing filesystems ... ");
+	sys_sync();
+	printk("done.\n");
+
 	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 	if ((error = suspend_prepare()))
 		goto Unlock;
Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -191,7 +191,6 @@ int freeze_processes(void)
 	if (error)
 		return error;
 
-	sys_sync();
 	error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
 	if (error)
 		return error;
Index: linux-2.6.22-rc6-mm1/kernel/power/user.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/user.c
+++ linux-2.6.22-rc6-mm1/kernel/power/user.c
@@ -153,6 +153,10 @@ static int snapshot_ioctl(struct inode *
 		mutex_lock(&pm_mutex);
 		error = pm_notifier_call_chain(PM_HIBERNATION_PREPARE);
 		if (!error) {
+			printk("Syncing filesystems ... ");
+			sys_sync();
+			printk("done.\n");
+
 			error = freeze_processes();
 			if (error)
 				thaw_processes();

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
  2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
@ 2007-07-09 20:32 ` Rafael J. Wysocki
  2007-07-09 23:42   ` Pavel Machek
  2007-07-10 15:00   ` Oleg Nesterov
  2007-07-09 20:33 ` [RFC][PATCH -mm 3/6] Freezer: Be more verbose Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:32 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Commit b74d0deb968e1f85942f17080eace015ce3c332c has changed
recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING.  For this
reason, the freezer should not send fake signals to kernel threads any more,
since otherwise some of them may run with TIF_SIGPENDING set forever if the
freezing of kernel threads fails.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/freezing-of-tasks.txt |   31 +++++----
 drivers/input/gameport/gameport.c         |    3 
 drivers/input/serio/serio.c               |    3 
 drivers/input/touchscreen/ucb1400_ts.c    |    3 
 drivers/media/dvb/dvb-core/dvb_frontend.c |    3 
 drivers/usb/core/hub.c                    |    3 
 drivers/usb/storage/usb.c                 |    5 -
 include/linux/freezer.h                   |   38 +++++++++++
 kernel/power/process.c                    |   97 +++++++++++++++++++++++-------
 kernel/signal.c                           |    1 
 10 files changed, 137 insertions(+), 50 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -75,21 +75,88 @@ void refrigerator(void)
 	__set_current_state(save);
 }
 
-static void freeze_task(struct task_struct *p)
+static void fake_signal_wake_up(struct task_struct *p, int resume)
 {
 	unsigned long flags;
 
-	if (!freezing(p)) {
+	spin_lock_irqsave(&p->sighand->siglock, flags);
+	signal_wake_up(p, resume);
+	spin_unlock_irqrestore(&p->sighand->siglock, flags);
+}
+
+static void send_fake_signal(struct task_struct *p)
+{
+	if (p->state == TASK_STOPPED)
+		force_sig_specific(SIGSTOP, p);
+	fake_signal_wake_up(p, p->state == TASK_STOPPED);
+}
+
+static int has_mm(struct task_struct *p)
+{
+	return (p->mm && !(p->flags & PF_BORROWED_MM));
+}
+
+/**
+ *	freeze_user_process - set TIF_FREEZE for user space process @p and
+ *			      send fake signal to it
+ *
+ *	Kernel threads should not have TIF_FREEZE set at this point, so we must
+ *	ensure that either p->mm is not NULL *and* PF_BORROWED_MM is unset, or
+ *	TIF_FRREZE is left unset.  The task_lock() is necessary to prevent races
+ *	with exit_mm() or use_mm()/unuse_mm() from occuring.
+ */
+static int freeze_user_process(struct task_struct *p)
+{
+	int ret = 1;
+
+	task_lock(p);
+	if (has_mm(p)) {
+		if (freezing(p)) {
+			if (!signal_pending(p))
+				fake_signal_wake_up(p, 0);
+			else
+				wake_up_state(p, TASK_INTERRUPTIBLE);
+		} else {
+			rmb();
+			if (!frozen(p)) {
+				set_freeze_flag(p);
+				send_fake_signal(p);
+			}
+		}
+	} else {
+		ret = 0;
+	}
+	task_unlock(p);
+	return ret;
+}
+
+/**
+ *	freeze_task - set TIF_FREEZE for task @p and either wake it up or send
+ *		      fake signal to it depending on whether or not it is a
+ *		      kernel thread
+ *
+ *	The task_lock() is necessary to prevent races with exit_mm() or
+ *	use_mm()/unuse_mm() from occuring.
+ */
+static void freeze_task(struct task_struct *p)
+{
+	task_lock(p);
+	if (freezing(p)) {
+		if (has_mm(p) && !signal_pending(p))
+			fake_signal_wake_up(p, 0);
+		else
+			wake_up_state(p, TASK_INTERRUPTIBLE);
+	} else {
 		rmb();
 		if (!frozen(p)) {
 			set_freeze_flag(p);
-			if (p->state == TASK_STOPPED)
-				force_sig_specific(SIGSTOP, p);
-			spin_lock_irqsave(&p->sighand->siglock, flags);
-			signal_wake_up(p, p->state == TASK_STOPPED);
-			spin_unlock_irqrestore(&p->sighand->siglock, flags);
+			if (has_mm(p))
+				send_fake_signal(p);
+			else
+				wake_up_state(p, TASK_INTERRUPTIBLE);
 		}
 	}
+	task_unlock(p);
 }
 
 static void cancel_freezing(struct task_struct *p)
@@ -125,22 +192,8 @@ static int try_to_freeze_tasks(int freez
 					cancel_freezing(p);
 					continue;
 				}
-				/*
-				 * Kernel threads should not have TIF_FREEZE set
-				 * at this point, so we must ensure that either
-				 * p->mm is not NULL *and* PF_BORROWED_MM is
-				 * unset, or TIF_FRREZE is left unset.
-				 * The task_lock() is necessary to prevent races
-				 * with exit_mm() or use_mm()/unuse_mm() from
-				 * occuring.
-				 */
-				task_lock(p);
-				if (!p->mm || (p->flags & PF_BORROWED_MM)) {
-					task_unlock(p);
+				if (!freeze_user_process(p))
 					continue;
-				}
-				freeze_task(p);
-				task_unlock(p);
 			} else {
 				freeze_task(p);
 			}
Index: linux-2.6.22-rc6-mm1/kernel/signal.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/signal.c
+++ linux-2.6.22-rc6-mm1/kernel/signal.c
@@ -99,7 +99,6 @@ static inline int has_pending_signals(si
 static int recalc_sigpending_tsk(struct task_struct *t)
 {
 	if (t->signal->group_stop_count > 0 ||
-	    (freezing(t)) ||
 	    PENDING(&t->pending, &t->blocked) ||
 	    PENDING(&t->signal->shared_pending, &t->blocked)) {
 		set_tsk_thread_flag(t, TIF_SIGPENDING);
Index: linux-2.6.22-rc6-mm1/include/linux/freezer.h
===================================================================
--- linux-2.6.22-rc6-mm1.orig/include/linux/freezer.h
+++ linux-2.6.22-rc6-mm1/include/linux/freezer.h
@@ -4,6 +4,7 @@
 #define FREEZER_H_INCLUDED
 
 #include <linux/sched.h>
+#include <linux/wait.h>
 
 #ifdef CONFIG_PM
 /*
@@ -126,7 +127,33 @@ static inline void set_freezable(void)
 	current->flags &= ~PF_NOFREEZE;
 }
 
-#else
+/*
+ * Freezer-friendly wrappers around wait_event_interruptible() and
+ * wait_event_interruptible_timeout(), originally defined in <linux/wait.h>
+ */
+
+#define wait_event_freezable(wq, condition)				\
+({									\
+	int __ret;							\
+	do {								\
+		__ret = wait_event_interruptible(wq, 			\
+				(condition) || freezing(current));	\
+	} while (try_to_freeze());					\
+	__ret;								\
+})
+
+
+#define wait_event_freezable_timeout(wq, condition, timeout)		\
+({									\
+	long __ret = timeout;						\
+	do {								\
+		__ret = wait_event_interruptible_timeout(wq,		\
+				(condition) || freezing(current),	\
+				__ret); 				\
+	} while (try_to_freeze());					\
+	__ret;								\
+})
+#else /* CONFIG_PM */
 static inline int frozen(struct task_struct *p) { return 0; }
 static inline int freezing(struct task_struct *p) { return 0; }
 static inline void set_freeze_flag(struct task_struct *p) {}
@@ -143,6 +170,13 @@ static inline void freezer_do_not_count(
 static inline void freezer_count(void) {}
 static inline int freezer_should_skip(struct task_struct *p) { return 0; }
 static inline void set_freezable(void) {}
-#endif
+
+#define wait_event_freezable(wq, condition)				\
+		wait_event_interruptible(wq, condition)
+
+#define wait_event_freezable_timeout(wq, condition, timeout)		\
+		wait_event_interruptible_timeout(wq, condition, timeout)
+
+#endif /* CONFIG_PM */
 
 #endif	/* FREEZER_H_INCLUDED */
Index: linux-2.6.22-rc6-mm1/drivers/input/gameport/gameport.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/input/gameport/gameport.c
+++ linux-2.6.22-rc6-mm1/drivers/input/gameport/gameport.c
@@ -448,9 +448,8 @@ static int gameport_thread(void *nothing
 	set_freezable();
 	do {
 		gameport_handle_event();
-		wait_event_interruptible(gameport_wait,
+		wait_event_freezable(gameport_wait,
 			kthread_should_stop() || !list_empty(&gameport_event_list));
-		try_to_freeze();
 	} while (!kthread_should_stop());
 
 	printk(KERN_DEBUG "gameport: kgameportd exiting\n");
Index: linux-2.6.22-rc6-mm1/drivers/input/serio/serio.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/input/serio/serio.c
+++ linux-2.6.22-rc6-mm1/drivers/input/serio/serio.c
@@ -387,9 +387,8 @@ static int serio_thread(void *nothing)
 	set_freezable();
 	do {
 		serio_handle_event();
-		wait_event_interruptible(serio_wait,
+		wait_event_freezable(serio_wait,
 			kthread_should_stop() || !list_empty(&serio_event_list));
-		try_to_freeze();
 	} while (!kthread_should_stop());
 
 	printk(KERN_DEBUG "serio: kseriod exiting\n");
Index: linux-2.6.22-rc6-mm1/drivers/input/touchscreen/ucb1400_ts.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/input/touchscreen/ucb1400_ts.c
+++ linux-2.6.22-rc6-mm1/drivers/input/touchscreen/ucb1400_ts.c
@@ -334,10 +334,9 @@ static int ucb1400_ts_thread(void *_ucb)
 			timeout = msecs_to_jiffies(10);
 		}
 
-		wait_event_interruptible_timeout(ucb->ts_wait,
+		wait_event_freezable_timeout(ucb->ts_wait,
 			ucb->irq_pending || ucb->ts_restart || kthread_should_stop(),
 			timeout);
-		try_to_freeze();
 	}
 
 	/* Send the "pen off" if we are stopping with the pen still active */
Index: linux-2.6.22-rc6-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/media/dvb/dvb-core/dvb_frontend.c
+++ linux-2.6.22-rc6-mm1/drivers/media/dvb/dvb-core/dvb_frontend.c
@@ -528,7 +528,8 @@ static int dvb_frontend_thread(void *dat
 		up(&fepriv->sem);	    /* is locked when we enter the thread... */
 restart:
 		timeout = wait_event_interruptible_timeout(fepriv->wait_queue,
-			dvb_frontend_should_wakeup(fe) || kthread_should_stop(),
+			dvb_frontend_should_wakeup(fe) || kthread_should_stop()
+				|| freezing(current),
 			fepriv->delay);
 
 		if (kthread_should_stop() || dvb_frontend_is_exiting(fe)) {
Index: linux-2.6.22-rc6-mm1/drivers/usb/core/hub.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/usb/core/hub.c
+++ linux-2.6.22-rc6-mm1/drivers/usb/core/hub.c
@@ -2729,10 +2729,9 @@ static int hub_thread(void *__unused)
 	set_freezable();
 	do {
 		hub_events();
-		wait_event_interruptible(khubd_wait,
+		wait_event_freezable(khubd_wait,
 				!list_empty(&hub_event_list) ||
 				kthread_should_stop());
-		try_to_freeze();
 	} while (!kthread_should_stop() || !list_empty(&hub_event_list));
 
 	pr_debug("%s: khubd exiting\n", usbcore_name);
Index: linux-2.6.22-rc6-mm1/drivers/usb/storage/usb.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/drivers/usb/storage/usb.c
+++ linux-2.6.22-rc6-mm1/drivers/usb/storage/usb.c
@@ -913,12 +913,9 @@ static int usb_stor_scan_thread(void * _
 	if (delay_use > 0) {
 		printk(KERN_DEBUG "usb-storage: waiting for device "
 				"to settle before scanning\n");
-retry:
-		wait_event_interruptible_timeout(us->delay_wait,
+		wait_event_freezable_timeout(us->delay_wait,
 				test_bit(US_FLIDX_DISCONNECTING, &us->flags),
 				delay_use * HZ);
-		if (try_to_freeze())
-			goto retry;
 	}
 
 	/* If the device is still connected, perform the scanning */
Index: linux-2.6.22-rc6-mm1/Documentation/power/freezing-of-tasks.txt
===================================================================
--- linux-2.6.22-rc6-mm1.orig/Documentation/power/freezing-of-tasks.txt
+++ linux-2.6.22-rc6-mm1/Documentation/power/freezing-of-tasks.txt
@@ -19,12 +19,13 @@ we only consider hibernation, but the de
 Namely, as the first step of the hibernation procedure the function
 freeze_processes() (defined in kernel/power/process.c) is called.  It executes
 try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and
-sends a fake signal to each of them.  A task that receives such a signal and has
-TIF_FREEZE set, should react to it by calling the refrigerator() function
-(defined in kernel/power/process.c), which sets the task's PF_FROZEN flag,
-changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is
-cleared for it.  Then, we say that the task is 'frozen' and therefore the set of
-functions handling this mechanism is called 'the freezer' (these functions are
+either wakes them up, if they are kernel threads, or sends fake signals to them,
+if they are user space processes.  A task that has TIF_FREEZE set, should react
+to it by calling the function called refrigerator() (defined in
+kernel/power/process.c), which sets the task's PF_FROZEN flag, changes its state
+to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it.
+Then, we say that the task is 'frozen' and therefore the set of functions
+handling this mechanism is referred to as 'the freezer' (these functions are
 defined in kernel/power/process.c and include/linux/freezer.h).  User space
 processes are generally frozen before kernel threads.
 
@@ -35,21 +36,27 @@ task enter refrigerator() if the flag is
 
 For user space processes try_to_freeze() is called automatically from the
 signal-handling code, but the freezable kernel threads need to call it
-explicitly in suitable places.  The code to do this may look like the following:
+explicitly in suitable places or use the wait_event_freezable() or
+wait_event_freezable_timeout() macros (defined in include/linux/freezer.h)
+that combine interruptible sleep with checking if TIF_FREEZE is set and calling
+try_to_freeze().  The main loop of a freezable kernel thread may look like the
+following one:
 
+	set_freezable();
 	do {
 		hub_events();
-		wait_event_interruptible(khubd_wait,
-					!list_empty(&hub_event_list));
-		try_to_freeze();
-	} while (!signal_pending(current));
+		wait_event_freezable(khubd_wait,
+				!list_empty(&hub_event_list) ||
+				kthread_should_stop());
+	} while (!kthread_should_stop() || !list_empty(&hub_event_list));
 
 (from drivers/usb/core/hub.c::hub_thread()).
 
 If a freezable kernel thread fails to call try_to_freeze() after the freezer has
 set TIF_FREEZE for it, the freezing of tasks will fail and the entire
 hibernation operation will be cancelled.  For this reason, freezable kernel
-threads must call try_to_freeze() somewhere.
+threads must call try_to_freeze() somewhere or use one of the
+wait_event_freezable() and wait_event_freezable_timeout() macros.
 
 After the system memory state has been restored from a hibernation image and
 devices have been reinitialized, the function thaw_processes() is called in

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 3/6] Freezer: Be more verbose
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
  2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
  2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
@ 2007-07-09 20:33 ` Rafael J. Wysocki
  2007-07-09 23:46   ` Pavel Machek
  2007-07-09 20:34 ` [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:33 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Increase the freezer's verbosity a bit, so that it's easier to read problem
reports related to it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -237,20 +237,23 @@ static int try_to_freeze_tasks(int freez
  */
 int freeze_processes(void)
 {
-	int error;
+	int error = 0;
 
-	printk("Stopping tasks ... ");
+	printk("Freezing user space processes ... ");
 	error = try_to_freeze_tasks(FREEZER_USER_SPACE);
 	if (error)
-		return error;
+		goto Exit;
+	printk("done.\n");
 
+	printk("Freezing remaining freezable tasks ... ");
 	error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
 	if (error)
-		return error;
-
-	printk("done.\n");
+		goto Exit;
+	printk("done.");
+ Exit:
 	BUG_ON(in_atomic());
-	return 0;
+	printk("\n");
+	return error;
 }
 
 static void thaw_tasks(int thaw_user_space)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2007-07-09 20:33 ` [RFC][PATCH -mm 3/6] Freezer: Be more verbose Rafael J. Wysocki
@ 2007-07-09 20:34 ` Rafael J. Wysocki
  2007-07-09 23:21   ` Pavel Machek
  2007-07-09 20:38 ` [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently Rafael J. Wysocki
  2007-07-09 20:41 ` [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking Rafael J. Wysocki
  5 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:34 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

Tasks should go to the refrigerator only if explicitly requested to do that by
the freezer and not as a result of inheriting the TIF_FREEZE flag set from the
parent.  Make it happen.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/fork.c |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.22-rc6-mm1/kernel/fork.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/fork.c
+++ linux-2.6.22-rc6-mm1/kernel/fork.c
@@ -932,6 +932,7 @@ static inline void copy_flags(unsigned l
 	if (!(clone_flags & CLONE_PTRACE))
 		p->ptrace = 0;
 	p->flags = new_flags;
+	clear_freeze_flag(p);
 }
 
 asmlinkage long sys_set_tid_address(int __user *tidptr)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2007-07-09 20:34 ` [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set Rafael J. Wysocki
@ 2007-07-09 20:38 ` Rafael J. Wysocki
  2007-07-09 23:34   ` Pavel Machek
  2007-07-09 20:41 ` [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking Rafael J. Wysocki
  5 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:38 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

The freezer fails if there are uninterruptible tasks waiting for some frozen
tasks to let them continue.  Moreover, in that case try_to_freeze_tasks() loops
uselessly until the timeout expires which is wasteful, so in principle we should
make the freezer fail as soon as all the tasks that refuse to freeze are
uninterruptible.  However, instead of failing the freezer we can try to use the
time left and thaw the tasks that have already been frozen without clearing the
freeze requests of the tasks that are refusing to freeze.  This way, if these
tasks are really blocked by the frozen ones, they will get extra chance to
freeze when the other tasks are thawed.  Next, we can repeat the freezing loop
and so on, until the timeout expires.

This patch implements the above idea.  It has been tested by trying to freeze
tasks on a 2.2 GHz dual-core machine running 'make -j50' kernel compilation
concurrently with try_to_freeze_tasks() with the freezer hooks removed from
do_fork() and it's been able to do the job 100% of the time.  It may help with
the "freezer tends to fail if FUSE is used" issue
(cf. http://lkml.org/lkml/2007/7/3/1) and I don't see any drawback of doing it
in general.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   67 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 54 insertions(+), 13 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -14,10 +14,8 @@
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
 
-/* 
- * Timeout for stopping processes
- */
-#define TIMEOUT	(20 * HZ)
+/* Timeout for the freezing of tasks */
+#define TIMEOUT	(30 * HZ)
 
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
@@ -172,15 +170,30 @@ static void cancel_freezing(struct task_
 	}
 }
 
+/**
+ *	try_to_freeze_tasks - send freeze requests to all tasks and wait for
+ *		for them to enter the refrigerator
+ *	@freeze_user_space - if set, only tasks that have mm of their own are
+ *		requested to freeze
+ *
+ *	If this function fails, thaw_tasks() must be called to do the cleanup.
+ */
+
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
 	unsigned long end_time;
-	unsigned int todo;
+	unsigned int todo, blocking;
+	unsigned int i = 0;
+	char *tick = "-\\|/";
+
+	printk(" ");
 
 	end_time = jiffies + TIMEOUT;
+ Repeat:
 	do {
 		todo = 0;
+		blocking = 0;
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
 			if (frozen(p) || !freezeable(p))
@@ -197,25 +210,53 @@ static int try_to_freeze_tasks(int freez
 			} else {
 				freeze_task(p);
 			}
-			if (!freezer_should_skip(p))
+			if (!freezer_should_skip(p)) {
 				todo++;
+				if (freeze_user_space &&
+				    (p->state == TASK_UNINTERRUPTIBLE))
+					blocking++;
+			}
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
 		yield();			/* Yield is okay here */
 		if (time_after(jiffies, end_time))
 			break;
-	} while (todo);
+	} while (todo != blocking);
+
+	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
+		/*
+		 * Some tasks have not been able to freeze.  They might be stuck
+		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
+		 * thaw the tasks that have frozen without clearing the freeze
+		 * requests of the remaining tasks and repeat.
+		 */
+		read_lock(&tasklist_lock);
+		do_each_thread(g, p) {
+			if (frozen(p)) {
+				p->flags &= ~PF_FROZEN;
+				wake_up_process(p);
+			}
+		} while_each_thread(g, p);
+		read_unlock(&tasklist_lock);
+
+		yield();
+
+		printk("\b%c", tick[i++%4]);
+
+		goto Repeat;
+	}
+
+	printk("\b");
 
 	if (todo) {
-		/* This does not unfreeze processes that are already frozen
-		 * (we have slightly ugly calling convention in that respect,
-		 * and caller must call thaw_processes() if something fails),
-		 * but it cleans up leftover PF_FREEZE requests.
+		/*
+		 * The freezing of tasks has failed.  List the tasks that have
+		 * refused to freeze.  This also clears all pending freeze
+		 * requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks timed out after %d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
 				TIMEOUT / HZ, todo);
 		show_state();
 		read_lock(&tasklist_lock);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking
  2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2007-07-09 20:38 ` [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently Rafael J. Wysocki
@ 2007-07-09 20:41 ` Rafael J. Wysocki
  2007-07-09 23:23   ` Pavel Machek
  5 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-09 20:41 UTC (permalink / raw)
  To: pm list; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, Oleg Nesterov

From: Rafael J. Wysocki <rjw@sisk.pl>

One important reason to freeze tasks, which is that we don't want them to
allocate memory after freeing it for the hibernation image, has not been
documented.  Fix it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/power/freezing-of-tasks.txt |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Index: linux-2.6.22-rc7/Documentation/power/freezing-of-tasks.txt
===================================================================
--- linux-2.6.22-rc7.orig/Documentation/power/freezing-of-tasks.txt
+++ linux-2.6.22-rc7/Documentation/power/freezing-of-tasks.txt
@@ -88,7 +88,16 @@ hibernation image has been created and b
 The majority of these are user space processes, but if any of the kernel threads
 may cause something like this to happen, they have to be freezable.
 
-2. The second reason is to prevent user space processes and some kernel threads
+2. Next, to create the hibernation image we need to free a sufficient amount of
+memory (approximately 50% of available RAM) and we need to do that before
+devices are deactivated, because we generally need them for swapping out.  Then,
+after the memory for the image has been freed, we don't want tasks to allocate
+additional memory and we prevent them from doing that by freezing them earlier.
+[Of course, this also means that device drivers should not allocate substantial
+amounts of memory from their .suspend() callbacks before hibernation, but this
+is e separate issue.]
+
+3. The third reason is to prevent user space processes and some kernel threads
 from interfering with the suspending and resuming of devices.  A user space
 process running on a second CPU while we are suspending devices may, for
 example, be troublesome and without the freezing of tasks we would need some
@@ -118,7 +127,7 @@ frozen before the driver's .suspend() ca
 thawed after the driver's .resume() callback has run, so it won't be accessing
 the device while it's suspended.
 
-3. Another reason for freezing tasks is to prevent user space processes from
+4. Another reason for freezing tasks is to prevent user space processes from
 realizing that hibernation (or suspend) operation takes place.  Ideally, user
 space processes should not notice that such a system-wide operation has occurred
 and should continue running without any problems after the restore (or resume

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems
  2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
@ 2007-07-09 23:12   ` Pavel Machek
  2007-07-10  0:31     ` Matthew Garrett
  0 siblings, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The syncing of filesystems from within the freezer is generally not needed.
> Also, if FUSE implements sync, this will cause the freezer to deadlock with it.

While this is right, I do not think it is good enough reason for a
change. I mean, sync there currently hurts noone....

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set
  2007-07-09 20:34 ` [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set Rafael J. Wysocki
@ 2007-07-09 23:21   ` Pavel Machek
  2007-07-10  6:03     ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:21 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Tasks should go to the refrigerator only if explicitly requested to do that by
> the freezer and not as a result of inheriting the TIF_FREEZE flag set from the
> parent.  Make it happen.

Umm, what prevents userspace task from escaping freezer this way? Does
tasklist_lock prevent new tasks and thus this race?

								Pavel

> --- linux-2.6.22-rc6-mm1.orig/kernel/fork.c
> +++ linux-2.6.22-rc6-mm1/kernel/fork.c
> @@ -932,6 +932,7 @@ static inline void copy_flags(unsigned l
>  	if (!(clone_flags & CLONE_PTRACE))
>  		p->ptrace = 0;
>  	p->flags = new_flags;
> +	clear_freeze_flag(p);
>  }
>  
>  asmlinkage long sys_set_tid_address(int __user *tidptr)

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking
  2007-07-09 20:41 ` [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking Rafael J. Wysocki
@ 2007-07-09 23:23   ` Pavel Machek
  0 siblings, 0 replies; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

On Mon 2007-07-09 22:41:28, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> One important reason to freeze tasks, which is that we don't want them to
> allocate memory after freeing it for the hibernation image, has not been
> documented.  Fix it.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

ACK.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-09 20:38 ` [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently Rafael J. Wysocki
@ 2007-07-09 23:34   ` Pavel Machek
  2007-07-10  6:09     ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:34 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The freezer fails if there are uninterruptible tasks waiting for some frozen
> tasks to let them continue.  Moreover, in that case try_to_freeze_tasks() loops
> uselessly until the timeout expires which is wasteful, so in principle we should
> make the freezer fail as soon as all the tasks that refuse to freeze are
> uninterruptible.  However, instead of failing the freezer we can try to use the
> time left and thaw the tasks that have already been frozen without
> clearing the

No, we can't do that:

Imagine we have single uninterruptible task that waits for disk. It
would exit uninterruptible state in 10msec, *but* you give up and
unfreeze all. Now, another task goes uninterruptible waiting for
disk and situation repeats. Livelock.

Yes, this might play with races in interresting ways and help fuse,
but we do not want the livelock in the first place.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
@ 2007-07-09 23:42   ` Pavel Machek
  2007-07-10  5:57     ` Rafael J. Wysocki
  2007-07-10 15:00   ` Oleg Nesterov
  1 sibling, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:42 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Commit b74d0deb968e1f85942f17080eace015ce3c332c has changed
> recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING.  For this
> reason, the freezer should not send fake signals to kernel threads any more,
> since otherwise some of them may run with TIF_SIGPENDING set forever if the
> freezing of kernel threads fails.

Is there some bigger plan why we want this?

I mean, it seems nicer w/o signals, but OTOH it is more lines of code
and kernel/user freezing diverges. Can we just revert few lines,
instead of inserting 100 lines?

> +/*
> + * Freezer-friendly wrappers around wait_event_interruptible() and
> + * wait_event_interruptible_timeout(), originally defined in <linux/wait.h>
> + */
> +
> +#define wait_event_freezable(wq, condition)				\
> +({									\
> +	int __ret;							\
> +	do {								\
> +		__ret = wait_event_interruptible(wq, 			\
> +				(condition) || freezing(current));	\
> +	} while (try_to_freeze());					\
> +	__ret;								\
> +})
> +
> +
> +#define wait_event_freezable_timeout(wq, condition, timeout)		\
> +({									\
> +	long __ret = timeout;						\
> +	do {								\
> +		__ret = wait_event_interruptible_timeout(wq,		\
> +				(condition) || freezing(current),	\
> +				__ret); 				\
> +	} while (try_to_freeze());					\
> +	__ret;								\
> +})

Hohum, but yes, these are nice.

								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 3/6] Freezer: Be more verbose
  2007-07-09 20:33 ` [RFC][PATCH -mm 3/6] Freezer: Be more verbose Rafael J. Wysocki
@ 2007-07-09 23:46   ` Pavel Machek
  2007-07-10  6:01     ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-09 23:46 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Increase the freezer's verbosity a bit, so that it's easier to read problem
> reports related to it.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/process.c |   17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
> ===================================================================
> --- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
> +++ linux-2.6.22-rc6-mm1/kernel/power/process.c
> @@ -237,20 +237,23 @@ static int try_to_freeze_tasks(int freez
>   */
>  int freeze_processes(void)
>  {
> -	int error;
> +	int error = 0;

You set error unconditionally few lines below.

> -	printk("Stopping tasks ... ");
> +	printk("Freezing user space processes ... ");
>  	error = try_to_freeze_tasks(FREEZER_USER_SPACE);
>  	if (error)
> -		return error;
> +		goto Exit;
> +	printk("done.\n");
>  
> +	printk("Freezing remaining freezable tasks ... ");

can we ust do printk(" rest ... "); ? Saves one line... and suspend
process is too verbose anyway.

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems
  2007-07-09 23:12   ` Pavel Machek
@ 2007-07-10  0:31     ` Matthew Garrett
  0 siblings, 0 replies; 40+ messages in thread
From: Matthew Garrett @ 2007-07-10  0:31 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Miklos Szeredi, pm list, Oleg Nesterov

On Tue, Jul 10, 2007 at 01:12:09AM +0200, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The syncing of filesystems from within the freezer is generally not needed.
> > Also, if FUSE implements sync, this will cause the freezer to deadlock with it.
> 
> While this is right, I do not think it is good enough reason for a
> change. I mean, sync there currently hurts noone....

It's a problem if you have an ext3 filesystem loopback mounted from a 
fuse filesystem...

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-09 23:42   ` Pavel Machek
@ 2007-07-10  5:57     ` Rafael J. Wysocki
  0 siblings, 0 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10  5:57 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi,

On Tuesday, 10 July 2007 01:42, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Commit b74d0deb968e1f85942f17080eace015ce3c332c has changed
> > recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING.  For this
> > reason, the freezer should not send fake signals to kernel threads any more,
> > since otherwise some of them may run with TIF_SIGPENDING set forever if the
> > freezing of kernel threads fails.
> 
> Is there some bigger plan why we want this?
> 
> I mean, it seems nicer w/o signals, but OTOH it is more lines of code
> and kernel/user freezing diverges. Can we just revert few lines,
> instead of inserting 100 lines?

I'm not sure what you mean.

AFAICS, commit b74d0deb968e1f85942f17080eace015ce3c332c is a bugfix and is not
reversible.

> > +/*
> > + * Freezer-friendly wrappers around wait_event_interruptible() and
> > + * wait_event_interruptible_timeout(), originally defined in <linux/wait.h>
> > + */
> > +
> > +#define wait_event_freezable(wq, condition)				\
> > +({									\
> > +	int __ret;							\
> > +	do {								\
> > +		__ret = wait_event_interruptible(wq, 			\
> > +				(condition) || freezing(current));	\
> > +	} while (try_to_freeze());					\
> > +	__ret;								\
> > +})
> > +
> > +
> > +#define wait_event_freezable_timeout(wq, condition, timeout)		\
> > +({									\
> > +	long __ret = timeout;						\
> > +	do {								\
> > +		__ret = wait_event_interruptible_timeout(wq,		\
> > +				(condition) || freezing(current),	\
> > +				__ret); 				\
> > +	} while (try_to_freeze());					\
> > +	__ret;								\
> > +})
> 
> Hohum, but yes, these are nice.

:-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 3/6] Freezer: Be more verbose
  2007-07-09 23:46   ` Pavel Machek
@ 2007-07-10  6:01     ` Rafael J. Wysocki
  2007-07-10 15:05       ` Pavel Machek
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10  6:01 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi,

On Tuesday, 10 July 2007 01:46, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Increase the freezer's verbosity a bit, so that it's easier to read problem
> > reports related to it.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  kernel/power/process.c |   17 ++++++++++-------
> >  1 file changed, 10 insertions(+), 7 deletions(-)
> > 
> > Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
> > ===================================================================
> > --- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
> > +++ linux-2.6.22-rc6-mm1/kernel/power/process.c
> > @@ -237,20 +237,23 @@ static int try_to_freeze_tasks(int freez
> >   */
> >  int freeze_processes(void)
> >  {
> > -	int error;
> > +	int error = 0;
> 
> You set error unconditionally few lines below.

Ah, thanks.  There were a couple of versions of this patch and I forgot to
remove this.

> > -	printk("Stopping tasks ... ");
> > +	printk("Freezing user space processes ... ");
> >  	error = try_to_freeze_tasks(FREEZER_USER_SPACE);
> >  	if (error)
> > -		return error;
> > +		goto Exit;
> > +	printk("done.\n");
> >  
> > +	printk("Freezing remaining freezable tasks ... ");
> 
> can we ust do printk(" rest ... "); ? Saves one line... and suspend
> process is too verbose anyway.

In fact, this is related to the next patch, in which try_to_freeze_tasks() is
reworked and without this it won't be clear which phase of the freezing has
failed (in case of failure, that is).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set
  2007-07-09 23:21   ` Pavel Machek
@ 2007-07-10  6:03     ` Rafael J. Wysocki
  2007-07-10 15:05       ` Pavel Machek
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10  6:03 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

On Tuesday, 10 July 2007 01:21, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Tasks should go to the refrigerator only if explicitly requested to do that by
> > the freezer and not as a result of inheriting the TIF_FREEZE flag set from the
> > parent.  Make it happen.
> 
> Umm, what prevents userspace task from escaping freezer this way? Does
> tasklist_lock prevent new tasks and thus this race?

I think so.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-09 23:34   ` Pavel Machek
@ 2007-07-10  6:09     ` Rafael J. Wysocki
  2007-07-10 10:04       ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10  6:09 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

On Tuesday, 10 July 2007 01:34, Pavel Machek wrote:
> Hi!
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The freezer fails if there are uninterruptible tasks waiting for some frozen
> > tasks to let them continue.  Moreover, in that case try_to_freeze_tasks() loops
> > uselessly until the timeout expires which is wasteful, so in principle we should
> > make the freezer fail as soon as all the tasks that refuse to freeze are
> > uninterruptible.  However, instead of failing the freezer we can try to use the
> > time left and thaw the tasks that have already been frozen without
> > clearing the
> 
> No, we can't do that:
> 
> Imagine we have single uninterruptible task that waits for disk. It
> would exit uninterruptible state in 10msec, *but* you give up and
> unfreeze all. Now, another task goes uninterruptible waiting for
> disk and situation repeats. Livelock.

For how many times would that have to repeat before 30s of timeout expires?

Sorry, but I don't buy this argument. :-)

> Yes, this might play with races in interresting ways and help fuse,
> but we do not want the livelock in the first place.

I think that the "livelock" will never happen.

Besides, we can add another timeout for breaking the loop from a "locked up"
state.  Anyway, waiting for 20s (as without the patch) when it is _certain_
that we will fail doesn't make sense ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10  6:09     ` Rafael J. Wysocki
@ 2007-07-10 10:04       ` Rafael J. Wysocki
  2007-07-10 17:17         ` Oleg Nesterov
                           ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 10:04 UTC (permalink / raw)
  To: linux-pm; +Cc: Matthew Garrett, Oleg Nesterov, Pavel Machek, Miklos Szeredi

On Tuesday, 10 July 2007 08:09, Rafael J. Wysocki wrote:
> On Tuesday, 10 July 2007 01:34, Pavel Machek wrote:
> > Hi!
> > 
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > The freezer fails if there are uninterruptible tasks waiting for some frozen
> > > tasks to let them continue.  Moreover, in that case try_to_freeze_tasks() loops
> > > uselessly until the timeout expires which is wasteful, so in principle we should
> > > make the freezer fail as soon as all the tasks that refuse to freeze are
> > > uninterruptible.  However, instead of failing the freezer we can try to use the
> > > time left and thaw the tasks that have already been frozen without
> > > clearing the
> > 
> > No, we can't do that:
> > 
> > Imagine we have single uninterruptible task that waits for disk. It
> > would exit uninterruptible state in 10msec, *but* you give up and
> > unfreeze all. Now, another task goes uninterruptible waiting for
> > disk and situation repeats. Livelock.
> 
> For how many times would that have to repeat before 30s of timeout expires?
> 
> Sorry, but I don't buy this argument. :-)
> 
> > Yes, this might play with races in interresting ways and help fuse,
> > but we do not want the livelock in the first place.
> 
> I think that the "livelock" will never happen.
> 
> Besides, we can add another timeout for breaking the loop from a "locked up"
> state.

Actually I like this idea. :-)

I have updated the patch to use the additional timeout, please have a look
(below).

Greetings,
Rafael


---
From: Rafael J. Wysocki <rjw@sisk.pl>

There is the problem with try_to_freeze_tasks() that it always loops until the
timeout expires, even if it is certain to fail much earlier.  Namely, if there
are uninterruptible tasks waiting for some frozen tasks to let them continue,
try_to_freeze_tasks() will certainly fail and it shouldn't waste time in that
cases.

To detect such situations, we can count the number of tasks that haven't been
frozen yet and the number of uninterruptible tasks among them.  If these two
numbers haven't been changing for sufficiently long time and are equal, then
most probably some uninterruptible tasks are blocked by some frozen tasks and we
should break out of this deadlock.  Thus, it seems reasonable to thaw the tasks
that have already been frozen without clearing the freeze requests of the tasks
that are refusing to freeze.  This way, if these tasks are really blocked by the
frozen ones, they will get extra chance to freeze themselves after we have
thawed the other tasks.  Next, we should repeat the freezing loop and so on,
until the timeout expires.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   88 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 75 insertions(+), 13 deletions(-)

Index: linux-2.6.22-rc6-mm1/kernel/power/process.c
===================================================================
--- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
+++ linux-2.6.22-rc6-mm1/kernel/power/process.c
@@ -14,10 +14,11 @@
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
 
-/* 
- * Timeout for stopping processes
- */
-#define TIMEOUT	(20 * HZ)
+/* Timeout for the freezing of tasks */
+#define TIMEOUT	(30 * HZ)
+
+/* Timeout for breaking the freezing loop if lock-up state is detected */
+#define BREAK_TIMEOUT	(HZ / 10)
 
 #define FREEZER_KERNEL_THREADS 0
 #define FREEZER_USER_SPACE 1
@@ -172,15 +173,33 @@ static void cancel_freezing(struct task_
 	}
 }
 
+/**
+ *	try_to_freeze_tasks - send freeze requests to all tasks and wait for
+ *		for them to enter the refrigerator
+ *	@freeze_user_space - if set, only tasks that have mm of their own are
+ *		requested to freeze
+ *
+ *	If this function fails, thaw_tasks() must be called to do the cleanup.
+ */
+
 static int try_to_freeze_tasks(int freeze_user_space)
 {
 	struct task_struct *g, *p;
-	unsigned long end_time;
-	unsigned int todo;
+	unsigned long end_time, break_time;
+	unsigned int todo, blocking, prev_blocking;
+	unsigned int i = 0;
+	char *tick = "-\\|/";
+
+	printk(" ");
 
 	end_time = jiffies + TIMEOUT;
+ Repeat:
+	break_time = 0;
+	blocking = 0;
 	do {
 		todo = 0;
+		prev_blocking = blocking;
+		blocking = 0;
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
 			if (frozen(p) || !freezeable(p))
@@ -197,25 +216,68 @@ static int try_to_freeze_tasks(int freez
 			} else {
 				freeze_task(p);
 			}
-			if (!freezer_should_skip(p))
+			if (!freezer_should_skip(p)) {
 				todo++;
+				if (freeze_user_space &&
+				    (p->state == TASK_UNINTERRUPTIBLE))
+					blocking++;
+			}
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+
 		yield();			/* Yield is okay here */
+
 		if (time_after(jiffies, end_time))
 			break;
+
+		/*
+		 * Check if we are making or we are likely to make any progress.
+		 * If not, we should better break out of this.
+		 */
+		if (todo && todo == blocking && blocking == prev_blocking) {
+			if (!break_time)
+				break_time = jiffies + BREAK_TIMEOUT;
+			else if (time_after(jiffies, break_time))
+				break;
+		} else {
+			break_time = 0;
+		}
 	} while (todo);
 
+	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
+		/*
+		 * Some tasks have not been able to freeze.  They might be stuck
+		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
+		 * thaw the tasks that have frozen without clearing the freeze
+		 * requests of the remaining tasks and repeat.
+		 */
+		read_lock(&tasklist_lock);
+		do_each_thread(g, p) {
+			if (frozen(p)) {
+				p->flags &= ~PF_FROZEN;
+				wake_up_process(p);
+			}
+		} while_each_thread(g, p);
+		read_unlock(&tasklist_lock);
+
+		yield();
+
+		printk("\b%c", tick[i++%4]);
+
+		goto Repeat;
+	}
+
+	printk("\b");
+
 	if (todo) {
-		/* This does not unfreeze processes that are already frozen
-		 * (we have slightly ugly calling convention in that respect,
-		 * and caller must call thaw_processes() if something fails),
-		 * but it cleans up leftover PF_FREEZE requests.
+		/*
+		 * The freezing of tasks has failed.  List the tasks that have
+		 * refused to freeze.  This also clears all pending freeze
+		 * requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of %s timed out after %d seconds "
+		printk(KERN_ERR "Freezing of tasks timed out after %d seconds "
 				"(%d tasks refusing to freeze):\n",
-				freeze_user_space ? "user space " : "tasks ",
 				TIMEOUT / HZ, todo);
 		show_state();
 		read_lock(&tasklist_lock);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
  2007-07-09 23:42   ` Pavel Machek
@ 2007-07-10 15:00   ` Oleg Nesterov
  2007-07-10 21:08     ` Rafael J. Wysocki
  1 sibling, 1 reply; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 15:00 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, pm list

On 07/09, Rafael J. Wysocki wrote:
>
> Commit b74d0deb968e1f85942f17080eace015ce3c332c has changed
> recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING.  For this
> reason, the freezer should not send fake signals to kernel threads any more,
> since otherwise some of them may run with TIF_SIGPENDING set forever if the
> freezing of kernel threads fails.

I personally think it is very good to get rid of signals to kthread, regardless
of changed behaviour of recalc_sigpending_tsk().

> +static int freeze_user_process(struct task_struct *p)
> +{
> +	int ret = 1;
> +
> +	task_lock(p);
> +	if (has_mm(p)) {
> +		if (freezing(p)) {
> +			if (!signal_pending(p))
> +				fake_signal_wake_up(p, 0);
> +			else
> +				wake_up_state(p, TASK_INTERRUPTIBLE);

Why do we need the "else" branch? It is already a bug if the task sleeps
in TASK_INTERRUPTIBLE state but has signal_pending().

The same for freeze_task(). Actually, they look very similar. Imho, it would
be better to have a single function with a "user_space_only" parameter.

> @@ -99,7 +99,6 @@ static inline int has_pending_signals(si
>  static int recalc_sigpending_tsk(struct task_struct *t)
>  {
>  	if (t->signal->group_stop_count > 0 ||
> -	    (freezing(t)) ||
>  	    PENDING(&t->pending, &t->blocked) ||
>  	    PENDING(&t->signal->shared_pending, &t->blocked)) {
>  		set_tsk_thread_flag(t, TIF_SIGPENDING);

This is important (and imho good) change, but changelog says nothing about it.

I guess this should works because freeze_user_process/freeze_task repeatedly
"re-send" the signal in a loop when TIF_SIGPENDING is cleared, yes?

> +#define wait_event_freezable(wq, condition)				\
> +({									\
> +	int __ret;							\
> +	do {								\
> +		__ret = wait_event_interruptible(wq, 			\
> +				(condition) || freezing(current));	\
> +	} while (try_to_freeze());					\
> +	__ret;								\
> +})

I don't think this is right.

wait_event_freezable() should return success _only_ if "condition" == true.
What if TIF_FREEZE was cleared between freezing() and try_to_freeze() ?

> --- linux-2.6.22-rc6-mm1.orig/drivers/input/gameport/gameport.c
> +++ linux-2.6.22-rc6-mm1/drivers/input/gameport/gameport.c
> @@ -448,9 +448,8 @@ static int gameport_thread(void *nothing
>  	set_freezable();
>  	do {
>  		gameport_handle_event();
> -		wait_event_interruptible(gameport_wait,
> +		wait_event_freezable(gameport_wait,
>  			kthread_should_stop() || !list_empty(&gameport_event_list));
> -		try_to_freeze();
>  	} while (!kthread_should_stop());

Isn't it better to break this patch into 2 separate ones? The first adds
wait_event_freezable() and "fixes" gameport_thread() and friends in advance,
the second deals with TIF_SIGPENDING. (please ignore if this is not convenient).

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 3/6] Freezer: Be more verbose
  2007-07-10  6:01     ` Rafael J. Wysocki
@ 2007-07-10 15:05       ` Pavel Machek
  0 siblings, 0 replies; 40+ messages in thread
From: Pavel Machek @ 2007-07-10 15:05 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

Hi!

> > > --- linux-2.6.22-rc6-mm1.orig/kernel/power/process.c
> > > +++ linux-2.6.22-rc6-mm1/kernel/power/process.c
> > > @@ -237,20 +237,23 @@ static int try_to_freeze_tasks(int freez
> > >   */
> > >  int freeze_processes(void)
> > >  {
> > > -	int error;
> > > +	int error = 0;
> > 
> > You set error unconditionally few lines below.
> 
> Ah, thanks.  There were a couple of versions of this patch and I forgot to
> remove this.
> 
> > > -	printk("Stopping tasks ... ");
> > > +	printk("Freezing user space processes ... ");
> > >  	error = try_to_freeze_tasks(FREEZER_USER_SPACE);
> > >  	if (error)
> > > -		return error;
> > > +		goto Exit;
> > > +	printk("done.\n");
> > >  
> > > +	printk("Freezing remaining freezable tasks ... ");
> > 
> > can we ust do printk(" rest ... "); ? Saves one line... and suspend
> > process is too verbose anyway.
> 
> In fact, this is related to the next patch, in which try_to_freeze_tasks() is
> reworked and without this it won't be clear which phase of the freezing has
> failed (in case of failure, that is).

Ok, ACK , then.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set
  2007-07-10  6:03     ` Rafael J. Wysocki
@ 2007-07-10 15:05       ` Pavel Machek
  0 siblings, 0 replies; 40+ messages in thread
From: Pavel Machek @ 2007-07-10 15:05 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, pm list, Oleg Nesterov

> On Tuesday, 10 July 2007 01:21, Pavel Machek wrote:
> > Hi!
> > 
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Tasks should go to the refrigerator only if explicitly requested to do that by
> > > the freezer and not as a result of inheriting the TIF_FREEZE flag set from the
> > > parent.  Make it happen.
> > 
> > Umm, what prevents userspace task from escaping freezer this way? Does
> > tasklist_lock prevent new tasks and thus this race?
> 
> I think so.

I think so, too. ACK.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 10:04       ` Rafael J. Wysocki
@ 2007-07-10 17:17         ` Oleg Nesterov
  2007-07-10 20:30           ` Rafael J. Wysocki
  2007-07-10 18:50         ` Oleg Nesterov
  2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
  2 siblings, 1 reply; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 17:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On 07/10, Rafael J. Wysocki wrote:
>
> > Besides, we can add another timeout for breaking the loop from a "locked up"
> > state.
> 
> Actually I like this idea. :-)
> 
> I have updated the patch to use the additional timeout, please have a look
> (below).

Q: do we really need these complications with UNINTERRUPTIBLE/blocked ?

Just watch the "todo", it it doesn't decrease during BREAK_TIMEOUT, then retry.
No?

> +	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
> +		/*
> +		 * Some tasks have not been able to freeze.  They might be stuck
> +		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
> +		 * thaw the tasks that have frozen without clearing the freeze
> +		 * requests of the remaining tasks and repeat.
> +		 */

Another question, why do we check "freeze_user_space" here?

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 10:04       ` Rafael J. Wysocki
  2007-07-10 17:17         ` Oleg Nesterov
@ 2007-07-10 18:50         ` Oleg Nesterov
  2007-07-10 19:54           ` Rafael J. Wysocki
  2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
  2 siblings, 1 reply; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 18:50 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On 07/10, Rafael J. Wysocki wrote:
>
> +	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
> +		/*
> +		 * Some tasks have not been able to freeze.  They might be stuck
> +		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
> +		 * thaw the tasks that have frozen without clearing the freeze
> +		 * requests of the remaining tasks and repeat.
> +		 */
> +		read_lock(&tasklist_lock);
> +		do_each_thread(g, p) {
> +			if (frozen(p)) {
> +				p->flags &= ~PF_FROZEN;
> +				wake_up_process(p);

I just noticed we don't use thaw_process(), this means that the retry doesn't
play well with wait_event_freezable() introduced in the previous patch.

Suppose that kthread_stop(T) hangs and blocks the freezer, and T does

	while (!kthread_should_stop()) {
		wait_event_freezable(...);
		do_something();
	}

and it is freezed. We clear PF_FROZEN but not TIF_FREEZE, wait_event_freezable
checks freezing() and goes to refrigerator again.

Another problem is that we only count UNINTERRUPTIBLE tasks to make a decision
about retry, while wait_event_freezable() sleeps NTERRUPTIBLE.

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 18:50         ` Oleg Nesterov
@ 2007-07-10 19:54           ` Rafael J. Wysocki
  2007-07-10 20:35             ` Oleg Nesterov
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 19:54 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On Tuesday, 10 July 2007 20:50, Oleg Nesterov wrote:
> On 07/10, Rafael J. Wysocki wrote:
> >
> > +	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
> > +		/*
> > +		 * Some tasks have not been able to freeze.  They might be stuck
> > +		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
> > +		 * thaw the tasks that have frozen without clearing the freeze
> > +		 * requests of the remaining tasks and repeat.
> > +		 */
> > +		read_lock(&tasklist_lock);
> > +		do_each_thread(g, p) {
> > +			if (frozen(p)) {
> > +				p->flags &= ~PF_FROZEN;
> > +				wake_up_process(p);
> 
> I just noticed we don't use thaw_process(), this means that the retry doesn't
> play well with wait_event_freezable() introduced in the previous patch.
> 
> Suppose that kthread_stop(T) hangs and blocks the freezer, and T does

The retry mechanism only applies to user land processes. :-)

> 	while (!kthread_should_stop()) {
> 		wait_event_freezable(...);
> 		do_something();
> 	}
> 
> and it is freezed. We clear PF_FROZEN but not TIF_FREEZE, wait_event_freezable
> checks freezing() and goes to refrigerator again.

PF_FROZEN and TIF_FREEZE are mutually exclusive as long as the thawing is
not racing with refrigerator().  I don't think that this is of concern here, but we can take
task_lock() in the freezing loop.

> Another problem is that we only count UNINTERRUPTIBLE tasks to make a decision
> about retry, while wait_event_freezable() sleeps NTERRUPTIBLE.

The retry thing doesn't cover kernel threads, because they aren't supposed to
block the freezer.  They are supposed to freeze voluntarily and to know
_exactly_ what they are doing.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 17:17         ` Oleg Nesterov
@ 2007-07-10 20:30           ` Rafael J. Wysocki
  2007-07-10 20:55             ` Oleg Nesterov
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 20:30 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On Tuesday, 10 July 2007 19:17, Oleg Nesterov wrote:
> On 07/10, Rafael J. Wysocki wrote:
> >
> > > Besides, we can add another timeout for breaking the loop from a "locked up"
> > > state.
> > 
> > Actually I like this idea. :-)
> > 
> > I have updated the patch to use the additional timeout, please have a look
> > (below).
> 
> Q: do we really need these complications with UNINTERRUPTIBLE/blocked ?

Yes, I think so.

> Just watch the "todo", it it doesn't decrease during BREAK_TIMEOUT, then retry.
> No?

No, it is computed from the start in every interation and I compare the result
from the previous iteration with the current result.  Still, I don't compare 'todo'
with 'todo from the previous step' directly, because the lock up can only
happen if 'todo' is equal to 'blocking'.  For this reason, it is sufficient to
compare 'blocking' with 'prev_blocking' (ie. 'blocking' from the previous
step) and 'todo' with 'blocking' (the test if 'todo' > 0 is not essential;
well, I should remove it in accordance with my signature ;-)).

> > +	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
> > +		/*
> > +		 * Some tasks have not been able to freeze.  They might be stuck
> > +		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
> > +		 * thaw the tasks that have frozen without clearing the freeze
> > +		 * requests of the remaining tasks and repeat.
> > +		 */
> 
> Another question, why do we check "freeze_user_space" here?

Well, the assumption is that the freezable kernel threads won't lock up the
freezer in such a way (that may be overoptimistic, but I'd like to make it
for now).  IOW, the retry mechanism is only activated during the freezing
of user space (in principle it might be used in the second phase too, but
I'm not sure if that's really necessary).

Greetings,
Rafael

-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 19:54           ` Rafael J. Wysocki
@ 2007-07-10 20:35             ` Oleg Nesterov
  2007-07-10 20:57               ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 20:35 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On 07/10, Rafael J. Wysocki wrote:
>
> On Tuesday, 10 July 2007 20:50, Oleg Nesterov wrote:
> > 
> > I just noticed we don't use thaw_process(), this means that the retry doesn't
> > play well with wait_event_freezable() introduced in the previous patch.
> > 
> > Suppose that kthread_stop(T) hangs and blocks the freezer, and T does
> 
> The retry mechanism only applies to user land processes. :-)
> 
> > 	while (!kthread_should_stop()) {
> > 		wait_event_freezable(...);
> > 		do_something();
> > 	}
> > 
> > and it is freezed. We clear PF_FROZEN but not TIF_FREEZE, wait_event_freezable
> > checks freezing() and goes to refrigerator again.
> 
> PF_FROZEN and TIF_FREEZE are mutually exclusive as long as the thawing is
> not racing with refrigerator().

Ah yes, I forgot. Thanks!

> > Another problem is that we only count UNINTERRUPTIBLE tasks to make a decision
> > about retry, while wait_event_freezable() sleeps NTERRUPTIBLE.
> 
> The retry thing doesn't cover kernel threads, because they aren't supposed to
> block the freezer.  They are supposed to freeze voluntarily and to know
> _exactly_ what they are doing.

Ok, I see... So, the plan is to eventually fix the "problematic" things like
kthread_stop(), yes?  Unless I missed something again, we can't reliably
kthread_stop() the thread above even if it does

	wait_event_freezable(my_condition || kthread_should_stop())

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 20:30           ` Rafael J. Wysocki
@ 2007-07-10 20:55             ` Oleg Nesterov
  2007-07-10 21:15               ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 20:55 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On 07/10, Rafael J. Wysocki wrote:
>
> On Tuesday, 10 July 2007 19:17, Oleg Nesterov wrote:
> 
> > Just watch the "todo", it it doesn't decrease during BREAK_TIMEOUT, then retry.
> > No?
> 
> No, it is computed from the start in every interation and I compare the result
> from the previous iteration with the current result.  Still, I don't compare 'todo'
> with 'todo from the previous step' directly, because the lock up can only
> happen if 'todo' is equal to 'blocking'.  For this reason, it is sufficient to
> compare 'blocking' with 'prev_blocking' (ie. 'blocking' from the previous
> step) and 'todo' with 'blocking' (the test if 'todo' > 0 is not essential;
> well, I should remove it in accordance with my signature ;-)).

Yes, I see what the code does. My question was: why it is not good enough to just
compare 'todo' with 'todo from the previous step' ? If 'todo' does not decrease
during the BREAK_TIMEOUT period, probably we should retry. A fork() from user-space
should not succeed because we send a fake signal to all tasks.

> > > +	if (todo && freeze_user_space && !time_after(jiffies, end_time)) {
> > > +		/*
> > > +		 * Some tasks have not been able to freeze.  They might be stuck
> > > +		 * in TASK_UNINTERRUPTIBLE waiting for the frozen tasks.  Try to
> > > +		 * thaw the tasks that have frozen without clearing the freeze
> > > +		 * requests of the remaining tasks and repeat.
> > > +		 */
> > 
> > Another question, why do we check "freeze_user_space" here?
> 
> Well, the assumption is that the freezable kernel threads won't lock up the
> freezer in such a way (that may be overoptimistic, but I'd like to make it
> for now).

OK, understand.

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 20:35             ` Oleg Nesterov
@ 2007-07-10 20:57               ` Rafael J. Wysocki
  0 siblings, 0 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 20:57 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On Tuesday, 10 July 2007 22:35, Oleg Nesterov wrote:
> On 07/10, Rafael J. Wysocki wrote:
> >
> > On Tuesday, 10 July 2007 20:50, Oleg Nesterov wrote:
> > > 
> > > I just noticed we don't use thaw_process(), this means that the retry doesn't
> > > play well with wait_event_freezable() introduced in the previous patch.
> > > 
> > > Suppose that kthread_stop(T) hangs and blocks the freezer, and T does
> > 
> > The retry mechanism only applies to user land processes. :-)
> > 
> > > 	while (!kthread_should_stop()) {
> > > 		wait_event_freezable(...);
> > > 		do_something();
> > > 	}
> > > 
> > > and it is freezed. We clear PF_FROZEN but not TIF_FREEZE, wait_event_freezable
> > > checks freezing() and goes to refrigerator again.
> > 
> > PF_FROZEN and TIF_FREEZE are mutually exclusive as long as the thawing is
> > not racing with refrigerator().
> 
> Ah yes, I forgot. Thanks!
> 
> > > Another problem is that we only count UNINTERRUPTIBLE tasks to make a decision
> > > about retry, while wait_event_freezable() sleeps NTERRUPTIBLE.
> > 
> > The retry thing doesn't cover kernel threads, because they aren't supposed to
> > block the freezer.  They are supposed to freeze voluntarily and to know
> > _exactly_ what they are doing.
> 
> Ok, I see... So, the plan is to eventually fix the "problematic" things like
> kthread_stop(), yes?

Yes.  Last time I looked at it kthread_stop() was a moving target. ;-)

> Unless I missed something again, we can't reliably 
> kthread_stop() the thread above even if it does
> 
> 	wait_event_freezable(my_condition || kthread_should_stop())

Not sure, will check.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-10 15:00   ` Oleg Nesterov
@ 2007-07-10 21:08     ` Rafael J. Wysocki
  2007-07-10 21:22       ` Oleg Nesterov
  0 siblings, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 21:08 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, pm list

On Tuesday, 10 July 2007 17:00, Oleg Nesterov wrote:
> On 07/09, Rafael J. Wysocki wrote:
> >
> > Commit b74d0deb968e1f85942f17080eace015ce3c332c has changed
> > recalc_sigpending_tsk() so that it doesn't clear TIF_SIGPENDING.  For this
> > reason, the freezer should not send fake signals to kernel threads any more,
> > since otherwise some of them may run with TIF_SIGPENDING set forever if the
> > freezing of kernel threads fails.
> 
> I personally think it is very good to get rid of signals to kthread, regardless
> of changed behaviour of recalc_sigpending_tsk().

Yes, anyway. :-)

> > +static int freeze_user_process(struct task_struct *p)
> > +{
> > +	int ret = 1;
> > +
> > +	task_lock(p);
> > +	if (has_mm(p)) {
> > +		if (freezing(p)) {
> > +			if (!signal_pending(p))
> > +				fake_signal_wake_up(p, 0);
> > +			else
> > +				wake_up_state(p, TASK_INTERRUPTIBLE);
> 
> Why do we need the "else" branch? It is already a bug if the task sleeps
> in TASK_INTERRUPTIBLE state but has signal_pending().
> 
> The same for freeze_task(). Actually, they look very similar. Imho, it would
> be better to have a single function with a "user_space_only" parameter.

OK, will do.

> > @@ -99,7 +99,6 @@ static inline int has_pending_signals(si
> >  static int recalc_sigpending_tsk(struct task_struct *t)
> >  {
> >  	if (t->signal->group_stop_count > 0 ||
> > -	    (freezing(t)) ||
> >  	    PENDING(&t->pending, &t->blocked) ||
> >  	    PENDING(&t->signal->shared_pending, &t->blocked)) {
> >  		set_tsk_thread_flag(t, TIF_SIGPENDING);
> 
> This is important (and imho good) change, but changelog says nothing about it.

Yes, I need to modify the changelog.

> I guess this should works because freeze_user_process/freeze_task repeatedly
> "re-send" the signal in a loop when TIF_SIGPENDING is cleared, yes?

Exactly.

> > +#define wait_event_freezable(wq, condition)				\
> > +({									\
> > +	int __ret;							\
> > +	do {								\
> > +		__ret = wait_event_interruptible(wq, 			\
> > +				(condition) || freezing(current));	\
> > +	} while (try_to_freeze());					\
> > +	__ret;								\
> > +})
> 
> I don't think this is right.
> 
> wait_event_freezable() should return success _only_ if "condition" == true.
> What if TIF_FREEZE was cleared between freezing() and try_to_freeze() ?

Hmm, that means we should loop while(!(condition)) .  I didn't do that
previously,  because I thought that would be a problem if (condition)
is satisfied in wait_event_interruptible(), but then changes immediately.

> > --- linux-2.6.22-rc6-mm1.orig/drivers/input/gameport/gameport.c
> > +++ linux-2.6.22-rc6-mm1/drivers/input/gameport/gameport.c
> > @@ -448,9 +448,8 @@ static int gameport_thread(void *nothing
> >  	set_freezable();
> >  	do {
> >  		gameport_handle_event();
> > -		wait_event_interruptible(gameport_wait,
> > +		wait_event_freezable(gameport_wait,
> >  			kthread_should_stop() || !list_empty(&gameport_event_list));
> > -		try_to_freeze();
> >  	} while (!kthread_should_stop());
> 
> Isn't it better to break this patch into 2 separate ones? The first adds
> wait_event_freezable() and "fixes" gameport_thread() and friends in advance,
> the second deals with TIF_SIGPENDING. (please ignore if this is not convenient).

Not really.  I thought about that, but the change in gameport_thread() and
friends is actually necessary _because_ we don't send signals to kernel
threads any more.

I'll rework the patch in a while, but I think I'll send the entire series once
again in a new thread, tomorrow.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 10:04       ` Rafael J. Wysocki
  2007-07-10 17:17         ` Oleg Nesterov
  2007-07-10 18:50         ` Oleg Nesterov
@ 2007-07-10 21:13         ` Pavel Machek
  2007-07-10 21:38           ` Oleg Nesterov
  2007-07-10 21:39           ` Rafael J. Wysocki
  2 siblings, 2 replies; 40+ messages in thread
From: Pavel Machek @ 2007-07-10 21:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi!

> > > No, we can't do that:
> > > 
> > > Imagine we have single uninterruptible task that waits for disk. It
> > > would exit uninterruptible state in 10msec, *but* you give up and
> > > unfreeze all. Now, another task goes uninterruptible waiting for
> > > disk and situation repeats. Livelock.
> > 
> > For how many times would that have to repeat before 30s of timeout expires?
> > 
> > Sorry, but I don't buy this argument. :-)
> > 
> > > Yes, this might play with races in interresting ways and help fuse,
> > > but we do not want the livelock in the first place.
> > 
> > I think that the "livelock" will never happen.
> > 
> > Besides, we can add another timeout for breaking the loop from a "locked up"
> > state.
> 
> Actually I like this idea. :-)
> 
> I have updated the patch to use the additional timeout, please have a look
> (below).

Yes, this one could actually work... _really_ inefficiently.

Task here is to sort the tasks, and freeze them in such order that
freezing works, right? Yep, we do not know the dependencies
explicitely... but what you invented is bogosort.

Yes, it does increase chance that freezing succeeds, but I do not
think increase of chance is worth having bogosort in tree :-).

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently
  2007-07-10 20:55             ` Oleg Nesterov
@ 2007-07-10 21:15               ` Rafael J. Wysocki
  0 siblings, 0 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 21:15 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Matthew Garrett, linux-pm, Pavel Machek, Miklos Szeredi

On Tuesday, 10 July 2007 22:55, Oleg Nesterov wrote:
> On 07/10, Rafael J. Wysocki wrote:
> >
> > On Tuesday, 10 July 2007 19:17, Oleg Nesterov wrote:
> > 
> > > Just watch the "todo", it it doesn't decrease during BREAK_TIMEOUT, then retry.
> > > No?
> > 
> > No, it is computed from the start in every interation and I compare the result
> > from the previous iteration with the current result.  Still, I don't compare 'todo'
> > with 'todo from the previous step' directly, because the lock up can only
> > happen if 'todo' is equal to 'blocking'.  For this reason, it is sufficient to
> > compare 'blocking' with 'prev_blocking' (ie. 'blocking' from the previous
> > step) and 'todo' with 'blocking' (the test if 'todo' > 0 is not essential;
> > well, I should remove it in accordance with my signature ;-)).
> 
> Yes, I see what the code does. My question was: why it is not good enough to just
> compare 'todo' with 'todo from the previous step' ? If 'todo' does not decrease
> during the BREAK_TIMEOUT period, probably we should retry. A fork() from user-space
> should not succeed because we send a fake signal to all tasks.

Hmm, yes,  I think that's reasonable.  I'll try to do it this way

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads
  2007-07-10 21:08     ` Rafael J. Wysocki
@ 2007-07-10 21:22       ` Oleg Nesterov
  0 siblings, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 21:22 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Matthew Garrett, Miklos Szeredi, Pavel Machek, pm list

On 07/10, Rafael J. Wysocki wrote:
>
> On Tuesday, 10 July 2007 17:00, Oleg Nesterov wrote:
> > > --- linux-2.6.22-rc6-mm1.orig/drivers/input/gameport/gameport.c
> > > +++ linux-2.6.22-rc6-mm1/drivers/input/gameport/gameport.c
> > > @@ -448,9 +448,8 @@ static int gameport_thread(void *nothing
> > >  	set_freezable();
> > >  	do {
> > >  		gameport_handle_event();
> > > -		wait_event_interruptible(gameport_wait,
> > > +		wait_event_freezable(gameport_wait,
> > >  			kthread_should_stop() || !list_empty(&gameport_event_list));
> > > -		try_to_freeze();
> > >  	} while (!kthread_should_stop());
> > 
> > Isn't it better to break this patch into 2 separate ones? The first adds
> > wait_event_freezable() and "fixes" gameport_thread() and friends in advance,
> > the second deals with TIF_SIGPENDING. (please ignore if this is not convenient).
> 
> Not really.  I thought about that, but the change in gameport_thread() and
> friends is actually necessary _because_ we don't send signals to kernel
> threads any more.

Yes sure. But this change looks correct even if we do send the signal, so it
can go first. It would be nice to have the actual "do not send the signal"
change in a separate patch.

But again, this is minor, please ignore if you don't like this for whatever
reason.

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
@ 2007-07-10 21:38           ` Oleg Nesterov
  2007-07-10 21:39           ` Rafael J. Wysocki
  1 sibling, 0 replies; 40+ messages in thread
From: Oleg Nesterov @ 2007-07-10 21:38 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, linux-pm, Miklos Szeredi

On 07/10, Pavel Machek wrote:
> 
> > > > No, we can't do that:
> > > > 
> > > > Imagine we have single uninterruptible task that waits for disk. It
> > > > would exit uninterruptible state in 10msec, *but* you give up and
> > > > unfreeze all. Now, another task goes uninterruptible waiting for
> > > > disk and situation repeats. Livelock.
> > > 
> > > For how many times would that have to repeat before 30s of timeout expires?
> > > 
> > > Sorry, but I don't buy this argument. :-)
> > > 
> > > > Yes, this might play with races in interresting ways and help fuse,
> > > > but we do not want the livelock in the first place.
> > > 
> > > I think that the "livelock" will never happen.
> > > 
> > > Besides, we can add another timeout for breaking the loop from a "locked up"
> > > state.
> > 
> > Actually I like this idea. :-)
> > 
> > I have updated the patch to use the additional timeout, please have a look
> > (below).
> 
> Yes, this one could actually work... _really_ inefficiently.

Why inefficiently?

I am asking because I am curious (I never used freezer for myself): how long
does it take to freeze all tasks? I can't believe we need 20 seconds unless
something goes wrong. It looks very natural to do what Rafael suggests.

Oleg.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
  2007-07-10 21:38           ` Oleg Nesterov
@ 2007-07-10 21:39           ` Rafael J. Wysocki
  2007-07-10 21:39             ` Pavel Machek
  1 sibling, 1 reply; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 21:39 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi,

On Tuesday, 10 July 2007 23:13, Pavel Machek wrote:
> Hi!
> 
> > > > No, we can't do that:
> > > > 
> > > > Imagine we have single uninterruptible task that waits for disk. It
> > > > would exit uninterruptible state in 10msec, *but* you give up and
> > > > unfreeze all. Now, another task goes uninterruptible waiting for
> > > > disk and situation repeats. Livelock.
> > > 
> > > For how many times would that have to repeat before 30s of timeout expires?
> > > 
> > > Sorry, but I don't buy this argument. :-)
> > > 
> > > > Yes, this might play with races in interresting ways and help fuse,
> > > > but we do not want the livelock in the first place.
> > > 
> > > I think that the "livelock" will never happen.
> > > 
> > > Besides, we can add another timeout for breaking the loop from a "locked up"
> > > state.
> > 
> > Actually I like this idea. :-)
> > 
> > I have updated the patch to use the additional timeout, please have a look
> > (below).
> 
> Yes, this one could actually work... _really_ inefficiently.

Not _that_ inefficiently ...

> Task here is to sort the tasks, and freeze them in such order that
> freezing works, right? Yep, we do not know the dependencies
> explicitely... but what you invented is bogosort.

Define, please?

> Yes, it does increase chance that freezing succeeds, but I do not
> think increase of chance is worth having bogosort in tree :-).

Well.  Do you think it's better to uselessly wait for 20s?

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 21:39           ` Rafael J. Wysocki
@ 2007-07-10 21:39             ` Pavel Machek
  2007-07-10 22:07               ` Rafael J. Wysocki
  0 siblings, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-10 21:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi!

> > Task here is to sort the tasks, and freeze them in such order that
> > freezing works, right? Yep, we do not know the dependencies
> > explicitely... but what you invented is bogosort.
> 
> Define, please?

Uff, sorry. http://en.wikipedia.org/wiki/Bogosort .

> > Yes, it does increase chance that freezing succeeds, but I do not
> > think increase of chance is worth having bogosort in tree :-).
> 
> Well.  Do you think it's better to uselessly wait for 20s?

I'd prefer not to add more randomness in the process. If we have
freezing failure in 10% cases now, 2 retries will make it 1%... that
will be ugly to debug...

Hmm, if someone wants to retry... perhaps we should just return
specific error and let the _userland_ do the retries? It can do it as
efficiently as outer loop in freezer...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 21:39             ` Pavel Machek
@ 2007-07-10 22:07               ` Rafael J. Wysocki
  2007-07-10 22:21                 ` Pavel Machek
  2007-07-23  8:04                 ` Pavel Machek
  0 siblings, 2 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-10 22:07 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

On Tuesday, 10 July 2007 23:39, Pavel Machek wrote:
> Hi!
> 
> > > Task here is to sort the tasks, and freeze them in such order that
> > > freezing works, right? Yep, we do not know the dependencies
> > > explicitely... but what you invented is bogosort.
> > 
> > Define, please?
> 
> Uff, sorry. http://en.wikipedia.org/wiki/Bogosort .

So this thing is not bogosort, because it makes the tasks that haven't frozen
appear on the bottom.  IOW, the first (failing) freezing attempt is used as
a sieve, so the process is not completely random.

This actually is important, because the things like FUSE should be taken
care of within a couple of iterations.

> > > Yes, it does increase chance that freezing succeeds, but I do not
> > > think increase of chance is worth having bogosort in tree :-).
> > 
> > Well.  Do you think it's better to uselessly wait for 20s?
> 
> I'd prefer not to add more randomness in the process. If we have
> freezing failure in 10% cases now, 2 retries will make it 1%... that
> will be ugly to debug...

No more ugly then it's now, I'd say ...

> Hmm, if someone wants to retry... perhaps we should just return
> specific error and let the _userland_ do the retries? It can do it as
> efficiently as outer loop in freezer...

No, because _that_ would be bogosort, as defined in Wikipedia. :-)

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 22:07               ` Rafael J. Wysocki
@ 2007-07-10 22:21                 ` Pavel Machek
  2007-07-23  8:04                 ` Pavel Machek
  1 sibling, 0 replies; 40+ messages in thread
From: Pavel Machek @ 2007-07-10 22:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi!

> > > > Task here is to sort the tasks, and freeze them in such order that
> > > > freezing works, right? Yep, we do not know the dependencies
> > > > explicitely... but what you invented is bogosort.
> > > 
> > > Define, please?
> > 
> > Uff, sorry. http://en.wikipedia.org/wiki/Bogosort .
> 
> So this thing is not bogosort, because it makes the tasks that haven't frozen
> appear on the bottom.  IOW, the first (failing) freezing attempt is used as
> a sieve, so the process is not completely random.
> 
> This actually is important, because the things like FUSE should be taken
> care of within a couple of iterations.

Aha, ok, I misunderstood the code. So you try to freeze the things
that could not be frozen, first? Hmm, something like that could
work...

> > Hmm, if someone wants to retry... perhaps we should just return
> > specific error and let the _userland_ do the retries? It can do it as
> > efficiently as outer loop in freezer...
> 
> No, because _that_ would be bogosort, as defined in Wikipedia. :-)

So it is me who invented bogosort, heh ;-).
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-10 22:07               ` Rafael J. Wysocki
  2007-07-10 22:21                 ` Pavel Machek
@ 2007-07-23  8:04                 ` Pavel Machek
  2007-07-23 19:16                   ` Rafael J. Wysocki
  1 sibling, 1 reply; 40+ messages in thread
From: Pavel Machek @ 2007-07-23  8:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi!

> > > > Task here is to sort the tasks, and freeze them in such order that
> > > > freezing works, right? Yep, we do not know the dependencies
> > > > explicitely... but what you invented is bogosort.
> > > 
> > > Define, please?
> > 
> > Uff, sorry. http://en.wikipedia.org/wiki/Bogosort .
> 
> So this thing is not bogosort, because it makes the tasks that haven't frozen
> appear on the bottom.  IOW, the first (failing) freezing attempt is used as
> a sieve, so the process is not completely random.

Aha..

> > Hmm, if someone wants to retry... perhaps we should just return
> > specific error and let the _userland_ do the retries? It can do it as
> > efficiently as outer loop in freezer...
> 
> No, because _that_ would be bogosort, as defined in Wikipedia. :-)

Ok, I see. It is better than bogosort... but still ugly. Can we at
least have someone tell us it helps with fuse before we proceed?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently)
  2007-07-23  8:04                 ` Pavel Machek
@ 2007-07-23 19:16                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 40+ messages in thread
From: Rafael J. Wysocki @ 2007-07-23 19:16 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Matthew Garrett, linux-pm, Oleg Nesterov, Miklos Szeredi

Hi,

On Monday, 23 July 2007 10:04, Pavel Machek wrote:
> Hi!
> 
> > > > > Task here is to sort the tasks, and freeze them in such order that
> > > > > freezing works, right? Yep, we do not know the dependencies
> > > > > explicitely... but what you invented is bogosort.
> > > > 
> > > > Define, please?
> > > 
> > > Uff, sorry. http://en.wikipedia.org/wiki/Bogosort .
> > 
> > So this thing is not bogosort, because it makes the tasks that haven't frozen
> > appear on the bottom.  IOW, the first (failing) freezing attempt is used as
> > a sieve, so the process is not completely random.
> 
> Aha..
> 
> > > Hmm, if someone wants to retry... perhaps we should just return
> > > specific error and let the _userland_ do the retries? It can do it as
> > > efficiently as outer loop in freezer...
> > 
> > No, because _that_ would be bogosort, as defined in Wikipedia. :-)
> 
> Ok, I see. It is better than bogosort... but still ugly. Can we at
> least have someone tell us it helps with fuse before we proceed?

Well, I think Matthew has a failing system. :-)

Anyway, I have an idea how to improve it, but first I'd like to flush the
patch queue ...

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2007-07-23 19:16 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-09 20:29 [RFC][PATCH -mm 0/6] Freezer update Rafael J. Wysocki
2007-07-09 20:31 ` [RFC][PATCH -mm 1/6] Freezer: Do not sync filesystems Rafael J. Wysocki
2007-07-09 23:12   ` Pavel Machek
2007-07-10  0:31     ` Matthew Garrett
2007-07-09 20:32 ` [RFC][PATCH -mm 2/6] Freezer: Do not send signals to kernel threads Rafael J. Wysocki
2007-07-09 23:42   ` Pavel Machek
2007-07-10  5:57     ` Rafael J. Wysocki
2007-07-10 15:00   ` Oleg Nesterov
2007-07-10 21:08     ` Rafael J. Wysocki
2007-07-10 21:22       ` Oleg Nesterov
2007-07-09 20:33 ` [RFC][PATCH -mm 3/6] Freezer: Be more verbose Rafael J. Wysocki
2007-07-09 23:46   ` Pavel Machek
2007-07-10  6:01     ` Rafael J. Wysocki
2007-07-10 15:05       ` Pavel Machek
2007-07-09 20:34 ` [RFC][PATCH -mm 4/6] Freezer: Prevent new tasks from inheriting TIF_FREEZE set Rafael J. Wysocki
2007-07-09 23:21   ` Pavel Machek
2007-07-10  6:03     ` Rafael J. Wysocki
2007-07-10 15:05       ` Pavel Machek
2007-07-09 20:38 ` [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently Rafael J. Wysocki
2007-07-09 23:34   ` Pavel Machek
2007-07-10  6:09     ` Rafael J. Wysocki
2007-07-10 10:04       ` Rafael J. Wysocki
2007-07-10 17:17         ` Oleg Nesterov
2007-07-10 20:30           ` Rafael J. Wysocki
2007-07-10 20:55             ` Oleg Nesterov
2007-07-10 21:15               ` Rafael J. Wysocki
2007-07-10 18:50         ` Oleg Nesterov
2007-07-10 19:54           ` Rafael J. Wysocki
2007-07-10 20:35             ` Oleg Nesterov
2007-07-10 20:57               ` Rafael J. Wysocki
2007-07-10 21:13         ` bogosort (was Re: Re: [RFC][PATCH -mm 5/6] Freezer: Use freezing timeout more efficiently) Pavel Machek
2007-07-10 21:38           ` Oleg Nesterov
2007-07-10 21:39           ` Rafael J. Wysocki
2007-07-10 21:39             ` Pavel Machek
2007-07-10 22:07               ` Rafael J. Wysocki
2007-07-10 22:21                 ` Pavel Machek
2007-07-23  8:04                 ` Pavel Machek
2007-07-23 19:16                   ` Rafael J. Wysocki
2007-07-09 20:41 ` [RFC][PATCH -mm 6/6] Freezer: Document relationship with memory shrinking Rafael J. Wysocki
2007-07-09 23:23   ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox