linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] fixed sysrq & rcu patches
@ 2014-04-29 18:06 riel
  2014-04-29 18:06 ` [PATCH 1/2] sysrq: rcu-ify __handle_sysrq riel
  2014-04-29 18:06 ` [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs riel
  0 siblings, 2 replies; 5+ messages in thread
From: riel @ 2014-04-29 18:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, paulmck, rdunlap, richard, umgwanakikbuti

Andrew, these patches contain all the fixes from the threads. They
seem to compile on normal x86 and UML now.

Thanks to Paul, Randy, and everybody else.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] sysrq: rcu-ify __handle_sysrq
  2014-04-29 18:06 [PATCH 0/2] fixed sysrq & rcu patches riel
@ 2014-04-29 18:06 ` riel
  2014-04-29 18:06 ` [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs riel
  1 sibling, 0 replies; 5+ messages in thread
From: riel @ 2014-04-29 18:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, paulmck, rdunlap, richard, umgwanakikbuti

From: Rik van Riel <riel@redhat.com>

Echoing values into /proc/sysrq-trigger seems to be a popular way to
get information out of the kernel. However, dumping information about
thousands of processes, or hundreds of CPUs to serial console can
result in IRQs being blocked for minutes, resulting in various kinds
of cascade failures.

The most common failure is due to interrupts being blocked for a very
long time. This can lead to things like failed IO requests, and other
things the system cannot easily recover from.

This problem is easily fixable by making __handle_sysrq use RCU
instead of spin_lock_irqsave.

This leaves the warning that RCU grace periods have not elapsed for a
long time, but the system will come back from that automatically.

It also leaves sysrq-from-irq-context when the sysrq keys are pressed,
but that is probably desired since people want that to work in situations
where the system is already hosed.

The callers of register_sysrq_key and unregister_sysrq_key appear to be
capable of sleeping.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Madper Xie <cxie@redhat.com>
---
 drivers/tty/sysrq.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index ce396ec..fc67a89 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -510,9 +510,8 @@ void __handle_sysrq(int key, bool check_mask)
 	struct sysrq_key_op *op_p;
 	int orig_log_level;
 	int i;
-	unsigned long flags;
 
-	spin_lock_irqsave(&sysrq_key_table_lock, flags);
+	rcu_read_lock();
 	/*
 	 * Raise the apparent loglevel to maximum so that the sysrq header
 	 * is shown to provide the user with positive feedback.  We do not
@@ -554,7 +553,7 @@ void __handle_sysrq(int key, bool check_mask)
 		printk("\n");
 		console_loglevel = orig_log_level;
 	}
-	spin_unlock_irqrestore(&sysrq_key_table_lock, flags);
+	rcu_read_unlock();
 }
 
 void handle_sysrq(int key)
@@ -1043,16 +1042,23 @@ static int __sysrq_swap_key_ops(int key, struct sysrq_key_op *insert_op_p,
                                 struct sysrq_key_op *remove_op_p)
 {
 	int retval;
-	unsigned long flags;
 
-	spin_lock_irqsave(&sysrq_key_table_lock, flags);
+	spin_lock(&sysrq_key_table_lock);
 	if (__sysrq_get_key_op(key) == remove_op_p) {
 		__sysrq_put_key_op(key, insert_op_p);
 		retval = 0;
 	} else {
 		retval = -1;
 	}
-	spin_unlock_irqrestore(&sysrq_key_table_lock, flags);
+	spin_unlock(&sysrq_key_table_lock);
+
+	/*
+	 * A concurrent __handle_sysrq either got the old op or the new op.
+	 * Wait for it to go away before returning, so the code for an old
+	 * op is not freed (eg. on module unload) while it is in use.
+	 */
+	synchronize_rcu();
+
 	return retval;
 }
 
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs
  2014-04-29 18:06 [PATCH 0/2] fixed sysrq & rcu patches riel
  2014-04-29 18:06 ` [PATCH 1/2] sysrq: rcu-ify __handle_sysrq riel
@ 2014-04-29 18:06 ` riel
  2014-05-15 23:24   ` Andrew Morton
  1 sibling, 1 reply; 5+ messages in thread
From: riel @ 2014-04-29 18:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: akpm, paulmck, rdunlap, richard, umgwanakikbuti

From: Rik van Riel <riel@redhat.com>

Some sysrq handlers can run for a long time, because they dump a lot
of data onto a serial console. Having RCU stall warnings pop up in
the middle of them only makes the problem worse.

This patch temporarily disables RCU stall warnings while a sysrq
request is handled.

Signed-off-by: Rik van Riel <riel@redhat.com>
Suggested-by: Paul McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Fix TINY_RCU build error. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 drivers/tty/sysrq.c      |  3 +++
 include/linux/rcupdate.h | 12 ++++++++++++
 kernel/rcu/update.c      | 12 ++++++++++++
 3 files changed, 27 insertions(+)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index fc67a89..38d5f9a 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -46,6 +46,7 @@
 #include <linux/jiffies.h>
 #include <linux/syscalls.h>
 #include <linux/of.h>
+#include <linux/rcupdate.h>
 
 #include <asm/ptrace.h>
 #include <asm/irq_regs.h>
@@ -511,6 +512,7 @@ void __handle_sysrq(int key, bool check_mask)
 	int orig_log_level;
 	int i;
 
+	rcu_sysrq_start();
 	rcu_read_lock();
 	/*
 	 * Raise the apparent loglevel to maximum so that the sysrq header
@@ -554,6 +556,7 @@ void __handle_sysrq(int key, bool check_mask)
 		console_loglevel = orig_log_level;
 	}
 	rcu_read_unlock();
+	rcu_sysrq_end();
 }
 
 void handle_sysrq(int key)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 00a7fd6..ec3959b 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -228,6 +228,18 @@ void rcu_idle_exit(void);
 void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 
+#ifdef CONFIG_RCU_STALL_COMMON
+void rcu_sysrq_start(void);
+void rcu_sysrq_end(void);
+#else /* #ifdef CONFIG_RCU_STALL_COMMON */
+static inline void rcu_sysrq_start(void)
+{
+}
+static inline void rcu_sysrq_end(void)
+{
+}
+#endif /* #else #ifdef CONFIG_RCU_STALL_COMMON */
+
 #ifdef CONFIG_RCU_USER_QS
 void rcu_user_enter(void);
 void rcu_user_exit(void);
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 4c0a9b0..d22309c 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -320,6 +320,18 @@ int rcu_jiffies_till_stall_check(void)
 	return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
 }
 
+void rcu_sysrq_start(void)
+{
+	if (!rcu_cpu_stall_suppress)
+		rcu_cpu_stall_suppress = 2;
+}
+
+void rcu_sysrq_end(void)
+{
+	if (rcu_cpu_stall_suppress == 2)
+		rcu_cpu_stall_suppress = 0;
+}
+
 static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
 {
 	rcu_cpu_stall_suppress = 1;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs
  2014-04-29 18:06 ` [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs riel
@ 2014-05-15 23:24   ` Andrew Morton
  2014-05-15 23:48     ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2014-05-15 23:24 UTC (permalink / raw)
  To: riel; +Cc: linux-kernel, paulmck, rdunlap, richard, umgwanakikbuti

On Tue, 29 Apr 2014 14:06:36 -0400 riel@redhat.com wrote:

> From: Rik van Riel <riel@redhat.com>
> 
> Some sysrq handlers can run for a long time, because they dump a lot
> of data onto a serial console. Having RCU stall warnings pop up in
> the middle of them only makes the problem worse.
> 
> This patch temporarily disables RCU stall warnings while a sysrq
> request is handled.
>
> ...
>
>  drivers/tty/sysrq.c      |  3 +++
>  include/linux/rcupdate.h | 12 ++++++++++++
>  kernel/rcu/update.c      | 12 ++++++++++++

OK, what's going on here.  Someone (of, I suspect, a Paulish nature)
has plucked out the rcu parts of this patch, put them in linux-next and
omitted the drivers/tty part.  Very tricky!


I have done the opposite and have staged two patches against linux-next:

http://ozlabs.org/~akpm/mmots/broken-out/sysrq-rcu-ify-__handle_sysrq.patch
http://ozlabs.org/~akpm/mmots/broken-out/sysrqrcu-suppress-rcu-stall-warnings-while-sysrq-runs.patch

Please check, review, comment, etc.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs
  2014-05-15 23:24   ` Andrew Morton
@ 2014-05-15 23:48     ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2014-05-15 23:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: riel, linux-kernel, rdunlap, richard, umgwanakikbuti

On Thu, May 15, 2014 at 04:24:12PM -0700, Andrew Morton wrote:
> On Tue, 29 Apr 2014 14:06:36 -0400 riel@redhat.com wrote:
> 
> > From: Rik van Riel <riel@redhat.com>
> > 
> > Some sysrq handlers can run for a long time, because they dump a lot
> > of data onto a serial console. Having RCU stall warnings pop up in
> > the middle of them only makes the problem worse.
> > 
> > This patch temporarily disables RCU stall warnings while a sysrq
> > request is handled.
> >
> > ...
> >
> >  drivers/tty/sysrq.c      |  3 +++
> >  include/linux/rcupdate.h | 12 ++++++++++++
> >  kernel/rcu/update.c      | 12 ++++++++++++
> 
> OK, what's going on here.  Someone (of, I suspect, a Paulish nature)
> has plucked out the rcu parts of this patch, put them in linux-next and
> omitted the drivers/tty part.  Very tricky!

Sounds like something I might do...  ;-)

I intend to push this for the upcoming merge window, in case it matters.

> I have done the opposite and have staged two patches against linux-next:
> 
> http://ozlabs.org/~akpm/mmots/broken-out/sysrq-rcu-ify-__handle_sysrq.patch

I defer to Rik for this one.

> http://ozlabs.org/~akpm/mmots/broken-out/sysrqrcu-suppress-rcu-stall-warnings-while-sysrq-runs.patch

The "[paulmck@linux.vnet.ibm.com: fix TINY_RCU build error]" should be
removed, since it goes with the patch I queued.  Not that it matters
that much, I suppose.

							Thanx, Paul

> Please check, review, comment, etc.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-15 23:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-29 18:06 [PATCH 0/2] fixed sysrq & rcu patches riel
2014-04-29 18:06 ` [PATCH 1/2] sysrq: rcu-ify __handle_sysrq riel
2014-04-29 18:06 ` [PATCH 2/2] sysrq,rcu: suppress RCU stall warnings while sysrq runs riel
2014-05-15 23:24   ` Andrew Morton
2014-05-15 23:48     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).