* [PATCH RT 1/5] tcp: force a dst refcount when prequeue packet
2013-05-03 21:37 [PATCH RT 0/5] [ANNOUNCE] 3.6.11.2-rt34-rc1 Steven Rostedt
@ 2013-05-03 21:37 ` Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 2/5] x86/mce: Defer mce wakeups to threads for PREEMPT_RT Steven Rostedt
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-03 21:37 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, stable-rt, Mike Galbraith, Eric Dumazet
[-- Attachment #1: 0001-tcp-force-a-dst-refcount-when-prequeue-packet.patch --]
[-- Type: text/plain, Size: 908 bytes --]
From: Eric Dumazet <edumazet@google.com>
Before escaping RCU protected section and adding packet into
prequeue, make sure the dst is refcounted.
Cc: stable-rt@vger.kernel.org
Reported-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
include/net/tcp.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1f000ff..9297897 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1015,6 +1015,7 @@ static inline bool tcp_prequeue(struct sock *sk, struct sk_buff *skb)
if (sysctl_tcp_low_latency || !tp->ucopy.task)
return false;
+ skb_dst_force(skb);
__skb_queue_tail(&tp->ucopy.prequeue, skb);
tp->ucopy.memory += skb->truesize;
if (tp->ucopy.memory > sk->sk_rcvbuf) {
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 2/5] x86/mce: Defer mce wakeups to threads for PREEMPT_RT
2013-05-03 21:37 [PATCH RT 0/5] [ANNOUNCE] 3.6.11.2-rt34-rc1 Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 1/5] tcp: force a dst refcount when prequeue packet Steven Rostedt
@ 2013-05-03 21:37 ` Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 3/5] powerpc/64bit,PREEMPT_RT: Check preempt_count before preempting Steven Rostedt
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-03 21:37 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, stable-rt
[-- Attachment #1: 0002-x86-mce-Defer-mce-wakeups-to-threads-for-PREEMPT_RT.patch --]
[-- Type: text/plain, Size: 5476 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
We had a customer report a lockup on a 3.0-rt kernel that had the
following backtrace:
[ffff88107fca3e80] rt_spin_lock_slowlock at ffffffff81499113
[ffff88107fca3f40] rt_spin_lock at ffffffff81499a56
[ffff88107fca3f50] __wake_up at ffffffff81043379
[ffff88107fca3f80] mce_notify_irq at ffffffff81017328
[ffff88107fca3f90] intel_threshold_interrupt at ffffffff81019508
[ffff88107fca3fa0] smp_threshold_interrupt at ffffffff81019fc1
[ffff88107fca3fb0] threshold_interrupt at ffffffff814a1853
It actually bugged because the lock was taken by the same owner that
already had that lock. What happened was the thread that was setting
itself on a wait queue had the lock when an MCE triggered. The MCE
interrupt does a wake up on its wait list and grabs the same lock.
NOTE: THIS IS NOT A BUG ON MAINLINE
Sorry for yelling, but as I Cc'd mainline maintainers I want them to
know that this is an PREEMPT_RT bug only. I only Cc'd them for advice.
On PREEMPT_RT the wait queue locks are converted from normal
"spin_locks" into an rt_mutex (see the rt_spin_lock_slowlock above).
These are not to be taken by hard interrupt context. This usually isn't
a problem as most all interrupts in PREEMPT_RT are converted into
schedulable threads. Unfortunately that's not the case with the MCE irq.
As wait queue locks are notorious for long hold times, we can not
convert them to raw_spin_locks without causing issues with -rt. But
Thomas has created a "simple-wait" structure that uses raw spin locks
which may have been a good fit.
Unfortunately, wait queues are not the only issue, as the mce_notify_irq
also does a schedule_work(), which grabs the workqueue spin locks that
have the exact same issue.
Thus, this patch I'm proposing is to move the actual work of the MCE
interrupt into a helper thread that gets woken up on the MCE interrupt
and does the work in a schedulable context.
NOTE: THIS PATCH ONLY CHANGES THE BEHAVIOR WHEN PREEMPT_RT IS SET
Oops, sorry for yelling again, but I want to stress that I keep the same
behavior of mainline when PREEMPT_RT is not set. Thus, this only changes
the MCE behavior when PREEMPT_RT is configured.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bigeasy@linutronix: make mce_notify_work() a proper prototype, use
kthread_run()]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
arch/x86/kernel/cpu/mcheck/mce.c | 78 +++++++++++++++++++++++++++++---------
1 file changed, 61 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e8d8ad0..e31ea90 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -18,6 +18,7 @@
#include <linux/rcupdate.h>
#include <linux/kobject.h>
#include <linux/uaccess.h>
+#include <linux/kthread.h>
#include <linux/kdebug.h>
#include <linux/kernel.h>
#include <linux/percpu.h>
@@ -1308,6 +1309,63 @@ static void mce_do_trigger(struct work_struct *work)
static DECLARE_WORK(mce_trigger_work, mce_do_trigger);
+static void __mce_notify_work(void)
+{
+ /* Not more than two messages every minute */
+ static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
+
+ /* wake processes polling /dev/mcelog */
+ wake_up_interruptible(&mce_chrdev_wait);
+
+ /*
+ * There is no risk of missing notifications because
+ * work_pending is always cleared before the function is
+ * executed.
+ */
+ if (mce_helper[0] && !work_pending(&mce_trigger_work))
+ schedule_work(&mce_trigger_work);
+
+ if (__ratelimit(&ratelimit))
+ pr_info(HW_ERR "Machine check events logged\n");
+}
+
+#ifdef CONFIG_PREEMPT_RT_FULL
+struct task_struct *mce_notify_helper;
+
+static int mce_notify_helper_thread(void *unused)
+{
+ while (1) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule();
+ if (kthread_should_stop())
+ break;
+ __mce_notify_work();
+ }
+ return 0;
+}
+
+static int mce_notify_work_init(void)
+{
+ mce_notify_helper = kthread_run(mce_notify_helper_thread, NULL,
+ "mce-notify");
+ if (!mce_notify_helper)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void mce_notify_work(void)
+{
+ wake_up_process(mce_notify_helper);
+}
+#else
+static void mce_notify_work(void)
+{
+ __mce_notify_work();
+}
+static inline int mce_notify_work_init(void) { return 0; }
+#endif
+
/*
* Notify the user(s) about new machine check events.
* Can be called from interrupt context, but not from machine check/NMI
@@ -1315,24 +1373,8 @@ static DECLARE_WORK(mce_trigger_work, mce_do_trigger);
*/
int mce_notify_irq(void)
{
- /* Not more than two messages every minute */
- static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
-
if (test_and_clear_bit(0, &mce_need_notify)) {
- /* wake processes polling /dev/mcelog */
- wake_up_interruptible(&mce_chrdev_wait);
-
- /*
- * There is no risk of missing notifications because
- * work_pending is always cleared before the function is
- * executed.
- */
- if (mce_helper[0] && !work_pending(&mce_trigger_work))
- schedule_work(&mce_trigger_work);
-
- if (__ratelimit(&ratelimit))
- pr_info(HW_ERR "Machine check events logged\n");
-
+ mce_notify_work();
return 1;
}
return 0;
@@ -2375,6 +2417,8 @@ static __init int mcheck_init_device(void)
/* register character device /dev/mcelog */
misc_register(&mce_chrdev_device);
+ err = mce_notify_work_init();
+
return err;
}
device_initcall_sync(mcheck_init_device);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 3/5] powerpc/64bit,PREEMPT_RT: Check preempt_count before preempting
2013-05-03 21:37 [PATCH RT 0/5] [ANNOUNCE] 3.6.11.2-rt34-rc1 Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 1/5] tcp: force a dst refcount when prequeue packet Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 2/5] x86/mce: Defer mce wakeups to threads for PREEMPT_RT Steven Rostedt
@ 2013-05-03 21:37 ` Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 4/5] swap: Use unique local lock name for swap_lock Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 5/5] Linux 3.6.11.2-rt34-rc1 Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-03 21:37 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, stable-rt, Priyanka Jain
[-- Attachment #1: 0003-powerpc-64bit-PREEMPT_RT-Check-preempt_count-before-.patch --]
[-- Type: text/plain, Size: 1048 bytes --]
From: Priyanka Jain <Priyanka.Jain@freescale.com>
In ret_from_except_lite() with CONFIG_PREEMPT enabled,
add the missing check to compare value of preempt_count
with zero before continuing with preemption process of
the current task.
If preempt_count is non-zero, restore reg and return,
else continue the preemption process.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
arch/powerpc/kernel/entry_64.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index a9b98cc..7af1ea7 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -596,6 +596,8 @@ resume_kernel:
#ifdef CONFIG_PREEMPT
/* Check if we need to preempt */
lwz r8,TI_PREEMPT(r9)
+ cmpwi 0,r8,0 /* if non-zero, just restore regs and return */
+ bne restore
andi. r0,r4,_TIF_NEED_RESCHED
bne+ 1f
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 4/5] swap: Use unique local lock name for swap_lock
2013-05-03 21:37 [PATCH RT 0/5] [ANNOUNCE] 3.6.11.2-rt34-rc1 Steven Rostedt
` (2 preceding siblings ...)
2013-05-03 21:37 ` [PATCH RT 3/5] powerpc/64bit,PREEMPT_RT: Check preempt_count before preempting Steven Rostedt
@ 2013-05-03 21:37 ` Steven Rostedt
2013-05-03 21:37 ` [PATCH RT 5/5] Linux 3.6.11.2-rt34-rc1 Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-03 21:37 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, Mike Galbraith
[-- Attachment #1: 0004-swap-Use-unique-local-lock-name-for-swap_lock.patch --]
[-- Type: text/plain, Size: 3519 bytes --]
From: Steven Rostedt <rostedt@goodmis.org>
>From lib/Kconfig.debug on CONFIG_FORCE_WEAK_PER_CPU:
----
s390 and alpha require percpu variables in modules to be
defined weak to work around addressing range issue which
puts the following two restrictions on percpu variable
definitions.
1. percpu symbols must be unique whether static or not
2. percpu variables can't be defined inside a function
To ensure that generic code follows the above rules, this
option forces all percpu variables to be defined as weak.
----
The addition of the local IRQ swap_lock in mm/swap.c broke this config
as the name "swap_lock" is used through out the kernel. Just do a "git
grep swap_lock" to see, and the new swap_lock is a local lock which
defines the swap_lock for per_cpu.
The fix was to rename swap_lock to swapvec_lock which keeps it unique.
Reported-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
mm/swap.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/mm/swap.c b/mm/swap.c
index 8ef0e84..0f9ad9d 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -42,7 +42,7 @@ static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
static DEFINE_LOCAL_IRQ_LOCK(rotate_lock);
-static DEFINE_LOCAL_IRQ_LOCK(swap_lock);
+static DEFINE_LOCAL_IRQ_LOCK(swapvec_lock);
/*
* This path almost never happens for VM activity - pages are normally
@@ -407,13 +407,13 @@ static void activate_page_drain(int cpu)
void activate_page(struct page *page)
{
if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
- struct pagevec *pvec = &get_locked_var(swap_lock,
+ struct pagevec *pvec = &get_locked_var(swapvec_lock,
activate_page_pvecs);
page_cache_get(page);
if (!pagevec_add(pvec, page))
pagevec_lru_move_fn(pvec, __activate_page, NULL);
- put_locked_var(swap_lock, activate_page_pvecs);
+ put_locked_var(swapvec_lock, activate_page_pvecs);
}
}
@@ -453,12 +453,12 @@ EXPORT_SYMBOL(mark_page_accessed);
void __lru_cache_add(struct page *page, enum lru_list lru)
{
- struct pagevec *pvec = &get_locked_var(swap_lock, lru_add_pvecs)[lru];
+ struct pagevec *pvec = &get_locked_var(swapvec_lock, lru_add_pvecs)[lru];
page_cache_get(page);
if (!pagevec_add(pvec, page))
__pagevec_lru_add(pvec, lru);
- put_locked_var(swap_lock, lru_add_pvecs);
+ put_locked_var(swapvec_lock, lru_add_pvecs);
}
EXPORT_SYMBOL(__lru_cache_add);
@@ -623,19 +623,19 @@ void deactivate_page(struct page *page)
return;
if (likely(get_page_unless_zero(page))) {
- struct pagevec *pvec = &get_locked_var(swap_lock,
+ struct pagevec *pvec = &get_locked_var(swapvec_lock,
lru_deactivate_pvecs);
if (!pagevec_add(pvec, page))
pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
- put_locked_var(swap_lock, lru_deactivate_pvecs);
+ put_locked_var(swapvec_lock, lru_deactivate_pvecs);
}
}
void lru_add_drain(void)
{
- lru_add_drain_cpu(local_lock_cpu(swap_lock));
- local_unlock_cpu(swap_lock);
+ lru_add_drain_cpu(local_lock_cpu(swapvec_lock));
+ local_unlock_cpu(swapvec_lock);
}
static void lru_add_drain_per_cpu(struct work_struct *dummy)
@@ -850,7 +850,7 @@ EXPORT_SYMBOL(pagevec_lookup_tag);
static int __init swap_init_locks(void)
{
local_irq_lock_init(rotate_lock);
- local_irq_lock_init(swap_lock);
+ local_irq_lock_init(swapvec_lock);
return 1;
}
early_initcall(swap_init_locks);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 5/5] Linux 3.6.11.2-rt34-rc1
2013-05-03 21:37 [PATCH RT 0/5] [ANNOUNCE] 3.6.11.2-rt34-rc1 Steven Rostedt
` (3 preceding siblings ...)
2013-05-03 21:37 ` [PATCH RT 4/5] swap: Use unique local lock name for swap_lock Steven Rostedt
@ 2013-05-03 21:37 ` Steven Rostedt
4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2013-05-03 21:37 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur
[-- Attachment #1: 0005-Linux-3.6.11.2-rt34-rc1.patch --]
[-- Type: text/plain, Size: 299 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
---
localversion-rt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/localversion-rt b/localversion-rt
index e1d8362..c2c1097 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt33
+-rt34-rc1
--
1.7.10.4
^ permalink raw reply related [flat|nested] 6+ messages in thread