* [PATCH RT 0/4] Linux 3.4.82-rt103-rc1
@ 2014-04-27 14:35 Steven Rostedt
2014-04-27 14:35 ` [PATCH RT 1/4] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls Steven Rostedt
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Steven Rostedt @ 2014-04-27 14:35 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, Paul Gortmaker
Dear RT Folks,
This is the RT stable review cycle of patch 3.4.82-rt103-rc1.
Please scream at me if I messed something up. Please test the patches too.
The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).
The pre-releases will not be pushed to the git repository, only the
final release is.
If all goes well, this patch will be converted to the next main release
on 4/30/2014.
Enjoy,
-- Steve
To build 3.4.82-rt103-rc1 directly, the following patches should be applied:
http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.4.tar.xz
http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.4.82.xz
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patch-3.4.82-rt103-rc1.patch.xz
You can also build from 3.4.82-rt102 by applying the incremental patch:
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/incr/patch-3.4.82-rt102-rt103-rc1.patch.xz
Changes from 3.4.82-rt102:
---
Marc Kleine-Budde (1):
net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls
Sebastian Andrzej Siewior (1):
fs: jbd2: pull your plug when waiting for space
Steven Rostedt (1):
cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep
Steven Rostedt (Red Hat) (1):
Linux 3.4.82-rt103-rc1
----
fs/jbd2/checkpoint.c | 2 ++
kernel/hrtimer.c | 25 ++++++++++++++++++-------
localversion-rt | 2 +-
net/sched/sch_generic.c | 2 +-
4 files changed, 22 insertions(+), 9 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH RT 1/4] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls 2014-04-27 14:35 [PATCH RT 0/4] Linux 3.4.82-rt103-rc1 Steven Rostedt @ 2014-04-27 14:35 ` Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 2/4] fs: jbd2: pull your plug when waiting for space Steven Rostedt ` (2 subsequent siblings) 3 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2014-04-27 14:35 UTC (permalink / raw) To: linux-kernel, linux-rt-users Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker, stable-rt, Marc Kleine-Budde [-- Attachment #1: 0001-net-sched-dev_deactivate_many-use-msleep-1-instead-o.patch --] [-- Type: text/plain, Size: 3290 bytes --] 3.4.82-rt103-rc1 stable review patch. If anyone has any objections, please let me know. ------------------ From: Marc Kleine-Budde <mkl@pengutronix.de> On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50 (by default). If a high priority userspace process tries to shut down a busy network interface it might spin in a yield loop waiting for the device to become idle. With the interrupt thread having a lower priority than the looping process it might never be scheduled and so result in a deadlock on UP systems. With Magic SysRq the following backtrace can be produced: > test_app R running 0 174 168 0x00000000 > [<c02c7070>] (__schedule+0x220/0x3fc) from [<c02c7870>] (preempt_schedule_irq+0x48/0x80) > [<c02c7870>] (preempt_schedule_irq+0x48/0x80) from [<c0008fa8>] (svc_preempt+0x8/0x20) > [<c0008fa8>] (svc_preempt+0x8/0x20) from [<c001a984>] (local_bh_enable+0x18/0x88) > [<c001a984>] (local_bh_enable+0x18/0x88) from [<c025316c>] (dev_deactivate_many+0x220/0x264) > [<c025316c>] (dev_deactivate_many+0x220/0x264) from [<c023be04>] (__dev_close_many+0x64/0xd4) > [<c023be04>] (__dev_close_many+0x64/0xd4) from [<c023be9c>] (__dev_close+0x28/0x3c) > [<c023be9c>] (__dev_close+0x28/0x3c) from [<c023f7f0>] (__dev_change_flags+0x88/0x130) > [<c023f7f0>] (__dev_change_flags+0x88/0x130) from [<c023f904>] (dev_change_flags+0x10/0x48) > [<c023f904>] (dev_change_flags+0x10/0x48) from [<c024c140>] (do_setlink+0x370/0x7ec) > [<c024c140>] (do_setlink+0x370/0x7ec) from [<c024d2f0>] (rtnl_newlink+0x2b4/0x450) > [<c024d2f0>] (rtnl_newlink+0x2b4/0x450) from [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4) > [<c024cfa0>] (rtnetlink_rcv_msg+0x158/0x1f4) from [<c0256740>] (netlink_rcv_skb+0xac/0xc0) > [<c0256740>] (netlink_rcv_skb+0xac/0xc0) from [<c024bbd8>] (rtnetlink_rcv+0x18/0x24) > [<c024bbd8>] (rtnetlink_rcv+0x18/0x24) from [<c02561b8>] (netlink_unicast+0x13c/0x198) > [<c02561b8>] (netlink_unicast+0x13c/0x198) from [<c025651c>] (netlink_sendmsg+0x264/0x2e0) > [<c025651c>] (netlink_sendmsg+0x264/0x2e0) from [<c022af98>] (sock_sendmsg+0x78/0x98) > [<c022af98>] (sock_sendmsg+0x78/0x98) from [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278) > [<c022bb50>] (___sys_sendmsg.part.25+0x268/0x278) from [<c022cf08>] (__sys_sendmsg+0x48/0x78) > [<c022cf08>] (__sys_sendmsg+0x48/0x78) from [<c0009320>] (ret_fast_syscall+0x0/0x2c) This patch works around the problem by replacing yield() by msleep(1), giving the interrupt thread time to finish, similar to other changes contained in the rt patch set. Using wait_for_completion() instead would probably be a better solution. Cc: stable-rt@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- net/sched/sch_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 67fc573..455d21a 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -848,7 +848,7 @@ void dev_deactivate_many(struct list_head *head) /* Wait for outstanding qdisc_run calls. */ list_for_each_entry(dev, head, unreg_list) while (some_qdisc_is_busy(dev)) - yield(); + msleep(1); } void dev_deactivate(struct net_device *dev) -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 2/4] fs: jbd2: pull your plug when waiting for space 2014-04-27 14:35 [PATCH RT 0/4] Linux 3.4.82-rt103-rc1 Steven Rostedt 2014-04-27 14:35 ` [PATCH RT 1/4] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls Steven Rostedt @ 2014-04-27 14:36 ` Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 3/4] cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 4/4] Linux 3.4.82-rt103-rc1 Steven Rostedt 3 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2014-04-27 14:36 UTC (permalink / raw) To: linux-kernel, linux-rt-users Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker, stable-rt [-- Attachment #1: 0002-fs-jbd2-pull-your-plug-when-waiting-for-space.patch --] [-- Type: text/plain, Size: 1161 bytes --] 3.4.82-rt103-rc1 stable review patch. If anyone has any objections, please let me know. ------------------ From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Two cps in parallel managed to stall the the ext4 fs. It seems that journal code is either waiting for locks or sleeping waiting for something to happen. This seems similar to what Mike observed on ext3, here is his description: |With an -rt kernel, and a heavy sync IO load, tasks can jam |up on journal locks without unplugging, which can lead to |terminal IO starvation. Unplug and schedule when waiting |for space. Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- fs/jbd2/checkpoint.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index c78841e..a4d273b 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -125,6 +125,8 @@ void __jbd2_log_wait_for_space(journal_t *journal) if (journal->j_flags & JBD2_ABORT) return; write_unlock(&journal->j_state_lock); + if (current->plug) + io_schedule(); mutex_lock(&journal->j_checkpoint_mutex); /* -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 3/4] cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep 2014-04-27 14:35 [PATCH RT 0/4] Linux 3.4.82-rt103-rc1 Steven Rostedt 2014-04-27 14:35 ` [PATCH RT 1/4] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 2/4] fs: jbd2: pull your plug when waiting for space Steven Rostedt @ 2014-04-27 14:36 ` Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 4/4] Linux 3.4.82-rt103-rc1 Steven Rostedt 3 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2014-04-27 14:36 UTC (permalink / raw) To: linux-kernel, linux-rt-users Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker, stable-rt, Ulrich Obergfell [-- Attachment #1: 0003-cpu_chill-Add-a-UNINTERRUPTIBLE-hrtimer_nanosleep.patch --] [-- Type: text/plain, Size: 4137 bytes --] 3.4.82-rt103-rc1 stable review patch. If anyone has any objections, please let me know. ------------------ From: Steven Rostedt <rostedt@goodmis.org> We hit another bug that was caused by switching cpu_chill() from msleep() to hrtimer_nanosleep(). This time it is a livelock. The problem is that hrtimer_nanosleep() calls schedule with the state == TASK_INTERRUPTIBLE. But these means that if a signal is pending, the scheduler wont schedule, and will simply change the current task state back to TASK_RUNNING. This nullifies the whole point of cpu_chill() in the first place. That is, if a task is spinning on a try_lock() and it preempted the owner of the lock, if it has a signal pending, it will never give up the CPU to let the owner of the lock run. I made a static function __hrtimer_nanosleep() that takes a fifth parameter "state", which determines the task state of that the nanosleep() will be in. The normal hrtimer_nanosleep() will act the same, but cpu_chill() will call the __hrtimer_nanosleep() directly with the TASK_UNINTERRUPTIBLE state. cpu_chill() only cares that the first sleep happens, and does not care about the state of the restart schedule (in hrtimer_nanosleep_restart). Cc: stable-rt@vger.kernel.org Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- kernel/hrtimer.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index a87d70d..5342f82 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1724,12 +1724,13 @@ void hrtimer_init_sleeper(struct hrtimer_sleeper *sl, struct task_struct *task) } EXPORT_SYMBOL_GPL(hrtimer_init_sleeper); -static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum hrtimer_mode mode) +static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum hrtimer_mode mode, + unsigned long state) { hrtimer_init_sleeper(t, current); do { - set_current_state(TASK_INTERRUPTIBLE); + set_current_state(state); hrtimer_start_expires(&t->timer, mode); if (!hrtimer_active(&t->timer)) t->task = NULL; @@ -1773,7 +1774,8 @@ long __sched hrtimer_nanosleep_restart(struct restart_block *restart) HRTIMER_MODE_ABS); hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires); - if (do_nanosleep(&t, HRTIMER_MODE_ABS)) + /* cpu_chill() does not care about restart state. */ + if (do_nanosleep(&t, HRTIMER_MODE_ABS, TASK_INTERRUPTIBLE)) goto out; rmtp = restart->nanosleep.rmtp; @@ -1790,8 +1792,10 @@ out: return ret; } -long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp, - const enum hrtimer_mode mode, const clockid_t clockid) +static long +__hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp, + const enum hrtimer_mode mode, const clockid_t clockid, + unsigned long state) { struct restart_block *restart; struct hrtimer_sleeper t; @@ -1804,7 +1808,7 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp, hrtimer_init_on_stack(&t.timer, clockid, mode); hrtimer_set_expires_range_ns(&t.timer, timespec_to_ktime(*rqtp), slack); - if (do_nanosleep(&t, mode)) + if (do_nanosleep(&t, mode, state)) goto out; /* Absolute timers do not update the rmtp value and restart: */ @@ -1831,6 +1835,12 @@ out: return ret; } +long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp, + const enum hrtimer_mode mode, const clockid_t clockid) +{ + return __hrtimer_nanosleep(rqtp, rmtp, mode, clockid, TASK_INTERRUPTIBLE); +} + SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp, struct timespec __user *, rmtp) { @@ -1857,7 +1867,8 @@ void cpu_chill(void) unsigned int freeze_flag = current->flags & PF_NOFREEZE; current->flags |= PF_NOFREEZE; - hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC); + __hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC, + TASK_UNINTERRUPTIBLE); if (!freeze_flag) current->flags &= ~PF_NOFREEZE; } -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 4/4] Linux 3.4.82-rt103-rc1 2014-04-27 14:35 [PATCH RT 0/4] Linux 3.4.82-rt103-rc1 Steven Rostedt ` (2 preceding siblings ...) 2014-04-27 14:36 ` [PATCH RT 3/4] cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep Steven Rostedt @ 2014-04-27 14:36 ` Steven Rostedt 3 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2014-04-27 14:36 UTC (permalink / raw) To: linux-kernel, linux-rt-users Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker [-- Attachment #1: 0004-Linux-3.4.82-rt103-rc1.patch --] [-- Type: text/plain, Size: 408 bytes --] 3.4.82-rt103-rc1 stable review patch. If anyone has any objections, please let me know. ------------------ From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> --- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index 33017cd..b809d7b 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt102 +-rt103-rc1 -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH RT 0/4] Linux 3.4.82-rt103-rc1
@ 2014-03-13 10:44 Steven Rostedt
2014-03-13 10:44 ` [PATCH RT 4/4] " Steven Rostedt
0 siblings, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2014-03-13 10:44 UTC (permalink / raw)
To: linux-kernel, linux-rt-users
Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
John Kacur, Paul Gortmaker
Dear RT Folks,
This is the RT stable review cycle of patch 3.4.82-rt103-rc1.
Please scream at me if I messed something up. Please test the patches too.
The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).
The pre-releases will not be pushed to the git repository, only the
final release is.
If all goes well, this patch will be converted to the next main release
on 3/18/2014.
Enjoy,
-- Steve
To build 3.4.82-rt103-rc1 directly, the following patches should be applied:
http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.4.tar.xz
http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.4.82.xz
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patch-3.4.82-rt103-rc1.patch.xz
You can also build from 3.4.82-rt102 by applying the incremental patch:
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/incr/patch-3.4.82-rt102-rt103-rc1.patch.xz
Changes from 3.4.82-rt102:
---
Marc Kleine-Budde (1):
net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls
Sebastian Andrzej Siewior (1):
fs: jbd2: pull your plug when waiting for space
Steven Rostedt (1):
cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep
Steven Rostedt (Red Hat) (1):
Linux 3.4.82-rt103-rc1
----
fs/jbd2/checkpoint.c | 2 ++
kernel/hrtimer.c | 25 ++++++++++++++++++-------
localversion-rt | 2 +-
net/sched/sch_generic.c | 2 +-
4 files changed, 22 insertions(+), 9 deletions(-)
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH RT 4/4] Linux 3.4.82-rt103-rc1 2014-03-13 10:44 [PATCH RT 0/4] " Steven Rostedt @ 2014-03-13 10:44 ` Steven Rostedt 0 siblings, 0 replies; 6+ messages in thread From: Steven Rostedt @ 2014-03-13 10:44 UTC (permalink / raw) To: linux-kernel, linux-rt-users Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker [-- Attachment #1: 0004-Linux-3.4.82-rt103-rc1.patch --] [-- Type: text/plain, Size: 406 bytes --] 3.4.82-rt103-rc1 stable review patch. If anyone has any objections, please let me know. ------------------ From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org> --- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index 33017cd..b809d7b 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt102 +-rt103-rc1 -- 1.8.5.3 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-04-27 14:36 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-27 14:35 [PATCH RT 0/4] Linux 3.4.82-rt103-rc1 Steven Rostedt 2014-04-27 14:35 ` [PATCH RT 1/4] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 2/4] fs: jbd2: pull your plug when waiting for space Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 3/4] cpu_chill: Add a UNINTERRUPTIBLE hrtimer_nanosleep Steven Rostedt 2014-04-27 14:36 ` [PATCH RT 4/4] Linux 3.4.82-rt103-rc1 Steven Rostedt -- strict thread matches above, loose matches on Subject: below -- 2014-03-13 10:44 [PATCH RT 0/4] " Steven Rostedt 2014-03-13 10:44 ` [PATCH RT 4/4] " Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).