stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: <stable@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Cc: Chanho Min <chanho0207@gmail.com>,
	Chanho Min <chanho.min@lge.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Ingo Molnar <mingo@elte.hu>,
	Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: [v2.6.34-stable 73/77] sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW
Date: Tue, 8 Jan 2013 18:35:52 -0500	[thread overview]
Message-ID: <1357688156-25387-74-git-send-email-paul.gortmaker@windriver.com> (raw)
In-Reply-To: <1357688156-25387-1-git-send-email-paul.gortmaker@windriver.com>

From: Chanho Min <chanho0207@gmail.com>

                   -------------------
    This is a commit scheduled for the next v2.6.34 longterm release.
    http://git.kernel.org/?p=linux/kernel/git/paulg/longterm-queue-2.6.34.git
    If you see a problem with using this for longterm, please comment.
                   -------------------

commit cb297a3e433dbdcf7ad81e0564e7b804c941ff0d upstream.

This issue happens under the following conditions:

 1. preemption is off
 2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined
 3. RT scheduling class
 4. SMP system

Sequence is as follows:

 1.suppose current task is A. start schedule()
 2.task A is enqueued pushable task at the entry of schedule()
   __schedule
    prev = rq->curr;
    ...
    put_prev_task
     put_prev_task_rt
      enqueue_pushable_task
 4.pick the task B as next task.
   next = pick_next_task(rq);
 3.rq->curr set to task B and context_switch is started.
   rq->curr = next;
 4.At the entry of context_swtich, release this cpu's rq->lock.
   context_switch
    prepare_task_switch
     prepare_lock_switch
      raw_spin_unlock_irq(&rq->lock);
 5.Shortly after rq->lock is released, interrupt is occurred and start IRQ context
 6.try_to_wake_up() which called by ISR acquires rq->lock
    try_to_wake_up
     ttwu_remote
      rq = __task_rq_lock(p)
      ttwu_do_wakeup(rq, p, wake_flags);
        task_woken_rt
 7.push_rt_task picks the task A which is enqueued before.
   task_woken_rt
    push_rt_tasks(rq)
     next_task = pick_next_pushable_task(rq)
 8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
   lowest_rq can be the remote rq.
  (But,If preemption is on, double_lock_balance always return 1 and it
   does't happen.)
   push_rt_task
    find_lock_lowest_rq
     if (double_lock_balance(rq, lowest_rq))..
 9.find_lock_lowest_rq return the available rq. task A is migrated to
   the remote cpu/rq.
   push_rt_task
    ...
    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, lowest_rq->cpu);
    activate_task(lowest_rq, next_task, 0);
 10. But, task A is on irq context at this cpu.
     So, task A is scheduled by two cpus at the same time until restore from IRQ.
     Task A's stack is corrupted.

To fix it, don't migrate an RT task if it's still running.

Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/CAOAMb1BHA=5fm7KTewYyke6u-8DP0iUuJMpgQw54vNeXFsGpoQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[PG: in 2.6.34, sched/rt.c is just sched_rt.c]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 kernel/sched_rt.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index fd8c1a3..abd5cba 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -1315,6 +1315,11 @@ static int push_rt_task(struct rq *rq)
 	if (!next_task)
 		return 0;
 
+#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW
+	if (unlikely(task_running(rq, next_task)))
+		return 0;
+#endif
+
  retry:
 	if (unlikely(next_task == rq->curr)) {
 		WARN_ON(1);
-- 
1.7.12.1


  parent reply	other threads:[~2013-01-08 23:35 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-08 23:34 [v2.6.34-stable 00/77] v2.6.34.14 longterm review Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 01/77] net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 02/77] time: Improve sanity checking of timekeeping inputs Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 03/77] time: Avoid making adjustments if we haven't accumulated anything Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 04/77] time: Move ktime_t overflow checking into timespec_valid_strict Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 05/77] ALSA: hda_intel: ALSA HD Audio patch for Intel Patsburg DeviceIDs Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 06/77] ALSA: hda: add Vortex86MX PCI ids Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 07/77] ALSA: hda - Add support for VMware controller Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 08/77] ALSA: hda - Reduce pci id list for Intel with class id Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 09/77] ALSA: hda - ALSA HD Audio patch for Intel Panther Point DeviceIDs Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 10/77] ALSA: hda: Use position_fix=1 for Acer Aspire 5538 to enable capture on internal mic Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 11/77] cifs: fix cifs stable patch cifs-fix-oplock-break-handling-try-2.patch Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 12/77] gro: reset vlan_tci on reuse Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 13/77] md: Fix handling for devices from 2TB to 4TB in 0.90 metadata Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 14/77] md: Don't truncate size at 4TB for RAID0 and Linear Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 15/77] genalloc: stop crashing the system when destroying a pool Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 16/77] inotify: stop kernel memory leak on file creation failure Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 17/77] xfs: validate acl count Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 18/77] xfs: fix acl count validation in xfs_acl_from_disk() Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 19/77] x86, ioapic: initialize nr_ioapic_registers early in mp_register_ioapic() Paul Gortmaker
2013-01-08 23:34 ` [v2.6.34-stable 20/77] i2c-algo-bit: Generate correct i2c address sequence for 10-bit target Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 21/77] eCryptfs: Extend array bounds for all filename chars Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 22/77] PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 23/77] ARM: 7161/1: errata: no automatic store buffer drain Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 24/77] ALSA: lx6464es - fix device communication via command bus Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 25/77] SUNRPC: Ensure we return EAGAIN in xs_nospace if congestion is cleared Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 26/77] timekeeping: add arch_offset hook to ktime_get functions Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 27/77] p54spi: Add missing spin_lock_init Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 28/77] p54spi: Fix workqueue deadlock Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 29/77] nl80211: fix MAC address validation Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 30/77] staging: usbip: bugfix for deadlock Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 31/77] staging: comedi: fix oops for USB DAQ devices Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 32/77] Staging: comedi: fix signal handling in read and write Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 33/77] USB: whci-hcd: fix endian conversion in qset_clear() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 34/77] usb: ftdi_sio: add PID for Propox ISPcable III Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 35/77] usb: option: add SIMCom SIM5218 Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 36/77] USB: usb-storage: unusual_devs entry for Kingston DT 101 G2 Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 37/77] Silencing 'killing requests for dead queue' Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 38/77] sched, x86: Avoid unnecessary overflow in sched_clock Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 39/77] x86/mpparse: Account for bus types other than ISA and PCI Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 40/77] oprofile, x86: Fix crash when unloading module (nmi timer mode) Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 41/77] genirq: Fix race condition when stopping the irq thread Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 42/77] tick-broadcast: Stop active broadcast device when replacing it Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 43/77] ALSA: sis7019 - give slow codecs more time to reset Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 44/77] ALSA: hda/realtek - Fix Oops in alc_mux_select() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 45/77] ARM: davinci: dm646x evm: wrong register used in setup_vpif_input_channel_mode Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 46/77] oprofile: Free potentially owned tasks in case of errors Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 47/77] oprofile: Fix locking dependency in sync_start() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 48/77] percpu: fix first chunk match in per_cpu_ptr_to_phys() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 49/77] percpu: fix chunk range calculation Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 50/77] xfrm: Fix key lengths for rfc3686(ctr(aes)) Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 51/77] linux/log2.h: Fix rounddown_pow_of_two(1) Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 52/77] jbd/jbd2: validate sb->s_first in journal_get_superblock() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 53/77] Make TASKSTATS require root access Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 54/77] hfs: fix hfs_find_init() sb->ext_tree NULL ptr oops Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 55/77] export __get_user_pages_fast() function Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 56/77] oprofile, x86: Fix nmi-unsafe callgraph support Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 57/77] ext4: avoid hangs in ext4_da_should_update_i_disksize() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 58/77] USB: cdc-acm: add IDs for Motorola H24 HSPA USB module Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 59/77] udf: Fortify loading of sparing table Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 60/77] udf: Avoid run away loop when partition table length is corrupted Paul Gortmaker
2013-01-10 14:43   ` Ben Hutchings
2013-01-10 17:03     ` Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 61/77] sctp: malloc enough room for asconf-ack chunk Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 62/77] sctp: Fix list corruption resulting from freeing an association on a list Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 63/77] sctp: ABORT if receive, reassmbly, or reodering queue is not empty while closing socket Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 64/77] sctp: Enforce retransmission limit during shutdown Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 65/77] SCTP: fix race between sctp_bind_addr_free() and sctp_bind_addr_conflict() Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 66/77] KVM: x86: Prevent starting PIT timers in the absence of irqchip support Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 67/77] perf_events: Fix races in group composition Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 68/77] perf: Fix tear-down of inherited group events Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 69/77] sched: fix divide by zero at {thread_group,task}_times Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 70/77] mutex: Place lock in contended state after fastpath_lock failure Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 71/77] crypto: ghash - Avoid null pointer dereference if no key is set Paul Gortmaker
2013-01-09  2:56   ` Nick Bowler
2013-01-09 14:56     ` Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 72/77] net: Fix ip link add netns oops Paul Gortmaker
2013-01-08 23:35 ` Paul Gortmaker [this message]
2013-01-08 23:35 ` [v2.6.34-stable 74/77] rwsem: Remove redundant asmregparm annotation Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 75/77] um: Use RWSEM_GENERIC_SPINLOCK on x86 Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 76/77] x86: Get rid of asmregparm Paul Gortmaker
2013-01-08 23:35 ` [v2.6.34-stable 77/77] x86: Don't use the EFI reboot method by default Paul Gortmaker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1357688156-25387-74-git-send-email-paul.gortmaker@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=chanho.min@lge.com \
    --cc=chanho0207@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).