* 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault @ 2011-05-10 1:04 Valdis.Kletnieks 2011-05-10 8:20 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Valdis.Kletnieks @ 2011-05-10 1:04 UTC (permalink / raw) To: Andrew Morton, Paul E. McKenney, Ingo Molnar, Peter Zijlstra Cc: linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 10107 bytes --] Seen at boot earlier today. Interrupt, page fault, RCU, scheduler, and MM code in the tracebacks. Yee-hah. [ 12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd [ 12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513 [ 12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 12.987691] hub 1-4:1.0: USB hub found [ 12.987877] hub 1-4:1.0: 3 ports detected [ 12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10 [ 13.071471] udevadm used greatest stack depth: 3984 bytes left [ 13.172129] [ 13.172130] ======================================================= [ 13.172425] [ INFO: possible circular locking dependency detected ] [ 13.172650] 2.6.39-rc6-mmotm0506 #1 [ 13.172773] ------------------------------------------------------- [ 13.172997] blkid/267 is trying to acquire lock: [ 13.173009] (&p->pi_lock){-.-.-.}, at: [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [ 13.173009] but task is already holding lock: [ 13.173009] (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] which lock already depends on the new lock. [ 13.173009] [ 13.173009] [ 13.173009] the existing dependency chain (in reverse order) is: [ 13.173009] [ 13.173009] -> #2 (rcu_node_level_0){..-...}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81090794>] rcu_read_unlock_special+0x8c/0x1d5 [ 13.173009] [<ffffffff8109092c>] __rcu_read_unlock+0x4f/0xd7 [ 13.173009] [<ffffffff81027bd3>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff8102cc34>] cpuacct_charge+0x6c/0x75 [ 13.173009] [<ffffffff81030cc6>] update_curr+0x101/0x12e [ 13.173009] [<ffffffff810311d0>] check_preempt_wakeup+0xf7/0x23b [ 13.173009] [<ffffffff8102acb3>] check_preempt_curr+0x2b/0x68 [ 13.173009] [<ffffffff81031d40>] ttwu_do_wakeup+0x76/0x128 [ 13.173009] [<ffffffff81031e49>] ttwu_do_activate.constprop.63+0x57/0x5c [ 13.173009] [<ffffffff81031e96>] scheduler_ipi+0x48/0x5d [ 13.173009] [<ffffffff810177d5>] smp_reschedule_interrupt+0x16/0x18 [ 13.173009] [<ffffffff815710f3>] reschedule_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810b66d1>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff810b739c>] find_get_page+0xa9/0xb9 [ 13.173009] [<ffffffff810b8b48>] filemap_fault+0x6a/0x34d [ 13.173009] [<ffffffff810d1a25>] __do_fault+0x54/0x3e6 [ 13.173009] [<ffffffff810d447a>] handle_pte_fault+0x12c/0x1ed [ 13.173009] [<ffffffff810d48f7>] handle_mm_fault+0x1cd/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] -> #1 (&rq->lock){-.-.-.}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81027e19>] __task_rq_lock+0x8b/0xd3 [ 13.173009] [<ffffffff81032f7f>] wake_up_new_task+0x41/0x108 [ 13.173009] [<ffffffff810376c3>] do_fork+0x265/0x33f [ 13.173009] [<ffffffff81007d02>] kernel_thread+0x6b/0x6d [ 13.173009] [<ffffffff8153a9dd>] rest_init+0x21/0xd2 [ 13.173009] [<ffffffff81b1db4f>] start_kernel+0x3bb/0x3c6 [ 13.173009] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3 [ 13.173009] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7 [ 13.173009] [ 13.173009] -> #0 (&p->pi_lock){-.-.-.}: [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] other info that might help us debug this: [ 13.173009] [ 13.173009] Chain exists of: [ 13.173009] &p->pi_lock --> &rq->lock --> rcu_node_level_0 [ 13.173009] [ 13.173009] Possible unsafe locking scenario: [ 13.173009] [ 13.173009] CPU0 CPU1 [ 13.173009] ---- ---- [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&rq->lock); [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&p->pi_lock); [ 13.173009] [ 13.173009] *** DEADLOCK *** [ 13.173009] [ 13.173009] 3 locks held by blkid/267: [ 13.173009] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8156cdb4>] do_page_fault+0x1f3/0x5de [ 13.173009] #1: (&yield_timer){+.-...}, at: [<ffffffff810451da>] call_timer_fn+0x0/0x1e9 [ 13.173009] #2: (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] stack backtrace: [ 13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1 [ 13.173009] Call Trace: [ 13.173009] <IRQ> [<ffffffff8154a529>] print_circular_bug+0xc8/0xd9 [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff8100c861>] ? save_stack_trace+0x28/0x46 [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff810451da>] ? del_timer+0x75/0x75 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8106365f>] ? tick_dev_program_event+0x37/0xf6 [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] <EOI> [<ffffffff810bd384>] ? get_page_from_freelist+0x114/0x310 [ 13.173009] [<ffffffff810bd51a>] ? get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff812220e7>] ? clear_page_c+0x7/0x10 [ 13.173009] [<ffffffff810bd1ef>] ? prep_new_page+0x14c/0x1cd [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff810d915f>] ? sys_brk+0x32/0x10c [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff81065c4f>] ? trace_hardirqs_off_caller+0x3f/0x9c [ 13.173009] [<ffffffff812235dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd [-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 1:04 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault Valdis.Kletnieks @ 2011-05-10 8:20 ` Paul E. McKenney 2011-05-10 8:57 ` Ingo Molnar 2011-05-11 23:11 ` Valdis.Kletnieks 0 siblings, 2 replies; 10+ messages in thread From: Paul E. McKenney @ 2011-05-10 8:20 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm On Mon, May 09, 2011 at 09:04:36PM -0400, Valdis.Kletnieks@vt.edu wrote: > Seen at boot earlier today. Interrupt, page fault, RCU, scheduler, and MM code > in the tracebacks. Yee-hah. Hmmmm... Here is an untested patch that breaks this particular deadlock cycle. It feels like there might be more where this one came from, though in theory if you call rcu_read_unlock() with the runqueue locks held, it should be impossible for priority deboosting to be invoked. Unless someone calls rcu_read_lock() with preemption enabled, then acquires a runqueue lock, and then calls rcu_read_unlock(). I am hoping that no one does this. Thanx, Paul ------------------------------------------------------------------------ rcu: Avoid acquiring rcu_node locks in timer functions This commit switches manipulations of the rcu_node ->wakemask field to atomic operations, which allows rcu_cpu_kthread_timer() to avoid acquiring the rcu_node lock. This should avoid the following lockdep splat: [ 12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd [ 12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513 [ 12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0 [ 12.987691] hub 1-4:1.0: USB hub found [ 12.987877] hub 1-4:1.0: 3 ports detected [ 12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10 [ 13.071471] udevadm used greatest stack depth: 3984 bytes left [ 13.172129] [ 13.172130] ======================================================= [ 13.172425] [ INFO: possible circular locking dependency detected ] [ 13.172650] 2.6.39-rc6-mmotm0506 #1 [ 13.172773] ------------------------------------------------------- [ 13.172997] blkid/267 is trying to acquire lock: [ 13.173009] (&p->pi_lock){-.-.-.}, at: [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [ 13.173009] but task is already holding lock: [ 13.173009] (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] which lock already depends on the new lock. [ 13.173009] [ 13.173009] [ 13.173009] the existing dependency chain (in reverse order) is: [ 13.173009] [ 13.173009] -> #2 (rcu_node_level_0){..-...}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81090794>] rcu_read_unlock_special+0x8c/0x1d5 [ 13.173009] [<ffffffff8109092c>] __rcu_read_unlock+0x4f/0xd7 [ 13.173009] [<ffffffff81027bd3>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff8102cc34>] cpuacct_charge+0x6c/0x75 [ 13.173009] [<ffffffff81030cc6>] update_curr+0x101/0x12e [ 13.173009] [<ffffffff810311d0>] check_preempt_wakeup+0xf7/0x23b [ 13.173009] [<ffffffff8102acb3>] check_preempt_curr+0x2b/0x68 [ 13.173009] [<ffffffff81031d40>] ttwu_do_wakeup+0x76/0x128 [ 13.173009] [<ffffffff81031e49>] ttwu_do_activate.constprop.63+0x57/0x5c [ 13.173009] [<ffffffff81031e96>] scheduler_ipi+0x48/0x5d [ 13.173009] [<ffffffff810177d5>] smp_reschedule_interrupt+0x16/0x18 [ 13.173009] [<ffffffff815710f3>] reschedule_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810b66d1>] rcu_read_unlock+0x21/0x23 [ 13.173009] [<ffffffff810b739c>] find_get_page+0xa9/0xb9 [ 13.173009] [<ffffffff810b8b48>] filemap_fault+0x6a/0x34d [ 13.173009] [<ffffffff810d1a25>] __do_fault+0x54/0x3e6 [ 13.173009] [<ffffffff810d447a>] handle_pte_fault+0x12c/0x1ed [ 13.173009] [<ffffffff810d48f7>] handle_mm_fault+0x1cd/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] -> #1 (&rq->lock){-.-.-.}: [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815697f1>] _raw_spin_lock+0x36/0x45 [ 13.173009] [<ffffffff81027e19>] __task_rq_lock+0x8b/0xd3 [ 13.173009] [<ffffffff81032f7f>] wake_up_new_task+0x41/0x108 [ 13.173009] [<ffffffff810376c3>] do_fork+0x265/0x33f [ 13.173009] [<ffffffff81007d02>] kernel_thread+0x6b/0x6d [ 13.173009] [<ffffffff8153a9dd>] rest_init+0x21/0xd2 [ 13.173009] [<ffffffff81b1db4f>] start_kernel+0x3bb/0x3c6 [ 13.173009] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3 [ 13.173009] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7 [ 13.173009] [ 13.173009] -> #0 (&p->pi_lock){-.-.-.}: [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 13.173009] [ 13.173009] other info that might help us debug this: [ 13.173009] [ 13.173009] Chain exists of: [ 13.173009] &p->pi_lock --> &rq->lock --> rcu_node_level_0 [ 13.173009] [ 13.173009] Possible unsafe locking scenario: [ 13.173009] [ 13.173009] CPU0 CPU1 [ 13.173009] ---- ---- [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&rq->lock); [ 13.173009] lock(rcu_node_level_0); [ 13.173009] lock(&p->pi_lock); [ 13.173009] [ 13.173009] *** DEADLOCK *** [ 13.173009] [ 13.173009] 3 locks held by blkid/267: [ 13.173009] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8156cdb4>] do_page_fault+0x1f3/0x5de [ 13.173009] #1: (&yield_timer){+.-...}, at: [<ffffffff810451da>] call_timer_fn+0x0/0x1e9 [ 13.173009] #2: (rcu_node_level_0){..-...}, at: [<ffffffff810901cc>] rcu_cpu_kthread_timer+0x27/0x58 [ 13.173009] [ 13.173009] stack backtrace: [ 13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1 [ 13.173009] Call Trace: [ 13.173009] <IRQ> [<ffffffff8154a529>] print_circular_bug+0xc8/0xd9 [ 13.173009] [<ffffffff81067788>] check_prev_add+0x68/0x20e [ 13.173009] [<ffffffff8100c861>] ? save_stack_trace+0x28/0x46 [ 13.173009] [<ffffffff810679b9>] check_prevs_add+0x8b/0x104 [ 13.173009] [<ffffffff81067da1>] validate_chain+0x36f/0x3ab [ 13.173009] [<ffffffff8106846b>] __lock_acquire+0x369/0x3e2 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81068a0f>] lock_acquire+0xfc/0x14c [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff815698ea>] _raw_spin_lock_irqsave+0x44/0x57 [ 13.173009] [<ffffffff81032d8f>] ? try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff81032d8f>] try_to_wake_up+0x29/0x1aa [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81032f3c>] wake_up_process+0x10/0x12 [ 13.173009] [<ffffffff810901e9>] rcu_cpu_kthread_timer+0x44/0x58 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff81045286>] call_timer_fn+0xac/0x1e9 [ 13.173009] [<ffffffff810451da>] ? del_timer+0x75/0x75 [ 13.173009] [<ffffffff810901a5>] ? rcu_check_quiescent_state+0x82/0x82 [ 13.173009] [<ffffffff8104556d>] run_timer_softirq+0x1aa/0x1f2 [ 13.173009] [<ffffffff8103e487>] __do_softirq+0x109/0x26a [ 13.173009] [<ffffffff8106365f>] ? tick_dev_program_event+0x37/0xf6 [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff8157144c>] call_softirq+0x1c/0x30 [ 13.173009] [<ffffffff81003207>] do_softirq+0x44/0xf1 [ 13.173009] [<ffffffff8103e8b9>] irq_exit+0x58/0xc8 [ 13.173009] [<ffffffff81017f5a>] smp_apic_timer_interrupt+0x79/0x87 [ 13.173009] [<ffffffff81570fd3>] apic_timer_interrupt+0x13/0x20 [ 13.173009] <EOI> [<ffffffff810bd384>] ? get_page_from_freelist+0x114/0x310 [ 13.173009] [<ffffffff810bd51a>] ? get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff812220e7>] ? clear_page_c+0x7/0x10 [ 13.173009] [<ffffffff810bd1ef>] ? prep_new_page+0x14c/0x1cd [ 13.173009] [<ffffffff810bd51a>] get_page_from_freelist+0x2aa/0x310 [ 13.173009] [<ffffffff810bdf03>] __alloc_pages_nodemask+0x178/0x243 [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff8101fe2f>] pte_alloc_one+0x1e/0x3a [ 13.173009] [<ffffffff810d46b9>] ? __pmd_alloc+0x87/0x99 [ 13.173009] [<ffffffff810d27fe>] __pte_alloc+0x22/0x14b [ 13.173009] [<ffffffff810d48a8>] handle_mm_fault+0x17e/0x1e0 [ 13.173009] [<ffffffff8156cfee>] do_page_fault+0x42d/0x5de [ 13.173009] [<ffffffff810d915f>] ? sys_brk+0x32/0x10c [ 13.173009] [<ffffffff810a0e4a>] ? time_hardirqs_off+0x1b/0x2f [ 13.173009] [<ffffffff81065c4f>] ? trace_hardirqs_off_caller+0x3f/0x9c [ 13.173009] [<ffffffff812235dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 13.173009] [<ffffffff8156a75f>] page_fault+0x1f/0x30 [ 14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 5616b17..20c22c5 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -1525,13 +1525,15 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt) */ static void rcu_cpu_kthread_timer(unsigned long arg) { - unsigned long flags; + unsigned long old; + unsigned long new; struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, arg); struct rcu_node *rnp = rdp->mynode; - raw_spin_lock_irqsave(&rnp->lock, flags); - rnp->wakemask |= rdp->grpmask; - raw_spin_unlock_irqrestore(&rnp->lock, flags); + do { + old = rnp->wakemask; + new = old | rdp->grpmask; + } while (cmpxchg(&rnp->wakemask, old, new) != old); invoke_rcu_node_kthread(rnp); } @@ -1682,8 +1684,7 @@ static int rcu_node_kthread(void *arg) wait_event_interruptible(rnp->node_wq, rnp->wakemask != 0); rnp->node_kthread_status = RCU_KTHREAD_RUNNING; raw_spin_lock_irqsave(&rnp->lock, flags); - mask = rnp->wakemask; - rnp->wakemask = 0; + mask = xchg(&rnp->wakemask,0); rcu_initiate_boost(rnp, flags); /* releases rnp->lock. */ for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++, mask >>= 1) { if ((mask & 0x1) == 0) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 8:20 ` Paul E. McKenney @ 2011-05-10 8:57 ` Ingo Molnar 2011-05-10 16:21 ` Paul E. McKenney 2011-05-11 23:11 ` Valdis.Kletnieks 1 sibling, 1 reply; 10+ messages in thread From: Ingo Molnar @ 2011-05-10 8:57 UTC (permalink / raw) To: Paul E. McKenney Cc: Valdis.Kletnieks, Andrew Morton, Peter Zijlstra, linux-kernel, linux-mm * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > - raw_spin_lock_irqsave(&rnp->lock, flags); > - rnp->wakemask |= rdp->grpmask; > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > + do { > + old = rnp->wakemask; > + new = old | rdp->grpmask; > + } while (cmpxchg(&rnp->wakemask, old, new) != old); Hm, isnt this an inferior version of atomic_or_long() in essence? Note that atomic_or_long() is x86 only, so a generic one would have to be offered too i suspect, atomic_cmpxchg() driven or so - which would look like the above loop. Most architectures could offer atomic_or_long() i suspect. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 8:57 ` Ingo Molnar @ 2011-05-10 16:21 ` Paul E. McKenney 2011-05-10 20:44 ` Ingo Molnar 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2011-05-10 16:21 UTC (permalink / raw) To: Ingo Molnar Cc: Valdis.Kletnieks, Andrew Morton, Peter Zijlstra, linux-kernel, linux-mm On Tue, May 10, 2011 at 10:57:46AM +0200, Ingo Molnar wrote: > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > - raw_spin_lock_irqsave(&rnp->lock, flags); > > - rnp->wakemask |= rdp->grpmask; > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > + do { > > + old = rnp->wakemask; > > + new = old | rdp->grpmask; > > + } while (cmpxchg(&rnp->wakemask, old, new) != old); > > Hm, isnt this an inferior version of atomic_or_long() in essence? > > Note that atomic_or_long() is x86 only, so a generic one would have to be > offered too i suspect, atomic_cmpxchg() driven or so - which would look like > the above loop. > > Most architectures could offer atomic_or_long() i suspect. Is the following what you had in mind? This (untested) patch provides only the generic function: if this is what you had in mind, I can put together optimized versions for a couple of the architectures. Thanx, Paul ------------------------------------------------------------------------ diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index cc6c53a..e7c2e69 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -171,6 +171,9 @@ config ARCH_HAS_DEFAULT_IDLE config ARCH_HAS_CACHE_LINE_SIZE def_bool y +config ARCH_HAS_ATOMIC_OR_LONG + def_bool X86_64 + config HAVE_SETUP_PER_CPU_AREA def_bool y diff --git a/include/linux/atomic.h b/include/linux/atomic.h index 96c038e..2fc3222 100644 --- a/include/linux/atomic.h +++ b/include/linux/atomic.h @@ -34,4 +34,17 @@ static inline int atomic_inc_not_zero_hint(atomic_t *v, int hint) } #endif +#ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG +static inline void atomic_or_long(unsigned long *v1, unsigned long v2) +{ + unsigned long old; + unsigned long new; + + do { + old = ACCESS_ONCE(*v1); + new = old | v2; + } while (cmpxchg(v1, old, new) != old); +} +#endif /* #ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG */ + #endif /* _LINUX_ATOMIC_H */ diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 20c22c5..86f44a3 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -36,7 +36,7 @@ #include <linux/interrupt.h> #include <linux/sched.h> #include <linux/nmi.h> -#include <asm/atomic.h> +#include <linux/atomic.h> #include <linux/bitops.h> #include <linux/module.h> #include <linux/completion.h> @@ -1525,15 +1525,10 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt) */ static void rcu_cpu_kthread_timer(unsigned long arg) { - unsigned long old; - unsigned long new; struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, arg); struct rcu_node *rnp = rdp->mynode; - do { - old = rnp->wakemask; - new = old | rdp->grpmask; - } while (cmpxchg(&rnp->wakemask, old, new) != old); + atomic_or_long(&rnp->wakemask, rdp->grpmask); invoke_rcu_node_kthread(rnp); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 16:21 ` Paul E. McKenney @ 2011-05-10 20:44 ` Ingo Molnar 2011-05-11 7:44 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Ingo Molnar @ 2011-05-10 20:44 UTC (permalink / raw) To: Paul E. McKenney Cc: Valdis.Kletnieks, Andrew Morton, Peter Zijlstra, linux-kernel, linux-mm * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > On Tue, May 10, 2011 at 10:57:46AM +0200, Ingo Molnar wrote: > > > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > - raw_spin_lock_irqsave(&rnp->lock, flags); > > > - rnp->wakemask |= rdp->grpmask; > > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > + do { > > > + old = rnp->wakemask; > > > + new = old | rdp->grpmask; > > > + } while (cmpxchg(&rnp->wakemask, old, new) != old); > > > > Hm, isnt this an inferior version of atomic_or_long() in essence? > > > > Note that atomic_or_long() is x86 only, so a generic one would have to be > > offered too i suspect, atomic_cmpxchg() driven or so - which would look like > > the above loop. > > > > Most architectures could offer atomic_or_long() i suspect. > > Is the following what you had in mind? This (untested) patch provides only > the generic function: if this is what you had in mind, I can put together > optimized versions for a couple of the architectures. Yeah, something like this, except: > +#ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG > +static inline void atomic_or_long(unsigned long *v1, unsigned long v2) > +{ > + unsigned long old; > + unsigned long new; > + > + do { > + old = ACCESS_ONCE(*v1); > + new = old | v2; > + } while (cmpxchg(v1, old, new) != old); > +} > +#endif /* #ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG */ Shouldnt that method work on atomic_t (or atomic64_t)? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 20:44 ` Ingo Molnar @ 2011-05-11 7:44 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2011-05-11 7:44 UTC (permalink / raw) To: Ingo Molnar Cc: Valdis.Kletnieks, Andrew Morton, Peter Zijlstra, linux-kernel, linux-mm On Tue, May 10, 2011 at 10:44:43PM +0200, Ingo Molnar wrote: > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > On Tue, May 10, 2011 at 10:57:46AM +0200, Ingo Molnar wrote: > > > > > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote: > > > > > > > - raw_spin_lock_irqsave(&rnp->lock, flags); > > > > - rnp->wakemask |= rdp->grpmask; > > > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > > + do { > > > > + old = rnp->wakemask; > > > > + new = old | rdp->grpmask; > > > > + } while (cmpxchg(&rnp->wakemask, old, new) != old); > > > > > > Hm, isnt this an inferior version of atomic_or_long() in essence? > > > > > > Note that atomic_or_long() is x86 only, so a generic one would have to be > > > offered too i suspect, atomic_cmpxchg() driven or so - which would look like > > > the above loop. > > > > > > Most architectures could offer atomic_or_long() i suspect. > > > > Is the following what you had in mind? This (untested) patch provides only > > the generic function: if this is what you had in mind, I can put together > > optimized versions for a couple of the architectures. > > Yeah, something like this, except: > > > +#ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG > > +static inline void atomic_or_long(unsigned long *v1, unsigned long v2) > > +{ > > + unsigned long old; > > + unsigned long new; > > + > > + do { > > + old = ACCESS_ONCE(*v1); > > + new = old | v2; > > + } while (cmpxchg(v1, old, new) != old); > > +} > > +#endif /* #ifndef CONFIG_ARCH_HAS_ATOMIC_OR_LONG */ > > Shouldnt that method work on atomic_t (or atomic64_t)? Works for me -- and in this case it is quite easy to change existing uses. Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-10 8:20 ` Paul E. McKenney 2011-05-10 8:57 ` Ingo Molnar @ 2011-05-11 23:11 ` Valdis.Kletnieks 2011-05-12 9:47 ` Paul E. McKenney 1 sibling, 1 reply; 10+ messages in thread From: Valdis.Kletnieks @ 2011-05-11 23:11 UTC (permalink / raw) To: paulmck; +Cc: Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 1324 bytes --] On Tue, 10 May 2011 01:20:29 PDT, "Paul E. McKenney" said: Would test, but it doesn't apply cleanly to my -mmotm0506 tree: > diff --git a/kernel/rcutree.c b/kernel/rcutree.c > index 5616b17..20c22c5 100644 > --- a/kernel/rcutree.c > +++ b/kernel/rcutree.c > @@ -1525,13 +1525,15 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt) > */ > static void rcu_cpu_kthread_timer(unsigned long arg) > { > - unsigned long flags; > + unsigned long old; > + unsigned long new; > struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, arg); > struct rcu_node *rnp = rdp->mynode; > > - raw_spin_lock_irqsave(&rnp->lock, flags); > - rnp->wakemask |= rdp->grpmask; > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > + do { > + old = rnp->wakemask; > + new = old | rdp->grpmask; > + } while (cmpxchg(&rnp->wakemask, old, new) != old); > invoke_rcu_node_kthread(rnp); > } My source has this: raw_spin_lock_irqsave(&rnp->lock, flags); rnp->wakemask |= rdp->grpmask; invoke_rcu_node_kthread(rnp); raw_spin_unlock_irqrestore(&rnp->lock, flags); the last 2 lines swapped from what you diffed against. I can easily work around that, except it's unclear what the implications of the invoke_rcu moving outside of the irq save/restore pair (or if it being inside is the actual root cause)... [-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-11 23:11 ` Valdis.Kletnieks @ 2011-05-12 9:47 ` Paul E. McKenney 2011-05-12 16:05 ` Valdis.Kletnieks 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2011-05-12 9:47 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm On Wed, May 11, 2011 at 07:11:34PM -0400, Valdis.Kletnieks@vt.edu wrote: > On Tue, 10 May 2011 01:20:29 PDT, "Paul E. McKenney" said: > > Would test, but it doesn't apply cleanly to my -mmotm0506 tree: > > > diff --git a/kernel/rcutree.c b/kernel/rcutree.c > > index 5616b17..20c22c5 100644 > > --- a/kernel/rcutree.c > > +++ b/kernel/rcutree.c > > @@ -1525,13 +1525,15 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt) > > */ > > static void rcu_cpu_kthread_timer(unsigned long arg) > > { > > - unsigned long flags; > > + unsigned long old; > > + unsigned long new; > > struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, arg); > > struct rcu_node *rnp = rdp->mynode; > > > > - raw_spin_lock_irqsave(&rnp->lock, flags); > > - rnp->wakemask |= rdp->grpmask; > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > + do { > > + old = rnp->wakemask; > > + new = old | rdp->grpmask; > > + } while (cmpxchg(&rnp->wakemask, old, new) != old); > > invoke_rcu_node_kthread(rnp); > > } > > My source has this: > > raw_spin_lock_irqsave(&rnp->lock, flags); > rnp->wakemask |= rdp->grpmask; > invoke_rcu_node_kthread(rnp); > raw_spin_unlock_irqrestore(&rnp->lock, flags); > > the last 2 lines swapped from what you diffed against. I can easily work around > that, except it's unclear what the implications of the invoke_rcu moving outside > of the irq save/restore pair (or if it being inside is the actual root cause)... Odd... This looks to me like a recent -next -- I do not believe that straight mmotm has rcu_cpu_kthread_timer() in it. The patch should apply to the last few days' -next kernels. Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-12 9:47 ` Paul E. McKenney @ 2011-05-12 16:05 ` Valdis.Kletnieks 2011-05-13 7:18 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Valdis.Kletnieks @ 2011-05-12 16:05 UTC (permalink / raw) To: paulmck; +Cc: Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 1680 bytes --] On Thu, 12 May 2011 02:47:05 PDT, "Paul E. McKenney" said: > On Wed, May 11, 2011 at 07:11:34PM -0400, Valdis.Kletnieks@vt.edu wrote: > > My source has this: > > > > raw_spin_lock_irqsave(&rnp->lock, flags); > > rnp->wakemask |= rdp->grpmask; > > invoke_rcu_node_kthread(rnp); > > raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > > the last 2 lines swapped from what you diffed against. I can easily work around > > that, except it's unclear what the implications of the invoke_rcu moving outside > > of the irq save/restore pair (or if it being inside is the actual root cause)... > > Odd... > > This looks to me like a recent -next -- I do not believe that straight > mmotm has rcu_cpu_kthread_timer() in it. The patch should apply to the > last few days' -next kernels. Ah. Found it. Your tree and current linux-next include this commit: commit 1217ed1ba5c67393293dfb0f03c353b118dadeb4 tree a765356c8418e134de85fd05d9fe6eda41de859c tree | snapshot parent 29ce831000081dd757d3116bf774aafffc4b6b20 commit | diff rcu: permit rcu_read_unlock() to be called while holding runqueue locks which includes this chunk: @@ -1546,8 +1531,8 @@ static void rcu_cpu_kthread_timer(unsigned long arg) raw_spin_lock_irqsave(&rnp->lock, flags); rnp->wakemask |= rdp->grpmask; - invoke_rcu_node_kthread(rnp); raw_spin_unlock_irqrestore(&rnp->lock, flags); + invoke_rcu_node_kthread(rnp); } but that was committed 4 days ago, and Andrew pulled linux-next for the -mmotm 6 days ago, so it's not in there. The *rest* of your recent commits appear to be in there though. So that explains the patch failure to apply. [-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault 2011-05-12 16:05 ` Valdis.Kletnieks @ 2011-05-13 7:18 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2011-05-13 7:18 UTC (permalink / raw) To: Valdis.Kletnieks Cc: Andrew Morton, Ingo Molnar, Peter Zijlstra, linux-kernel, linux-mm On Thu, May 12, 2011 at 12:05:24PM -0400, Valdis.Kletnieks@vt.edu wrote: > On Thu, 12 May 2011 02:47:05 PDT, "Paul E. McKenney" said: > > On Wed, May 11, 2011 at 07:11:34PM -0400, Valdis.Kletnieks@vt.edu wrote: > > > My source has this: > > > > > > raw_spin_lock_irqsave(&rnp->lock, flags); > > > rnp->wakemask |= rdp->grpmask; > > > invoke_rcu_node_kthread(rnp); > > > raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > > > > the last 2 lines swapped from what you diffed against. I can easily work around > > > that, except it's unclear what the implications of the invoke_rcu moving outside > > > of the irq save/restore pair (or if it being inside is the actual root cause)... > > > > Odd... > > > > This looks to me like a recent -next -- I do not believe that straight > > mmotm has rcu_cpu_kthread_timer() in it. The patch should apply to the > > last few days' -next kernels. > > Ah. Found it. Your tree and current linux-next include this commit: > > commit 1217ed1ba5c67393293dfb0f03c353b118dadeb4 > tree a765356c8418e134de85fd05d9fe6eda41de859c tree | snapshot > parent 29ce831000081dd757d3116bf774aafffc4b6b20 commit | diff > rcu: permit rcu_read_unlock() to be called while holding runqueue locks > > which includes this chunk: > > @@ -1546,8 +1531,8 @@ static void rcu_cpu_kthread_timer(unsigned long arg) > > raw_spin_lock_irqsave(&rnp->lock, flags); > rnp->wakemask |= rdp->grpmask; > - invoke_rcu_node_kthread(rnp); > raw_spin_unlock_irqrestore(&rnp->lock, flags); > + invoke_rcu_node_kthread(rnp); > } > > > but that was committed 4 days ago, and Andrew pulled linux-next for the -mmotm > 6 days ago, so it's not in there. The *rest* of your recent commits appear to > be in there though. So that explains the patch failure to apply. Whew!!! ;-) Thanx, Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-05-13 7:18 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-05-10 1:04 2.6.39-rc6-mmotm0506 - lockdep splat in RCU code on page fault Valdis.Kletnieks 2011-05-10 8:20 ` Paul E. McKenney 2011-05-10 8:57 ` Ingo Molnar 2011-05-10 16:21 ` Paul E. McKenney 2011-05-10 20:44 ` Ingo Molnar 2011-05-11 7:44 ` Paul E. McKenney 2011-05-11 23:11 ` Valdis.Kletnieks 2011-05-12 9:47 ` Paul E. McKenney 2011-05-12 16:05 ` Valdis.Kletnieks 2011-05-13 7:18 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).