All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/apic: Fix circular locking dependency between console and hrtimer locks
@ 2020-04-07 17:09 Leon Romanovsky
  2020-04-14  5:48 ` Leon Romanovsky
  0 siblings, 1 reply; 14+ messages in thread
From: Leon Romanovsky @ 2020-04-07 17:09 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Leon Romanovsky, H. Peter Anvin, x86, Suresh Siddha, linux-kernel

From: Leon Romanovsky <leonro@mellanox.com>

clockevents_switch_state() calls printk() from under hrtimer_bases.lock.
That causes lock inversion on scheduler locks because printk() can call
into the scheduler. Lockdep puts it as:

 [  728.464312] ====================================================================================================
 [  735.312580] TSC deadline timer enabled
 [  735.324143]
 [  735.324146] ======================================================
 [  735.324148] WARNING: possible circular locking dependency detected
 [  735.324150] 5.6.0-for-upstream-dbg-2020-04-03_10-44-43-70 #1 Not tainted
 [  735.324152] ------------------------------------------------------
 [  735.324154] swapper/3/0 is trying to acquire lock:
 [  735.324155] ffffffff8442c858 ((console_sem).lock){-...}-{2:2}, at: down_trylock+0x13/0x70
 [  735.324162]
 [  735.324164] but task is already holding lock:
 [  735.324165] ffff88842dfb9958 (hrtimer_bases.lock){-.-.}-{2:2}, at: lock_hrtimer_base+0x71/0x120
 [  735.324171]
 [  735.324173] which lock already depends on the new lock.
 [  735.324174]
 [  735.324175]
 [  735.324177] the existing dependency chain (in reverse order) is:
 [  735.324179]
 [  735.324180] -> #4 (hrtimer_bases.lock){-.-.}-{2:2}:
 [  735.324186]        _raw_spin_lock_irqsave+0x3c/0x4b
 [  735.324187]        lock_hrtimer_base+0x71/0x120
 [  735.324189]        hrtimer_start_range_ns+0xc6/0xaa0
 [  735.324191]        __enqueue_rt_entity+0xc44/0xf50
 [  735.324192]        enqueue_rt_entity+0x79/0xc0
 [  735.324194]        enqueue_task_rt+0x5c/0x2e0
 [  735.324195]        activate_task+0x15a/0x2c0
 [  735.324197]        ttwu_do_activate+0xcf/0x120
 [  735.324199]        sched_ttwu_pending+0x160/0x230
 [  735.324200]        scheduler_ipi+0x1c0/0x530
 [  735.324202]        reschedule_interrupt+0xf/0x20
 [  735.324204]        tick_nohz_idle_enter+0x16c/0x250
 [  735.324205]        do_idle+0x90/0x530
 [  735.324207]        cpu_startup_entry+0x19/0x20
 [  735.324208]        start_secondary+0x2ee/0x3e0
 [  735.324210]        secondary_startup_64+0xa4/0xb0
 [  735.324211]
 [  735.324212] -> #3 (&rt_b->rt_runtime_lock){-...}-{2:2}:
 [  735.324218]        _raw_spin_lock+0x25/0x30
 [  735.324219]        rq_online_rt+0x288/0x550
 [  735.324221]        set_rq_online+0x11f/0x190
 [  735.324223]        sched_cpu_activate+0x1d4/0x390
 [  735.324225]        cpuhp_invoke_callback+0x1c5/0x1560
 [  735.324226]        cpuhp_thread_fun+0x3f8/0x6f0
 [  735.324228]        smpboot_thread_fn+0x305/0x5f0
 [  735.324229]        kthread+0x2f8/0x3b0
 [  735.324231]        ret_from_fork+0x24/0x30
 [  735.324232]
 [  735.324233] -> #2 (&rq->lock){-.-.}-{2:2}:
 [  735.324238]        _raw_spin_lock+0x25/0x30
 [  735.324240]        task_fork_fair+0x34/0x430
 [  735.324241]        sched_fork+0x48a/0xa60
 [  735.324243]        copy_process+0x15df/0x5970
 [  735.324244]        _do_fork+0x106/0xcd0
 [  735.324246]        kernel_thread+0x9e/0xe0
 [  735.324247]        rest_init+0x28/0x330
 [  735.324249]        start_kernel+0x6ac/0x6ed
 [  735.324251]        secondary_startup_64+0xa4/0xb0
 [  735.324252]
 [  735.324253] -> #1 (&p->pi_lock){-.-.}-{2:2}:
 [  735.324258]        _raw_spin_lock_irqsave+0x3c/0x4b
 [  735.324260]        try_to_wake_up+0x9a/0x1700
 [  735.324261]        up+0x7a/0xb0
 [  735.324263]        __up_console_sem+0x3c/0x70
 [  735.324264]        console_unlock+0x4f4/0xab0
 [  735.324266]        con_font_op+0x907/0x1010
 [  735.324267]        vt_ioctl+0x10a6/0x2890
 [  735.324269]        tty_ioctl+0x257/0x1240
 [  735.324270]        ksys_ioctl+0x3e9/0x1190
 [  735.324272]        __x64_sys_ioctl+0x6f/0xb0
 [  735.324273]        do_syscall_64+0xe7/0x12c0
 [  735.324275]        entry_SYSCALL_64_after_hwframe+0x49/0xb3
 [  735.324276]
 [  735.324277] -> #0 ((console_sem).lock){-...}-{2:2}:
 [  735.324283]        __lock_acquire+0x374a/0x5210
 [  735.324284]        lock_acquire+0x1b9/0x920
 [  735.324286]        _raw_spin_lock_irqsave+0x3c/0x4b
 [  735.324288]        down_trylock+0x13/0x70
 [  735.324289]        __down_trylock_console_sem+0x33/0xa0
 [  735.324291]        console_trylock+0x13/0x60
 [  735.324292]        vprintk_emit+0xec/0x370
 [  735.324294]        printk+0x9c/0xc3
 [  735.324296]        lapic_timer_set_oneshot+0x4e/0x60
 [  735.324297]        clockevents_switch_state+0x1e1/0x360
 [  735.324299]        tick_program_event+0xae/0xc0
 [  735.324301]        hrtimer_start_range_ns+0x4b6/0xaa0
 [  735.324302]        tick_nohz_idle_stop_tick+0x67c/0xa90
 [  735.324304]        do_idle+0x326/0x530
 [  735.324305]        cpu_startup_entry+0x19/0x20
 [  735.324307]        start_secondary+0x2ee/0x3e0
 [  735.324309]        secondary_startup_64+0xa4/0xb0
 [  735.324310]
 [  735.324311] other info that might help us debug this:
 [  735.324312]
 [  735.324314] Chain exists of:
 [  735.324315]   (console_sem).lock --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
 [  735.324322]
 [  735.324324]  Possible unsafe locking scenario:
 [  735.324325]
 [  735.324327]        CPU0                    CPU1
 [  735.324328]        ----                    ----
 [  735.324329]   lock(hrtimer_bases.lock);
 [  735.324333]                                lock(&rt_b->rt_runtime_lock);
 [  735.324337]                                lock(hrtimer_bases.lock);
 [  735.324341]   lock((console_sem).lock);
 [  735.324344]
 [  735.324345]  *** DEADLOCK ***
 [  735.324346]
 [  735.324348] 1 lock held by swapper/3/0:
 [  735.324349]  #0: ffff88842dfb9958 (hrtimer_bases.lock){-.-.}-{2:2}, at: lock_hrtimer_base+0x71/0x120
 [  735.324356]
 [  735.324357] stack backtrace:
 [  735.324360] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-for-upstream-dbg-2020-04-03_10-44-43-70 #1
 [  735.324363] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 [  735.324364] Call Trace:
 [  735.324366]  dump_stack+0xb7/0x10b
 [  735.324367]  check_noncircular+0x37f/0x460
 [  735.324369]  ? arch_stack_walk+0x7c/0xd0
 [  735.324370]  ? print_circular_bug+0x4e0/0x4e0
 [  735.324372]  ? mark_lock+0x1a4/0xb60
 [  735.324373]  ? __lock_acquire+0x374a/0x5210
 [  735.324375]  __lock_acquire+0x374a/0x5210
 [  735.324376]  ? register_lock_class+0x17e0/0x17e0
 [  735.324378]  ? register_lock_class+0x17e0/0x17e0
 [  735.324380]  lock_acquire+0x1b9/0x920
 [  735.324381]  ? down_trylock+0x13/0x70
 [  735.324383]  ? check_flags.part.29+0x450/0x450
 [  735.324384]  ? lock_downgrade+0x760/0x760
 [  735.324386]  ? vprintk_emit+0xec/0x370
 [  735.324387]  _raw_spin_lock_irqsave+0x3c/0x4b
 [  735.324389]  ? down_trylock+0x13/0x70
 [  735.324390]  down_trylock+0x13/0x70
 [  735.324392]  __down_trylock_console_sem+0x33/0xa0
 [  735.324393]  console_trylock+0x13/0x60
 [  735.324395]  vprintk_emit+0xec/0x370
 [  735.324396]  printk+0x9c/0xc3
 [  735.324398]  ? kmsg_dump_rewind_nolock+0xd9/0xd9
 [  735.324399]  lapic_timer_set_oneshot+0x4e/0x60
 [  735.324401]  clockevents_switch_state+0x1e1/0x360
 [  735.324402]  ? enqueue_hrtimer+0x116/0x310
 [  735.324404]  tick_program_event+0xae/0xc0
 [  735.324406]  hrtimer_start_range_ns+0x4b6/0xaa0
 [  735.324407]  ? hrtimer_run_softirq+0x210/0x210
 [  735.324409]  ? rcu_read_lock_sched_held+0xab/0xe0
 [  735.324410]  ? rcu_read_lock_bh_held+0xe0/0xe0
 [  735.324412]  tick_nohz_idle_stop_tick+0x67c/0xa90
 [  735.324413]  ? tsc_verify_tsc_adjust+0x71/0x290
 [  735.324415]  do_idle+0x326/0x530
 [  735.324416]  ? arch_cpu_idle_exit+0x40/0x40
 [  735.324418]  cpu_startup_entry+0x19/0x20
 [  735.324419]  start_secondary+0x2ee/0x3e0
 [  735.324421]  ? set_cpu_sibling_map+0x2f70/0x2f70
 [  735.324423]  secondary_startup_64+0xa4/0xb0
 [  760.028504] ====================================================================================================

Fix by using deferred variant of printk which doesn't call to the scheduler.

Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
It is far away from my main expertise and I'm not sure that the solution
is correct, but it definitely fixed our regression.
---
 arch/x86/kernel/apic/apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index d254cebdd3c3..6706b2cd9aec 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -353,7 +353,7 @@ static void __setup_APIC_LVTT(unsigned int clocks, int oneshot, int irqen)
 		 */
 		asm volatile("mfence" : : : "memory");

-		printk_once(KERN_DEBUG "TSC deadline timer enabled\n");
+		printk_deferred_once(KERN_DEBUG "TSC deadline timer enabled\n");
 		return;
 	}

--
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-05-01 18:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-07 17:09 [PATCH] x86/apic: Fix circular locking dependency between console and hrtimer locks Leon Romanovsky
2020-04-14  5:48 ` Leon Romanovsky
2020-04-14  6:24   ` Ingo Molnar
2020-04-14  7:05     ` Leon Romanovsky
2020-04-23  7:13       ` Leon Romanovsky
2020-04-27 15:35       ` Thomas Gleixner
2020-04-27 15:49         ` Leon Romanovsky
2020-04-27 11:09     ` Thomas Gleixner
2020-04-27 11:32       ` Leon Romanovsky
2020-04-27 12:59         ` Thomas Gleixner
2020-04-27 13:41           ` Leon Romanovsky
2020-04-27 15:31             ` Thomas Gleixner
2020-04-27 15:54               ` Leon Romanovsky
2020-05-01 18:22               ` [tip: x86/urgent] x86/apic: Move TSC deadline timer debug printk tip-bot2 for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.