From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiang Liu Date: Tue, 01 Sep 2015 07:12:34 +0000 Subject: Possible deadlock related to CPU hotplug and kernfs Message-Id: <55E54FE2.7030601@linux.intel.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Tejun Heo , "Rafael J. Wysocki" , linux hotplug mailing , Linux Kernel Mailing List , ACPI Devel Maling List Hi Rafael and Tejun, When running CPU hotplug tests, it triggers an lockdep warning as follow. The two possible deadlock paths are: 1) echo x > /sys/devices/system/cpu/cpux/online ->kernfs_fop_write() ->kernfs_get_active() 1.a) ->rwsem_acquire_read(&kn->dep_map, 0, 1, _RET_IP_); ->cpu_up() 1.b) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)] 2) hardware triggers hotplug evetns ->acpi_device_hotplug() ->acpi_processor_remove() 2.a) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)] ->unregister_cpu() ->device_del() ->kernfs_remove_by_name_ns() ->__kernfs_remove() ->kernfs_drain() 2.b) ->rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_) So there is a possible deadlock scenario among 1.a, 1.b, 2.a and 2.b. I'm not familiar with kernfs, so could you please help to comment: 1) whether is a real deadlock issue? 2) any recommended way to get it fixed? Thanks! Gerry Full lockdep warnings: [ 310.309391] [ INFO: possible circular locking dependency detected ] [ 310.316462] 4.2.0-rc8+ #7 Not tainted [ 310.320613] ------------------------------------------------------- [ 310.327684] kworker/u288:3/388 is trying to acquire lock: [ 310.333780] (s_active#97){++++.+}, at: [] kernfs_remove_by_name_ns+0x49/0xb0 [ 310.343885] [ 310.343885] but task is already holding lock: [ 310.350466] (cpu_hotplug.lock#2){+.+.+.}, at: [] cpu_hotplug_begin+0x7b/0xc0 [ 310.360564] [ 310.360564] which lock already depends on the new lock. [ 310.360564] [ 310.369766] [ 310.369766] the existing dependency chain (in reverse order) is: [ 310.378198] [ 310.378198] -> #3 (cpu_hotplug.lock#2){+.+.+.}: [ 310.383821] [] lock_acquire+0xdd/0x2a0 [ 310.390591] [] mutex_lock_nested+0x70/0x3e0 [ 310.397847] [] cpu_hotplug_begin+0x7b/0xc0 [ 310.405004] [] _cpu_up+0x31/0x140 [ 310.411285] [] cpu_up+0x7c/0xa0 [ 310.417362] [] smp_init+0x86/0x88 [ 310.423647] [] kernel_init_freeable+0x171/0x286 [ 310.431292] [] kernel_init+0xe/0xe0 [ 310.437771] [] ret_from_fork+0x3f/0x70 [ 310.444540] [ 310.444540] -> #2 (cpu_hotplug.lock){++++++}: [ 310.449957] [] lock_acquire+0xdd/0x2a0 [ 310.456714] [] cpu_hotplug_begin+0x6d/0xc0 [ 310.463871] [] _cpu_up+0x31/0x140 [ 310.470143] [] cpu_up+0x7c/0xa0 [ 310.476228] [] smp_init+0x86/0x88 [ 310.482509] [] kernel_init_freeable+0x171/0x286 [ 310.490153] [] kernel_init+0xe/0xe0 [ 310.496628] [] ret_from_fork+0x3f/0x70 [ 310.503393] [ 310.503393] -> #1 (cpu_add_remove_lock){+.+.+.}: [ 310.509099] [] lock_acquire+0xdd/0x2a0 [ 310.515866] [] __might_fault+0x84/0xb0 [ 310.522635] [] kernfs_fop_write+0x8f/0x190 [ 310.529793] [] __vfs_write+0x28/0xe0 [ 310.536368] [] vfs_write+0xac/0x1a0 [ 310.542833] [] SyS_write+0x49/0xb0 [ 310.549212] [] entry_SYSCALL_64_fastpath+0x16/0x7a [ 310.557149] [ 310.557149] -> #0 (s_active#97){++++.+}: [ 310.562135] [] __lock_acquire+0x21b9/0x21c0 [ 310.569391] [] lock_acquire+0xdd/0x2a0 [ 310.576159] [] __kernfs_remove+0x231/0x330 [ 310.583318] [] kernfs_remove_by_name_ns+0x49/0xb0 [ 310.591154] [] sysfs_remove_file_ns+0x15/0x20 [ 310.598594] [] device_remove_attrs+0x3e/0x80 [ 310.605948] [] device_del+0x138/0x270 [ 310.612617] [] device_unregister+0x22/0x70 [ 310.619767] [] unregister_cpu+0x39/0x60 [ 310.626622] [] arch_unregister_cpu+0x23/0x30 [ 310.633974] [] acpi_processor_remove+0x91/0xca [ 310.641524] [] acpi_bus_trim+0x5a/0x8d [ 310.648292] [] acpi_bus_trim+0x38/0x8d [ 310.655060] [] acpi_scan_device_not_present+0x1d/0x3d [ 310.663312] [] acpi_scan_bus_check+0x29/0xa2 [ 310.670654] [] acpi_device_hotplug+0x99/0x3fa [ 310.678103] [] acpi_hotplug_work_fn+0x1f/0x2b [ 310.685555] [] process_one_work+0x1f1/0x7c0 [ 310.692814] [] worker_thread+0x69/0x480 [ 310.699677] [] kthread+0x11f/0x140 [ 310.706046] [] ret_from_fork+0x3f/0x70 [ 310.712815] [ 310.712815] other info that might help us debug this: [ 310.712815] [ 310.721907] Chain exists of: [ 310.721907] s_active#97 --> cpu_hotplug.lock --> cpu_hotplug.lock#2 [ 310.721907] [ 310.731680] Possible unsafe locking scenario: [ 310.731680] [ 310.738413] CPU0 CPU1 [ 310.743562] ---- ---- [ 310.748710] lock(cpu_hotplug.lock#2); [ 310.753261] lock(cpu_hotplug.lock); [ 310.760382] lock(cpu_hotplug.lock#2); [ 310.767755] lock(s_active#97); [ 310.771625] [ 310.771625] *** DEADLOCK *** [ 310.771625] [ 310.778382] 7 locks held by kworker/u288:3/388: [ 310.783530] #0: ("kacpi_hotplug"){.+.+.+}, at: [] process_one_work+0x166/0x7c0 [ 310.793975] #1: ((&hpw->work)){+.+.+.}, at: [] process_one_work+0x166/0x7c0 [ 310.804126] #2: (device_hotplug_lock){+.+.+.}, at: [] lock_device_hotplug+0x17/0x20 [ 310.815057] #3: (acpi_scan_lock){+.+.+.}, at: [] acpi_device_hotplug+0x36/0x3fa [ 310.825599] #4: (cpu_add_remove_lock){+.+.+.}, at: [] cpu_maps_update_begin+0x17/0x20 [ 310.836727] #5: (cpu_hotplug.lock){++++++}, at: [] cpu_hotplug_begin+0x5/0xc0 [ 310.847073] #6: (cpu_hotplug.lock#2){+.+.+.}, at: [] cpu_hotplug_begin+0x7b/0xc0 [ 310.857774] [ 310.857774] stack backtrace: [ 310.862754] CPU: 11 PID: 388 Comm: kworker/u288:3 Not tainted 4.2.0-rc8+ #7 [ 310.870628] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXIN1.86B.0060.R02.1508171754 08/17/2015 [ 310.882326] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 310.888499] ffffffff82a39b50 ffff88042b9a38d8 ffffffff8185f0b8 0000000000000011 [ 310.897130] ffffffff82afcab0 ffff88042b9a3928 ffffffff8185c183 0000000000000007 [ 310.905762] ffff88042b9a3998 ffff88042b9a3928 ffff88042b99ab08 ffff88042b99a980 [ 310.914393] Call Trace: [ 310.917206] [] dump_stack+0x4c/0x65 [ 310.923039] [] print_circular_bug+0x20b/0x21c [ 310.929843] [] __lock_acquire+0x21b9/0x21c0 [ 310.936455] [] ? native_sched_clock+0x28/0x90 [ 310.943258] [] lock_acquire+0xdd/0x2a0 [ 310.949382] [] ? kernfs_remove_by_name_ns+0x49/0xb0 [ 310.956769] [] __kernfs_remove+0x231/0x330 [ 310.963280] [] ? kernfs_remove_by_name_ns+0x49/0xb0 [ 310.970669] [] ? kernfs_name_hash+0x17/0xa0 [ 310.977278] [] ? kernfs_find_ns+0x81/0x140 [ 310.983792] [] kernfs_remove_by_name_ns+0x49/0xb0 [ 310.990986] [] sysfs_remove_file_ns+0x15/0x20 [ 310.997791] [] device_remove_attrs+0x3e/0x80 [ 311.004498] [] device_del+0x138/0x270 [ 311.010524] [] ? kernfs_remove_by_name_ns+0x55/0xb0 [ 311.017914] [] device_unregister+0x22/0x70 [ 311.024427] [] unregister_cpu+0x39/0x60 [ 311.030646] [] arch_unregister_cpu+0x23/0x30 [ 311.037354] [] acpi_processor_remove+0x91/0xca [ 311.044257] [] acpi_bus_trim+0x5a/0x8d [ 311.050379] [] acpi_bus_trim+0x38/0x8d [ 311.056501] [] acpi_scan_device_not_present+0x1d/0x3d [ 311.064085] [] acpi_scan_bus_check+0x29/0xa2 [ 311.070791] [] acpi_device_hotplug+0x99/0x3fa [ 311.077596] [] acpi_hotplug_work_fn+0x1f/0x2b [ 311.084402] [] process_one_work+0x1f1/0x7c0 [ 311.091012] [] ? process_one_work+0x166/0x7c0 [ 311.097815] [] ? worker_thread+0xf9/0x480 [ 311.104231] [] worker_thread+0x69/0x480 [ 311.110451] [] ? process_one_work+0x7c0/0x7c0 [ 311.117256] [] kthread+0x11f/0x140 [ 311.122990] [] ? kthread_create_on_node+0x260/0x260 [ 311.130379] [] ret_from_fork+0x3f/0x70 [ 311.136502] [] ? kthread_create_on_node+0x260/0x260