From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH RFC] spinlock: split out debugging check from spin_lock_mutex Date: Fri, 12 Apr 2013 16:01:04 +0200 Message-ID: <516813A0.1040300@acm.org> References: <5166BDAA.3000603@acm.org> <1365693486-6315-1-git-send-email-nhorman@tuxdriver.com> <5166F35B.1040200@acm.org> <20130411191409.GA9790@hmsreliant.think-freely.org> <5167A953.1020900@acm.org> <20130412113232.GA19966@hmsreliant.think-freely.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: David Miller , netdev@vger.kernel.org, Ingo Molnar To: Neil Horman Return-path: Received: from jacques.telenet-ops.be ([195.130.132.50]:42897 "EHLO jacques.telenet-ops.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751183Ab3DLOBN (ORCPT ); Fri, 12 Apr 2013 10:01:13 -0400 In-Reply-To: <20130412113232.GA19966@hmsreliant.think-freely.org> Sender: netdev-owner@vger.kernel.org List-ID: On 04/12/13 13:32, Neil Horman wrote: > On Fri, Apr 12, 2013 at 08:27:31AM +0200, Bart Van Assche wrote: >> On 04/11/13 21:14, Neil Horman wrote: >>> This resulted from my commit ca99ca14c which introduced a mutex_trylock >>> operation in a path that could execute in interrupt context. When mutex >>> debugging is enabled, the above warns the user when we are in fact exectuting in >>> interrupt context. >>> >>> I think this is a false positive however. The check is intended to catch users >>> who might be issuing sleeping calls in irq context, but the use of mutex_trylock >>> here is guaranteed not to sleep. >>> >>> We could fix this by replacing the DEBUG_LOCK_WARN_ON check in spin_lock_mutex >>> with a __might_sleep call in the appropriate parent mutex operations, but for >>> the sake of effiency (which It seems is why the check was put in the spin lock >>> code only when debug is enabled), lets split the spin_lock_mutex call into two >>> components, where the outer component does the debug checking. Then >>> mutex_trylock can just call the inner part as its callable from irq context >>> safely. >> >> Sorry but I'm not yet convinced that it's safe to invoke >> mutex_trylock() from IRQ context. Please have a look at the >> implementation of mutex_set_owner(), which is invoked by >> mutex_trylock(). mutex_set_owner() stores the value of the "current" >> pointer into lock->owner. The value of "current" does not have a >> meaning in IRQ context. > > Thats irrelevant, at least as far as deadlock safety is concerned. current will > be set to the process that was running when we were interrupted, but it won't > change during the course of the irq handler, which is all that matters. The > lock->owner field is used for optimistic spinning. The worst that will happen > is, if CONFIG_MUTEX_SPIN_ON_OWNER is configured, another process may wait on > this mutex, spinning on the wrong task to release it (see mutex_spin_on_owner). > Thats not efficient, but its not deadlock prone, and its not even that > inefficient, when you consider that the critical path in the netpoll code is > relatively short. And using the trylock here is certainly preferable to the > memory corruption that was possible previously. I think there is another issue with invoking mutex_trylock() and mutex_unlock() from IRQ context: as far as I can see if CONFIG_DEBUG_MUTEXES is disabled __mutex_unlock_common_slowpath() uses spin_lock() to lock mutex.wait_lock and hence invoking mutex_unlock() from both non-IRQ and IRQ context is not safe. Any thoughts about that ? With v2 of your patch and CONFIG_DEBUG_MUTEXES enabled I get the warning below: ------------[ cut here ]------------ WARNING: at kernel/mutex.c:313 __mutex_unlock_slowpath+0x157/0x160() Pid: 181, comm: kworker/0:1H Tainted: G O 3.9.0-rc6-debug+ #1 Call Trace: [] warn_slowpath_common+0x7f/0xc0 [] warn_slowpath_null+0x1a/0x20 [] __mutex_unlock_slowpath+0x157/0x160 [] mutex_unlock+0xe/0x10 [] netpoll_poll_dev+0x111/0x9a0 [] ? __alloc_skb+0x82/0x2a0 [] netpoll_send_skb_on_dev+0x205/0x3b0 [] netpoll_send_udp+0x28a/0x3a0 [] ? write_msg+0x53/0x110 [netconsole] [] write_msg+0xcf/0x110 [netconsole] [] call_console_drivers.constprop.16+0xa1/0x120 [] console_unlock+0x3f8/0x450 [] vprintk_emit+0x1ee/0x510 [] dev_vprintk_emit+0x5c/0x70 [] ? mempool_free_slab+0x17/0x20 [] ? mempool_free_slab+0x17/0x20 [] ? kmem_cache_free+0x1c2/0x1d0 [] dev_printk_emit+0x39/0x40 [] ? blk_update_request+0x3d2/0x520 [] ? device_block+0x10/0x10 [scsi_mod] [] __dev_printk+0x5e/0x90 [] dev_printk+0x45/0x50 [] scsi_io_completion+0x277/0x6c0 [scsi_mod] [] scsi_finish_command+0xbd/0x120 [scsi_mod] [] scsi_softirq_done+0x13f/0x160 [scsi_mod] [] blk_done_softirq+0x80/0xa0 [] __do_softirq+0x101/0x280 [] irq_exit+0xb5/0xc0 [] smp_apic_timer_interrupt+0x6e/0x99 [] apic_timer_interrupt+0x6f/0x80 [] ? mark_held_locks+0xb2/0x130 [] ? _raw_spin_unlock_irq+0x3a/0x50 [] ? _raw_spin_unlock_irq+0x30/0x50 [] cfq_kick_queue+0x44/0x50 [] process_one_work+0x1fd/0x510 [] ? process_one_work+0x192/0x510 [] worker_thread+0x10f/0x380 [] ? busy_worker_rebind_fn+0xb0/0xb0 [] kthread+0xdb/0xe0 [] ? kthread_create_on_node+0x140/0x140 [] ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x140/0x140 ---[ end trace dd7421d6dfb2c4ed ]--- Bart.