From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: xen_console_resume() may sleep with irqs disabled Date: Wed, 19 Jun 2013 13:08:25 +0100 Message-ID: <51C19F39.4080207@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Xen-devel@lists.xen.org Cc: Ian Campbell , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org During a suspend/resume cycle (e.g., a migration), xen_console_resume() is called from xen_suspend() which is called with local irqs disabled by stop_machine(). xen_console_resume() calls rebind_evtchn_irq() which attempts to lock the irq_mapping_update_lock mutex. This produces the BUG listed below. The lock was changed to a mutex by 773659 (xen/irq: Alter the locking to use a mutex instead of a spinlock.). Clearly we can't just revert this change but it's not clear to me what the correct fix here is. Can xen_console_resume() be deferred to later in the resume process? David [ 39.877210] BUG: sleeping function called from invalid context at /anfs/drall/scratch/davidvr/x86/linux/kernel/mutex.c:413 [ 39.877210] in_atomic(): 1, irqs_disabled(): 1, pid: 7, name: migration/0 [ 39.877210] no locks held by migration/0/7. [ 39.877210] irq event stamp: 38 [ 39.877210] hardirqs last enabled at (37): [] _raw_spin_unlock_irq+0x30/0x50 [ 39.877210] hardirqs last disabled at (38): [] stop_machine_cpu_stop+0x95/0x110 [ 39.877210] softirqs last enabled at (0): [] copy_process.part.63+0x348/0x13e0 [ 39.877210] softirqs last disabled at (0): [< (null)>] (null) [ 39.877210] CPU: 0 PID: 7 Comm: migration/0 Not tainted 3.10.0-rc6.davidvr #82 [ 39.877210] 0000000000000000 ffff88000ec8fbc8 ffffffff812cc2cf ffff88000ec8fbe8 [ 39.877210] ffffffff810723b5 ffff88000ed16800 ffff88000ec8ffd8 ffff88000ec8fc68 [ 39.877210] ffffffff812cd181 ffff88000ec8fc18 ffffffff81092b17 ffff88000ed16800 [ 39.877210] Call Trace: [ 39.877210] [] dump_stack+0x19/0x1b [ 39.877210] [] __might_sleep+0xf5/0x130 [ 39.877210] [] mutex_lock_nested+0x41/0x3b0 [ 39.877210] [] ? __irq_put_desc_unlock+0x27/0x50 [ 39.877210] [] ? __disable_irq_nosync+0x4c/0x70 [ 39.877210] [] ? queue_stop_cpus_work+0xc1/0xe0 [ 39.877210] [] rebind_evtchn_irq+0x42/0xc0 [ 39.877210] [] xen_console_resume+0x60/0x70 [ 39.877210] [] xen_suspend+0x92/0xb0 [ 39.877210] [] stop_machine_cpu_stop+0xab/0x110 [ 39.877210] [] ? queue_stop_cpus_work+0xe0/0xe0 [ 39.877210] [] cpu_stopper_thread+0x8b/0x140 [ 39.877210] [] ? _raw_spin_unlock_irqrestore+0x3f/0x80 [ 39.877210] [] ? trace_hardirqs_on_caller+0x105/0x1d0 [ 39.877210] [] ? trace_hardirqs_on+0xd/0x10 [ 39.877210] [] ? _raw_spin_unlock_irqrestore+0x49/0x80 [ 39.877210] [] smpboot_thread_fn+0x166/0x1f0 [ 39.877210] [] ? lg_global_unlock+0x80/0x80 [ 39.877210] [] kthread+0xdb/0xe0 [ 39.877210] [] ? __kthread_parkme+0x90/0x90 [ 39.877210] [] ret_from_fork+0x7c/0xb0 [ 39.877210] [] ? __kthread_parkme+0x90/0x90