From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756051Ab1HWTwx (ORCPT ); Tue, 23 Aug 2011 15:52:53 -0400 Received: from casper.infradead.org ([85.118.1.10]:59278 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755704Ab1HWTwr convert rfc822-to-8bit (ORCPT ); Tue, 23 Aug 2011 15:52:47 -0400 Subject: Re: [kernel.org users] [KORG] Panics on master backend From: Peter Zijlstra To: "J.H." Cc: users@kernel.org, linux-kernel , Oleg Nesterov , Frank Rowand , "yong.zhang0" , mingo@kernel.org, Jens Axboe , Thomas Gleixner , scameron@beardog.cce.hp.com, "James E.J. Bottomley" , hch Date: Tue, 23 Aug 2011 21:52:13 +0200 In-Reply-To: <4E53ECEF.7040109@kernel.org> References: <4E53ECEF.7040109@kernel.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Mailer: Evolution 3.0.2- Message-ID: <1314129133.8002.102.camel@twins> Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2011-08-23 at 11:09 -0700, J.H. wrote: > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.702389] <> [] try_to_wake_up+0x89/0x1ca > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.709536] [] ? wake_up_process+0x15/0x17 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.715422] [] default_wake_function+0x12/0x14 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.721659] [] __wake_up_common+0x4e/0x84 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.727489] [] ? _xfs_buf_ioend+0x2f/0x36 [xfs] > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.733852] [] complete+0x3f/0x52 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.738985] [] xfs_buf_ioend+0xc2/0xe2 [xfs] > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.745079] [] _xfs_buf_ioend+0x2f/0x36 [xfs] > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.751259] [] xfs_buf_bio_end_io+0x2a/0x37 [xfs] > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.757758] [] bio_endio+0x2d/0x2f > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.762945] [] req_bio_endio.clone.31+0x9e/0xa7 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.772811] [] blk_update_request+0x191/0x32e > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.782563] [] blk_update_bidi_request+0x21/0x6d > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.792529] [] blk_end_bidi_request+0x1f/0x5d > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.802230] [] blk_end_request+0x10/0x12 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.811446] [] scsi_io_completion+0x1e2/0x4d8 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.821002] [] ? _raw_spin_unlock_irqrestore+0x17/0x19 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.831405] [] scsi_finish_command+0xe8/0xf1 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.840948] [] scsi_softirq_done+0x109/0x112 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.850485] [] blk_done_softirq+0x73/0x87 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.859729] [] __do_softirq+0xc9/0x1b6 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.868705] [] ? paravirt_read_tsc+0x9/0xd > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.878093] [] call_softirq+0x1c/0x30 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.887025] [] do_softirq+0x46/0x84 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.895767] [] irq_exit+0x52/0xaf > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.904246] [] smp_apic_timer_interrupt+0x7c/0x8a > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.914171] [] apic_timer_interrupt+0x6e/0x80 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.923618] [] ? arch_local_irq_restore+0x6/0xd > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.933756] [] _raw_spin_unlock_irqrestore+0x17/0x19 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.943630] [] hpsa_scsi_queue_command+0x2f9/0x319 [hpsa] > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.953851] [] scsi_dispatch_cmd+0x18c/0x251 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.962917] [] scsi_request_fn+0x3cd/0x3f9 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.971781] [] __blk_run_queue+0x1b/0x1d > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.980439] [] cfq_insert_request+0x4cf/0x504 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.989469] [] __elv_add_request+0x19c/0x1e7 > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145813.998481] [] blk_flush_plug_list+0x166/0x19b > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145814.007670] [] schedule+0x2a3/0x6dc > 20110823.log:Aug 23 16:49:53 console1 port05 RXDATA: [145814.015903] [] schedule_timeout+0x36/0xe3 Whee, you must be real 'lucky' to hit this so reliable. Both panics included the above bit, so what I think happens is that we release rq->lock while calling that blk_flush_plug muck, it however then goes and re-enabled interrupts *FAIL* An interrupts happens and it (or in fact a pending softirq) then tries to wake the current task which is in the process of going to sleep. Since the task is still on the cpu we spin until its gone (assuming its a remote task since interrupts aren't supposed to happen here). Now I can fudge around this, see below, but really the whole block/scsi/hpsa crap should *NOT* re-enable interrupts here. It looks like the offender is: scsi_request_fn(), which does: /* * XXX(hch): This is rather suboptimal, scsi_dispatch_cmd will * take the lock again. */ spin_unlock_irq(shost->host_lock); --- kernel/sched.c | 9 +++------ 1 files changed, 3 insertions(+), 6 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index ccacdbd..30d9788 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2630,7 +2630,6 @@ static void ttwu_queue_remote(struct task_struct *p, int cpu) smp_send_reschedule(cpu); } -#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW static int ttwu_activate_remote(struct task_struct *p, int wake_flags) { struct rq *rq; @@ -2647,7 +2646,6 @@ static int ttwu_activate_remote(struct task_struct *p, int wake_flags) return ret; } -#endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */ #endif /* CONFIG_SMP */ static void ttwu_queue(struct task_struct *p, int cpu) @@ -2705,7 +2703,6 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) * this task as prev, wait until its done referencing the task. */ while (p->on_cpu) { -#ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW /* * In case the architecture enables interrupts in * context_switch(), we cannot busy wait, since that @@ -2713,11 +2710,11 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) * tries to wake up @prev. So bail and do a complete * remote wakeup. */ - if (ttwu_activate_remote(p, wake_flags)) + if (cpu == smp_processor_id() && + ttwu_activate_remote(p, wake_flags)) goto stat; -#else + cpu_relax(); -#endif } /* * Pairs with the smp_wmb() in finish_lock_switch().