Re: [Bug 11990] New: Kernel hang in spin_unlock_irq from scsi_request_fn from do_IRQ

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: bugme-daemon@bugzilla.kernel.org
Cc: linux-scsi@vger.kernel.org, yanmin_zhang@linux.intel.com
Subject: Re: [Bug 11990] New: Kernel hang in spin_unlock_irq from scsi_request_fn from do_IRQ
Date: Sun, 09 Nov 2008 09:22:47 -0600	[thread overview]
Message-ID: <1226244167.19841.3.camel@localhost.localdomain> (raw)
In-Reply-To: <bug-11990-11613@http.bugzilla.kernel.org/>

On Sat, 2008-11-08 at 19:50 -0800, bugme-daemon@bugzilla.kernel.org
wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11990
> 
>            Summary: Kernel hang in spin_unlock_irq from scsi_request_fn from
>                     do_IRQ
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.28-rc3
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: vandrove@vc.cvut.cz
> 
> 
> Latest working kernel version: commit c8d7aa after 2.6.28-rc2
> Earliest failing kernel version: commit 920da6 after 2.6.28-rc2
> Distribution: Debian
> Hardware Environment: sata_sil24, amd64, 2cpu
> Software Environment: 64bit kernel, 32bit userspace, preemptible kernel
> Problem Description:
> 
> When I/O is under stress, from time to time CPU1 hangs, most probably due to
> endless stream of interrupts.  Backtrace printed either by kernel's softlockup
> detection or alt-sysrq-p is below (written down; I/O is dead when this
> happens).
> 
> _spin_unlock_irq + 0x30  (after sti)
> scsi_request_fn + 0x1b9  (after spin_unlock_irq(shost->host_lock) at
> not_ready:)
> blk_invoke_request_fn
> __blk_runqueue
> scsi_run_queue
> scsi_next_command
> scsi_end_request
> scsi_io_completion
> scsi_finish_command
> scsi_softirq_done
> blk_done_softirq
> __do_softirq
> call_softirq
> do_softirq
> irqexit
> do_IRQ
> ret_from_intr
> <EOI>
> native_safe_halt
> trace_hardirqs_on
> default_idle
> c1e_idle
> cpu_idle
> start_secondary
> 
> Steps to reproduce:
> 
> It seems to occur under heavy I/O (updatedb, dumping core from ~3GB app), but I
> was not able to trigger it reliably - most reliable is hard resetting box, then
> it occurs in ~80% cases when replaying journals on disks connected to
> sata_sil24 (through PMP, but problem does not seem to occur on 2.6.28-rc2 with
> Jens's PMP patches).

This looks identical to

http://bugzilla.kernel.org/show_bug.cgi?id=11898

Could you see if this refinement of the discussed patches fixes it for
you?

Thanks,

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f5d3b96..e09a661 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -606,6 +606,7 @@ static void scsi_run_queue(struct request_queue *q)
 		}
 
 		list_del_init(&sdev->starved_entry);
+		starved_head = NULL;
 		spin_unlock(shost->host_lock);
 
 		spin_lock(sdev->request_queue->queue_lock);
@@ -620,6 +621,12 @@ static void scsi_run_queue(struct request_queue *q)
 		spin_unlock(sdev->request_queue->queue_lock);
 
 		spin_lock(shost->host_lock);
+		if (unlikely(!list_empty(&sdev->starved_entry)))
+			/* 
+			 * sdev got put back on the starved list
+			 * so finish starved handling
+			 */
+			break;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);

next prev parent reply	other threads:[~2008-11-09 15:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-09  3:50 [Bug 11990] New: Kernel hang in spin_unlock_irq from scsi_request_fn from do_IRQ bugme-daemon
2008-11-09  3:51 ` [Bug 11990] " bugme-daemon
2008-11-09  9:57 ` bugme-daemon
2008-11-09 15:22 ` James Bottomley [this message]
2008-11-09 15:23 ` bugme-daemon
2008-11-09 19:01 ` bugme-daemon
2008-11-09 19:01 ` bugme-daemon

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:f5d3b96 dfblob:e09a661 )
 OR (
bs:"Re: [Bug 11990] New: Kernel hang in spin_unlock_irq from scsi_request_fn from do_IRQ" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1226244167.19841.3.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=bugme-daemon@bugzilla.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox