[2.6.27.25] Hang in SCSI sync cache when a disk is removed--?

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: Paul Smith <paul@mad-scientist.net>
To: linux-scsi@vger.kernel.org
Subject: [2.6.27.25] Hang in SCSI sync cache when a disk is removed--?
Date: Thu, 02 Jul 2009 12:22:52 -0400	[thread overview]
Message-ID: <1246551772.9022.7192.camel@psmith-ubeta.netezza.com> (raw)

Hi all; we are seeing a problem where, when we pull a disk out of our
disk array (even one that's not actively being used), the entire IO
subsystem in Linux hangs.  Here are some details:

I have an IBM Bladecenter with an LSI EXP3000 SAS expander with 12 1TB
Seagate SAS disks.  Relevant lspci output for the SAS controllers:

        # lspci | grep LSI
        02:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS (rev 02)
        08:01.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064 PCI-X Fusion-MPT SAS (rev 03)
        14:01.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064 PCI-X Fusion-MPT SAS (rev 03)

On this system we are running an embedded/custom version of Linux in a
ramdisk, based on Linux 2.6.27.25.  Unfortunately it's quite
difficult/impossible for us to upgrade to a newer kernel at this time,
however if this problem rings a bell I'm happy to backport patches,
fixes, etc.

As I mentioned, when we pull one of the disks from the EXP3000 the IO
subsystem completely hangs.  Since we're running on a ramdisk this
doesn't hang our system completely, but any attempt to do any disk IO
thereafter hangs, so we have to power-cycle the blade (because reboot
tries to write to the disks).  This quite reproducible in our
environment BUT it is very timing-sensitive, as shown below.  If we
enable too much logging, etc. it goes away.

We've been in touch with some driver folks at LSI and they seem to feel
that the problem is a SCSI midlayer race condition, rather than in the
mptlinux driver itself.  So I'm hoping someone here has ideas.

On a working disk pull we get log messages like this:

        mptscsih: ioc1: attempting host reset! (sc=ffff8804619e2640)
        mptscsih: ioc1: host reset: SUCCESS (sc=ffff8804619e2640)
        mptbase: ioc1: LogInfo(0x30030501): Originator={IOP}, Code={Invalid Page}, SubCode(0x0501)
        mptsas: ioc1: removing ssp device: fw_channel 0, fw_id 72, phy 11, sas_addr 0x5000c5000d2987b6
        sd 3:0:11:0: [sdx] Synchronizing SCSI cache
        sd 3:0:11:0: Device offlined - not ready after error recovery
        sg_cmd_done: device detached

Note that the "host reset: SUCCESS" message here comes BEFORE the
"Synchronizing SCSI cache" message.  On a hanging disk pull we get log
messages like this:

        mptscsih: ioc1: attempting host reset! (sc=ffff8804622b48c0)
        mptsas: ioc1: removing ssp device: fw_channel 0, fw_id 72, phy 11, sas_addr 0x5000c5000d2987b6
        sd 3:0:11:0: [sdx] Synchronizing SCSI cache

and it hangs right here.  In this situation the host reset does not
complete before we try to sync, and that appears to be the indicator of
the problem.  Here's a backtrace; note we're in sd_sync_cache():

        Call Trace:
         [<ffffffff8048d88f>] _spin_lock_irqsave+0x1f/0x50
         [<ffffffff8048daf2>] _spin_unlock_irqrestore+0x12/0x40
         [<ffffffffa00080fc>] scsi_get_command+0x8c/0xc0 [scsi_mod]
         [<ffffffff8048c11d>] schedule_timeout+0xad/0xf0
         [<ffffffff8034df1d>] elv_next_request+0x15d/0x290
         [<ffffffff8048b1ea>] wait_for_common+0xba/0x170
         [<ffffffff80237460>] default_wake_function+0x0/0x10
         [<ffffffff80353b77>] blk_execute_rq+0x67/0xa0
         [<ffffffff80350e71>] get_request_wait+0x21/0x1d0
         [<ffffffff8023e972>] vprintk+0x1f2/0x490
         [<ffffffff8048dab1>] _spin_unlock_irq+0x11/0x40
         [<ffffffffa000e5a4>] scsi_execute+0xf4/0x150 [scsi_mod]
         [<ffffffffa000e691>] scsi_execute_req+0x91/0x100 [scsi_mod]
         [<ffffffffa00f89bc>] sd_sync_cache+0xac/0x100 [sd_mod]
         [<ffffffff80360000>] compat_blkdev_ioctl+0x80/0x1740
         [<ffffffff80364062>] kobject_get+0x12/0x20
         [<ffffffffa00fac51>] sd_shutdown+0x71/0x160 [sd_mod]
         [<ffffffffa00fad7c>] sd_remove+0x3c/0x80 [sd_mod]
         [<ffffffffa0012122>] scsi_bus_remove+0x42/0x60 [scsi_mod]
         [<ffffffff803d8ba9>] __device_release_driver+0x99/0x100
         [<ffffffff803d8d08>] device_release_driver+0x28/0x40
         [<ffffffff803d8087>] bus_remove_device+0xb7/0xf0
         [<ffffffff803d66c9>] device_del+0x119/0x1a0
         [<ffffffffa001245c>] __scsi_remove_device+0x5c/0xb0 [scsi_mod]
         [<ffffffffa00124d8>] scsi_remove_device+0x28/0x40 [scsi_mod]
         [<ffffffffa00125a0>] __scsi_remove_target+0xa0/0xd0 [scsi_mod]
         [<ffffffffa0012640>] __remove_child+0x0/0x30 [scsi_mod]
         [<ffffffffa0012656>] __remove_child+0x16/0x30 [scsi_mod]
         [<ffffffff803d5c3b>] device_for_each_child+0x3b/0x60
         [<ffffffffa0012606>] scsi_remove_target+0x36/0x70 [scsi_mod]
         [<ffffffffa010c5f5>] sas_rphy_remove+0x75/0x80 [scsi_transport_sas]
         [<ffffffffa010c609>] sas_rphy_delete+0x9/0x20 [scsi_transport_sas]
         [<ffffffffa010c642>] sas_port_delete+0x22/0x140 [scsi_transport_sas]
         [<ffffffffa013c230>] mptsas_del_end_device+0x230/0x2c0 [mptsas]
         [<ffffffffa013c8a1>] mptsas_hotplug_work+0x291/0xb20 [mptsas]
         [<ffffffff80369c9a>] vsnprintf+0x2ea/0x7c0
         [<ffffffff80287dac>] free_hot_cold_page+0x1fc/0x2f0
         [<ffffffff80287ed8>] __pagevec_free+0x38/0x50
         [<ffffffff8028b730>] release_pages+0x180/0x1d0
         [<ffffffff80362789>] __next_cpu+0x19/0x30
         [<ffffffff802321ec>] find_busiest_group+0x1dc/0x960
         [<ffffffff80362789>] __next_cpu+0x19/0x30
         [<ffffffff802321ec>] find_busiest_group+0x1dc/0x960
         [<ffffffffa013e4a9>] mptsas_firmware_event_work+0xd29/0x1110 [mptsas]
         [<ffffffff8022dc94>] update_curr+0x84/0xd0
         [<ffffffff80230370>] __dequeue_entity+0x60/0x90
         [<ffffffff8048dab1>] _spin_unlock_irq+0x11/0x40
         [<ffffffff802364fb>] finish_task_switch+0x3b/0xd0
         [<ffffffff8048b911>] thread_return+0xa3/0x662
         [<ffffffffa013d780>] mptsas_firmware_event_work+0x0/0x1110 [mptsas]
         [<ffffffff80250e65>] run_workqueue+0x85/0x150
         [<ffffffff80250fcf>] worker_thread+0x9f/0x110
         [<ffffffff802553b0>] autoremove_wake_function+0x0/0x30
         [<ffffffff80250f30>] worker_thread+0x0/0x110
         [<ffffffff80254ef7>] kthread+0x47/0x90
         [<ffffffff80254eb0>] kthread+0x0/0x90
         [<ffffffff8020d5f9>] child_rip+0xa/0x11
         [<ffffffff80254eb0>] kthread+0x0/0x90
         [<ffffffff80254eb0>] kthread+0x0/0x90
         [<ffffffff8020d5ef>] child_rip+0x0/0x11

According to sd.c:sd_synch_cache() it's supposed to retry the
scsi_execute_req() three times then give up, but instead it never
returns.  It seems that if the host reset is not completed yet, then we
find this event on the workqueue and get into some kind of deadlock
situation.

We're kind of stuck on this and I was wondering if anyone has any
thoughts or avenues to look at to move us forward on resolving this?

Thanks!

next             reply	other threads:[~2009-07-02 16:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-02 16:22 Paul Smith [this message]
2009-07-02 17:41 ` [2.6.27.25] Hang in SCSI sync cache when a disk is removed--? Mike Anderson
2009-07-02 20:12   ` Paul Smith
2009-07-06 18:04   ` Paul Smith
2009-07-07  6:25     ` Mike Anderson
2009-07-07 13:58       ` James Bottomley
2009-07-07 14:33         ` Paul Smith
2009-07-07 20:24           ` Desai, Kashyap
2009-07-07 20:45             ` Mike Anderson
2009-07-07 21:10               ` Mike Anderson
2009-07-21 10:16                 ` Desai, Kashyap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1246551772.9022.7192.camel@psmith-ubeta.netezza.com \
    --to=paul@mad-scientist.net \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox