linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: linux-scsi@vger.kernel.org
Cc: linux-ide@vger.kernel.org, Marcin Tomczak <marcin.tomczak@intel.com>
Subject: [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race
Date: Fri, 10 Feb 2012 00:44:44 -0800	[thread overview]
Message-ID: <20120210084444.25701.31139.stgit@dwillia2-linux.jf.intel.com> (raw)
In-Reply-To: <20120210084411.25701.94502.stgit@dwillia2-linux.jf.intel.com>

ata_port lifetime in libata follows the host.  In libsas it follows the
scsi_target.  Once scsi_remove_device() has caused all commands to be
completed it allows scsi_remove_target() to immediately proceed to
freeing the ata_port causing bug reports like:

[  848.393333] BUG: spinlock bad magic on CPU#4, kworker/u:2/5107
[  848.400262] general protection fault: 0000 [#1] SMP
[  848.406244] CPU 4
[  848.408310] Modules linked in: nls_utf8 ipv6 uinput i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca sg sd_mod sr_mod cdrom ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
[  848.432060]
[  848.434137] Pid: 5107, comm: kworker/u:2 Not tainted 3.2.0-isci+ #8 Intel Corporation S2600CP/S2600CP
[  848.445310] RIP: 0010:[<ffffffff8126a68c>]  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.454787] RSP: 0018:ffff8807f868dca0  EFLAGS: 00010002
[  848.461137] RAX: 0000000000000048 RBX: ffff8807fe86a630 RCX: ffffffff817d0be0
[  848.469520] RDX: 0000000000000000 RSI: ffffffff814af1cf RDI: 0000000000000002
[  848.477959] RBP: ffff8807f868dcb0 R08: 00000000ffffffff R09: 000000006b6b6b6b
[  848.486327] R10: 000000000003fb8c R11: ffffffff81a19448 R12: 6b6b6b6b6b6b6b6b
[  848.494699] R13: ffff8808027dc520 R14: 0000000000000000 R15: 000000000000001e
[  848.503067] FS:  0000000000000000(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000
[  848.512899] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  848.519710] CR2: 00007ff77d001000 CR3: 00000007f7a5d000 CR4: 00000000000406e0
[  848.528072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  848.536446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  848.544831] Process kworker/u:2 (pid: 5107, threadinfo ffff8807f868c000, task ffff8807ff348000)
[  848.555327] Stack:
[  848.557959]  ffff8807fe86a630 ffff8807fe86a630 ffff8807f868dcd0 ffffffff8126a6e0
[  848.567072]  ffffffff817c142f ffff8807fe86a630 ffff8807f868dcf0 ffffffff8126a703
[  848.576190]  ffff8808027dc520 0000000000000286 ffff8807f868dd10 ffffffff814af1bb
[  848.585281] Call Trace:
[  848.588409]  [<ffffffff8126a6e0>] spin_bug+0x26/0x28
[  848.594357]  [<ffffffff8126a703>] do_raw_spin_unlock+0x21/0x88
[  848.601283]  [<ffffffff814af1bb>] _raw_spin_unlock_irqrestore+0x2c/0x65
[  848.609089]  [<ffffffffa001c103>] ata_scsi_port_error_handler+0x548/0x557 [libata]
[  848.618331]  [<ffffffff81061813>] ? async_schedule+0x17/0x17
[  848.625060]  [<ffffffffa004f30f>] async_sas_ata_eh+0x45/0x69 [libsas]
[  848.632655]  [<ffffffff810618aa>] async_run_entry_fn+0x97/0x125
[  848.639670]  [<ffffffff81057439>] process_one_work+0x207/0x38d
[  848.646577]  [<ffffffff8105738c>] ? process_one_work+0x15a/0x38d
[  848.653681]  [<ffffffff810576f7>] worker_thread+0x138/0x21c
[  848.660305]  [<ffffffff810575bf>] ? process_one_work+0x38d/0x38d
[  848.667493]  [<ffffffff8105b098>] kthread+0x9d/0xa5
[  848.673382]  [<ffffffff8106e1bd>] ? trace_hardirqs_on_caller+0x12f/0x166
[  848.681304]  [<ffffffff814b7704>] kernel_thread_helper+0x4/0x10
[  848.688324]  [<ffffffff814af534>] ? retint_restore_args+0x13/0x13
[  848.695530]  [<ffffffff8105affb>] ? __init_kthread_worker+0x5b/0x5b
[  848.702929]  [<ffffffff814b7700>] ? gs_change+0x13/0x13
[  848.709155] Code: 00 00 48 8d 88 38 04 00 00 44 8b 80 84 02 00 00 31 c0 e8 cf 1b 24 00 41 83 c8 ff 44 8b 4b 08 48 c7 c1 e0 0b 7d 81 4d 85 e4 74 10 <45> 8b 84 24 84 02 00 00 49 8d 8c 24 38 04 00 00 8b 53 04 48 89
[  848.732467] RIP  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.738905]  RSP <ffff8807f868dca0>
[  848.743743] ---[ end trace 143161646eee8caa ]---

...so arrange for the ata_port to have the same end of life as the domain
device.

Reported-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_ata.c       |    4 ++++
 drivers/scsi/libsas/sas_discover.c  |    5 +++++
 drivers/scsi/libsas/sas_scsi_host.c |    3 ---
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 37a9e73..852b1b1 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -661,6 +661,10 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
 	struct ata_port *ap = dev->sata_dev.ap;
 	struct sas_ha_struct *ha = dev->port->ha;
 
+	/* hold a reference over eh since we may be racing with final
+	 * remove once all commands are completed
+	 */
+	kref_get(&dev->kref);
 	ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata port error handler");
 	ata_scsi_port_error_handler(ha->core.shost, ap);
 }
diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index 3ef7741..9367101 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -242,6 +242,11 @@ void sas_free_device(struct kref *kref)
 	if (dev->dev_type == EDGE_DEV || dev->dev_type == FANOUT_DEV)
 		kfree(dev->ex_dev.ex_phy);
 
+	if (dev_is_sata(dev) && dev->sata_dev.ap) {
+		ata_sas_port_destroy(dev->sata_dev.ap);
+		dev->sata_dev.ap = NULL;
+	}
+
 	kfree(dev);
 }
 
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 731c892..b563ff2 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1028,9 +1028,6 @@ void sas_target_destroy(struct scsi_target *starget)
 	if (!found_dev)
 		return;
 
-	if (dev_is_sata(found_dev))
-		ata_sas_port_destroy(found_dev->sata_dev.ap);
-
 	starget->hostdata = NULL;
 	sas_put_device(found_dev);
 }


  reply	other threads:[~2012-02-10  8:44 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
2012-02-10  8:44 ` Dan Williams [this message]
2012-02-10 18:32   ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Jeff Garzik
2012-02-10 19:09   ` Dan Williams
2012-02-10  8:44 ` [PATCH v8 02/13] libsas: improve debug statements Dan Williams
2012-02-10  8:44 ` [PATCH v8 03/13] libsas: async ata scanning Dan Williams
2012-02-10 18:34   ` Jeff Garzik
2012-02-10  8:45 ` [PATCH v8 04/13] libsas: set attached device type and target protocols for local phys Dan Williams
2012-02-10  8:45 ` [PATCH v8 05/13] libsas: fixup target_port_protocols for expanders that don't report sata Dan Williams
2012-02-10  8:45 ` [PATCH v8 06/13] libsas: don't recover end devices attached to disabled phys Dan Williams
2012-02-10  8:45 ` [PATCH v8 07/13] scsi_transport_sas: 'enable' phys on reset Dan Williams
2012-02-10  8:45 ` [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter Dan Williams
2012-02-29 21:55   ` James Bottomley
2012-02-29 22:40     ` Douglas Gilbert
2012-02-29 23:27       ` Dan Williams
2012-03-01  0:23         ` Douglas Gilbert
2012-03-01  0:35           ` Dan Williams
2012-02-29 23:22     ` Dan Williams
2012-03-01 14:27       ` James Bottomley
2012-02-10  8:45 ` [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context Dan Williams
2012-02-29 22:05   ` James Bottomley
2012-03-01  0:28     ` Dan Williams
2012-03-01 14:29       ` James Bottomley
2012-03-06 19:17         ` Dan Williams
2012-02-10  8:45 ` [PATCH v8 10/13] libsas: add sas_eh_abort_handler Dan Williams
2012-02-10  8:45 ` [PATCH v8 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler Dan Williams
2012-02-10  8:45 ` [PATCH v8 12/13] isci: use sas eh strategy handlers Dan Williams
2012-02-10  8:45 ` [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure Dan Williams
2012-02-15  1:09   ` Jack Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120210084444.25701.31139.stgit@dwillia2-linux.jf.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=marcin.tomczak@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).