linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
	Dariusz Majchrzak <dariusz.majchrzak@intel.com>
Subject: Re: [PATCH 12/12] scsi_transport_sas: fix delete vs scan race
Date: Sun, 20 May 2012 12:20:06 -0700	[thread overview]
Message-ID: <CAA9_cmeL5h_5xESis06pyT-7bt+K2eQrN5SR6_b25qLBSDVXvA@mail.gmail.com> (raw)
In-Reply-To: <CAA9_cmcCQtyRBEt-c8EP6wuSukqZn0Mswxi3nDG7R-88L57BjA@mail.gmail.com>

On Sat, May 5, 2012 at 2:52 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Sun, Apr 22, 2012 at 10:15 AM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
>> Async scan here means any scan in a different thread, right ... it just
>> has to be asynchronous relative to us?  So that includes the manually
>> initiated ones and hotplug ones, doesn't it?
>
> [ resend since I notice this never hit the lists ]
>
> Hmm, well no I don't think so.  This literally means the initial async
> scan, and the
> failure window is between when we skip the call to
> scsi_sysfs_add_sdev() (in scsi_add_lun() under the scan_mutex) and
> finally call scsi_sysfs_add_sdev() again via scsi_finish_async_scan().
> I don't see how that fixes it because when we fail the sequence goes:
>
> mutex_lock(scan_mutex)
> starget->parent = end_device;
> scsi_add_lun()
> mutex_unlock(scan_mutex)
>
> device_del(end_device)
>
> mutex_lock(scan_mutex)
> device_add(starget)
> <crash>
>
> As far as I can see taking the scan_mutex in sas_rphy_remove() does
> not change this failure window.  Unless I missed something?
>
> I am going to re-submit this patch as is with the proposed libsas batch for 3.5.

It turns out this patch can cause a deadlock in the scenario where we
have two hosts scanning and the "previous" host (according to the
async scan queue), experiences a device removal event.  I think the
following should be all we need:

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 01b0374..8906557 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1714,6 +1714,9 @@ static void scsi_sysfs_add_devices(struct
Scsi_Host *shost)
 {
        struct scsi_device *sdev;
        shost_for_each_device(sdev, shost) {
+               /* target removed before the device could be added */
+               if (sdev->sdev_state == SDEV_DEL)
+                       continue;
                if (!scsi_host_scan_allowed(shost) ||
                    scsi_sysfs_add_sdev(sdev) != 0)
                        __scsi_remove_device(sdev);

...since starget removal will mark the sdevs as deleted under
scan_mutex.  scsi_sysfs_add_devices can simply ignore deleted devices.
 I'll post this patch after Darek has a chance to try it out.

--
Dan

  reply	other threads:[~2012-05-20 19:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-13 23:36 [GIT PATCH 00/12] libsas fixes for 3.4 Dan Williams
2012-04-13 23:36 ` [PATCH 01/12] libsas: introduce sas_work to fix sas_drain_work vs sas_queue_work Dan Williams
2012-04-13 23:37 ` [PATCH 02/12] libsas: cleanup spurious calls to scsi_schedule_eh Dan Williams
2012-04-13 23:37 ` [PATCH 03/12] libata, libsas: introduce sched_eh and end_eh port ops Dan Williams
2012-04-21  6:19   ` Jeff Garzik
2012-04-22 17:30   ` James Bottomley
2012-04-23  2:33     ` Jeff Garzik
2012-04-23  8:10       ` James Bottomley
2012-04-23 19:13         ` Dan Williams
2012-04-23 22:22           ` James Bottomley
2012-04-23 22:49             ` Dan Williams
2012-04-24 10:11               ` Jacek Danecki
2012-04-23 19:41     ` Dan Williams
2012-04-26 17:21       ` Dan Williams
2012-04-13 23:37 ` [PATCH 04/12] libsas: fix sas_find_bcast_phy() in the presence of 'vacant' phys Dan Williams
2012-04-13 23:37 ` [PATCH 05/12] libsas: fix sas_get_port_device regression Dan Williams
2012-04-13 23:37 ` [PATCH 06/12] libsas: unify domain_device sas_rphy lifetimes Dan Williams
2012-04-13 23:37 ` [PATCH 07/12] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready Dan Williams
2012-04-13 23:37 ` [PATCH 08/12] libata: make ata_print_id atomic Dan Williams
2012-04-13 23:37 ` [PATCH 09/12] libsas, libata: fix start of life for a sas ata_port Dan Williams
2012-04-21  6:20   ` Jeff Garzik
2012-04-13 23:37 ` [PATCH 10/12] scsi: fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations) Dan Williams
2012-04-21 12:22   ` James Bottomley
2012-04-22 15:24     ` Dan Williams
2012-04-13 23:37 ` [PATCH 11/12] libsas: fix false positive 'device attached' conditions Dan Williams
2012-04-22 10:53   ` James Bottomley
2012-04-22 15:56     ` Dan Williams
2012-04-13 23:37 ` [PATCH 12/12] scsi_transport_sas: fix delete vs scan race Dan Williams
2012-04-22 10:38   ` James Bottomley
2012-04-22 15:43     ` Dan Williams
2012-04-22 17:15       ` James Bottomley
2012-05-05 21:52         ` Dan Williams
2012-05-20 19:20           ` Dan Williams [this message]
2012-04-14  8:19 ` [GIT PATCH 00/12] libsas fixes for 3.4 jack_wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA9_cmeL5h_5xESis06pyT-7bt+K2eQrN5SR6_b25qLBSDVXvA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=dariusz.majchrzak@intel.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).