linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Hannes Reinecke <hare@suse.de>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
	James Bottomley <jbottomley@parallels.com>,
	Mike Christie <michaelc@cs.wisc.edu>, Tejun Heo <tj@kernel.org>,
	Chanho Min <chanho.min@lge.com>
Subject: Re: [PATCH v7 4/9] Remove offline devices when removing a host
Date: Fri, 07 Dec 2012 18:21:21 +0100	[thread overview]
Message-ID: <50C22591.7020906@acm.org> (raw)
In-Reply-To: <50C20C3F.6020003@acm.org>

On 12/07/12 16:33, Bart Van Assche wrote:
> On 12/07/12 16:10, Hannes Reinecke wrote:
>> On 12/06/2012 04:55 PM, Bart Van Assche wrote:
>>> Currently __scsi_remove_device() skips devices that are visible and
>>> offline. Make sure that these devices get removed by changing their
>>> device state into SDEV_DEL at the start of __scsi_remove_device().
>>> Also, avoid that __scsi_remove_device() gets called a second time
>>> for devices that are in state SDEV_CANCEL when scsi_forget_host()
>>> is invoked.
>>>
>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>> Cc: James Bottomley <JBottomley@Parallels.com>
>>> Cc: Mike Christie <michaelc@cs.wisc.edu>
>>> Cc: Hannes Reinecke <hare@suse.de>
>>> Cc: Tejun Heo <tj@kernel.org>
>>> ---
>>>   drivers/scsi/scsi_scan.c  |    2 +-
>>>   drivers/scsi/scsi_sysfs.c |    4 ++--
>>>   2 files changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>>> index 3e58b22..0612fba 100644
>>> --- a/drivers/scsi/scsi_scan.c
>>> +++ b/drivers/scsi/scsi_scan.c
>>> @@ -1889,7 +1889,7 @@ void scsi_forget_host(struct Scsi_Host *shost)
>>>    restart:
>>>       spin_lock_irqsave(shost->host_lock, flags);
>>>       list_for_each_entry(sdev, &shost->__devices, siblings) {
>>> -        if (sdev->sdev_state == SDEV_DEL)
>>> +        if (scsi_device_being_removed(sdev))
>>>               continue;
>>>           spin_unlock_irqrestore(shost->host_lock, flags);
>>>           __scsi_remove_device(sdev);
>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>>> index 2ff7ba5..4348f12 100644
>>> --- a/drivers/scsi/scsi_sysfs.c
>>> +++ b/drivers/scsi/scsi_sysfs.c
>>> @@ -959,8 +959,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
>>>       unsigned long flags;
>>>
>>>       if (sdev->is_visible) {
>>> -        if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
>>> -            return;
>>> +        WARN_ON_ONCE(scsi_device_set_state(sdev, SDEV_CANCEL) != 0 &&
>>> +                 scsi_device_set_state(sdev, SDEV_DEL) != 0);
>>>
>>>           bsg_unregister_queue(sdev->request_queue);
>>>           device_unregister(&sdev->sdev_dev);
>>>
>> Hmm. Then we would be getting a warning if the device is already in
>> SDEV_DEL, wouldn't we?
>> And what about offlined devices?
>> We should be safe to remove them, or?
>
> Hello Hannes,
>
> The intent of this patch is that __scsi_remove_device() gets invoked
> exactly once per device. This function shouldn't be invoked for devices
> already in state SDEV_DEL.
>
> Offlined devices will be transitioned directly from one of the two
> offline states into state SDEV_DEL.
>
> The above patch fixes a nasty crash by avoiding that a second
> __scsi_remove_device() call queues I/O (sd_shutdown()) after
> scsi_remove_host() has already finished.

(replying to my own e-mail)

Please ignore the above comment about sd_shutdown() - that didn't make 
sense. What I would like to add to the above is that it's only after I 
included the above patch in my tests that the following two call stacks 
could no longer be triggered:

BUG: spinlock bad magic on CPU#0, kworker/0:1H/178
  lock: 0xffff880177880c28, .magic: ffff8801, .owner: <none>/-1, 
.owner_cpu: 2006506176
Pid: 178, comm: kworker/0:1H Tainted: G        W  O 3.7.0-rc7-debug+ #2
Call Trace:
  [<ffffffff814120ef>] spin_dump+0x8c/0x91
  [<ffffffff81412115>] spin_bug+0x21/0x26
  [<ffffffff81218aef>] do_raw_spin_lock+0x13f/0x150
  [<ffffffff81417bb8>] _raw_spin_lock_irqsave+0x78/0xa0
  [<ffffffffa0766c6c>] srp_queuecommand+0x3c/0xc80 [ib_srp]
  [<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
  [<ffffffffa000a390>] scsi_request_fn+0x320/0x520 [scsi_mod]
  [<ffffffff811ec427>] __blk_run_queue+0x37/0x50
  [<ffffffff811ec539>] blk_delay_work+0x29/0x40
  [<ffffffff81059283>] process_one_work+0x1c3/0x5c0
  [<ffffffff8105b1be>] worker_thread+0x15e/0x440
  [<ffffffff8106137b>] kthread+0xdb/0xe0
  [<ffffffff81420d5c>] ret_from_fork+0x7c/0xb0
------------[ cut here ]------------

BUG: spinlock bad magic on CPU#1, udevd/1518
  lock: 0xffff8801a2384c28, .magic: ffff8801, .owner: <none>/-1, 
.owner_cpu: -1519491200
Pid: 1518, comm: udevd Not tainted 3.7.0-rc8-debug+ #2
Call Trace:
  [<ffffffff81411a9d>] spin_dump+0x8c/0x91
  [<ffffffff81411ac3>] spin_bug+0x21/0x26
  [<ffffffff812184ff>] do_raw_spin_lock+0x13f/0x150
  [<ffffffff81417568>] _raw_spin_lock_irqsave+0x78/0xa0
  [<ffffffffa04a0d1c>] srp_queuecommand+0x3c/0xc80 [ib_srp]
  [<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
  [<ffffffffa000a6cc>] scsi_request_fn+0x46c/0x570 [scsi_mod]
  [<ffffffff811ebe26>] __blk_run_queue+0x46/0x60
  [<ffffffff811ebe7e>] queue_unplugged+0x3e/0xd0
  [<ffffffff811ee9c3>] blk_flush_plug_list+0x1c3/0x240
  [<ffffffff811eea58>] blk_finish_plug+0x18/0x50
  [<ffffffff8110511c>] __do_page_cache_readahead+0x24c/0x2e0
  [<ffffffff811052e9>] force_page_cache_readahead+0x79/0xb0
  [<ffffffff8110573b>] page_cache_sync_readahead+0x4b/0x50
  [<ffffffff810fad30>] generic_file_aio_read+0x590/0x710
  [<ffffffff8114b127>] do_sync_read+0xa7/0xe0
  [<ffffffff8114b878>] vfs_read+0xa8/0x170
  [<ffffffff8114b995>] sys_read+0x55/0xa0
  [<ffffffff81420782>] system_call_fastpath+0x16/0x1b
------------[ cut here ]------------

Bart.

  reply	other threads:[~2012-12-07 17:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-06 15:51 [PATCH v7 0/9] More device removal fixes Bart Van Assche
2012-12-06 15:52 ` [PATCH v7 1/9] Fix race between starved list processing and device removal Bart Van Assche
     [not found]   ` <034101cdee08$2d67f870$8837e950$@min@lge.com>
2013-02-09 15:06     ` Bart Van Assche
2012-12-06 15:53 ` [PATCH v7 2/9] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2012-12-06 15:55 ` [PATCH v7 3/9] Introduce scsi_device_being_removed() Bart Van Assche
2012-12-07  6:48   ` Hannes Reinecke
2012-12-07  8:40   ` Rolf Eike Beer
2012-12-07  9:11     ` Bart Van Assche
2012-12-07 10:02       ` Rolf Eike Beer
2012-12-07 12:43         ` Bart Van Assche
2012-12-07 13:41           ` Rolf Eike Beer
2012-12-06 15:55 ` [PATCH v7 4/9] Remove offline devices when removing a host Bart Van Assche
2012-12-07 15:10   ` Hannes Reinecke
2012-12-07 15:33     ` Bart Van Assche
2012-12-07 17:21       ` Bart Van Assche [this message]
2012-12-06 15:56 ` [PATCH v7 5/9] Disallow changing the device state via sysfs into "deleted" Bart Van Assche
2012-12-07  6:55   ` Hannes Reinecke
2012-12-07 12:46     ` Bart Van Assche
2012-12-07 13:33       ` Bart Van Assche
2012-12-07 13:36         ` Hannes Reinecke
2012-12-06 15:57 ` [PATCH v7 6/9] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
2012-12-07  6:55   ` Hannes Reinecke
2012-12-06 15:58 ` [PATCH v7 7/9] Make scsi_remove_host() wait for device removal Bart Van Assche
2012-12-06 15:59 ` [PATCH v7 8/9] Make scsi_remove_host() wait until error handling finished Bart Van Assche
2012-12-07  6:58   ` Hannes Reinecke
2012-12-06 16:00 ` [PATCH v7 9/9] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
2012-12-07  6:59   ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C22591.7020906@acm.org \
    --to=bvanassche@acm.org \
    --cc=chanho.min@lge.com \
    --cc=hare@suse.de \
    --cc=jbottomley@parallels.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).