From: Bart Van Assche <bvanassche@acm.org>
To: Hannes Reinecke <hare@suse.de>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
James Bottomley <jbottomley@parallels.com>,
Mike Christie <michaelc@cs.wisc.edu>, Tejun Heo <tj@kernel.org>,
Chanho Min <chanho.min@lge.com>
Subject: Re: [PATCH v7 4/9] Remove offline devices when removing a host
Date: Fri, 07 Dec 2012 18:21:21 +0100 [thread overview]
Message-ID: <50C22591.7020906@acm.org> (raw)
In-Reply-To: <50C20C3F.6020003@acm.org>
On 12/07/12 16:33, Bart Van Assche wrote:
> On 12/07/12 16:10, Hannes Reinecke wrote:
>> On 12/06/2012 04:55 PM, Bart Van Assche wrote:
>>> Currently __scsi_remove_device() skips devices that are visible and
>>> offline. Make sure that these devices get removed by changing their
>>> device state into SDEV_DEL at the start of __scsi_remove_device().
>>> Also, avoid that __scsi_remove_device() gets called a second time
>>> for devices that are in state SDEV_CANCEL when scsi_forget_host()
>>> is invoked.
>>>
>>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>>> Cc: James Bottomley <JBottomley@Parallels.com>
>>> Cc: Mike Christie <michaelc@cs.wisc.edu>
>>> Cc: Hannes Reinecke <hare@suse.de>
>>> Cc: Tejun Heo <tj@kernel.org>
>>> ---
>>> drivers/scsi/scsi_scan.c | 2 +-
>>> drivers/scsi/scsi_sysfs.c | 4 ++--
>>> 2 files changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>>> index 3e58b22..0612fba 100644
>>> --- a/drivers/scsi/scsi_scan.c
>>> +++ b/drivers/scsi/scsi_scan.c
>>> @@ -1889,7 +1889,7 @@ void scsi_forget_host(struct Scsi_Host *shost)
>>> restart:
>>> spin_lock_irqsave(shost->host_lock, flags);
>>> list_for_each_entry(sdev, &shost->__devices, siblings) {
>>> - if (sdev->sdev_state == SDEV_DEL)
>>> + if (scsi_device_being_removed(sdev))
>>> continue;
>>> spin_unlock_irqrestore(shost->host_lock, flags);
>>> __scsi_remove_device(sdev);
>>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>>> index 2ff7ba5..4348f12 100644
>>> --- a/drivers/scsi/scsi_sysfs.c
>>> +++ b/drivers/scsi/scsi_sysfs.c
>>> @@ -959,8 +959,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
>>> unsigned long flags;
>>>
>>> if (sdev->is_visible) {
>>> - if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
>>> - return;
>>> + WARN_ON_ONCE(scsi_device_set_state(sdev, SDEV_CANCEL) != 0 &&
>>> + scsi_device_set_state(sdev, SDEV_DEL) != 0);
>>>
>>> bsg_unregister_queue(sdev->request_queue);
>>> device_unregister(&sdev->sdev_dev);
>>>
>> Hmm. Then we would be getting a warning if the device is already in
>> SDEV_DEL, wouldn't we?
>> And what about offlined devices?
>> We should be safe to remove them, or?
>
> Hello Hannes,
>
> The intent of this patch is that __scsi_remove_device() gets invoked
> exactly once per device. This function shouldn't be invoked for devices
> already in state SDEV_DEL.
>
> Offlined devices will be transitioned directly from one of the two
> offline states into state SDEV_DEL.
>
> The above patch fixes a nasty crash by avoiding that a second
> __scsi_remove_device() call queues I/O (sd_shutdown()) after
> scsi_remove_host() has already finished.
(replying to my own e-mail)
Please ignore the above comment about sd_shutdown() - that didn't make
sense. What I would like to add to the above is that it's only after I
included the above patch in my tests that the following two call stacks
could no longer be triggered:
BUG: spinlock bad magic on CPU#0, kworker/0:1H/178
lock: 0xffff880177880c28, .magic: ffff8801, .owner: <none>/-1,
.owner_cpu: 2006506176
Pid: 178, comm: kworker/0:1H Tainted: G W O 3.7.0-rc7-debug+ #2
Call Trace:
[<ffffffff814120ef>] spin_dump+0x8c/0x91
[<ffffffff81412115>] spin_bug+0x21/0x26
[<ffffffff81218aef>] do_raw_spin_lock+0x13f/0x150
[<ffffffff81417bb8>] _raw_spin_lock_irqsave+0x78/0xa0
[<ffffffffa0766c6c>] srp_queuecommand+0x3c/0xc80 [ib_srp]
[<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
[<ffffffffa000a390>] scsi_request_fn+0x320/0x520 [scsi_mod]
[<ffffffff811ec427>] __blk_run_queue+0x37/0x50
[<ffffffff811ec539>] blk_delay_work+0x29/0x40
[<ffffffff81059283>] process_one_work+0x1c3/0x5c0
[<ffffffff8105b1be>] worker_thread+0x15e/0x440
[<ffffffff8106137b>] kthread+0xdb/0xe0
[<ffffffff81420d5c>] ret_from_fork+0x7c/0xb0
------------[ cut here ]------------
BUG: spinlock bad magic on CPU#1, udevd/1518
lock: 0xffff8801a2384c28, .magic: ffff8801, .owner: <none>/-1,
.owner_cpu: -1519491200
Pid: 1518, comm: udevd Not tainted 3.7.0-rc8-debug+ #2
Call Trace:
[<ffffffff81411a9d>] spin_dump+0x8c/0x91
[<ffffffff81411ac3>] spin_bug+0x21/0x26
[<ffffffff812184ff>] do_raw_spin_lock+0x13f/0x150
[<ffffffff81417568>] _raw_spin_lock_irqsave+0x78/0xa0
[<ffffffffa04a0d1c>] srp_queuecommand+0x3c/0xc80 [ib_srp]
[<ffffffffa0002f18>] scsi_dispatch_cmd+0x148/0x310 [scsi_mod]
[<ffffffffa000a6cc>] scsi_request_fn+0x46c/0x570 [scsi_mod]
[<ffffffff811ebe26>] __blk_run_queue+0x46/0x60
[<ffffffff811ebe7e>] queue_unplugged+0x3e/0xd0
[<ffffffff811ee9c3>] blk_flush_plug_list+0x1c3/0x240
[<ffffffff811eea58>] blk_finish_plug+0x18/0x50
[<ffffffff8110511c>] __do_page_cache_readahead+0x24c/0x2e0
[<ffffffff811052e9>] force_page_cache_readahead+0x79/0xb0
[<ffffffff8110573b>] page_cache_sync_readahead+0x4b/0x50
[<ffffffff810fad30>] generic_file_aio_read+0x590/0x710
[<ffffffff8114b127>] do_sync_read+0xa7/0xe0
[<ffffffff8114b878>] vfs_read+0xa8/0x170
[<ffffffff8114b995>] sys_read+0x55/0xa0
[<ffffffff81420782>] system_call_fastpath+0x16/0x1b
------------[ cut here ]------------
Bart.
next prev parent reply other threads:[~2012-12-07 17:21 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-06 15:51 [PATCH v7 0/9] More device removal fixes Bart Van Assche
2012-12-06 15:52 ` [PATCH v7 1/9] Fix race between starved list processing and device removal Bart Van Assche
[not found] ` <034101cdee08$2d67f870$8837e950$@min@lge.com>
2013-02-09 15:06 ` Bart Van Assche
2012-12-06 15:53 ` [PATCH v7 2/9] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2012-12-06 15:55 ` [PATCH v7 3/9] Introduce scsi_device_being_removed() Bart Van Assche
2012-12-07 6:48 ` Hannes Reinecke
2012-12-07 8:40 ` Rolf Eike Beer
2012-12-07 9:11 ` Bart Van Assche
2012-12-07 10:02 ` Rolf Eike Beer
2012-12-07 12:43 ` Bart Van Assche
2012-12-07 13:41 ` Rolf Eike Beer
2012-12-06 15:55 ` [PATCH v7 4/9] Remove offline devices when removing a host Bart Van Assche
2012-12-07 15:10 ` Hannes Reinecke
2012-12-07 15:33 ` Bart Van Assche
2012-12-07 17:21 ` Bart Van Assche [this message]
2012-12-06 15:56 ` [PATCH v7 5/9] Disallow changing the device state via sysfs into "deleted" Bart Van Assche
2012-12-07 6:55 ` Hannes Reinecke
2012-12-07 12:46 ` Bart Van Assche
2012-12-07 13:33 ` Bart Van Assche
2012-12-07 13:36 ` Hannes Reinecke
2012-12-06 15:57 ` [PATCH v7 6/9] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
2012-12-07 6:55 ` Hannes Reinecke
2012-12-06 15:58 ` [PATCH v7 7/9] Make scsi_remove_host() wait for device removal Bart Van Assche
2012-12-06 15:59 ` [PATCH v7 8/9] Make scsi_remove_host() wait until error handling finished Bart Van Assche
2012-12-07 6:58 ` Hannes Reinecke
2012-12-06 16:00 ` [PATCH v7 9/9] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
2012-12-07 6:59 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50C22591.7020906@acm.org \
--to=bvanassche@acm.org \
--cc=chanho.min@lge.com \
--cc=hare@suse.de \
--cc=jbottomley@parallels.com \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).