linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Michael Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>,
	Hannes Reinecke <hare@suse.de>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished
Date: Thu, 30 Jan 2014 20:46:40 +0100	[thread overview]
Message-ID: <52EAAC20.2040607@acm.org> (raw)
In-Reply-To: <FBC61EDC-F703-4E3F-9980-70D41EE9484F@cs.wisc.edu>

On 06/25/13 18:13, Michael Christie wrote:
> On Jun 25, 2013, at 10:31 AM, Bart Van Assche <bvanassche@acm.org> wrote:
>> On 06/25/13 15:45, James Bottomley wrote:
>>> On Tue, 2013-06-25 at 11:01 +0200, Bart Van Assche wrote:
>>>> There is a difference though between moving the EH kthread_stop() call
>>>> and the patch at the start of this thread: moving the EH kthread_stop()
>>>> call does not prevent that an ioctl like SG_SCSI_RESET triggers an eh_*
>>>> callback after scsi_remove_host() has finished. However, the
>>>> scsi_begin_eh() / scsi_end_eh() functions do prevent that an ioctl can
>>>> cause an eh_* callback to be invoked after scsi_remove_device() finished.
>>>
>>> OK, but this doesn't tell me what you're trying to achieve.
>>>
>>> An eh function is allowable as long as the host hadn't had the release
>>> callback executed.  That means you must have to have a reference to the
>>> device/host to execute the eh function, which is currently guaranteed
>>> for all invocations.
>>
>> That raises a new question: how is an LLD expected to clean up resources
>> without triggering a race condition ? What you wrote means that it's not
>> safe for an LLD to start cleaning up the resources needed by the eh_*
>> callbacks immediately after scsi_remove_device() returns since it it not
>> guaranteed that at that time all references to the device have already
>> been dropped.
>
> A callback in the device/target/host (whatever is needed) release function
> would do this right? If I understand James right, I think he suggested
> something like this in another mail.

(replying to an e-mail of seven months ago - see also
http://thread.gmane.org/gmane.linux.scsi/82572/focus=82822)

Hello Mike,

Sorry but I'm afraid that making the SCSI core invoke a callback
function from a device, target or host release function would create a
new challenge, namely making sure that all these callback functions have
finished before the module is unloaded that contains the SCSI host
template and the code implementing such a callback function. That
challenge is not specific to the SCSI infrastructure but is a general
question that has not yet been solved in the Linux kernel (see e.g.
"[PATCH] kobject: provide kobject_put_wait to fix module unload race"
for a more general discussion about how to ensure that kobject release
functions are invoked before the module is unloaded that owns the
release function - http://thread.gmane.org/gmane.linux.kernel/1622885).

Or maybe this just means that I misunderstood you ?

In case it is not clear why I'm reviving this discussion: now that the
"improved eh timeout handler" patch is upstream (commit
e494f6a728394ab0df194342549ee20e6f0752df) there is an additional way in
which the SCSI core can invoke an EH function concurrently with or after
scsi_remove_host() has finished, namely from the TMF work queue
(tmf_work_q).

Thanks,

Bart.


  parent reply	other threads:[~2014-01-30 19:46 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-12 12:48 [PATCH v11 0/9] More device removal fixes Bart Van Assche
2013-06-12 12:49 ` [PATCH v11 1/9] Fix race between starved list and device removal Bart Van Assche
2013-06-24 15:38   ` James Bottomley
2013-06-24 16:16     ` Bart Van Assche
2013-06-24 16:23       ` James Bottomley
2013-06-24 17:24     ` Mike Christie
2013-06-24 17:49       ` James Bottomley
2013-06-12 12:51 ` [PATCH v11 2/9] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2013-06-24  1:29   ` Mike Christie
2013-06-24  2:36   ` James Bottomley
2013-06-24  7:13     ` Bart Van Assche
2013-06-24 13:34       ` James Bottomley
2013-06-24 15:43         ` Bart Van Assche
2013-06-12 12:52 ` [PATCH v11 3/9] Avoid calling __scsi_remove_device() twice Bart Van Assche
2013-06-23 21:35   ` Mike Christie
2013-06-24  6:29     ` Bart Van Assche
2013-06-24 17:38   ` James Bottomley
2013-06-25  8:37     ` Bart Van Assche
2013-06-25 13:44       ` James Bottomley
2013-06-25 15:23         ` Bart Van Assche
2013-06-12 12:53 ` [PATCH v11 4/9] Disallow changing the device state via sysfs into "deleted" Bart Van Assche
2013-06-24  1:05   ` Mike Christie
2013-06-24  6:35     ` Bart Van Assche
2013-06-24 17:59   ` James Bottomley
2013-06-25  8:41     ` Bart Van Assche
2013-06-25 13:42       ` James Bottomley
2013-06-12 12:54 ` [PATCH v11 5/9] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
2013-06-24  1:06   ` Mike Christie
2013-06-12 12:55 ` [PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished Bart Van Assche
2013-06-24  1:15   ` Mike Christie
2013-06-24  6:49     ` Bart Van Assche
2013-06-24 19:19   ` James Bottomley
2013-06-24 20:04     ` Mike Christie
2013-06-24 22:27       ` James Bottomley
2013-06-25  2:26         ` Mike Christie
2013-06-25  2:56           ` Michael Christie
2013-06-25  9:01         ` Bart Van Assche
2013-06-25 13:45           ` James Bottomley
2013-06-25 15:31             ` Bart Van Assche
2013-06-25 16:13               ` Michael Christie
2013-06-25 17:40                 ` James Bottomley
2013-06-25 17:47                   ` Bart Van Assche
2014-01-30 19:46                 ` Bart Van Assche [this message]
2014-01-31  5:58                   ` James Bottomley
2014-01-31  7:52                     ` Bart Van Assche
2013-06-25 11:13         ` Bart Van Assche
2013-06-12 12:56 ` PATCH v11 7/9] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
2013-06-12 12:57 ` [PATCH v11 8/9] Save and restore host_scribble during error handling Bart Van Assche
2013-06-24  1:21   ` Mike Christie
2013-06-24  2:08     ` James Bottomley
2013-06-12 12:58 ` [PATCH v11 9/9] Avoid reenabling I/O after the transport became offline Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2013-06-24 10:17 RE:[PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished Jack Wang
2013-06-24 10:53 ` [PATCH " Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52EAAC20.2040607@acm.org \
    --to=bvanassche@acm.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).