All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Christie <michaelc@cs.wisc.edu>,
	Hannes Reinecke <hare@suse.de>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished
Date: Fri, 31 Jan 2014 08:52:16 +0100	[thread overview]
Message-ID: <52EB5630.2010401@acm.org> (raw)
In-Reply-To: <1391147936.2181.160.camel@dabdike.int.hansenpartnership.com>

On 01/31/14 06:58, James Bottomley wrote:
> On Thu, 2014-01-30 at 20:46 +0100, Bart Van Assche wrote:
>> On 06/25/13 18:13, Michael Christie wrote:
>> Sorry but I'm afraid that making the SCSI core invoke a callback
>> function from a device, target or host release function would create a
>> new challenge, namely making sure that all these callback functions have
>> finished before the module is unloaded that contains the SCSI host
>> template and the code implementing such a callback function. That
>> challenge is not specific to the SCSI infrastructure but is a general
>> question that has not yet been solved in the Linux kernel (see e.g.
>> "[PATCH] kobject: provide kobject_put_wait to fix module unload race"
>> for a more general discussion about how to ensure that kobject release
>> functions are invoked before the module is unloaded that owns the
>> release function - http://thread.gmane.org/gmane.linux.kernel/1622885).
> 
> For callbacks, that's easy: it's module_get/module_put

Adding a module_get() call in scsi_add_host() and a module_put() call in
scsi_host_dev_release() would cause a behavior change that would be
reported by end users as "system hangs during shutdown". Today it is
possible to unload a kernel module via rmmod that created one or more
SCSI hosts. Since user space does not remove SCSI hosts during system
shutdown, trying to unload a SCSI LLD kernel module that had created one
or more SCSI hosts would cause the module unload to fail because of a
non-zero module reference count.

>> In case it is not clear why I'm reviving this discussion: now that the
>> "improved eh timeout handler" patch is upstream (commit
>> e494f6a728394ab0df194342549ee20e6f0752df) there is an additional way in
>> which the SCSI core can invoke an EH function concurrently with or after
>> scsi_remove_host() has finished, namely from the TMF work queue
>> (tmf_work_q).
> 
> But the fundamental guarantee is that the eh thread for the host (the eh
> context if you will) has to be dead before the host can be removed and
> the module unloaded.  The thread doesn't die until all the work is done.

Maybe I'm misunderstanding something, but as far as I can see
kthread_stop(shost->ehandler) is invoked from scsi_host_dev_release().
That last function is called by the last scsi_host_put(). And LLD's
invoke scsi_host_put() after scsi_remove_host(). In other words, the
SCSI error handler thread and TMF work queue are still active when
scsi_remove_host() returns.

Should I repost the patch at the start of this thread ?

Thanks,

Bart.

  reply	other threads:[~2014-01-31  7:52 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-12 12:48 [PATCH v11 0/9] More device removal fixes Bart Van Assche
2013-06-12 12:49 ` [PATCH v11 1/9] Fix race between starved list and device removal Bart Van Assche
2013-06-24 15:38   ` James Bottomley
2013-06-24 16:16     ` Bart Van Assche
2013-06-24 16:23       ` James Bottomley
2013-06-24 17:24     ` Mike Christie
2013-06-24 17:49       ` James Bottomley
2013-06-12 12:51 ` [PATCH v11 2/9] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2013-06-24  1:29   ` Mike Christie
2013-06-24  2:36   ` James Bottomley
2013-06-24  7:13     ` Bart Van Assche
2013-06-24 13:34       ` James Bottomley
2013-06-24 15:43         ` Bart Van Assche
2013-06-12 12:52 ` [PATCH v11 3/9] Avoid calling __scsi_remove_device() twice Bart Van Assche
2013-06-23 21:35   ` Mike Christie
2013-06-24  6:29     ` Bart Van Assche
2013-06-24 17:38   ` James Bottomley
2013-06-25  8:37     ` Bart Van Assche
2013-06-25 13:44       ` James Bottomley
2013-06-25 15:23         ` Bart Van Assche
2013-06-12 12:53 ` [PATCH v11 4/9] Disallow changing the device state via sysfs into "deleted" Bart Van Assche
2013-06-24  1:05   ` Mike Christie
2013-06-24  6:35     ` Bart Van Assche
2013-06-24 17:59   ` James Bottomley
2013-06-25  8:41     ` Bart Van Assche
2013-06-25 13:42       ` James Bottomley
2013-06-12 12:54 ` [PATCH v11 5/9] Avoid saving/restoring interrupt state inside scsi_remove_host() Bart Van Assche
2013-06-24  1:06   ` Mike Christie
2013-06-12 12:55 ` [PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished Bart Van Assche
2013-06-24  1:15   ` Mike Christie
2013-06-24  6:49     ` Bart Van Assche
2013-06-24 19:19   ` James Bottomley
2013-06-24 20:04     ` Mike Christie
2013-06-24 22:27       ` James Bottomley
2013-06-25  2:26         ` Mike Christie
2013-06-25  2:56           ` Michael Christie
2013-06-25  9:01         ` Bart Van Assche
2013-06-25 13:45           ` James Bottomley
2013-06-25 15:31             ` Bart Van Assche
2013-06-25 16:13               ` Michael Christie
2013-06-25 17:40                 ` James Bottomley
2013-06-25 17:47                   ` Bart Van Assche
2014-01-30 19:46                 ` Bart Van Assche
2014-01-31  5:58                   ` James Bottomley
2014-01-31  7:52                     ` Bart Van Assche [this message]
2013-06-25 11:13         ` Bart Van Assche
2013-06-12 12:56 ` PATCH v11 7/9] Avoid that scsi_device_set_state() triggers a race Bart Van Assche
2013-06-12 12:57 ` [PATCH v11 8/9] Save and restore host_scribble during error handling Bart Van Assche
2013-06-24  1:21   ` Mike Christie
2013-06-24  2:08     ` James Bottomley
2013-06-12 12:58 ` [PATCH v11 9/9] Avoid reenabling I/O after the transport became offline Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2013-06-24 10:17 RE:[PATCH v11 6/9] Make scsi_remove_host() wait until error handling finished Jack Wang
2013-06-24 10:53 ` [PATCH " Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52EB5630.2010401@acm.org \
    --to=bvanassche@acm.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.