All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Nic Bellinger <nab@daterainc.com>,
	target-devel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [PATCH 09/11] target_core_alua: Use workqueue for ALUA transitioning
Date: Thu, 17 Oct 2013 08:52:26 +0200	[thread overview]
Message-ID: <525F892A.9060309@suse.de> (raw)
In-Reply-To: <1381961177.19256.656.camel@haakon3.risingtidesystems.com>

On 10/17/2013 12:06 AM, Nicholas A. Bellinger wrote:
> On Wed, 2013-10-16 at 09:20 +0200, Hannes Reinecke wrote:
>> Use a workqueue for processing ALUA state transitions; this allows
>> us to process implicit delay properly.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>  drivers/target/target_core_alua.c | 174 +++++++++++++++++++++++++++-----------
>>  include/target/target_core_base.h |   4 +
>>  2 files changed, 128 insertions(+), 50 deletions(-)
>>
> 
> <SNIP>
> 
>> +static int core_alua_do_transition_tg_pt(
>> +	struct t10_alua_tg_pt_gp *tg_pt_gp,
>> +	int new_state,
>> +	int explicit)
>> +{
>> +	struct se_device *dev = tg_pt_gp->tg_pt_gp_dev;
>> +	DECLARE_COMPLETION_ONSTACK(wait);
>> +
>> +	/* Nothing to be done here */
>> +	if (atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == new_state)
>> +		return 0;
>> +
>> +	if (new_state == ALUA_ACCESS_STATE_TRANSITION)
>> +		return -EAGAIN;
>> +
>> +	/*
>> +	 * Flush any pending transitions
>> +	 */
>> +	if (!explicit && tg_pt_gp->tg_pt_gp_implicit_trans_secs &&
>> +	    atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) ==
>> +	    ALUA_ACCESS_STATE_TRANSITION) {
>> +		/* Just in case */
>> +		tg_pt_gp->tg_pt_gp_alua_pending_state = new_state;
>> +		tg_pt_gp->tg_pt_gp_transition_complete = &wait;
>> +		flush_delayed_work(&tg_pt_gp->tg_pt_gp_transition_work);
>> +		wait_for_completion(&wait);
>> +		tg_pt_gp->tg_pt_gp_transition_complete = NULL;
>> +		return 0;
>> +	}
>> +
>> +	/*
>> +	 * Save the old primary ALUA access state, and set the current state
>> +	 * to ALUA_ACCESS_STATE_TRANSITION.
>> +	 */
>> +	tg_pt_gp->tg_pt_gp_alua_previous_state =
>> +		atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state);
>> +	tg_pt_gp->tg_pt_gp_alua_pending_state = new_state;
>> +
>> +	atomic_set(&tg_pt_gp->tg_pt_gp_alua_access_state,
>> +			ALUA_ACCESS_STATE_TRANSITION);
>> +	tg_pt_gp->tg_pt_gp_alua_access_status = (explicit) ?
>> +				ALUA_STATUS_ALTERED_BY_EXPLICIT_STPG :
>> +				ALUA_STATUS_ALTERED_BY_IMPLICIT_ALUA;
>> +
>> +	/*
>> +	 * Check for the optional ALUA primary state transition delay
>> +	 */
>> +	if (tg_pt_gp->tg_pt_gp_trans_delay_msecs != 0)
>> +		msleep_interruptible(tg_pt_gp->tg_pt_gp_trans_delay_msecs);
>> +
>> +	/*
>> +	 * Take a reference for workqueue item
>> +	 */
>> +	spin_lock(&dev->t10_alua.tg_pt_gps_lock);
>> +	atomic_inc(&tg_pt_gp->tg_pt_gp_ref_cnt);
>> +	smp_mb__after_atomic_inc();
>> +	spin_unlock(&dev->t10_alua.tg_pt_gps_lock);
>> +
>> +	if (!explicit && tg_pt_gp->tg_pt_gp_implicit_trans_secs) {
>> +		unsigned long transition_tmo;
>> +
>> +		transition_tmo = tg_pt_gp->tg_pt_gp_implicit_trans_secs * HZ;
>> +		queue_delayed_work(tg_pt_gp->tg_pt_gp_dev->tmr_wq,
>> +				   &tg_pt_gp->tg_pt_gp_transition_work,
>> +				   transition_tmo);
>> +	} else {
>> +		tg_pt_gp->tg_pt_gp_transition_complete = &wait;
>> +		queue_delayed_work(tg_pt_gp->tg_pt_gp_dev->tmr_wq,
>> +				   &tg_pt_gp->tg_pt_gp_transition_work, 0);
>> +		wait_for_completion(&wait);
>> +		tg_pt_gp->tg_pt_gp_transition_complete = NULL;
>> +	}
> 
> Mmm, the trans_delay_msecs delay seems a bit out of place now..
> 
> How about dropping it's usage with msleep_interruptible() above, and
> instead combining it with the delay passed into queue_delayed_work()..?
> 
Yeah, we could.
Actually I was thinking of opening up the implicit transition time
to be a general transitioning time, to be used for both implicit and
explicit.
Reasoning here is that one could kick off a userland tool once
'transitioning' mode has been requested.
That tool would then go ahead and do whatever is required to switch
paths around etc, and could write the final ALUA state into configfs.
That would then terminate the workqueue and the system would
continue with the new state.
If the userland tool fails to execute in time the system would
revert to the requested state as it does now.

The only problem with that approach is that it's currently
impossible to have an atomic implicit transition for several tpgs.

You can to an atomic _explicit_ transition, as SET TARGET PORT
GROUPS _can_ carry information about all target port groups.
(Mind you, scsi_dh_alua doesn't use it that way, it only sends
information for the active/optimized target port group).
But you cannot achieve a similar operation via configfs; there you
can only set one tpg at a time, and each of this will potentially
trigger a move to transitioning.

While this is not a violation of the spec, it is certainly
confusing. I'd rather have a way to set the new state for _all_ tpgs
at once and only then kick off the transitioning mechanism.

Any ideas how this could be achieved?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2013-10-17  6:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-16  7:20 [PATCH 00/11] target_core_mod: ALUA updates Hannes Reinecke
2013-10-16  7:20 ` [PATCH 01/11] target core: rename (ex,im)plict -> (ex,im)plicit Hannes Reinecke
2013-10-16 21:13   ` Nicholas A. Bellinger
2013-10-16  7:20 ` [PATCH 02/11] target_core_alua: Store supported ALUA states Hannes Reinecke
2013-10-16 21:19   ` Nicholas A. Bellinger
2013-10-17  5:48     ` Hannes Reinecke
2013-10-16  7:20 ` [PATCH 03/11] target_core_alua: Make supported states configurable Hannes Reinecke
2013-10-16 21:38   ` Nicholas A. Bellinger
2013-10-16  7:20 ` [PATCH 04/11] target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED Hannes Reinecke
2013-10-16  7:20 ` [PATCH 05/11] target_core_alua: spellcheck Hannes Reinecke
2013-10-16  7:20 ` [PATCH 06/11] target_core_alua: Validate ALUA state transition Hannes Reinecke
2013-10-16 21:40   ` Nicholas A. Bellinger
2013-10-16  7:20 ` [PATCH 07/11] target_core_alua: Allocate ALUA metadata on demand Hannes Reinecke
2013-10-16 21:42   ` Nicholas A. Bellinger
2013-10-16  7:20 ` [PATCH 08/11] target_core_alua: store old and pending ALUA state Hannes Reinecke
2013-10-16  7:20 ` [PATCH 09/11] target_core_alua: Use workqueue for ALUA transitioning Hannes Reinecke
2013-10-16 22:06   ` Nicholas A. Bellinger
2013-10-17  6:52     ` Hannes Reinecke [this message]
2013-10-18 19:04       ` Nicholas A. Bellinger
2013-10-16 22:09 ` [PATCH 00/11] target_core_mod: ALUA updates Nicholas A. Bellinger
2013-11-06 20:54 ` Nicholas A. Bellinger
2013-11-12 21:49   ` Nicholas A. Bellinger
2013-11-13  6:46     ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=525F892A.9060309@suse.de \
    --to=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=nab@daterainc.com \
    --cc=nab@linux-iscsi.org \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.