All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Johannes Thumshirn <jthumshirn@suse.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Alexander Graf <agraf@suse.de>
Subject: Re: [Qemu-devel] [PATCH RFC 0/8] scsi-disk: Active/passive ALUA support
Date: Tue, 15 Dec 2015 07:49:42 +0100	[thread overview]
Message-ID: <566FB806.2070301@suse.de> (raw)
In-Reply-To: <20151215030257.GA30291@stefanha-x1.localdomain>

On 12/15/2015 04:02 AM, Stefan Hajnoczi wrote:
> On Mon, Dec 14, 2015 at 08:35:43AM +0100, Hannes Reinecke wrote:
>> On 12/14/2015 08:24 AM, Stefan Hajnoczi wrote:
>>> On Thu, Dec 10, 2015 at 10:13:17AM +0100, Hannes Reinecke wrote:
>>>> On 12/10/2015 09:26 AM, Stefan Hajnoczi wrote:
>>>>> On Fri, Nov 27, 2015 at 03:58:58PM +0100, Hannes Reinecke wrote:
>>>>>> here's now an updated version to enable ALUA and simplified
>>>>>> active/passive multipath support for qemu.
>>>>>>
>>>>>> This patchset relies on having _two_ block devices configured,
>>>>>> and two SCSI disks pointing to those block devices with the
>>>>>> _same_ 'wwn' property and unique 'port_group' properties.
>>>>>> I know, this is a bit of a nasty hack, but I hope to add
>>>>>> proper multipath support (with several SCSI devices pointing /
>>>>>> linking to the same block device) in the near future.
>>>>>>
>>>>>> It also implements a 'alua_policy', which allows for simulating
>>>>>> an 'active/passive' multipath setup.
>>>>>>
>>>>>> And for testing I've implemented a 'block_disconnect' HMP command,
>>>>>> which simulates a link failure for the attached devices.
>>>>>>
>>>>>> I wouldn't object if someone declares this a gross hack, but with
>>>>>> it I can finally simulate real-life multipath failover and do
>>>>>> some functional multipath-tools testing withouth having to recurse
>>>>>> on using real hardware.
>>>>>
>>>>> I'm not familiar with how ALUA works but have been thinking about a
>>>>> multipath problem:
>>>>>
>>>>> If the host has SCSI disks that are marked 'offline' then QEMU will
>>>>> refuse to start up since it cannot open the block device (ENXIO).
>>>>>
>>>> Define 'offline'.
>>>> If this means the ALUA state 'offline' then we wouldn't have to worry; ALUA
>>>> state 'offline' essentially means "Yeah, there's something here, but I won't
>>>> tell you and you cannot access it.".
>>>> And any transitions to and from 'offline' are essentially vendor-specific.
>>>> In short: Do not use it.
>>>>
>>>> If OTOH means the 'block_disconnect' state this is something which
>>>> should/needs to be implemented in the HBA emulation for simulating
>>>> a link failure.
>>>> qemu itself should be able to access the device and it should start up
>>>> perfectly normal, so we shouldn't get any ENXIO errors.
>>>>
>>>> (Obviously, if _all_ disks are in 'disconnect' state the guest wouldn't
>>>> start up as it cannot read any data. But that's beside the point.)
>>>
>>> I'm referring to scsi_device_set_state(scmd->device, SDEV_OFFLINE) in
>>> Linux.  This is the state where the host block device cannot be opened
>>> or accessed.
>>>
>> Which means the device is declared dead by the SCSI stack.
>> And qemu does _very_ well not to start in this circumstances.
>>
>> However, this behaviour is not influenced nor modified by the ALUA patchset
>> but is rather a different topic.
>>
>> <rambling>
>> 'offline' devices is the final step in SCSI EH, which means that SCSI EH has
>> exhausted its options and doesn't know how to fix the device.
>> However, in modern systems this typically happens when SCSI EH kicks in
>> during a (transport) link disconnect, as then every single step in SCSI EH
>> will fail. (Which also means that SCSI EH is woefully inadequate for FC, but
>> that's a different topic.)
>> But as this is a transport issue, _all_ respective drivers should be aware
>> of this, and should have been modified _not_ to start SCSI EH when the
>> transport link is severed.
>> So the very fact that SCSI EH is started means that there's an issue with
>> the driver, which really needs to be fixed first.
>> Hence I think qemu is right here, as the underlying reason for the 'offline'
>> device should be fixed first.
>> </rambling>
>
> Interesting, thanks for explaining.
>
And thinking about it some more, we _could_ map the SCSI 'offline' 
state onto ALUA 'offline', and allow qemu to start nevertheless.
(It could get the interesting bits like INQUIRY from sysfs, so it 
doesn't _actually_ have to do I/O on startup).
Then we could have a HMP/QMP command for resetting the SCSI status 
back to 'running', which should allow I/O to start properly.
Hmm. Lemme see ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

      reply	other threads:[~2015-12-15  6:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-27 14:58 [Qemu-devel] [PATCH RFC 0/8] scsi-disk: Active/passive ALUA support Hannes Reinecke
2015-11-27 14:58 ` [Qemu-devel] [PATCH 1/8] scsi-disk: Add 'port_group' property Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 2/8] scsi-disk: Add 'alua_state' property Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 3/8] scsi-disk: Implement 'REPORT TARGET PORT GROUPS' Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 4/8] scsi-disk: Implement 'SET " Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 5/8] scsi-disk: implement ALUA policy Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 6/8] scsi-disk: Allow READ CAPACITY in standby Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 7/8] scsi-disk: Implement 'alua_preferred' option Hannes Reinecke
2015-11-27 14:59 ` [Qemu-devel] [PATCH 8/8] block: Implement 'block_disconnect' HMP command Hannes Reinecke
2015-11-27 18:00   ` Eric Blake
2015-12-10  8:26 ` [Qemu-devel] [PATCH RFC 0/8] scsi-disk: Active/passive ALUA support Stefan Hajnoczi
2015-12-10  9:13   ` Hannes Reinecke
2015-12-14  7:24     ` Stefan Hajnoczi
2015-12-14  7:35       ` Hannes Reinecke
2015-12-15  3:02         ` Stefan Hajnoczi
2015-12-15  6:49           ` Hannes Reinecke [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566FB806.2070301@suse.de \
    --to=hare@suse.de \
    --cc=agraf@suse.de \
    --cc=jthumshirn@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.