All of lore.kernel.org
 help / color / mirror / Atom feed
* HDS multipathing prioritizer not doing what it should
@ 2012-05-10  7:28 Christian Schausberger
  2012-05-10 12:48 ` Hannes Reinecke
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Schausberger @ 2012-05-10  7:28 UTC (permalink / raw)
  To: dm-devel


[-- Attachment #1.1: Type: text/plain, Size: 2768 bytes --]

Hi all,


I think I found a bug in the HDS prioritizer module at 
http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD

In there the following is stated for assigning the priority:

* CONTROLLER ODD and LDEV ODD: PRIORITY 1
* CONTROLLER ODD and LDEV EVEN: PRIORITY 0
* CONTROLLER EVEN and LDEV ODD: PRIORITY 0
* CONTROLLER EVEN and LDEV EVEN: PRIORITY 1

When watching multipathing with debug output one can see that the 
controllers returned are 1 and 2:

May 08 14:44:00 | sdo: hds prio: VENDOR:  HITACHI
May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F
May 08 14:44:00 | sdo: hds prio: SERIAL:  0x0089
May 08 14:44:00 | sdo: hds prio: LDEV:    0x0004
May 08 14:44:00 | sdo: hds prio: CTRL:    1 <= This is really controller 0
May 08 14:44:00 | sdo: hds prio: PORT:    C
May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
May 08 14:44:00 | sdo: hds prio = 0

May 08 14:44:00 | sdk: hds prio: VENDOR:  HITACHI
May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F
May 08 14:44:00 | sdk: hds prio: SERIAL:  0x0089
May 08 14:44:00 | sdk: hds prio: LDEV:    0x0004
May 08 14:44:00 | sdk: hds prio: CTRL:    2 <= This is really controller 1
May 08 14:44:00 | sdk: hds prio: PORT:    C
May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
May 08 14:44:00 | sdk: hds prio = 1

This looks fine, but afaik HDS starts counting controllers from 0 (so 
actually I have 0 and 1). So when assigning LUN ownership in the 
storage, a LUN with an active/passive path will actually always be 
accessed through the wrong controller. This has a huge performance 
penalty when the system is under stress, because of the additional 
overhead generated by this.

To sum this up, the priority is exactly swapped from what it should be:

Lun 0 mapped with ownership on Controller 0 (CONTROLLER EVEN and LDEV 
EVEN)will be accessed on Controller 1
Lun 1 mapped with ownership on Controller 1 (CONTROLLER ODD and LDEV 
ODD) will be accessed on Controller 0

I am not quite sure were to fix this. Looks like the code was 
contributed by Hitachi in 2006. Back then they maybe started the 
numbering of the controllers with 1. The AMS and the new HUS systems 
start at 0 though.


If you can tell me how I can help, I am glad to do so.

Thanks,
Christian

-- 
-=IPS GmbH=- <http://www.ips.at>

Mit freundlichen Grüßen / Best Regards

*Christian Schausberger, MSc*

   Systems Engineer

IPS Vertriebsgesellschaft für innovative EDV-Produkte und - Systeme GmbH
Franzosengraben 10
A-1030 Wien

T#: +43 1 796 86 86 - 57
F#: +43 1 796 86 86 - 15
M#: +43 664 88 45 46 11
@#: schausberger@ips.at <mailto:schausberger@ips.at>


[-- Attachment #1.2.1: Type: text/html, Size: 5127 bytes --]

[-- Attachment #1.2.2: Type: image/jpeg, Size: 27 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: HDS multipathing prioritizer not doing what it should
  2012-05-10  7:28 HDS multipathing prioritizer not doing what it should Christian Schausberger
@ 2012-05-10 12:48 ` Hannes Reinecke
  0 siblings, 0 replies; 3+ messages in thread
From: Hannes Reinecke @ 2012-05-10 12:48 UTC (permalink / raw)
  To: device-mapper development

On 05/10/2012 09:28 AM, Christian Schausberger wrote:
> Hi all,
> 
> 
> I think I found a bug in the HDS prioritizer module at
> http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD
> 
> In there the following is stated for assigning the priority:
> 
> * CONTROLLER ODD and LDEV ODD: PRIORITY 1
> * CONTROLLER ODD and LDEV EVEN: PRIORITY 0
> * CONTROLLER EVEN and LDEV ODD: PRIORITY 0
> * CONTROLLER EVEN and LDEV EVEN: PRIORITY 1
> 
> When watching multipathing with debug output one can see that the
> controllers returned are 1 and 2:
> 
> May 08 14:44:00 | sdo: hds prio: VENDOR:  HITACHI
> May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F         
> May 08 14:44:00 | sdo: hds prio: SERIAL:  0x0089
> May 08 14:44:00 | sdo: hds prio: LDEV:    0x0004
> May 08 14:44:00 | sdo: hds prio: CTRL:    1                       
> <= This is really controller 0
> May 08 14:44:00 | sdo: hds prio: PORT:    C
> May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
> May 08 14:44:00 | sdo: hds prio = 0
> 
> May 08 14:44:00 | sdk: hds prio: VENDOR:  HITACHI
> May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F         
> May 08 14:44:00 | sdk: hds prio: SERIAL:  0x0089
> May 08 14:44:00 | sdk: hds prio: LDEV:    0x0004
> May 08 14:44:00 | sdk: hds prio: CTRL:    2                       
> <= This is really controller 1
> May 08 14:44:00 | sdk: hds prio: PORT:    C
> May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
> May 08 14:44:00 | sdk: hds prio = 1
> 
> This looks fine, but afaik HDS starts counting controllers from 0
> (so actually I have 0 and 1). So when assigning LUN ownership in the
> storage, a LUN with an active/passive path will actually always be
> accessed through the wrong controller. This has a huge performance
> penalty when the system is under stress, because of the additional
> overhead generated by this.
> 
Have you tested whether the situation improves when the priority is
reversed?

I'd be very much surprised if it did, though.

I suspect more the internal queue size of the Hitachi to be a
problem here. I've seen instances where we overload the internal
queue size, causing the array to seize up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: HDS multipathing prioritizer not doing what it should
@ 2012-05-11  7:47 Christian Schausberger
  0 siblings, 0 replies; 3+ messages in thread
From: Christian Schausberger @ 2012-05-11  7:47 UTC (permalink / raw)
  To: dm-devel

On 05/10/2012 09:28 AM, Christian Schausberger wrote:
>>  Hi all,
>>
>>
>>  I think I found a bug in the HDS prioritizer module at
>>  http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD
>>
>>  In there the following is stated for assigning the priority:
>>
>>  * CONTROLLER ODD and LDEV ODD: PRIORITY 1
>>  * CONTROLLER ODD and LDEV EVEN: PRIORITY 0
>>  * CONTROLLER EVEN and LDEV ODD: PRIORITY 0
>>  * CONTROLLER EVEN and LDEV EVEN: PRIORITY 1
>>
>>  When watching multipathing with debug output one can see that the
>>  controllers returned are 1 and 2:
>>
>>  May 08 14:44:00 | sdo: hds prio: VENDOR:  HITACHI
>>  May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F
>>  May 08 14:44:00 | sdo: hds prio: SERIAL:  0x0089
>>  May 08 14:44:00 | sdo: hds prio: LDEV:    0x0004
>>  May 08 14:44:00 | sdo: hds prio: CTRL:    1
>>  <= This is really controller 0
>>  May 08 14:44:00 | sdo: hds prio: PORT:    C
>>  May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
>>  May 08 14:44:00 | sdo: hds prio = 0
>>
>>  May 08 14:44:00 | sdk: hds prio: VENDOR:  HITACHI
>>  May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F
>>  May 08 14:44:00 | sdk: hds prio: SERIAL:  0x0089
>>  May 08 14:44:00 | sdk: hds prio: LDEV:    0x0004
>>  May 08 14:44:00 | sdk: hds prio: CTRL:    2
>>  <= This is really controller 1
>>  May 08 14:44:00 | sdk: hds prio: PORT:    C
>>  May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
>>  May 08 14:44:00 | sdk: hds prio = 1
>>
>>  This looks fine, but afaik HDS starts counting controllers from 0
>>  (so actually I have 0 and 1). So when assigning LUN ownership in the
>>  storage, a LUN with an active/passive path will actually always be
>>  accessed through the wrong controller. This has a huge performance
>  penalty when the system is under stress, because of the additional
>  overhead generated by this.
>>

>Have you tested whether the situation improves when the priority is
>reversed?

>I'd be very much surprised if it did, though.

>I suspect more the internal queue size of the Hitachi to be a
>problem here. I've seen instances where we overload the internal
>queue size, causing the array to seize up.

>Cheers,

>Hannes

Yes, with the priority reversed within the storage throughput goes from 4.5 GB/s to 6 GB/s. Mind you, this is without any other changes to the host or storage.

I agree, that in normal operation the load balancing and the active/active mode of the storage iron this out. But in this setup (Lustre filesystem trimmed for
sustainable bandwidth) those features actually decrease performance and are not used. That's why the false priority makes such a difference.

Christian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-05-11  7:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-10  7:28 HDS multipathing prioritizer not doing what it should Christian Schausberger
2012-05-10 12:48 ` Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2012-05-11  7:47 Christian Schausberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.