From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christian Schausberger Subject: HDS multipathing prioritizer not doing what it should Date: Thu, 10 May 2012 09:28:06 +0200 Message-ID: <4FAB6E06.9020709@ips.at> Reply-To: device-mapper development Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8948387850383229954==" Return-path: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com To: dm-devel@redhat.com List-Id: dm-devel.ids --===============8948387850383229954== Content-Type: multipart/alternative; boundary="------------030800010007020207070105" --------------030800010007020207070105 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mx3-phx2.redhat.com id q4ACO363001807 Hi all, I think I found a bug in the HDS prioritizer module at=20 http://git.kernel.org/gitweb.cgi?p=3Dlinux/storage/multipath/hare/multipa= th-tools.git;a=3Dblob_plain;f=3Dlibmultipath/prioritizers/hds.c;hb=3DHEAD In there the following is stated for assigning the priority: * CONTROLLER ODD and LDEV ODD: PRIORITY 1 * CONTROLLER ODD and LDEV EVEN: PRIORITY 0 * CONTROLLER EVEN and LDEV ODD: PRIORITY 0 * CONTROLLER EVEN and LDEV EVEN: PRIORITY 1 When watching multipathing with debug output one can see that the=20 controllers returned are 1 and 2: May 08 14:44:00 | sdo: hds prio: VENDOR: HITACHI May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F May 08 14:44:00 | sdo: hds prio: SERIAL: 0x0089 May 08 14:44:00 | sdo: hds prio: LDEV: 0x0004 May 08 14:44:00 | sdo: hds prio: CTRL: 1 <=3D This is really controlle= r 0 May 08 14:44:00 | sdo: hds prio: PORT: C May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0 May 08 14:44:00 | sdo: hds prio =3D 0 May 08 14:44:00 | sdk: hds prio: VENDOR: HITACHI May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F May 08 14:44:00 | sdk: hds prio: SERIAL: 0x0089 May 08 14:44:00 | sdk: hds prio: LDEV: 0x0004 May 08 14:44:00 | sdk: hds prio: CTRL: 2 <=3D This is really controlle= r 1 May 08 14:44:00 | sdk: hds prio: PORT: C May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1 May 08 14:44:00 | sdk: hds prio =3D 1 This looks fine, but afaik HDS starts counting controllers from 0 (so=20 actually I have 0 and 1). So when assigning LUN ownership in the=20 storage, a LUN with an active/passive path will actually always be=20 accessed through the wrong controller. This has a huge performance=20 penalty when the system is under stress, because of the additional=20 overhead generated by this. To sum this up, the priority is exactly swapped from what it should be: Lun 0 mapped with ownership on Controller 0 (CONTROLLER EVEN and LDEV=20 EVEN)will be accessed on Controller 1 Lun 1 mapped with ownership on Controller 1 (CONTROLLER ODD and LDEV=20 ODD) will be accessed on Controller 0 I am not quite sure were to fix this. Looks like the code was=20 contributed by Hitachi in 2006. Back then they maybe started the=20 numbering of the controllers with 1. The AMS and the new HUS systems=20 start at 0 though. If you can tell me how I can help, I am glad to do so. Thanks, Christian --=20 -=3DIPS GmbH=3D- Mit freundlichen Gr=FC=DFen / Best Regards *Christian Schausberger, MSc* Systems Engineer IPS Vertriebsgesellschaft f=FCr innovative EDV-Produkte und - Systeme Gmb= H Franzosengraben 10 A-1030 Wien T#: +43 1 796 86 86 - 57 F#: +43 1 796 86 86 - 15 M#: +43 664 88 45 46 11 @#: schausberger@ips.at --------------030800010007020207070105 Content-Type: multipart/related; boundary="------------010702060708040100060601" --------------010702060708040100060601 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Hi all,


I think I found a bug in the HDS prioritizer module at http://git.kernel.org/gitweb.cgi?p=linux/storage/multipath/hare/multipath-tools.git;a=blob_plain;f=libmultipath/prioritizers/hds.c;hb=HEAD

In there the following is stated for assigning the priority:


* CONTROLLER ODD and LDEV ODD: PRIORITY 1
* CONTROLLER ODD and LDEV EVEN: PRIORITY 0
* CONTROLLER EVEN and LDEV ODD: PRIORITY 0
* CONTROLLER EVEN and LDEV EVEN: PRIORITY 1

When watching multipathing with debug output one can see that the controllers returned are 1 and 2:

May 08 14:44:00 | sdo: hds prio: VENDOR:  HITACHI
May 08 14:44:00 | sdo: hds prio: PRODUCT: DF600F         
May 08 14:44:00 | sdo: hds prio: SERIAL:  0x0089
May 08 14:44:00 | sdo: hds prio: LDEV:    0x0004
May 08 14:44:00 | sdo: hds prio: CTRL:    1                        <= This is really controller 0
May 08 14:44:00 | sdo: hds prio: PORT:    C
May 08 14:44:00 | sdo: hds prio: CTRL ODD, LDEV EVEN, PRIO 0
May 08 14:44:00 | sdo: hds prio = 0

May 08 14:44:00 | sdk: hds prio: VENDOR:  HITACHI
May 08 14:44:00 | sdk: hds prio: PRODUCT: DF600F         
May 08 14:44:00 | sdk: hds prio: SERIAL:  0x0089
May 08 14:44:00 | sdk: hds prio: LDEV:    0x0004
May 08 14:44:00 | sdk: hds prio: CTRL:    2                       
<= This is really controller 1
May 08 14:44:00 | sdk: hds prio: PORT:    C
May 08 14:44:00 | sdk: hds prio: CTRL EVEN, LDEV EVEN, PRIO 1
May 08 14:44:00 | sdk: hds prio = 1

This looks fine, but afaik HDS starts counting controllers from 0 (so actually I have 0 and 1). So when assigning LUN ownership in the storage, a LUN with an active/passive path will actually always be accessed through the wrong controller. This has a huge performance penalty when the system is under stress, because of the additional overhead generated by this.

To sum this up, the priority is exactly swapped from what it should be:

Lun 0 mapped with ownership on Controller 0 (
CONTROLLER EVEN and LDEV EVEN) will be accessed on Controller 1
Lun 1 mapped with ownership on Controller 1 (
CONTROLLER ODD and LDEV ODD)   will be accessed on Controller 0

I am not quite sure were to fix this. Looks like the code was contributed by Hitachi in 2006. Back then they maybe started the numbering of the controllers with 1. The AMS and the new HUS systems start at 0 though.


If you can tell me how I can help, I am glad to do so.

Thanks,
Christian

--
-=IPS
              GmbH=-

Mit freundlichen Grüßen / Best Regards

Christian Schausberger, MSc

  Systems Engineer

IPS Vertriebsgesellschaft für innovative EDV-Produkte und - Systeme GmbH
Franzosengraben 10
A-1030 Wien

T#: +43 1 796 86 86 - 57
F#: +43 1 796 86 86 - 15
M#: +43 664 88 45 46 11
@#: schausberger@ips.at

--------------010702060708040100060601 Content-Type: image/jpeg Content-ID: Content-Transfer-Encoding: base64 Thisbodypartwillbedownloadedondemand --------------010702060708040100060601-- --------------030800010007020207070105-- --===============8948387850383229954== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 7bit --===============8948387850383229954==--