All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Mansfield <patmans@us.ibm.com>
To: Frederik Schueler <fs@lowpingbastards.de>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: new qla2xxx driver breaks SAN setup with 2 controllers
Date: Tue, 23 Aug 2005 13:00:40 -0700	[thread overview]
Message-ID: <20050823200040.GA8310@us.ibm.com> (raw)
In-Reply-To: <20050823112535.GB13391@mail.lowpingbastards.de>

On Tue, Aug 23, 2005 at 01:25:35PM +0200, Frederik Schueler wrote:
> hello,
> 
> we are experiencing problems with the new qlogic driver in 2.6.12 on
> a set of servers with qla2310 HBAs.
> 
> The problem is as follows:
> 
> The Infotrend storage array we are using has two controllers, each
> of them has two virtual discs with a couple of partitions exported
> as shared storage.
> 
> The controllers are linked inside of the storage box, each controller
> has one qlogic fabric switch attached, and half of the servers are
> connected to the lefthand switch, the other half is connected to the
> righthand switch.
> 
> Now, with the qlogic driver in 2.6.11.12, we can access all shares
> on both controllers from every server, while the new driver allows
> only access to the respective controller where the switch is attached
> to directly, thus depriving the servers of half of it's shared
> storage devices.
> 
> Example: on server s05, we have a boot device (lun 3 on primary
> controller), and 2 shared storages (lun 9 on primary, lun 10 on
> secondary controller).

The use of scsiadd script implies that you are attaching or somehow
modifying the storage after the driver has loaded. Is that correct?

There is a fix for scanning initiated via user space, this change:

http://www.kernel.org/git/?p=linux/kernel/git/jejb/scsi-rc-fixes-2.6.git;a=commit;h=5c44cd2afad3f7b015542187e147a820600172f1

The above fix is in the current 2.6 git tree. Does that fix your problem?

If so, reloading the driver should also rescan correctly (even in
2.6.12.5).

unplugging/plugging the cable might also fix the problem.

> With 2.6.11.12, this looks as follows:
> 
> s05:~# cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 03
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 00 Lun: 09
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 01 Lun: 10
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> 
> 
> and the driver sees everything:
> 
> s05:~# cat /proc/scsi/qla2xxx/0
> QLogic PCI to Fibre Channel Host Adapter for QLA2310:
>         Firmware version 3.03.08 IPX, Driver version 8.00.02b4-k
> ISP: ISP2300, Serial# R74545
> Request Queue = 0xcf940000, Response Queue = 0xcf980000
> Request Queue count = 2048, Response Queue count = 512
> Total number of active commands = 0
> Total number of interrupts = 1117762
>     Device queue depth = 0x20
> Number of free request entries = 964
> Number of mailbox timeouts = 0
> Number of ISP aborts = 0
> Number of loop resyncs = 0
> Number of retries for empty slots = 0
> Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0
> Host adapter:loop state = <READY>, flags = 0x1a03
> Dpc flags = 0x0
> MBX flags = 0x0
> Link down Timeout = 030
> Port down retry = 030
> Login retry count = 030
> Commands retried with dropped frame(s) = 0
> Product ID = 4953 5020 2020 0001
> 
> 
> SCSI Device Information:
> scsi-qla0-adapter-node=200000e08b1bd113;
> scsi-qla0-adapter-port=210000e08b1bd113;
> scsi-qla0-target-0=210000d023800002;
> scsi-qla0-target-1=210000d023600002;
> 
> SCSI LUN Information:
> (Id:Lun)  * - indicates lun is not registered with the OS.
> ( 0: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:81 00
> ( 0: 3): Total reqs 470693, Pending reqs 0, flags 0x0, 0:0:81 00
> ( 0: 9): Total reqs 227717, Pending reqs 0, flags 0x0, 0:0:81 00
> ( 0:11): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00
> ( 0:13): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00
> ( 1: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:82 00
> ( 1:10): Total reqs 12, Pending reqs 0, flags 0x0, 0:0:82 00
> ( 1:12): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00
> ( 1:14): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00
> 
> 
> while on 2.6.12.5 and 2.6.13-rc6 it looks like this:
> 
> sm05:~# scsiadd -a 0 0 0 9
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 03
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 00 Lun: 09
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> 
> 
> sm05:~# scsiadd -a 0 0 1 10
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 03
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> Host: scsi0 Channel: 00 Id: 00 Lun: 09
>   Vendor: IFT      Model: A16F-R1211       Rev: 334B
>   Type:   Direct-Access                    ANSI SCSI revision: 03
> 
> 
> unfortunately, the proc interface was removed:

Why, is some data missing?

Also try using lsscsi.

> s05:/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/0000:02:02.0/host0#
> find .
> .
> ./rport-0:0-1
> ./rport-0:0-1/power
> ./rport-0:0-1/power/state
> ./rport-0:0-0
> ./rport-0:0-0/target0:0:0
> ./rport-0:0-0/target0:0:0/0:0:0:9
> ./rport-0:0-0/target0:0:0/0:0:0:9/ioerr_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:9/iodone_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:9/iorequest_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:9/iocounterbits
> ./rport-0:0-0/target0:0:0/0:0:0:9/timeout
> ./rport-0:0-0/target0:0:0/0:0:0:9/state
> ./rport-0:0-0/target0:0:0/0:0:0:9/delete
> ./rport-0:0-0/target0:0:0/0:0:0:9/rescan
> ./rport-0:0-0/target0:0:0/0:0:0:9/rev
> ./rport-0:0-0/target0:0:0/0:0:0:9/model
> ./rport-0:0-0/target0:0:0/0:0:0:9/vendor
> ./rport-0:0-0/target0:0:0/0:0:0:9/scsi_level
> ./rport-0:0-0/target0:0:0/0:0:0:9/type
> ./rport-0:0-0/target0:0:0/0:0:0:9/queue_type
> ./rport-0:0-0/target0:0:0/0:0:0:9/queue_depth
> ./rport-0:0-0/target0:0:0/0:0:0:9/device_blocked
> ./rport-0:0-0/target0:0:0/0:0:0:9/bus
> ./rport-0:0-0/target0:0:0/0:0:0:9/driver
> ./rport-0:0-0/target0:0:0/0:0:0:9/block
> ./rport-0:0-0/target0:0:0/0:0:0:9/power
> ./rport-0:0-0/target0:0:0/0:0:0:9/power/state
> ./rport-0:0-0/target0:0:0/0:0:0:3
> ./rport-0:0-0/target0:0:0/0:0:0:3/ioerr_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:3/iodone_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:3/iorequest_cnt
> ./rport-0:0-0/target0:0:0/0:0:0:3/iocounterbits
> ./rport-0:0-0/target0:0:0/0:0:0:3/timeout
> ./rport-0:0-0/target0:0:0/0:0:0:3/state
> ./rport-0:0-0/target0:0:0/0:0:0:3/delete
> ./rport-0:0-0/target0:0:0/0:0:0:3/rescan
> ./rport-0:0-0/target0:0:0/0:0:0:3/rev
> ./rport-0:0-0/target0:0:0/0:0:0:3/model
> ./rport-0:0-0/target0:0:0/0:0:0:3/vendor
> ./rport-0:0-0/target0:0:0/0:0:0:3/scsi_level
> ./rport-0:0-0/target0:0:0/0:0:0:3/type
> ./rport-0:0-0/target0:0:0/0:0:0:3/queue_type
> ./rport-0:0-0/target0:0:0/0:0:0:3/queue_depth
> ./rport-0:0-0/target0:0:0/0:0:0:3/device_blocked
> ./rport-0:0-0/target0:0:0/0:0:0:3/bus
> ./rport-0:0-0/target0:0:0/0:0:0:3/driver
> ./rport-0:0-0/target0:0:0/0:0:0:3/block
> ./rport-0:0-0/target0:0:0/0:0:0:3/power
> ./rport-0:0-0/target0:0:0/0:0:0:3/power/state
> ./rport-0:0-0/target0:0:0/power
> ./rport-0:0-0/target0:0:0/power/state
> ./rport-0:0-0/power
> ./rport-0:0-0/power/state
> ./nvram
> ./fw_dump
> ./power
> ./power/state
> 
> 
> apparently the targets on rport-0:0-1 are not scanned at all, and
> so the devices on the secondary controller are not reachable.
> 
> placing an additional link between the two fabric switches did
> double the amount of targets, but not solve our problem.
> It seems to us the 2.6.12+ driver does not allow access to
> controllers not directly attached to the very same fabric switch.
> 
> how can this be fixed?
> 
> 
> Best regards
> Frederik Schueler

-- Patrick Mansfield

  reply	other threads:[~2005-08-23 20:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-23 11:25 new qla2xxx driver breaks SAN setup with 2 controllers Frederik Schueler
2005-08-23 20:00 ` Patrick Mansfield [this message]
2005-08-24  9:55   ` Frederik Schueler
2005-08-24 10:01     ` Christoph Hellwig
2005-08-24 12:48       ` Frederik Schueler
2005-08-24 12:50         ` Christoph Hellwig
2005-08-24 13:08           ` Frederik Schueler
2005-08-24 17:03             ` Patrick Mansfield
2005-08-25 11:42               ` Frederik Schueler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050823200040.GA8310@us.ibm.com \
    --to=patmans@us.ibm.com \
    --cc=fs@lowpingbastards.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.