From: Steven Dake <sdake@mvista.com>
To: Fabien Salvi <fabien@cri74.org>
Cc: Linux SCSI list <linux-scsi@vger.kernel.org>
Subject: Re: Fibre-Channel Access : interuptive access
Date: Mon, 25 Nov 2002 12:37:07 -0700 [thread overview]
Message-ID: <3DE27BE3.2090402@mvista.com> (raw)
In-Reply-To: 3DE261E5.3F61EBC4@cri74.org
Fabien,
What you want is hotswap support. The kernel has basic support for
hotswap but only if a device is not in use. Search the archives for
hotswap.
I'm currently working on forced block device removal, even if the device
is in use, properly shutting down files in VFS, RAID, and filesystem
mount layers. This is what you really need when hotswap happens, but it
just isn't ready yet.
The correct way to configure your system so it will be alive during this
type of failure is to have two HBAs, two switches, and have each hba go
through a seperate switch. This way, if your link/HBA/switch fails,
there is automatic failover.
Then create a RAID 1 array across both HBAs. In the case of a switch
failure, the RAID subsystem will automatically correct any errors and
rebuild arrays on disk reinsertions. Or you could use the RAID
multipathing personality to create a multipath across two hbas to the
same device.
Hope this helps.
-steve
Fabien Salvi wrote:
>Hello,
>
>We have Fibre-Channel HBA (Qlogic 2200F) with a Sanbox2 switch connected
>to a storage enclosure with a CMD 7240 Raid FC - SCSI controller.
>
>We use qla2x00 (v6.01) driver.
>
>When I reboot the FC switch, access is interrupted for 1 minute.
>If I still have partition mounted on external enclosure while rebooting,
>it brings failure on the server with a "semi-crash" of linux :
>
>I can still access on it, but fsync is impossible, access to the data
>after the reboot is not possible and reboot is blocked...
>So, I must do a hard reset.
>
>Well, this is something not really anormal you will say me, but what can
>I do to reduce damages ?
>Is there a way to prevent access to the partition while rebooting ?
>When there is a timeout in NFS mounts, it is still possible to reboot
>normally and to get back data when NFS is ok. Is there a solution like
>this with FibreChannel SCSI ?
>
>Here are the logs (I use Reiserfs filesystem) :
>
>Nov 25 16:26:27 d4 kernel: scsi(0): LOOP DOWN detected
>Nov 25 16:27:07 d4 kernel: SCSI disk error : host 0 channel 0 id 0 lun 7
>return code = 10000
>Nov 25 16:27:07 d4 kernel: I/O error: dev 08:01, sector 50056
>Nov 25 16:27:07 d4 kernel: journal-601, buffer write failed
>Nov 25 16:27:07 d4 kernel: kernel BUG at prints.c:334!
>Nov 25 16:27:07 d4 kernel: invalid operand: 0000
>Nov 25 16:27:07 d4 kernel: CPU: 0
>Nov 25 16:27:07 d4 kernel: EIP: 0010:[reiserfs_panic+41/96] Not
>tainted
>Nov 25 16:27:07 d4 kernel: EFLAGS: 00010286
>Nov 25 16:27:07 d4 kernel: eax: 00000024 ebx: c02764c0 ecx:
>c7fb0000 edx: 00000000
>Nov 25 16:27:07 d4 kernel: esi: c3470400 edi: 00000000 ebp:
>c3470400 esp: c7fb1ee4
>Nov 25 16:27:07 d4 kernel: ds: 0018 es: 0018 ss: 0018
>Nov 25 16:27:07 d4 kernel: Process kupdated (pid: 7, stackpage=c7fb1000)
>Nov 25 16:27:07 d4 kernel: Stack: c027495a c031f0c0 c02764c0 c7fb1f08
>c888c798 00000003 c01a83cf
> c3470400
>Nov 25 16:27:07 d4 kernel: c02764c0 00000011 00000012 00000010
>00000000 c888c7cc c888c7c0
> 00000004
>Nov 25 16:27:07 d4 kernel: 00000000 00000012 c7a032c0 c01abcfe
>c3470400 c888c798 00000001
> c7fb1fa4
>Nov 25 16:27:07 d4 kernel: Call Trace: [flush_commit_list+687/928]
>[do_journal_end+1982/2704]
> [flush_old_commits+287/320] [reiserfs_write_super+21/32]
>[sync_supers+191/240]
>Nov 25 16:27:07 d4 kernel: [sync_old_buffers+12/64] [kupdate+213/256]
>[kernel_thread+40/64]
>Nov 25 16:27:07 d4 kernel:
>Nov 25 16:27:07 d4 kernel: Code: 0f 0b 4e 01 60 49 27 c0 68 c0 f0 31 c0
>85 f6 74 16 0f b7 46
>Nov 25 16:27:08 d4 kernel: SCSI disk error : host 0 channel 0 id 0 lun
>7 return code = 10000
>Nov 25 16:27:08 d4 kernel: I/O error: dev 08:01, sector 50064
>Nov 25 16:27:09 d4 kernel: SCSI disk error : host 0 channel 0 id 0 lun 7
>return code = 10000
>Nov 25 16:27:09 d4 kernel: I/O error: dev 08:01, sector 50072
>Nov 25 16:27:30 d4 kernel: scsi(0): LOOP UP detected
>
>
>Thanks a lot for your help !
>
>--
>Fabien
>-
>To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
>
>
next prev parent reply other threads:[~2002-11-25 19:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-25 17:46 Fibre-Channel Access : interuptive access Fabien Salvi
2002-11-25 19:37 ` Steven Dake [this message]
2002-11-26 10:04 ` Fabien Salvi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3DE27BE3.2090402@mvista.com \
--to=sdake@mvista.com \
--cc=fabien@cri74.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.