From: Andreas Olsowski <andreas.olsowski@leuphana.de>
To: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: megasas stops I/O when running kernel as dom0 under xen4.1/4.2
Date: Thu, 11 Aug 2011 15:59:39 +0200 [thread overview]
Message-ID: <4E43E04B.8010401@leuphana.de> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 3359 bytes --]
Hello xen-devel,
as one of the people using Dell Servers i am aware that the LSI megaraid
drivers are quite old in the current 2.6.32 pvops tree,
but it seems that, once again, i have run into problems that are more
rare than the usual "cant find disk" issues. (Of which i had none, ever)
The situation:
--------------
I have 2 dom0 kernels, 2.6.32.44 and 3.0.1 that work fine when booted
bare-metal. I can run stress -m 40 -d 4 -i 1 for hours on end without
any error occuring.
The 2.6.32.44 kernels use version 00.00.05.30 megasas modules.
When i boot that kernel on my R610 servers under xen (4.1 and 4.2) the
kernels work fine too. I create 10 virtual machines, each running 4
"stress -m 40" and can do disk i/o on my local storage as much as i want to.
But on my Dell R710 system things dont look so good.
Booted bare-metal, both kernels work fine.
When i boot them as dom0 under xen, everything seems to be okay at first.
Then i create my 10 virtual machines that put some load on the memory.
And as soon as i do i/o to the local disk, even a "ls /usr/src/" can
suffice, i/o freezes, the system stops to respond to anything that
requires disk acccess.
After a while the kernel will start spewing out error messages:
#### lots of these
sd 0:2:0:0: [sda] megasas: RESET -83318 cmd=2a retries=0
megaraid_sas: HBA reset handler invoked without an internal reset condition.
megasas: [ 0]waiting for 16 commands to complete
megaraid_sas: no more pending commands remain after reset handling.
megasas: reset successful
###
### then some of these
sd 0:2:0:0: Device offlined - not ready after error recovery
###
### goes on to
sd 0:2:0:0: [sda] Unhandled error code
sd 0:2:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
sd 0:2:0:0: [sda] CDB: Write(10): 2a 00 08 45 6f 00 00 01 88 00
end_request: I/O error, dev sda, sector 138768128
Buffer I/O error on device sda2, logical block 5138912
lost page write due to I/O error on sda2
Buffer I/O error on device sda2, logical block 5138913
###
### and finally these, as often as one tries to access the disk
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
If a kernel works fine on one set of servers (Dell R610 with LSI Logic /
Symbios Logic LSI MegaSAS 9260 (rev 05) raid controllers) and crashes on
another server (Dell R710 with a LSI Logic / Symbios Logic MegaRAID SAS
1078 (rev 04) raid controller),
it would seem logical to assume, that the kernel does not support the
hardware properly.
But when run bare-metal, no errors occur.
I for one ran out of things to try, the R710 worked fine before i
upgraded its firmware to the most current versions and went from
xen4.0.1 to xen4.1/4.2.
So i put it to you, fine sirs of xen-devel:
is it:
A.) a hardware problem, because the software works on different hardware
or
B.) a xen problem, because the hardware runs fine in a non-virtualized
scenario with the same kernel
Or is it something else entirely?
Help, input, questions and suggestions are, as always, greatly appreciated.
With best regards
--
Andreas Olsowski
Leuphana Universität Lüneburg
Rechen- und Medienzentrum
Scharnhorststraße 1, C7.015
21335 Lüneburg
Tel: ++49 4131 677 1309
[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next reply other threads:[~2011-08-11 13:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-11 13:59 Andreas Olsowski [this message]
2011-08-11 16:27 ` megasas stops I/O when running kernel as dom0 under xen4.1/4.2 Simon Rowe
2011-08-11 22:51 ` Konrad Rzeszutek Wilk
2011-08-12 6:31 ` xen.frontend flag for higher display resolution (vnc) for HVM domU domains Mark Schneider
2011-08-12 7:26 ` Marc - A. Dahlhaus
2011-08-12 7:42 ` megasas stops I/O when running kernel as dom0 under xen4.1/4.2 Simon Rowe
2011-08-12 9:11 ` Andreas Olsowski
2011-08-12 9:23 ` Simon Rowe
2011-08-15 10:49 ` Simon Rowe
2011-08-15 12:52 ` Andreas Olsowski
2011-08-19 12:28 ` Andrew Cooper
2011-08-19 14:17 ` Andreas Olsowski
2011-08-19 14:57 ` Andrew Cooper
2011-08-19 16:37 ` Andreas Olsowski
2011-08-19 16:49 ` Andrew Cooper
2011-08-19 18:10 ` Andreas Olsowski
2011-08-22 9:05 ` Andrew Cooper
2011-08-24 12:06 ` Andrew Cooper
2011-08-24 16:57 ` Andrew Cooper
2011-08-24 17:09 ` Konrad Rzeszutek Wilk
2011-08-24 17:20 ` Andrew Cooper
2011-08-26 18:16 ` Andrew Cooper
2011-08-26 18:32 ` Andrew Cooper
2011-08-30 12:02 ` Andreas Olsowski
2011-08-30 12:11 ` Andrew Cooper
2011-08-30 12:46 ` Keir Fraser
2011-08-12 9:02 ` Simon Rowe
2011-08-12 16:26 ` Pasi Kärkkäinen
2011-08-15 7:44 ` Simon Rowe
2011-08-12 16:25 ` Pasi Kärkkäinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E43E04B.8010401@leuphana.de \
--to=andreas.olsowski@leuphana.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.