From: Andreas Olsowski <andreas.olsowski@leuphana.de>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
xen-devel@lists.xensource.com
Subject: Re: [SPAM] Re: kernel BUG at arch/x86/xen/mmu.c:1860!
Date: Thu, 10 Mar 2011 14:45:02 +0100 [thread overview]
Message-ID: <4D78D5DE.4000609@leuphana.de> (raw)
In-Reply-To: <4D77DC0A.9090705@leuphana.de>
[-- Attachment #1.1: Type: text/plain, Size: 4892 bytes --]
All xen 4.1.0 test were done on server1 (netcatarina).
All but one test with xen 4.0.1 were made on server2 (memoryana).
Why i had to rerun one of the test for server2 on server1 is explained
below.
Here are my test results:
======================================================
Kernel 2.6.32.28 without XEN:
about 50 successful runs of Teck Choon Giams "test.sh" script.
(modified for handling 10 test volumes and sleeping 2 seconds)
multipathd restarted succesfully s
multipath module loaded/unloaded successfully
lvm2 restarted successfully
======================================================
Kernel 2.6.38 without XEN:
about 20 successful runs of "test.sh"
multipathd restarted succesfully s
multipath module loaded/unloaded successfully
lvm2 restarted successfully
======================================================
Kernel 2.6.32.28 with XEN 4.0.1:
at about loop 2 for volume 7 of "test.sh" it stopped doing ... well anything
there has been no output on the screen and neitehr syslog nor dmesg entry.
I left it hanging for about 15 Minutes until i decided to write this one
off as a side effect of the same underlying problem.
All lvm2 tools stopped working and i couldnt shut it down.
Killing the hangig process ended it properly.
I did a cold reset of the server, as i wanted to see the discussed BUG
again. But i failed here.
It would seem like my server2 has some kind of addressing error:
pci 000:04:00.1: BAR 6: address space collision of device ....
0000:04:00.1: is one of my QLogic HBAs
And since i use centralized FC storage ... who knows what side effects
happened here.
Interesting enough i had no problems with kernel 2.6.38 on this machine.
So i downgraded server1 that did never show this message to xen 4.0.1
and ran the test:
after 2 loops at volume 5 i hit "kernel BUG at arch/x86/xen/mmu.c" again.
======================================================
Kernel 2.6.38 with XEN 4.0.1:
100 runs of test.sh without error
multipathd restarted successfully
multipath module loaded/unloaded successfully
lvm2 stop/start ok
======================================================
Kernel 2.6.32.28 with XEN 4.1.0-rc7:
booted at first:
crash afer only 5 iterations of "test.sh"
http://pastebin.com/uNL7ehZ8
later, after having booted 2.6.38 on this server to test it with xen
4.1, i encountered different error at boottime:
BUG: unable to handle kernel paging request at ffff8800cc3e5f48
Only have pictures of it:
http://141.39.208.101/err1.png
http://141.39.208.101/err2.png
I then did a cold boot of the server, as this has proven to make it boot
in the past.
When this did not help, i stopped the test.sh running on my other
server, because the hang came when lvm2 was started and the servers use
shared storage.
Apparently this helped, the server booted fine after another cold reset.
After that i encountered an error again at loop 10 of "test.sh", but not
with the "kernel BUG at arch/x86/xen/mmu.c", but again, with
"BUG: unable to handle kernel paging request at ffff8800cc61ce010"
http://141.39.208.101/err3.png
http://141.39.208.101/err4.png
======================================================
Kernel 2.6.38 with XEN 4.1.0-rc7:
100 runs of test.sh without error
multipathd restarted successfully
multipath module loaded/unloaded successfully
lvm2 stop/start ok
======================================================
Summary
======================================================
So thats two different errors i have encountered,
one is the "kernel BUG at arch/x86/xen/mmu.c", the other is
"BUG: unablte to handle kernel paging request"
Both only apply to 2.6.32 when running under eitehr xen4.0.1 or 4.1.
On its own the kernel works fine.
Kernel 2.6.38 ran fine on both hypervisors as well as on its own.
One other issue occured that i didnt expect:
With the same .config (make oldconfig), 2.6.38 left my screen black
after loading the kernel, on both hypervisors.
The servers worked just fine, i just didnt see any output on their VGA
ports.
I hope this information helps you to hunt this bug down as it
effectively makes the "default" Xen unusable in server situations where
the device mapper is involved.
It is puzzling to me why noone did notice it last year, am i the only
one running xen on server hardware (Dell R610, 710 and 2950) with
centralized storage (FibreChannel or iSCSI) and using it as environment
for production.
Is multipathing two links to a centralized storage and using LVM2 to
split it up for virtual machines running on two or more servers really
such a rare thing to find Xen running on?
Btw, who is currently working on the remus implementation?
If you should need any more testing from me, feel free to ask.
Best regards.
--
Andreas Olsowski
[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 6595 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2011-03-10 13:45 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-26 8:16 kernel BUG at arch/x86/xen/mmu.c:1860! Teck Choon Giam
2010-12-27 15:53 ` Konrad Rzeszutek Wilk
2010-12-27 22:14 ` Teck Choon Giam
2010-12-28 10:42 ` Pasi Kärkkäinen
2010-12-28 18:01 ` Teck Choon Giam
2010-12-29 4:25 ` Teck Choon Giam
2010-12-29 4:58 ` Teck Choon Giam
2011-01-14 15:20 ` Konrad Rzeszutek Wilk
2011-01-14 19:25 ` Teck Choon Giam
2011-01-14 19:44 ` Konrad Rzeszutek Wilk
2011-01-14 20:09 ` Teck Choon Giam
2011-01-14 20:32 ` Teck Choon Giam
2011-01-24 1:42 ` Teck Choon Giam
2011-01-24 14:36 ` Konrad Rzeszutek Wilk
2011-01-24 15:56 ` Teck Choon Giam
2011-01-25 14:48 ` Konrad Rzeszutek Wilk
2011-01-26 14:31 ` Konrad Rzeszutek Wilk
2011-01-27 17:17 ` Teck Choon Giam
2011-01-27 20:32 ` Konrad Rzeszutek Wilk
2011-01-27 22:20 ` Teck Choon Giam
2011-02-26 12:03 ` Teck Choon Giam
2011-02-28 16:20 ` Konrad Rzeszutek Wilk
2011-03-01 9:59 ` Teck Choon Giam
2011-03-03 22:16 ` Konrad Rzeszutek Wilk
2011-03-04 5:30 ` Teck Choon Giam
2011-03-04 6:15 ` Fajar A. Nugraha
2011-03-04 6:33 ` Teck Choon Giam
2011-03-08 19:29 ` Konrad Rzeszutek Wilk
2011-03-08 20:10 ` Konrad Rzeszutek Wilk
2011-03-08 20:20 ` Teck Choon Giam
2011-03-08 20:45 ` Guido Hecken
2011-03-08 20:50 ` [SPAM] " tjaouen
2011-03-09 0:06 ` Andreas Olsowski
2011-03-11 18:38 ` benco
2011-03-11 19:59 ` Sander Eikelenboom
2011-03-11 20:29 ` Teck Choon Giam
2011-03-11 20:45 ` Teck Choon Giam
2011-03-11 21:02 ` Sander Eikelenboom
2011-03-11 21:15 ` Teck Choon Giam
2011-03-09 0:43 ` [SPAM] " Konrad Rzeszutek Wilk
2011-03-09 6:58 ` Andreas Olsowski
2011-03-09 15:00 ` Konrad Rzeszutek Wilk
2011-03-09 19:59 ` Andreas Olsowski
2011-03-10 7:20 ` Andreas Olsowski
2011-03-10 13:45 ` Andreas Olsowski [this message]
2011-03-11 18:05 ` Konrad Rzeszutek Wilk
2011-03-14 10:25 ` Ian Campbell
2011-03-14 10:36 ` Teck Choon Giam
2011-03-16 15:52 ` [SPAM] Re: kernel BUG at arch/x86/xen/mmu.c:1860! - ideas Konrad Rzeszutek Wilk
2011-03-16 16:26 ` Teck Choon Giam
2011-03-16 16:40 ` Konrad Rzeszutek Wilk
2011-03-24 11:57 ` Konrad Rzeszutek Wilk
2011-03-24 21:28 ` Teck Choon Giam
2011-03-25 3:57 ` Teck Choon Giam
2011-03-27 10:16 ` Teck Choon Giam
2011-03-28 11:37 ` Andreas Olsowski
2011-03-28 12:29 ` Teck Choon Giam
2011-04-05 22:01 ` Dave Hunter
2011-04-05 22:15 ` Teck Choon Giam
2011-04-05 23:20 ` Dave Hunter
2011-04-06 7:53 ` Ian Campbell
2011-04-06 21:52 ` Jeremy Fitzhardinge
2011-04-07 13:16 ` Teck Choon Giam
2011-03-09 0:41 ` kernel BUG at arch/x86/xen/mmu.c:1860! Konrad Rzeszutek Wilk
2011-01-04 15:10 ` Christophe Saout
2011-01-04 15:19 ` Christophe Saout
2011-01-04 15:37 ` benco
2011-01-04 18:40 ` Christophe Saout
2011-01-04 19:32 ` Teck Choon Giam
2011-01-04 19:56 ` benco
2011-01-14 15:22 ` Konrad Rzeszutek Wilk
2011-01-14 15:33 ` Christophe Saout
2011-01-04 23:10 ` Christophe Saout
2011-01-05 10:51 ` Pasi Kärkkäinen
2011-01-05 14:56 ` Teck Choon Giam
2011-01-14 15:24 ` Konrad Rzeszutek Wilk
2011-01-14 19:31 ` Teck Choon Giam
2011-01-04 13:48 ` Ian Campbell
2011-01-04 19:24 ` Teck Choon Giam
2011-01-05 15:30 ` Teck Choon Giam
2011-01-13 14:28 ` tjaouen
2011-01-14 14:47 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D78D5DE.4000609@leuphana.de \
--to=andreas.olsowski@leuphana.de \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).