* Dom0 panic on xen 3.4.3
@ 2010-08-20 18:15 Cris Daniluk
2010-08-20 23:59 ` Pasi Kärkkäinen
0 siblings, 1 reply; 8+ messages in thread
From: Cris Daniluk @ 2010-08-20 18:15 UTC (permalink / raw)
To: xen-devel
I'm running Xen 3.4.3 final and observing the following kernel panic
when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux
kernel. grub.conf entry is below, followed by the dump. The kernel
crash is immediately after the 9th domU is unpaused.
title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen)
root (hd0,0)
kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all
com1=38400
0,8n1 console=com1 sync_console noreboot
module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base
console=hvc0,38400 norhgb netloop.nloopbacks=100
module /mpp-2.6.18-194.11.1.el5xen.img
title CentOS (2.6.18-194.11.1.el5) with MPP
root (hd0,0)
Unable to handle kernel paging request at ffff88002e864000 RIP:
[<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
PGD 11a5067 PUD 11a6067 PMD 131b067 PTE 0
Oops: 0000 [1] SMP
last sysfs file: /devices/xen-backend/vbd-2-51712/statistics/wr_sect
CPU 2
Modules linked in: xt_tcpudp xt_state ip_conntrack nfnetlink
xt_physdev bridge nfs lockd fscache nfs_acl sunrpc iptable_filter
ip_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad
ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio
cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2
scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh video
backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery
asus_acpi ac blkbk netbk blktap pciback parport_pc lp parport joydev
cdc_ether serial_core usbnet pcspkr bnx2 ide_cd i2c_i801 e1000e
i2c_core cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache
dm_snapshot dm_zero dm_mirror dm_log dm_mod mppVhba(U) ata_piix libata
shpchp megaraid_sas mppUpper(U) sg sd_mod scsi_mod ext3 jbd uhci_hcd
ohci_hcd ehci_hcd
Pid: 0, comm: swapper Tainted: G 2.6.18-194.11.1.el5xen #1
RIP: e030:[<ffffffff8041a95f>] [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
RSP: e02b:ffff88000112fdd0 EFLAGS: 00010246
RAX: 0000000000000042 RBX: ffff88001ef04480 RCX: 0000000000000b50
RDX: ffff8800238f0d00 RSI: ffff88002e864000 RDI: ffff880027bc0000
RBP: 0000000000000000 R08: ffff8800238f0d10 R09: 0000000000000b92
R10: 0000000000000b50 R11: 0000000000000000 R12: 0000000000000042
R13: 0000000000000042 R14: 0000000000000000 R15: ffff880027bc0000
FS: 00002b2e6ab313f0(0000) GS:ffffffff805d2100(0000) knlGS:0000000000000000
CS: e033 DS: 002b ES: 002b
Process swapper (pid: 0, threadinfo ffff880006186000, task ffff8800000657e0)
Stack: ffff88002c8b27c0 ffff88002b7c8b80 0000000000000b50 ffff88001e5abe80
ffff8800289cf500 ffff88001ef04480 ffff880001be3200 ffffffff8836b732
ffff8800289cf000 0000000000000042
Call Trace:
<IRQ> [<ffffffff8836b732>] :netbk:netif_be_start_xmit+0x241/0x471
[<ffffffff8041fbde>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff8042fff5>] __qdisc_run+0x136/0x1f9
[<ffffffff803b348c>] unmask_evtchn+0x2d/0xd7
[<ffffffff80420f33>] net_tx_action+0xc9/0xf1
[<ffffffff80212cd3>] __do_softirq+0x8d/0x13b
[<ffffffff80260da4>] call_softirq+0x1c/0x278
[<ffffffff8026e0c1>] do_softirq+0x31/0x98
[<ffffffff8026df4d>] do_IRQ+0xec/0xf5
[<ffffffff803b3e14>] evtchn_do_upcall+0x13b/0x1fb
[<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff8026f4eb>] raw_safe_halt+0x84/0xa8
[<ffffffff8026ca80>] xen_idle+0x38/0x4a
[<ffffffff8024add7>] cpu_idle+0x97/0xba
Code: f3 a4 0f 84 a2 00 00 00 45 01 d4 49 89 ff 41 ff c6 45 89 cd
RIP [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
RSP <ffff88000112fdd0>
CR2: ffff88002e864000
test6: no IPv6 routers present
<0>Kernel panic - not syncing: Fatal exception
BUG: warning at
arch/x86_64/kernel/genapic_xen.c:92/xen_send_IPI_mask() (Tainted: G
)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-08-20 18:15 Dom0 panic on xen 3.4.3 Cris Daniluk
@ 2010-08-20 23:59 ` Pasi Kärkkäinen
2010-08-21 0:20 ` Cris Daniluk
0 siblings, 1 reply; 8+ messages in thread
From: Pasi Kärkkäinen @ 2010-08-20 23:59 UTC (permalink / raw)
To: Cris Daniluk; +Cc: xen-devel
On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote:
> I'm running Xen 3.4.3 final and observing the following kernel panic
> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux
> kernel. grub.conf entry is below, followed by the dump. The kernel
> crash is immediately after the 9th domU is unpaused.
>
> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen)
> root (hd0,0)
> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all
> com1=38400
> 0,8n1 console=com1 sync_console noreboot
> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base
> console=hvc0,38400 norhgb netloop.nloopbacks=100
> module /mpp-2.6.18-194.11.1.el5xen.img
> title CentOS (2.6.18-194.11.1.el5) with MPP
> root (hd0,0)
>
so I assume this crash doesn't happen if you use the el5 default xen hypervisor?
Have you tried using latest linux-2.6.18-xen (from xen.org) ?
-- Pasi
>
>
> Unable to handle kernel paging request at ffff88002e864000 RIP:
> [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
> PGD 11a5067 PUD 11a6067 PMD 131b067 PTE 0
> Oops: 0000 [1] SMP
> last sysfs file: /devices/xen-backend/vbd-2-51712/statistics/wr_sect
> CPU 2
> Modules linked in: xt_tcpudp xt_state ip_conntrack nfnetlink
> xt_physdev bridge nfs lockd fscache nfs_acl sunrpc iptable_filter
> ip_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad
> ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio
> cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2
> scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh video
> backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery
> asus_acpi ac blkbk netbk blktap pciback parport_pc lp parport joydev
> cdc_ether serial_core usbnet pcspkr bnx2 ide_cd i2c_i801 e1000e
> i2c_core cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache
> dm_snapshot dm_zero dm_mirror dm_log dm_mod mppVhba(U) ata_piix libata
> shpchp megaraid_sas mppUpper(U) sg sd_mod scsi_mod ext3 jbd uhci_hcd
> ohci_hcd ehci_hcd
> Pid: 0, comm: swapper Tainted: G 2.6.18-194.11.1.el5xen #1
> RIP: e030:[<ffffffff8041a95f>] [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
> RSP: e02b:ffff88000112fdd0 EFLAGS: 00010246
> RAX: 0000000000000042 RBX: ffff88001ef04480 RCX: 0000000000000b50
> RDX: ffff8800238f0d00 RSI: ffff88002e864000 RDI: ffff880027bc0000
> RBP: 0000000000000000 R08: ffff8800238f0d10 R09: 0000000000000b92
> R10: 0000000000000b50 R11: 0000000000000000 R12: 0000000000000042
> R13: 0000000000000042 R14: 0000000000000000 R15: ffff880027bc0000
> FS: 00002b2e6ab313f0(0000) GS:ffffffff805d2100(0000) knlGS:0000000000000000
> CS: e033 DS: 002b ES: 002b
> Process swapper (pid: 0, threadinfo ffff880006186000, task ffff8800000657e0)
> Stack: ffff88002c8b27c0 ffff88002b7c8b80 0000000000000b50 ffff88001e5abe80
> ffff8800289cf500 ffff88001ef04480 ffff880001be3200 ffffffff8836b732
> ffff8800289cf000 0000000000000042
> Call Trace:
> <IRQ> [<ffffffff8836b732>] :netbk:netif_be_start_xmit+0x241/0x471
> [<ffffffff8041fbde>] dev_hard_start_xmit+0x1b7/0x28a
> [<ffffffff8042fff5>] __qdisc_run+0x136/0x1f9
> [<ffffffff803b348c>] unmask_evtchn+0x2d/0xd7
> [<ffffffff80420f33>] net_tx_action+0xc9/0xf1
> [<ffffffff80212cd3>] __do_softirq+0x8d/0x13b
> [<ffffffff80260da4>] call_softirq+0x1c/0x278
> [<ffffffff8026e0c1>] do_softirq+0x31/0x98
> [<ffffffff8026df4d>] do_IRQ+0xec/0xf5
> [<ffffffff803b3e14>] evtchn_do_upcall+0x13b/0x1fb
> [<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c
> <EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
> [<ffffffff8026f4eb>] raw_safe_halt+0x84/0xa8
> [<ffffffff8026ca80>] xen_idle+0x38/0x4a
> [<ffffffff8024add7>] cpu_idle+0x97/0xba
>
>
> Code: f3 a4 0f 84 a2 00 00 00 45 01 d4 49 89 ff 41 ff c6 45 89 cd
> RIP [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3
> RSP <ffff88000112fdd0>
> CR2: ffff88002e864000
> test6: no IPv6 routers present
> <0>Kernel panic - not syncing: Fatal exception
> BUG: warning at
> arch/x86_64/kernel/genapic_xen.c:92/xen_send_IPI_mask() (Tainted: G
> )
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-08-20 23:59 ` Pasi Kärkkäinen
@ 2010-08-21 0:20 ` Cris Daniluk
2010-08-21 10:12 ` Pasi Kärkkäinen
0 siblings, 1 reply; 8+ messages in thread
From: Cris Daniluk @ 2010-08-21 0:20 UTC (permalink / raw)
To: Pasi Kärkkäinen; +Cc: xen-devel
On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:
> On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote:
>> I'm running Xen 3.4.3 final and observing the following kernel panic
>> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux
>> kernel. grub.conf entry is below, followed by the dump. The kernel
>> crash is immediately after the 9th domU is unpaused.
>>
>> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen)
>> root (hd0,0)
>> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all
>> com1=38400
>> 0,8n1 console=com1 sync_console noreboot
>> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base
>> console=hvc0,38400 norhgb netloop.nloopbacks=100
>> module /mpp-2.6.18-194.11.1.el5xen.img
>> title CentOS (2.6.18-194.11.1.el5) with MPP
>> root (hd0,0)
>>
>
> so I assume this crash doesn't happen if you use the el5 default xen hypervisor?
>
> Have you tried using latest linux-2.6.18-xen (from xen.org) ?
>
> -- Pasi
>
The dom0 kernel is the el5 default, but I have not tried running their
hypervisor since it is so dated. I'll give a shot with the latest
xen.org kernel.
Thanks,
Cris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-08-21 0:20 ` Cris Daniluk
@ 2010-08-21 10:12 ` Pasi Kärkkäinen
2010-08-21 22:27 ` Cris Daniluk
0 siblings, 1 reply; 8+ messages in thread
From: Pasi Kärkkäinen @ 2010-08-21 10:12 UTC (permalink / raw)
To: Cris Daniluk; +Cc: xen-devel
On Fri, Aug 20, 2010 at 08:20:54PM -0400, Cris Daniluk wrote:
> On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:
> > On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote:
> >> I'm running Xen 3.4.3 final and observing the following kernel panic
> >> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux
> >> kernel. grub.conf entry is below, followed by the dump. The kernel
> >> crash is immediately after the 9th domU is unpaused.
> >>
> >> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen)
> >> root (hd0,0)
> >> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all
> >> com1=38400
> >> 0,8n1 console=com1 sync_console noreboot
> >> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base
> >> console=hvc0,38400 norhgb netloop.nloopbacks=100
> >> module /mpp-2.6.18-194.11.1.el5xen.img
> >> title CentOS (2.6.18-194.11.1.el5) with MPP
> >> root (hd0,0)
> >>
> >
> > so I assume this crash doesn't happen if you use the el5 default xen hypervisor?
> >
> > Have you tried using latest linux-2.6.18-xen (from xen.org) ?
> >
> > -- Pasi
> >
>
> The dom0 kernel is the el5 default, but I have not tried running their
> hypervisor since it is so dated. I'll give a shot with the latest
> xen.org kernel.
>
Well the version number looks old (3.1.2), but it has a lot of fixes and backports
from newer xen versions.
-- Pasi
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-08-21 10:12 ` Pasi Kärkkäinen
@ 2010-08-21 22:27 ` Cris Daniluk
2010-11-01 10:28 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Cris Daniluk @ 2010-08-21 22:27 UTC (permalink / raw)
To: Pasi Kärkkäinen; +Cc: xen-devel
On Sat, Aug 21, 2010 at 6:12 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote:
> On Fri, Aug 20, 2010 at 08:20:54PM -0400, Cris Daniluk wrote:
>> On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:
>> > On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote:
>> >> I'm running Xen 3.4.3 final and observing the following kernel panic
>> >> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux
>> >> kernel. grub.conf entry is below, followed by the dump. The kernel
>> >> crash is immediately after the 9th domU is unpaused.
>> >>
>> >> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen)
>> >> root (hd0,0)
>> >> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all
>> >> com1=38400
>> >> 0,8n1 console=com1 sync_console noreboot
>> >> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base
>> >> console=hvc0,38400 norhgb netloop.nloopbacks=100
>> >> module /mpp-2.6.18-194.11.1.el5xen.img
>> >> title CentOS (2.6.18-194.11.1.el5) with MPP
>> >> root (hd0,0)
>> >>
>> >
>> > so I assume this crash doesn't happen if you use the el5 default xen hypervisor?
>> >
>> > Have you tried using latest linux-2.6.18-xen (from xen.org) ?
>> >
>> > -- Pasi
>> >
>>
>> The dom0 kernel is the el5 default, but I have not tried running their
>> hypervisor since it is so dated. I'll give a shot with the latest
>> xen.org kernel.
>>
>
> Well the version number looks old (3.1.2), but it has a lot of fixes and backports
> from newer xen versions.
>
> -- Pasi
>
>
Having the same issue with the EL5 RPM as well. Latest 2.6.18 on
xen.org seems to be a little broken. The LSI controller isn't getting
detected after the megaraid_sas module loads. The initrds look
identical between the two, so I'm assuming there is some obscure bug
that was fixed and backported into the EL5 kernel but not Jeremy's. I
suppose I can try a newer pvops kernel if you think that might be
relevant.
Seems to be fairly consistently crashing regardless of Xen version,
though. Also interestingly it seems it may be more about the VMs
starting at once htan the specific number of VMs..
Cris
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-08-21 22:27 ` Cris Daniluk
@ 2010-11-01 10:28 ` Paolo Bonzini
2010-12-28 0:50 ` prickett233
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2010-11-01 10:28 UTC (permalink / raw)
To: Cris Daniluk; +Cc: Drew Jones, dwu, xen-devel
On 08/22/2010 12:27 AM, Cris Daniluk wrote:
> Having the same issue with the EL5 RPM as well. Latest 2.6.18 on
> xen.org seems to be a little broken. The LSI controller isn't getting
> detected after the megaraid_sas module loads. The initrds look
> identical between the two, so I'm assuming there is some obscure bug
> that was fixed and backported into the EL5 kernel but not Jeremy's. I
> suppose I can try a newer pvops kernel if you think that might be
> relevant.
>
> Seems to be fairly consistently crashing regardless of Xen version,
> though. Also interestingly it seems it may be more about the VMs
> starting at once htan the specific number of VMs..
Hi Cris,
I am a virtualization engineer at Red Hat. We encountered this bug
recently but we are not able to reproduce it consistently. Can you do
so? If so, I could try giving you a test EL5 kernel (so that you can
boot) with upstream's netback driver to test it.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Dom0 panic on xen 3.4.3
2010-11-01 10:28 ` Paolo Bonzini
@ 2010-12-28 0:50 ` prickett233
2010-12-28 10:38 ` Pasi Kärkkäinen
0 siblings, 1 reply; 8+ messages in thread
From: prickett233 @ 2010-12-28 0:50 UTC (permalink / raw)
To: xen-devel
We also experience this same problem with the dom0 crashing when multiple
domU's are starting up at the same time. Our setup consists of a dual pair
CentOS 5.5 running LVM/DRBD/Xen and have tried the default EL Xen version
3.0.3-105 including other newer versions (3.4.0 and 3.4.3) from the GITCO
repo [ http://www.gitco.de/repo/ ] with the crashing still occurring
independent of Xen version.
>From the kernel crash output we found a related post here [
http://lists.linbit.com/pipermail/drbd-user/2009-March/011652.html ] that
discusses a problem with I/O and network interface affecting DRBD. By
turning off scatter/gather on the dom0's network interface we no longer
experience crashes in the dom0.
We came across this crashing issue when migrating our Xen VM's to newer
servers, but when running on our older servers the dom0 did not crash. As
this issue appears to be related to the network driver/chipset, for the
record and hopefully our results will be useful to others, the difference
between these servers NIC is;
----
Old servers (CentOS 5.3) without dom0 crashing, uses tg3 network driver:
$ uname -a
Linux ldx12020 2.6.18-128.4.1.el5xen #1 SMP Tue Aug 4 20:51:12 EDT 2009
x86_64 x86_64 x86_64 GNU/Linux
$ lspci
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit
Ethernet PCI Express
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit
Ethernet PCI Express
$ ethtool -i eth0
driver: tg3
version: 3.93
firmware-version: 5722-v3.07, ASFIPMI v6.02
bus-info: 0000:03:00.0
----
New servers with dom0 crashing, uses bnx2 network driver:
$ uname -a
Linux tnx176 2.6.18-194.26.1.el5xen #1 SMP Tue Nov 9 13:35:30 EST 2010
x86_64 x86_64 x86_64 GNU/Linux
$ lspci
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
Gigabit Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
Gigabit Ethernet (rev 20)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
Gigabit Ethernet (rev 20)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
Gigabit Ethernet (rev 20)
$ ethtool -i eth0
Cannot get driver information: Operation not supported
dmesg output: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.2 (Aug
21, 2009)
----
Hopefully we can see this bug fixed.
Cheers,
Paul
--
View this message in context: http://xen.1045712.n5.nabble.com/Dom0-panic-on-xen-3-4-3-tp2642628p3319865.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: Dom0 panic on xen 3.4.3
2010-12-28 0:50 ` prickett233
@ 2010-12-28 10:38 ` Pasi Kärkkäinen
0 siblings, 0 replies; 8+ messages in thread
From: Pasi Kärkkäinen @ 2010-12-28 10:38 UTC (permalink / raw)
To: prickett233; +Cc: xen-devel
On Mon, Dec 27, 2010 at 04:50:04PM -0800, prickett233 wrote:
>
> We also experience this same problem with the dom0 crashing when multiple
> domU's are starting up at the same time. Our setup consists of a dual pair
> CentOS 5.5 running LVM/DRBD/Xen and have tried the default EL Xen version
> 3.0.3-105 including other newer versions (3.4.0 and 3.4.3) from the GITCO
> repo [ http://www.gitco.de/repo/ ] with the crashing still occurring
> independent of Xen version.
>
> >From the kernel crash output we found a related post here [
> http://lists.linbit.com/pipermail/drbd-user/2009-March/011652.html ] that
> discusses a problem with I/O and network interface affecting DRBD. By
> turning off scatter/gather on the dom0's network interface we no longer
> experience crashes in the dom0.
>
> We came across this crashing issue when migrating our Xen VM's to newer
> servers, but when running on our older servers the dom0 did not crash. As
> this issue appears to be related to the network driver/chipset, for the
> record and hopefully our results will be useful to others, the difference
> between these servers NIC is;
>
> ----
>
> Old servers (CentOS 5.3) without dom0 crashing, uses tg3 network driver:
>
> $ uname -a
> Linux ldx12020 2.6.18-128.4.1.el5xen #1 SMP Tue Aug 4 20:51:12 EDT 2009
> x86_64 x86_64 x86_64 GNU/Linux
>
> $ lspci
> 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit
> Ethernet PCI Express
> 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit
> Ethernet PCI Express
>
> $ ethtool -i eth0
> driver: tg3
> version: 3.93
> firmware-version: 5722-v3.07, ASFIPMI v6.02
> bus-info: 0000:03:00.0
>
> ----
>
> New servers with dom0 crashing, uses bnx2 network driver:
>
> $ uname -a
> Linux tnx176 2.6.18-194.26.1.el5xen #1 SMP Tue Nov 9 13:35:30 EST 2010
> x86_64 x86_64 x86_64 GNU/Linux
>
> $ lspci
> 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
> 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
>
> $ ethtool -i eth0
> Cannot get driver information: Operation not supported
>
> dmesg output: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.2 (Aug
> 21, 2009)
>
> ----
>
> Hopefully we can see this bug fixed.
>
>
Did you open a bug about this to Redhat bugzilla?
Do you have a serial console set up and logging both the hypervisor
and dom0 linux kernel, so you can see what errors you get when it crashes?
-- Pasi
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-12-28 10:38 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-20 18:15 Dom0 panic on xen 3.4.3 Cris Daniluk
2010-08-20 23:59 ` Pasi Kärkkäinen
2010-08-21 0:20 ` Cris Daniluk
2010-08-21 10:12 ` Pasi Kärkkäinen
2010-08-21 22:27 ` Cris Daniluk
2010-11-01 10:28 ` Paolo Bonzini
2010-12-28 0:50 ` prickett233
2010-12-28 10:38 ` Pasi Kärkkäinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).