From: Claus Rosenberger <claus.rosenberger@rocnet.de>
To: xen-devel@lists.xensource.com
Subject: Re: Freeze with 2.6.32.19 and xen-4.0.1rc5
Date: Sun, 22 Aug 2010 00:08:53 +0200 [thread overview]
Message-ID: <4C704E75.2050200@rocnet.de> (raw)
In-Reply-To: <20100821140234.GX2804@reaktio.net>
Am 21.08.2010 16:02, schrieb Pasi Kärkkäinen:
> On Sat, Aug 21, 2010 at 03:47:57PM +0200, Claus Rosenberger wrote:
>> Hi,
>>
>> i have big trouble with a Debian Lenny dom0 and latest kernel 2.6.32.19
>> with xen-4.0.1rc5. Due some reason the system freezes from time to time.
>> I used kernel 2.6.31.9 with xen-3.4.2 before. The machine doesn't write
>> anything to serial console so there are no errors or something like that.
>>
>> Perhaps there is something to see from the logs ...
>>
> Hello,
>
> A couple of questions:
>
> - Do you use PCI passthru?
I tried but now i disabled to avoid a mixup of to many issues.
> - Is there something special happening when it freezes?
Last time it happened as creating filesystems, perhaps it's something
about disk usage. At the end of the mail i describe more about the disk
problems.
> - Does it freeze at regular intervals, at the same time/uptime, or randomly?
It happens or not, it's randomly.
> - By freezing you mean it doesn't respond to anything? Or does it reboot?
If it's freezing then i cannot do anything, i can connect with iamt and
reboot, nothing else.
> - Can you try using the old 2.6.31.9 kernel with the new xen hypervisor?
Sure.
> -- Pasi
>
>
>> Configuration Grub
>>
>> title Xen 4.0-amd64 / Debian GNU/Linux, kernel 2.6.32.19
>> root (hd0,0)
>> kernel /boot/xen-4.0-amd64.gz dom0_mem=524288 cpufreq=xen
>> cpuidle console=com1 com1=115200,8n1,0xf1c0,0 sync_console
> Try adding "loglvl=all guest_loglvl=all" for xen.gz.
Sure.
>> module /boot/vmlinuz-2.6.32.19 root=/dev/md0 ro console=tty0
>> console=hvc0
>> module /boot/initrd.img-2.6.32.19
>>
> And try adding "nomodeset" for dom0 kernel (vmlinuz).
Whats that parameter for?
I switched the disk because there was an error on the last one, now on
sata2 there is a brand new disk and i can see following on my console
log. I cannot believe it's a disk problem, perhaps it's a disk
controller problem instead or there is something with the kernel. I will
add the parameters and switch off/on the machine to restart from scratch.
Claus
[17392.097849] sd 1:0:0:0: [sdb] Unhandled error code
[17392.100047] BUG: soft lockup - CPU#0 stuck for 66s! [swapper:0]
[17392.100049] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.100088] CPU 0:
[17392.100089] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.100120] Pid: 0, comm: swapper Not tainted 2.6.32.19 #2
[17392.100122] RIP: e030:[<ffffffff8100928a>] [<ffffffff8100928a>]
hypercall_page+0x28a/0x1001
[17392.100129] RSP: e02b:ffff880002f38df8 EFLAGS: 00000a07
[17392.100130] RAX: 0000000000000000 RBX: ffffc900081d2060 RCX:
ffffffff8100928a
[17392.100132] RDX: 0000000000000001 RSI: ffffc900081d51c0 RDI:
0000000000000001
[17392.100134] RBP: ffffc900081d2198 R08: 0000000000000000 R09:
0000000000000000
[17392.100135] R10: 0000000000015640 R11: 0000000000000a07 R12:
0000000000000003
[17392.100137] R13: 0000000000004620 R14: 0000000000000021 R15:
6db6db6db6db6db7
[17392.100142] FS: 00007f3b93f6a6e0(0000) GS:ffff880002f35000(0000)
knlGS:0000000000000000
[17392.100144] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[17392.100145] CR2: 00007f3b93f69000 CR3: 000000001efba000 CR4:
0000000000002660
[17392.100147] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[17392.100149] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[17392.100151] Call Trace:
[17392.100153] <IRQ> [<ffffffff811fd74a>] ? net_tx_action+0x294/0x9be
[17392.100160] [<ffffffff8100eadf>] ? xen_restore_fl_direct_end+0x0/0x1
[17392.100164] [<ffffffff8109786c>] ? check_for_new_grace_period+0x9e/0xa8
[17392.100167] [<ffffffff81052bc3>] ? tasklet_action+0x77/0xd3
[17392.100170] [<ffffffff81054352>] ? __do_softirq+0xe0/0x1a2
[17392.100173] [<ffffffff811ee9ff>] ? __xen_evtchn_do_upcall+0x12a/0x16c
[17392.100176] [<ffffffff81012bec>] ? call_softirq+0x1c/0x30
[17392.100179] [<ffffffff81014813>] ? do_softirq+0x3f/0x7c
[17392.100181] [<ffffffff810541b3>] ? irq_exit+0x36/0x79
[17392.100184] [<ffffffff811eeeb0>] ? xen_evtchn_do_upcall+0x35/0x42
[17392.100186] [<ffffffff81012c3e>] ? xen_do_hypervisor_callback+0x1e/0x30
[17392.100187] <EOI> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
[17392.100191] [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
[17392.100194] [<ffffffff8100e454>] ? xen_safe_halt+0xc/0x15
[17392.100196] [<ffffffff8100bf15>] ? xen_idle+0x35/0x40
[17392.100199] [<ffffffff81010c13>] ? cpu_idle+0xa3/0xdd
[17392.100203] [<ffffffff814f3cdb>] ? start_kernel+0x3da/0x3e5
[17392.100205] [<ffffffff814f5b83>] ? xen_start_kernel+0x5e6/0x5ea
[17392.103027] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[17392.103031] ata1.00: failed command: READ DMA EXT
[17392.103035] ata1.00: cmd 25/00:00:5d:88:39/00:04:3d:00:00/e0 tag 0
dma 524288 in
[17392.103036] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[17392.103037] ata1.00: status: { DRDY }
[17392.103052] ata1.00: hard resetting link
[17392.372433] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_TIMEOUT
[17392.376416] sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 3d 39 7f dd 00 04
00 00
[17392.380089] end_request: I/O error, dev sdb, sector 1027178461
[17392.384569] raid1: Disk failure on sdb3, disabling device.
[17392.384569] raid1: Operation continuing on 1 devices.
[17392.389710] BUG: soft lockup - CPU#1 stuck for 66s! [scsi_eh_1:538]
[17392.389710] Modules linked in:
[17392.393352] md: md2: resync done.
[17392.393348] nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
xt_physdev iptable_filter ip_tables x_tables xen_evtchn xenfs 8021q garp
bridge stp coretemp lm85 hwmon_vid loop evdev video output tpm_tis tpm
snd_pcsp tpm_bios psmouse snd_pcm serio_raw snd_timer snd soundcore
snd_page_alloc i2c_i801 i2c_core processor button acpi_processor ext3
jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod raid1
md_mod sd_mod crc_t10dif ata_piix ehci_hcd uhci_hcd ata_generic libata
usbcore nls_base scsi_mod e1000e thermal fan thermal_sys [last unloaded:
scsi_wait_scan]
[17392.420099] CPU 1:
[17392.420099] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.448018] Pid: 538, comm: scsi_eh_1 Not tainted 2.6.32.19 #2
[17392.452071] RIP: e030:[<ffffffff8100922a>] [<ffffffff8100922a>]
hypercall_page+0x22a/0x1001
[17392.456082] RSP: e02b:ffff88000205bbc8 EFLAGS: 00000246
[17392.460069] RAX: 0000000000040000 RBX: ffff880002353000 RCX:
ffffffff8100922a
[17392.464074] RDX: 000000000000d729 RSI: 0000000000000000 RDI:
0000000000000000
[17392.464074] RBP: ffff880002312000 R08: 0000000000000001 R09:
00000000000000fa
[17392.468023] R10: ffff88000206d170 R11: 0000000000000246 R12:
ffff880002338000
[17392.468023] R13: ffff880002353048 R14: ffff88001e7f0900 R15:
ffff880002698000
[17392.468023] FS: 00007f3b93f6a6e0(0000) GS:ffff880002f52000(0000)
knlGS:0000000000000000
[17392.472075] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[17392.472075] CR2: 00007f3b9356d1a4 CR3: 000000001f657000 CR4:
0000000000002660
[17392.472075] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[17392.476070] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[17392.476070] Call Trace:
[17392.476070] [<ffffffff8100e41d>] ? xen_force_evtchn_callback+0x9/0xa
[17392.476070] [<ffffffff8100eaf2>] ? check_events+0x12/0x20
[17392.480079] ata1.01: hard resetting link
[17392.480099] [<ffffffff8100ea99>] ? xen_irq_enable_direct_end+0x0/0x7
[17392.480099] [<ffffffffa003a19c>] ? scsi_request_fn+0x3b9/0x4da
[scsi_mod]
[17392.480099] [<ffffffff8117d7b6>] ? __blk_run_queue+0x35/0x66
[17392.484070] [<ffffffff8117d88d>] ? blk_run_queue+0x20/0x32
[17392.484070] [<ffffffffa0039826>] ? scsi_run_queue+0x2da/0x370 [scsi_mod]
[17392.488084] [<ffffffff810e6294>] ? kmem_cache_free+0x71/0xa4
[17392.488084] [<ffffffffa003a4a5>] ? scsi_next_command+0x2d/0x39
[scsi_mod]
[17392.488084] [<ffffffffa003adfc>] ? scsi_io_completion+0x1ed/0x416
[scsi_mod]
[17392.488084] [<ffffffffa0037a7a>] ? scsi_eh_flush_done_q+0xec/0x10d
[scsi_mod]
[17392.488084] [<ffffffffa00a7223>] ? ata_scsi_error+0x5e9/0x681 [libata]
[17392.488084] [<ffffffffa0038a3d>] ? scsi_error_handler+0xec/0x5a9
[scsi_mod]
[17392.496345] [<ffffffffa0038951>] ? scsi_error_handler+0x0/0x5a9
[scsi_mod]
[17392.496345] [<ffffffff810652ad>] ? kthread+0x75/0x7d
[17392.496345] [<ffffffff81012aea>] ? child_rip+0xa/0x20
[17392.496345] [<ffffffff81011ca1>] ? int_ret_from_sys_call+0x7/0x1b
[17392.500072] [<ffffffff8101245d>] ? retint_restore_args+0x5/0x6
[17392.500072] [<ffffffff81012ae0>] ? child_rip+0x0/0x20
[17392.956361] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[17392.957486] ata1.01: SATA link down (SStatus 0 SControl 300)
[17392.972769] ata1.00: configured for UDMA/133
[17392.973795] ata1.00: device reported invalid CHS sector 0
[17392.974743] ata1: EH complete
[17393.149020] md: checkpointing resync of md2.
[17393.482545] RAID1 conf printout:
[17393.483122] --- wd:1 rd:2
[17393.483585] disk 0, wo:0, o:1, dev:sda3
[17393.484259] disk 1, wo:1, o:0, dev:sdb3
[17393.492056] RAID1 conf printout:
[17393.492628] --- wd:1 rd:2
[17393.493108] disk 0, wo:0, o:1, dev:sda3
[17393.494841] md: resync of RAID array md2
[17393.495559] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[17393.496573] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for resync.
[17393.498235] md: using 128k window, over a total of 958020096 blocks.
[17393.498466] md: resuming resync of md2 from checkpoint.
[17393.498466] md: md2: resync done.
[17393.824165] RAID1 conf printout:
[17393.825572] --- wd:1 rd:2
[17393.826761] disk 0, wo:0, o:1, dev:sda3
next prev parent reply other threads:[~2010-08-21 22:08 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-21 13:47 Freeze with 2.6.32.19 and xen-4.0.1rc5 Claus Rosenberger
2010-08-21 14:02 ` Pasi Kärkkäinen
2010-08-21 22:08 ` Claus Rosenberger [this message]
2010-08-23 8:01 ` Claus Rosenberger
2010-08-23 11:09 ` Pasi Kärkkäinen
2010-08-23 12:29 ` Pasi Kärkkäinen
2010-08-24 7:50 ` Claus Rosenberger
2010-08-24 8:16 ` Freeze with 2.6.32.19 and xen-4.0.1rc5, acpi problems Pasi Kärkkäinen
2010-08-24 8:49 ` Jeremy Fitzhardinge
2010-08-24 23:22 ` Claus Rosenberger
2010-08-25 8:02 ` Claus Rosenberger
2010-08-25 8:53 ` Pasi Kärkkäinen
2010-08-27 7:03 ` Claus Rosenberger
2010-12-05 6:13 ` Mark Brown
2010-12-06 13:04 ` Keir Fraser
2010-12-06 17:10 ` Mark Brown
2010-12-06 17:19 ` Mark Brown
2010-12-10 2:24 ` Freeze with 2.6.32-5 and xen-4.0.2-rc1-pre Lockup Mark Brown
2010-12-13 16:55 ` Konrad Rzeszutek Wilk
2010-12-07 11:29 ` Re: Freeze with 2.6.32.19 and xen-4.0.1rc5, acpi problems Sander Eikelenboom
2010-12-07 11:37 ` Ian Campbell
2010-12-07 16:35 ` Mark Brown
2010-08-21 23:04 ` Freeze with 2.6.32.19 and xen-4.0.1rc5 Bruce Edge
2010-12-14 0:59 ` Mark Brown
2010-12-14 11:04 ` Pasi Kärkkäinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C704E75.2050200@rocnet.de \
--to=claus.rosenberger@rocnet.de \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.