* BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
@ 2011-03-30 14:14 Gerd Jakobovitsch
2011-03-31 2:44 ` Teck Choon Giam
0 siblings, 1 reply; 9+ messages in thread
From: Gerd Jakobovitsch @ 2011-03-30 14:14 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 10122 bytes --]
Hello all,
I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative
success, but the bug at mmu.c appeared once at a while. Therefore, I'm
looking for a more stable option.
I compiled and ran the newly released xen 4.1, with kernel PVOPS
2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs
appeared at dmesg. After that, xl and xm commands do not longer respond:
[ 145.749573] alloc irq_desc for 2209 on node -1
[ 145.749581] alloc kstat_irqs on node -1
[ 145.883515] block tda: sector-size: 512 capacity: 262144
[ 145.889952] general protection fault: 0000 [#1] SMP
[ 145.890109] last sysfs file: /sys/block/tda/removable
[ 145.890164] CPU 7
[ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl
auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev
xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
[ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1
PowerEdge M610
[ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>]
blktap_device_end_request+0x4e/0x63
[ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
[ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX:
0000000000000028
[ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
0000000000000000
[ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09:
ffff88007da42948
[ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12:
ffff88007da427f8
[ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15:
ffff88007c75f3a8
[ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000)
knlGS:0000000000000000
[ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4:
0000000000002660
[ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000,
task ffff88007c272d60)
[ 145.892669] Stack:
[ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000
0000000000000000
[ 145.893334] <0> ffff880064061e88 ffffffff812815ae ffff880064061e58
ffffffff811d234f
[ 145.894035] <0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff
0000000000000000
[ 145.895015] Call Trace:
[ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
[ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
[ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
[ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
[ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
[ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
[ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
[ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
[ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
[ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
[ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
[ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
[ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00
e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44
24 40 <48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
[ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
[ 145.902321] RSP <ffff880064061cd8>
[ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
[ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
[ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
[ 262.101232] Call Trace:
[ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
[ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
[ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
[ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
[ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
[ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
[ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
[ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
[ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
[ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
[ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
[ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
[ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
[ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
[ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
[ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
[ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
[ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
[ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
[ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
[ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
[ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
[ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
[ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
[ 262.107918] sending NMI to all CPUs:
[ 262.108189] BUG: unable to handle kernel paging request at
ffffffffff5fb310
[ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
[ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
[ 262.109235] Oops: 0002 [#2] SMP
[ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
[ 262.109830] CPU 4
[ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl
auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev
xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
[ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
PowerEdge M610
[ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>]
flat_send_IPI_mask+0x6a/0xc0
[ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
[ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX:
00000000000160a0
[ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI:
ffffffff81816020
[ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09:
0000000000000004
[ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12:
0000000000000002
[ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15:
0000000000000000
[ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000)
knlGS:0000000000000000
[ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4:
0000000000002660
[ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000,
task ffff88007db86250)
[ 262.116194] Stack:
[ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000
0000000076e9ecd0
[ 262.116825] <0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8
ffffffff81029f0d
[ 262.117485] <0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8
ffffffff81219eb3
[ 262.118396] Call Trace:
[ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
[ 262.118925] [<ffffffff81029f0d>]
arch_trigger_all_cpu_backtrace+0x45/0x66
[ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
[ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
[ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
[ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
[ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
[ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
[ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
[ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
[ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
[ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
[ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
[ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
[ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
[ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
[ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
[ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
[ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
[ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
[ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
[ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
[ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
[ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
[ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
[ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
[ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90
58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1
e0 18 <89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
[ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
[ 262.134164] RSP <ffff88006778f968>
[ 262.134419] CR2: ffffffffff5fb310
[ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
--
*Gerd Jakobovitsch
Engenheiro de Produto **
---------------------------------------------------------* *
**ALOG Data Centers do Brasil**
**Excelência em Projetos de Hosting*
Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
Telefone: (11) 3524-4970 / (11) 7152-0815
*http://www.alog.com.br* <http://www.alog.com.br/>
*"Como estão nossos serviços? Clique aqui
<https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
escutar a sua opinião!"*
[-- Attachment #1.2.1: Type: text/html, Size: 15519 bytes --]
[-- Attachment #1.2.2: alog --]
[-- Type: image/png, Size: 32911 bytes --]
[-- Attachment #1.2.3: graphics2 --]
[-- Type: image/png, Size: 42793 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-03-30 14:14 BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM Gerd Jakobovitsch
@ 2011-03-31 2:44 ` Teck Choon Giam
2011-04-08 15:06 ` Gerd Jakobovitsch
0 siblings, 1 reply; 9+ messages in thread
From: Teck Choon Giam @ 2011-03-31 2:44 UTC (permalink / raw)
To: Gerd Jakobovitsch; +Cc: xen-devel
On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch <gerd@alog.com.br> wrote:
>
> Hello all,
>
> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>
> [ 145.749573] alloc irq_desc for 2209 on node -1
> [ 145.749581] alloc kstat_irqs on node -1
> [ 145.883515] block tda: sector-size: 512 capacity: 262144
> [ 145.889952] general protection fault: 0000 [#1] SMP
> [ 145.890109] last sysfs file: /sys/block/tda/removable
> [ 145.890164] CPU 7
> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
> [ 145.892669] Stack:
> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
> [ 145.893334] <0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
> [ 145.894035] <0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
> [ 145.895015] Call Trace:
> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40 <48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> [ 145.902321] RSP <ffff880064061cd8>
> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
> [ 262.101232] Call Trace:
> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 262.107918] sending NMI to all CPUs:
> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
> [ 262.109235] Oops: 0002 [#2] SMP
> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
> [ 262.109830] CPU 4
> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
> [ 262.116194] Stack:
> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
> [ 262.116825] <0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
> [ 262.117485] <0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
> [ 262.118396] Call Trace:
> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18 <89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.134164] RSP <ffff88006778f968>
> [ 262.134419] CR2: ffffffffff5fb310
> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>
Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
You can read more about this BUG at
http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
I initially hit this BUG sometime Dec 2010... ...
http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
Thanks.
Kindest regards,
Giam Teck Choon
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-03-31 2:44 ` Teck Choon Giam
@ 2011-04-08 15:06 ` Gerd Jakobovitsch
2011-04-08 15:44 ` Teck Choon Giam
0 siblings, 1 reply; 9+ messages in thread
From: Gerd Jakobovitsch @ 2011-04-08 15:06 UTC (permalink / raw)
To: Teck Choon Giam; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 17836 bytes --]
On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch<gerd@alog.com.br> wrote:
>> Hello all,
>>
>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
>> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>>
>> [ 145.749573] alloc irq_desc for 2209 on node -1
>> [ 145.749581] alloc kstat_irqs on node -1
>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
>> [ 145.889952] general protection fault: 0000 [#1] SMP
>> [ 145.890109] last sysfs file: /sys/block/tda/removable
>> [ 145.890164] CPU 7
>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
>> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
>> [ 145.892669] Stack:
>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
>> [ 145.895015] Call Trace:
>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
>> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>> [ 145.902321] RSP<ffff880064061cd8>
>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
>> [ 262.101232] Call Trace:
>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>> [ 262.107918] sending NMI to all CPUs:
>> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
>> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>> [ 262.109235] Oops: 0002 [#2] SMP
>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>> [ 262.109830] CPU 4
>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
>> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
>> [ 262.116194] Stack:
>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
>> [ 262.118396] Call Trace:
>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
>> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>> [ 262.134164] RSP<ffff88006778f968>
>> [ 262.134419] CR2: ffffffffff5fb310
>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>
> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
>
> You can read more about this BUG at
> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>
> I initially hit this BUG sometime Dec 2010... ...
> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>
> Thanks.
>
> Kindest regards,
> Giam Teck Choon
Sorry for the delayed answer. The problem I'm facing now is not related
to the mmu bug - that one I still am seeing at systems with xen 4.0.2 /
kernel 2.6.32.24. Newer kernels have new bugs, much more troublesome,
since I can not run a single VM instance.
Adding DEBUG_PAGEALLOC, the main difference is that the system is
rebooting shortly after trying to start up a VM:
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach' message
(uuid = 0)
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach response'
message (uuid = 0)
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open' message
(uuid = 0)
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
open('/storage5/linux-centos-5-64b-base-7253/hda')
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size: pre
sector_shift [134217728] post sector_shift [262144]
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
/storage5/linux-centos-5-64b-base-rip/hda (1 users, state: 0x00000001,
type: 0)
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
/storage5/linux-centos-5-64b-base/hda: 0
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open response'
message (uuid = 0)
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
sector-size: 512 capacity: 262144
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general protection
fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs file:
/sys/block/tda/removable
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules linked in:
nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2 ocfs2_dlmfs
ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs
arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
dm_round_robin lockd sunrpc crc32c bonding iscsi_tcp libiscsi_tcp bnx2i
libiscsi scsi_transport_iscsi cnic uio dm_multipath bnx2 megaraid_sas
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566, comm:
tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
blktap_device_end_request+0x4e/0x63
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
e02b:ffff88006da5dcd8 EFLAGS: 00010046
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
00007f86d75a3730(0000) GS:ffff8800280c7000(0000) knlGS:0000000000000000
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS: 0000
ES: 0000 CR0: 000000008005003b
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process tapdisk2
(pid: 15566, threadinfo ffff88006da5c000, task ffff88007d7750f0)
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807] ffff88007d7750f0
0000000000000000 0000000000000000 0000000000000000
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074] <0>
ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574] <0>
00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
[<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
[<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
[<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
[<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
[<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
[<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
[<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
[<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
[<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
[<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
[<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
[<ffffffff81112444>] vfs_ioctl+0x5e/0x77
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
[<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
[<ffffffff81112a01>] sys_ioctl+0x57/0x7a
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
[<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61 f4 ff
ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b 54 24 60 44
89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40 <48> 8b b8 70 04 00 00 e8
1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
[<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP <ffff88006da5dcd8>
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end trace
da218b929afc63f7 ]---
Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl: tap-err:tap_ctl_read_message:
failure reading message
Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl: tap-err:tap_ctl_send_and_receive:
failed to receive 'unknown' message
--
*Gerd Jakobovitsch
Engenheiro de Produto **
---------------------------------------------------------* *
**ALOG Data Centers do Brasil**
**Excelência em Projetos de Hosting*
Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
Telefone: (11) 3524-4970 / (11) 7152-0815
*http://www.alog.com.br* <http://www.alog.com.br/>
*"Como estão nossos serviços? Clique aqui
<https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
escutar a sua opinião!"*
[-- Attachment #1.2.1: Type: text/html, Size: 23203 bytes --]
[-- Attachment #1.2.2: alog --]
[-- Type: image/png, Size: 32911 bytes --]
[-- Attachment #1.2.3: graphics2 --]
[-- Type: image/png, Size: 42793 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-08 15:06 ` Gerd Jakobovitsch
@ 2011-04-08 15:44 ` Teck Choon Giam
2011-04-08 20:51 ` Gerd Jakobovitsch
2011-04-08 21:38 ` Gerd Jakobovitsch
0 siblings, 2 replies; 9+ messages in thread
From: Teck Choon Giam @ 2011-04-08 15:44 UTC (permalink / raw)
To: Gerd Jakobovitsch; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 17762 bytes --]
On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch <gerd@alog.com.br> wrote:
> On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>
> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch <gerd@alog.com.br> <gerd@alog.com.br> wrote:
>
> Hello all,
>
> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>
> [ 145.749573] alloc irq_desc for 2209 on node -1
> [ 145.749581] alloc kstat_irqs on node -1
> [ 145.883515] block tda: sector-size: 512 capacity: 262144
> [ 145.889952] general protection fault: 0000 [#1] SMP
> [ 145.890109] last sysfs file: /sys/block/tda/removable
> [ 145.890164] CPU 7
> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
> [ 145.892669] Stack:
> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
> [ 145.893334] <0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
> [ 145.894035] <0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
> [ 145.895015] Call Trace:
> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40 <48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> [ 145.902321] RSP <ffff880064061cd8>
> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
> [ 262.101232] Call Trace:
> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 262.107918] sending NMI to all CPUs:
> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
> [ 262.109235] Oops: 0002 [#2] SMP
> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
> [ 262.109830] CPU 4
> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
> [ 262.116194] Stack:
> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
> [ 262.116825] <0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
> [ 262.117485] <0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
> [ 262.118396] Call Trace:
> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18 <89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> [ 262.134164] RSP <ffff88006778f968>
> [ 262.134419] CR2: ffffffffff5fb310
> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>
>
> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
>
> You can read more about this BUG athttp://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>
> I initially hit this BUG sometime Dec 2010... ...http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>
> Thanks.
>
> Kindest regards,
> Giam Teck Choon
>
> Sorry for the delayed answer. The problem I'm facing now is not related to
> the mmu bug - that one I still am seeing at systems with xen 4.0.2 / kernel
> 2.6.32.24. Newer kernels have new bugs, much more troublesome, since I can
> not run a single VM instance.
>
> Adding DEBUG_PAGEALLOC, the main difference is that the system is rebooting
> shortly after trying to start up a VM:
>
Ok. Sorry, didn't see your log message carefully previously.
>
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach' message
> (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach response'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open' message
> (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
> open('/storage5/linux-centos-5-64b-base-7253/hda')
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size: pre
> sector_shift [134217728] post sector_shift [262144]
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
> /storage5/linux-centos-5-64b-base-rip/hda (1 users, state: 0x00000001, type:
> 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> /storage5/linux-centos-5-64b-base/hda: 0
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open response'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
> sector-size: 512 capacity: 262144
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general protection
> fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs file:
> /sys/block/tda/removable
>
Just curious... ... what type of storage you are using for your VMs?
Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules linked in: nfs
> fscache nfs_acl auth_rpcgss bridge stp ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb
> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs arptable_filter
> arp_tables xt_esp ipt_ah xt_physdev xt_multiport dm_round_robin lockd sunrpc
> crc32c bonding iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi
> cnic uio dm_multipath bnx2 megaraid_sas
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566, comm:
> tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
> e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
> blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
> e02b:ffff88006da5dcd8 EFLAGS: 00010046
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX: 6b6b6b6b6b6b6b6b
> RBX: ffff88006566c000 RCX: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX: 0000000000000000
> RSI: 0000000000000000 RDI: ffff88006d8c7980
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP: ffff88006da5dcf8
> R08: ffffffff817e66a0 R09: ffff88007d775790
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10: ffffffff810ccfe4
> R11: ffff8800280d8f60 R12: ffff88006720e7f8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13: ffff88006d8c77e0
> R14: 0000000000000000 R15: ffff88006d8c77f0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000) knlGS:0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2: 0000000045614ed8
> CR3: 000000006d911000 CR4: 0000000000002660
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process tapdisk2 (pid:
> 15566, threadinfo ffff88006da5c000, task ffff88007d7750f0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807] ffff88007d7750f0
> 0000000000000000 0000000000000000 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074] <0> ffff88006da5de88
> ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574] <0> 00000000ffffffff
> ffff88006d8c77e0 00000001816fab27 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522] [<ffffffff8129f2c0>]
> blktap_ring_ioctl+0x183/0x2d8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767] [<ffffffff8100fedd>]
> ? xen_force_evtchn_callback+0xd/0xf
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012] [<ffffffff811ebe9b>]
> ? inode_has_perm+0xa1/0xb3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260] [<ffffffff8101064f>]
> ? xen_restore_fl_direct_end+0x0/0x1
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509] [<ffffffff81081d84>]
> ? lock_release+0x1b8/0x1c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756] [<ffffffff81233d2c>]
> ? _raw_spin_unlock+0xab/0xb2
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999] [<ffffffff8155618d>]
> ? _spin_unlock+0x26/0x2a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248] [<ffffffff81134aa1>]
> ? aio_read_evt+0x87/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493] [<ffffffff81134b36>]
> ? aio_read_evt+0x11c/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740] [<ffffffff81233ecc>]
> ? _raw_spin_lock+0x77/0x12f
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986] [<ffffffff811ec054>]
> ? file_has_perm+0xb4/0xc6
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237] [<ffffffff81112444>]
> vfs_ioctl+0x5e/0x77
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480] [<ffffffff81112959>]
> do_vfs_ioctl+0x484/0x4d5
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726] [<ffffffff81112a01>]
> sys_ioctl+0x57/0x7a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970] [<ffffffff81013d42>]
> system_call_fastpath+0x16/0x1b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61 f4 ff ff
> 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b 54 24 60 44 89 f6
> 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40 <48> 8b b8 70 04 00 00 e8 1d 65 2b 00
> 5b 41 5c 41 5d 41 5e c9 c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
> <ffff88006da5dcd8>
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end trace
> da218b929afc63f7 ]---
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl: tap-err:tap_ctl_read_message:
> failure reading message
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl: tap-err:tap_ctl_send_and_receive:
> failed to receive 'unknown' message
>
>
Looks like it is related to blktap/blktap2 drivers related issue to me... so
you are right... ... this is different BUG from what I encountered as the
logs are different. Sorry, didn't read your log carefully before replying
previously. Have you try to use normal LVM for your VM to reproduce the BUG
as I guess you are using different storage?
Thanks.
Kindest regards,
Giam Teck Choon
[-- Attachment #1.2: Type: text/html, Size: 20190 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-08 15:44 ` Teck Choon Giam
@ 2011-04-08 20:51 ` Gerd Jakobovitsch
2011-04-08 21:38 ` Gerd Jakobovitsch
1 sibling, 0 replies; 9+ messages in thread
From: Gerd Jakobovitsch @ 2011-04-08 20:51 UTC (permalink / raw)
To: Teck Choon Giam; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 27836 bytes --]
On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
>
>
> On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch <gerd@alog.com.br
> <mailto:gerd@alog.com.br>> wrote:
>
> On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch<gerd@alog.com.br> <mailto:gerd@alog.com.br> wrote:
>>> Hello all,
>>>
>>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
>>> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>>>
>>> [ 145.749573] alloc irq_desc for 2209 on node -1
>>> [ 145.749581] alloc kstat_irqs on node -1
>>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
>>> [ 145.889952] general protection fault: 0000 [#1] SMP
>>> [ 145.890109] last sysfs file: /sys/block/tda/removable
>>> [ 145.890164] CPU 7
>>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
>>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
>>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
>>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
>>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
>>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
>>> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
>>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
>>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
>>> [ 145.892669] Stack:
>>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
>>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
>>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
>>> [ 145.895015] Call Trace:
>>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
>>> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>> [ 145.902321] RSP<ffff880064061cd8>
>>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
>>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
>>> [ 262.101232] Call Trace:
>>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
>>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 262.107918] sending NMI to all CPUs:
>>> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
>>> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>>> [ 262.109235] Oops: 0002 [#2] SMP
>>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>>> [ 262.109830] CPU 4
>>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
>>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
>>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
>>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
>>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
>>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
>>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
>>> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
>>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
>>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
>>> [ 262.116194] Stack:
>>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
>>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
>>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
>>> [ 262.118396] Call Trace:
>>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>>> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
>>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
>>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
>>> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.134164] RSP<ffff88006778f968>
>>> [ 262.134419] CR2: ffffffffff5fb310
>>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>>
>> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
>>
>> You can read more about this BUG at
>> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>>
>> I initially hit this BUG sometime Dec 2010... ...
>> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>>
>> Thanks.
>>
>> Kindest regards,
>> Giam Teck Choon
> Sorry for the delayed answer. The problem I'm facing now is not
> related to the mmu bug - that one I still am seeing at systems
> with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
> much more troublesome, since I can not run a single VM instance.
>
> Adding DEBUG_PAGEALLOC, the main difference is that the system is
> rebooting shortly after trying to start up a VM:
>
>
> Ok. Sorry, didn't see your log message carefully previously.
>
>
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
> response' message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
> open('/storage5/linux-centos-5-64b-base-7253/hda')
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
> pre sector_shift [134217728] post sector_shift [262144]
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
> /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
> 0x00000001, type: 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> /storage5/linux-centos-5-64b-base/hda: 0
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
> response' message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
> sector-size: 512 capacity: 262144
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
> protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
> file: /sys/block/tda/removable
>
>
> Just curious... ... what type of storage you are using for your VMs?
>
>
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
> linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
> ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
> xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
> iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
> uio dm_multipath bnx2 megaraid_sas
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
> comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
> e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
> blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
> e02b:ffff88006da5dcd8 EFLAGS: 00010046
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
> 6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
> 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
> ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
> ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
> ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
> knlGS:0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS:
> 0000 ES: 0000 CR0: 000000008005003b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
> 0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
> tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
> ffff88007d7750f0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
> ffff88007d7750f0 0000000000000000 0000000000000000 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074] <0>
> ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574] <0>
> 00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
> [<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
> [<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
> [<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
> [<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
> [<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
> [<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
> [<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
> [<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
> [<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
> [<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
> [<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
> [<ffffffff81112444>] vfs_ioctl+0x5e/0x77
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
> [<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
> [<ffffffff81112a01>] sys_ioctl+0x57/0x7a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
> [<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
> f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
> 54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40 <48> 8b
> b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
> <ffff88006da5dcd8>
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
> trace da218b929afc63f7 ]---
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> tap-err:tap_ctl_read_message: failure reading message
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
>
>
> Looks like it is related to blktap/blktap2 drivers related issue to
> me... so you are right... ... this is different BUG from what I
> encountered as the logs are different. Sorry, didn't read your log
> carefully before replying previously. Have you try to use normal LVM
> for your VM to reproduce the BUG as I guess you are using different
> storage?
>
> Thanks.
>
> Kindest regards,
> Giam Teck Choon
Thank you for your response.
First of all, the mmu bug happened in my scenario without using LVM nor
multipath. We are accessing the storage via NFS and using file-based
images exported via tapdisk2. This is the same scenario we are now testing.
Now I recompiled the kernel without the kernel debugging options. It is
possible to run some VMs, but just for a very short period. Afterwards
system is rebooting for itself. There is very few information at
/var/log/messages, as expected, since debugging is off:
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: I/O queue driver: lio
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: received 'attach' message
(uuid = 8)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: sending 'attach response'
message (uuid = 8)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: received 'open' message
(uuid = 8)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: block-aio
open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/hda')
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]:
open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/hda)
with O_DIRECT
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: Image size: pre
sector_shift [134217728] post sector_shift [262144]
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: opened image
/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/hda (1
users, state: 0x00000001, type: 0)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: VBD CHAIN:
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]:
/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/hda: 0
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13300]: sending 'open response'
message (uuid = 8)
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5375.648125] block tdi:
sector-size: 512 capacity: 262144
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: I/O queue driver: lio
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: received 'attach' message
(uuid = 9)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: sending 'attach response'
message (uuid = 9)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: received 'open' message
(uuid = 9)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: block-aio
open('/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/xvda')
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]:
open(/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/xvda)
with O_DIRECT
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: Image size: pre
sector_shift [10737418240] post sector_shift [20971520]
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: opened image
/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/xvda (1
users, state: 0x00000001, type: 0)
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: VBD CHAIN:
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]:
/storage5_nfs/3/CD996633-linux-centos-5-64b-base-rip-sx-7411/xvda: 0
Apr 8 17:42:45 r2b16ch2x28p2 tapdisk2[13315]: sending 'open response'
message (uuid = 9)
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5375.762941] block tdj:
sector-size: 512 capacity: 20971520
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5375.990088] device
tap-tec7411.0 entered promiscuous mode
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5375.990172] virtbr: port
10(tap-tec7411.0) entering forwarding state
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5376.243558] device tec7411.0
entered promiscuous mode
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5376.251202] virtbr: port
11(tec7411.0) entering forwarding state
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5376.364483] virtbr: port
10(tap-tec7411.0) entering disabled state
Apr 8 17:42:45 r2b16ch2x28p2 kernel: [ 5376.382219] virtbr: port
10(tap-tec7411.0) entering forwarding state
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10544]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10544]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10557]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10557]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11448]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11448]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11462]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11462]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12348]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12348]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12363]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12363]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13315]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13315]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7182]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7182]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7196]: received 'pid' message
(uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7196]: sending 'pid response'
message (uuid = 0)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7182]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7182]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7182]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7196]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7196]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[7196]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10544]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10544]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10544]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10557]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10557]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[10557]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11448]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11448]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11448]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11462]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11462]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[11462]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12348]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12348]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12348]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12363]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12363]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[12363]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13315]: received 'list' message
(uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13315]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13315]: sending 'list response'
message (uuid = 65535)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: received 'resume' message
(uuid = 8)
Apr 8 17:42:47 r2b16ch2x28p2 tapdisk2[13300]: sending 'resume response'
message (uuid = 8)
Apr 8 17:43:03 r2b16ch2x28p2 kernel: [ 5393.862981] blkback: ring-ref
8, event-channel 5, protocol 1 (x86_64-abi)
Apr 8 17:43:03 r2b16ch2x28p2 kernel: [ 5393.925163] blkback: ring-ref
9, event-channel 6, protocol 1 (x86_64-abi)
Apr 8 17:49:12 r2b16ch2x28p2 syslogd 1.4.1: restart.
[-- Attachment #1.2: Type: text/html, Size: 33843 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-08 15:44 ` Teck Choon Giam
2011-04-08 20:51 ` Gerd Jakobovitsch
@ 2011-04-08 21:38 ` Gerd Jakobovitsch
2011-04-12 10:59 ` Konrad Rzeszutek Wilk
1 sibling, 1 reply; 9+ messages in thread
From: Gerd Jakobovitsch @ 2011-04-08 21:38 UTC (permalink / raw)
To: Teck Choon Giam; +Cc: xen-devel
[-- Attachment #1.1: Type: text/plain, Size: 25599 bytes --]
One more follow-up:
Another kernel bug report, with no kernel debug activated:
r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761622] BUG: unable to
handle kernel NULL pointer dereference at 00000000000002f0
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761781] IP:
[<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761892] PGD 710a3067 PUD
724c6067 PMD 0
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762076] Oops: 0000 [#1] SMP
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762212] last sysfs file:
/sys/devices/vbd-6-51712/statistics/wr_sect
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762271] CPU 5
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762363] Modules linked in:
bnx2 xt_mac bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter
arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding
dm_multipath megaraid_sas [last unloaded: bnx2]
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763350] Pid: 7781, comm:
tapdisk2 Not tainted 2.6.32.36 #5 PowerEdge M610
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763410] RIP:
e030:[<ffffffff8121e7f9>] [<ffffffff8121e7f9>]
blktap_device_end_request+0x4e/0x6c
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763519] RSP:
e02b:ffff88006ed49cf8 EFLAGS: 00010046
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763574] RAX:
0000000000000000 RBX: ffff88005e6fc3e0 RCX: ffffffff811a3de6
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763635] RDX:
ffff88005e6fc3e0 RSI: ffffffff8149bed6 RDI: ffff880070f42178
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763694] RBP:
ffff880070f42010 R08: ffffffff81661840 R09: 00000001002c9435
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763971] R10:
0000000000000000 R11: ffff88005e6378f0 R12: ffff88005e6378f0
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764244] R13:
ffff880070f42000 R14: 0000000000000000 R15: 0000000000000001
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764522] FS:
00007fcfd34b0730(0000) GS:ffff880015fe7000(0000) knlGS:0000000000000000
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765011] CS: e033 DS: 0000
ES: 0000 CR0: 000000008005003b
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765280] CR2:
00000000000002f0 CR3: 000000006ed46000 CR4: 0000000000002660
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765554] DR0:
0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765829] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766105] Process tapdisk2
(pid: 7781, threadinfo ffff88006ed48000, task ffff880079486270)
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766602] Stack:
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766868] ffff88007a81dee0
ffff880079486270 0000000000000000 0000000000000001
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767266] <0>
0000000000000001 ffffffff8121e0b9 0000000000000003 0000000000000000
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767967] <0>
ffff880070f42000 0000bda50000bda7 ffff88005e6fc3e0 fffffffd00000000
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.768921] Call Trace:
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769193]
[<ffffffff8121e0b9>] ? blktap_ring_ioctl+0x159/0x290
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769474]
[<ffffffff8149c50a>] ? error_exit+0x2a/0x60
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769750]
[<ffffffff810133dd>] ? retint_restore_args+0x5/0x6
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770030]
[<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770308]
[<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770592]
[<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770874]
[<ffffffff8100eca5>] ? xen_force_evtchn_callback+0x9/0xa
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771157]
[<ffffffff8100f342>] ? check_events+0x12/0x20
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771435]
[<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771715]
[<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771993]
[<ffffffff8100fcea>] ? xen_spin_lock_slow+0xb7/0xf8
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772275]
[<ffffffff810c825a>] ? vfs_ioctl+0x55/0x6b
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772554]
[<ffffffff810c8775>] ? do_vfs_ioctl+0x492/0x4e5
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772834]
[<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773116]
[<ffffffff810c8819>] ? sys_ioctl+0x51/0x70
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773392]
[<ffffffff81012a02>] ? system_call_fastpath+0x16/0x1b
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773671] Code: e8 39 f6 ff
ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 e6 d6 27 00 4c 89 e7 41 8b 54
24 60 44 89 f6 e8 b9 59 f8 ff 49 8b 44 24 40 <48> 8b b8 f0 02 00 00 e8
02 14 df ff 66 90 ff 14 25 78 8d 65 81
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777198] RIP
[<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777524] RSP <ffff88006ed49cf8>
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.778067] ---[ end trace
a71b80c14de09da1 ]---
On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
>
>
> On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch <gerd@alog.com.br
> <mailto:gerd@alog.com.br>> wrote:
>
> On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch<gerd@alog.com.br> <mailto:gerd@alog.com.br> wrote:
>>> Hello all,
>>>
>>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
>>> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>>>
>>> [ 145.749573] alloc irq_desc for 2209 on node -1
>>> [ 145.749581] alloc kstat_irqs on node -1
>>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
>>> [ 145.889952] general protection fault: 0000 [#1] SMP
>>> [ 145.890109] last sysfs file: /sys/block/tda/removable
>>> [ 145.890164] CPU 7
>>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
>>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
>>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
>>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
>>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
>>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
>>> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
>>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
>>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
>>> [ 145.892669] Stack:
>>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
>>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
>>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
>>> [ 145.895015] Call Trace:
>>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
>>> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>> [ 145.902321] RSP<ffff880064061cd8>
>>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
>>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
>>> [ 262.101232] Call Trace:
>>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
>>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 262.107918] sending NMI to all CPUs:
>>> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
>>> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>>> [ 262.109235] Oops: 0002 [#2] SMP
>>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>>> [ 262.109830] CPU 4
>>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
>>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
>>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
>>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
>>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
>>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
>>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
>>> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
>>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
>>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
>>> [ 262.116194] Stack:
>>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
>>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
>>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
>>> [ 262.118396] Call Trace:
>>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>>> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
>>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
>>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
>>> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>> [ 262.134164] RSP<ffff88006778f968>
>>> [ 262.134419] CR2: ffffffffff5fb310
>>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>>
>> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
>>
>> You can read more about this BUG at
>> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>>
>> I initially hit this BUG sometime Dec 2010... ...
>> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>>
>> Thanks.
>>
>> Kindest regards,
>> Giam Teck Choon
> Sorry for the delayed answer. The problem I'm facing now is not
> related to the mmu bug - that one I still am seeing at systems
> with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
> much more troublesome, since I can not run a single VM instance.
>
> Adding DEBUG_PAGEALLOC, the main difference is that the system is
> rebooting shortly after trying to start up a VM:
>
>
> Ok. Sorry, didn't see your log message carefully previously.
>
>
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
> response' message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
> message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
> open('/storage5/linux-centos-5-64b-base-7253/hda')
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
> pre sector_shift [134217728] post sector_shift [262144]
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
> /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
> 0x00000001, type: 0)
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> /storage5/linux-centos-5-64b-base/hda: 0
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
> response' message (uuid = 0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
> sector-size: 512 capacity: 262144
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
> protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
> file: /sys/block/tda/removable
>
>
> Just curious... ... what type of storage you are using for your VMs?
>
>
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
> linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
> ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
> xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
> iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
> uio dm_multipath bnx2 megaraid_sas
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
> comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
> e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
> blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
> e02b:ffff88006da5dcd8 EFLAGS: 00010046
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
> 6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
> 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
> ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
> ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
> ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
> knlGS:0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS:
> 0000 ES: 0000 CR0: 000000008005003b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
> 0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
> tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
> ffff88007d7750f0)
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
> ffff88007d7750f0 0000000000000000 0000000000000000 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074] <0>
> ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574] <0>
> 00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
> [<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
> [<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
> [<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
> [<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
> [<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
> [<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
> [<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
> [<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
> [<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
> [<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
> [<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
> [<ffffffff81112444>] vfs_ioctl+0x5e/0x77
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
> [<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
> [<ffffffff81112a01>] sys_ioctl+0x57/0x7a
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
> [<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
> f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
> 54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40 <48> 8b
> b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
> <ffff88006da5dcd8>
> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
> trace da218b929afc63f7 ]---
> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> tap-err:tap_ctl_read_message: failure reading message
> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
>
>
> Looks like it is related to blktap/blktap2 drivers related issue to
> me... so you are right... ... this is different BUG from what I
> encountered as the logs are different. Sorry, didn't read your log
> carefully before replying previously. Have you try to use normal LVM
> for your VM to reproduce the BUG as I guess you are using different
> storage?
>
> Thanks.
>
> Kindest regards,
> Giam Teck Choon
--
*Gerd Jakobovitsch
Engenheiro de Produto **
---------------------------------------------------------* *
**ALOG Data Centers do Brasil**
**Excelência em Projetos de Hosting*
Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
Telefone: (11) 3524-4970 / (11) 7152-0815
*http://www.alog.com.br* <http://www.alog.com.br/>
*"Como estão nossos serviços? Clique aqui
<https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
escutar a sua opinião!"*
[-- Attachment #1.2.1: Type: text/html, Size: 33091 bytes --]
[-- Attachment #1.2.2: alog --]
[-- Type: image/png, Size: 32911 bytes --]
[-- Attachment #1.2.3: graphics2 --]
[-- Type: image/png, Size: 42793 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-08 21:38 ` Gerd Jakobovitsch
@ 2011-04-12 10:59 ` Konrad Rzeszutek Wilk
2011-04-12 11:19 ` Gerd Jakobovitsch
0 siblings, 1 reply; 9+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-04-12 10:59 UTC (permalink / raw)
To: Gerd Jakobovitsch; +Cc: xen-devel, Teck Choon Giam
On Fri, Apr 08, 2011 at 06:38:38PM -0300, Gerd Jakobovitsch wrote:
> One more follow-up:
>
> Another kernel bug report, with no kernel debug activated:
This is just with the guest starting right? Not running for a long time?
>
> r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761622] BUG: unable to
> handle kernel NULL pointer dereference at 00000000000002f0
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761781] IP:
> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761892] PGD 710a3067
> PUD 724c6067 PMD 0
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762076] Oops: 0000 [#1] SMP
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762212] last sysfs
> file: /sys/devices/vbd-6-51712/statistics/wr_sect
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762271] CPU 5
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762363] Modules linked
> in: bnx2 xt_mac bridge stp nfs fscache nfs_acl auth_rpcgss
> arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
> lockd sunrpc bonding dm_multipath megaraid_sas [last unloaded: bnx2]
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763350] Pid: 7781,
> comm: tapdisk2 Not tainted 2.6.32.36 #5 PowerEdge M610
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763410] RIP:
> e030:[<ffffffff8121e7f9>] [<ffffffff8121e7f9>]
> blktap_device_end_request+0x4e/0x6c
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763519] RSP:
> e02b:ffff88006ed49cf8 EFLAGS: 00010046
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763574] RAX:
> 0000000000000000 RBX: ffff88005e6fc3e0 RCX: ffffffff811a3de6
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763635] RDX:
> ffff88005e6fc3e0 RSI: ffffffff8149bed6 RDI: ffff880070f42178
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763694] RBP:
> ffff880070f42010 R08: ffffffff81661840 R09: 00000001002c9435
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763971] R10:
> 0000000000000000 R11: ffff88005e6378f0 R12: ffff88005e6378f0
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764244] R13:
> ffff880070f42000 R14: 0000000000000000 R15: 0000000000000001
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764522] FS:
> 00007fcfd34b0730(0000) GS:ffff880015fe7000(0000)
> knlGS:0000000000000000
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765011] CS: e033 DS:
> 0000 ES: 0000 CR0: 000000008005003b
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765280] CR2:
> 00000000000002f0 CR3: 000000006ed46000 CR4: 0000000000002660
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765554] DR0:
> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765829] DR3:
> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766105] Process
> tapdisk2 (pid: 7781, threadinfo ffff88006ed48000, task
> ffff880079486270)
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766602] Stack:
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766868]
> ffff88007a81dee0 ffff880079486270 0000000000000000 0000000000000001
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767266] <0>
> 0000000000000001 ffffffff8121e0b9 0000000000000003 0000000000000000
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767967] <0>
> ffff880070f42000 0000bda50000bda7 ffff88005e6fc3e0 fffffffd00000000
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.768921] Call Trace:
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769193]
> [<ffffffff8121e0b9>] ? blktap_ring_ioctl+0x159/0x290
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769474]
> [<ffffffff8149c50a>] ? error_exit+0x2a/0x60
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769750]
> [<ffffffff810133dd>] ? retint_restore_args+0x5/0x6
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770030]
> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770308]
> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770592]
> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770874]
> [<ffffffff8100eca5>] ? xen_force_evtchn_callback+0x9/0xa
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771157]
> [<ffffffff8100f342>] ? check_events+0x12/0x20
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771435]
> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771715]
> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771993]
> [<ffffffff8100fcea>] ? xen_spin_lock_slow+0xb7/0xf8
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772275]
> [<ffffffff810c825a>] ? vfs_ioctl+0x55/0x6b
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772554]
> [<ffffffff810c8775>] ? do_vfs_ioctl+0x492/0x4e5
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772834]
> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773116]
> [<ffffffff810c8819>] ? sys_ioctl+0x51/0x70
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773392]
> [<ffffffff81012a02>] ? system_call_fastpath+0x16/0x1b
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773671] Code: e8 39 f6
> ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 e6 d6 27 00 4c 89 e7 41
> 8b 54 24 60 44 89 f6 e8 b9 59 f8 ff 49 8b 44 24 40 <48> 8b b8 f0 02
> 00 00 e8 02 14 df ff 66 90 ff 14 25 78 8d 65 81
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777198] RIP
> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777524] RSP <ffff88006ed49cf8>
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.778067] ---[ end trace
> a71b80c14de09da1 ]---
>
> On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
> >
> >
> >On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch
> ><gerd@alog.com.br <mailto:gerd@alog.com.br>> wrote:
> >
> > On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
> >> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch<gerd@alog.com.br> <mailto:gerd@alog.com.br> wrote:
> >>> Hello all,
> >>>
> >>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
> >>> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
> >>>
> >>> [ 145.749573] alloc irq_desc for 2209 on node -1
> >>> [ 145.749581] alloc kstat_irqs on node -1
> >>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
> >>> [ 145.889952] general protection fault: 0000 [#1] SMP
> >>> [ 145.890109] last sysfs file: /sys/block/tda/removable
> >>> [ 145.890164] CPU 7
> >>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> >>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
> >>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> >>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
> >>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
> >>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
> >>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
> >>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
> >>> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
> >>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
> >>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
> >>> [ 145.892669] Stack:
> >>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
> >>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
> >>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
> >>> [ 145.895015] Call Trace:
> >>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
> >>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> >>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
> >>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> >>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
> >>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
> >>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
> >>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
> >>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
> >>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
> >>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
> >>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> >>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
> >>> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
> >>> [ 145.902321] RSP<ffff880064061cd8>
> >>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
> >>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
> >>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
> >>> [ 262.101232] Call Trace:
> >>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
> >>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> >>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> >>> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> >>> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> >>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> >>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> >>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> >>> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> >>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> >>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> >>> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> >>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> >>> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> >>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> >>> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> >>> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> >>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> >>> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> >>> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> >>> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> >>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> >>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
> >>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> >>> [ 262.107918] sending NMI to all CPUs:
> >>> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
> >>> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> >>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
> >>> [ 262.109235] Oops: 0002 [#2] SMP
> >>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
> >>> [ 262.109830] CPU 4
> >>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
> >>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
> >>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> >>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
> >>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
> >>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
> >>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
> >>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
> >>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
> >>> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
> >>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
> >>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
> >>> [ 262.116194] Stack:
> >>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
> >>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
> >>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
> >>> [ 262.118396] Call Trace:
> >>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
> >>> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
> >>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
> >>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
> >>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
> >>> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
> >>> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
> >>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
> >>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
> >>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
> >>> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
> >>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
> >>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
> >>> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
> >>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
> >>> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
> >>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
> >>> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
> >>> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
> >>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
> >>> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
> >>> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
> >>> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
> >>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
> >>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
> >>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
> >>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
> >>> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
> >>> [ 262.134164] RSP<ffff88006778f968>
> >>> [ 262.134419] CR2: ffffffffff5fb310
> >>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
> >>>
> >> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
> >>
> >> You can read more about this BUG at
> >> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
> >>
> >> I initially hit this BUG sometime Dec 2010... ...
> >> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
> >>
> >> Thanks.
> >>
> >> Kindest regards,
> >> Giam Teck Choon
> > Sorry for the delayed answer. The problem I'm facing now is not
> > related to the mmu bug - that one I still am seeing at systems
> > with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
> > much more troublesome, since I can not run a single VM instance.
> >
> > Adding DEBUG_PAGEALLOC, the main difference is that the system is
> > rebooting shortly after trying to start up a VM:
> >
> >
> >Ok. Sorry, didn't see your log message carefully previously.
> >
> >
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
> > message (uuid = 0)
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
> > response' message (uuid = 0)
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
> > message (uuid = 0)
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
> > open('/storage5/linux-centos-5-64b-base-7253/hda')
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> > open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
> >pre sector_shift [134217728] post sector_shift [262144]
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
> > /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
> > 0x00000001, type: 0)
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
> > /storage5/linux-centos-5-64b-base/hda: 0
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
> > response' message (uuid = 0)
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
> > sector-size: 512 capacity: 262144
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
> > protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
> > file: /sys/block/tda/removable
> >
> >
> >Just curious... ... what type of storage you are using for your VMs?
> >
> >
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
> > linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
> > ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
> > ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
> > xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
> > iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
> > uio dm_multipath bnx2 megaraid_sas
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
> > comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
> > e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
> > blktap_device_end_request+0x4e/0x63
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
> > e02b:ffff88006da5dcd8 EFLAGS: 00010046
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
> > 6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
> > 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
> > ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
> > ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
> > ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
> >00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
> > knlGS:0000000000000000
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS:
> > 0000 ES: 0000 CR0: 000000008005003b
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
> > 0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
> > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
> > 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
> > tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
> > ffff88007d7750f0)
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
> >ffff88007d7750f0 0000000000000000 0000000000000000
> >0000000000000000
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074] <0>
> > ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574] <0>
> > 00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
> >[<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
> >[<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
> >[<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
> >[<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
> >[<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
> >[<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
> >[<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
> >[<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
> >[<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
> >[<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
> >[<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
> >[<ffffffff81112444>] vfs_ioctl+0x5e/0x77
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
> >[<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
> >[<ffffffff81112a01>] sys_ioctl+0x57/0x7a
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
> >[<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
> > f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
> > 54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40 <48> 8b
> > b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
> >[<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
> > <ffff88006da5dcd8>
> > Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
> > trace da218b929afc63f7 ]---
> > Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
> > Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> > tap-err:tap_ctl_read_message: failure reading message
> > Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
> > tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
> >
> >
> >Looks like it is related to blktap/blktap2 drivers related issue
> >to me... so you are right... ... this is different BUG from what I
> >encountered as the logs are different. Sorry, didn't read your
> >log carefully before replying previously. Have you try to use
> >normal LVM for your VM to reproduce the BUG as I guess you are
> >using different storage?
> >
> >Thanks.
> >
> >Kindest regards,
> >Giam Teck Choon
>
> --
>
> *Gerd Jakobovitsch
> Engenheiro de Produto **
> ---------------------------------------------------------* *
> **ALOG Data Centers do Brasil**
> **Excelência em Projetos de Hosting*
> Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
> Telefone: (11) 3524-4970 / (11) 7152-0815
> *http://www.alog.com.br* <http://www.alog.com.br/>
>
>
> *"Como estão nossos serviços? Clique aqui
> <https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
> escutar a sua opinião!"*
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-12 10:59 ` Konrad Rzeszutek Wilk
@ 2011-04-12 11:19 ` Gerd Jakobovitsch
2011-04-12 16:41 ` Teck Choon Giam
0 siblings, 1 reply; 9+ messages in thread
From: Gerd Jakobovitsch @ 2011-04-12 11:19 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Teck Choon Giam
Yes. It is the same scenario from previous bug.
On 04/12/2011 07:59 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Apr 08, 2011 at 06:38:38PM -0300, Gerd Jakobovitsch wrote:
>> One more follow-up:
>>
>> Another kernel bug report, with no kernel debug activated:
> This is just with the guest starting right? Not running for a long time?
>
>> r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761622] BUG: unable to
>> handle kernel NULL pointer dereference at 00000000000002f0
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761781] IP:
>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761892] PGD 710a3067
>> PUD 724c6067 PMD 0
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762076] Oops: 0000 [#1] SMP
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762212] last sysfs
>> file: /sys/devices/vbd-6-51712/statistics/wr_sect
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762271] CPU 5
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762363] Modules linked
>> in: bnx2 xt_mac bridge stp nfs fscache nfs_acl auth_rpcgss
>> arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
>> lockd sunrpc bonding dm_multipath megaraid_sas [last unloaded: bnx2]
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763350] Pid: 7781,
>> comm: tapdisk2 Not tainted 2.6.32.36 #5 PowerEdge M610
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763410] RIP:
>> e030:[<ffffffff8121e7f9>] [<ffffffff8121e7f9>]
>> blktap_device_end_request+0x4e/0x6c
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763519] RSP:
>> e02b:ffff88006ed49cf8 EFLAGS: 00010046
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763574] RAX:
>> 0000000000000000 RBX: ffff88005e6fc3e0 RCX: ffffffff811a3de6
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763635] RDX:
>> ffff88005e6fc3e0 RSI: ffffffff8149bed6 RDI: ffff880070f42178
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763694] RBP:
>> ffff880070f42010 R08: ffffffff81661840 R09: 00000001002c9435
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763971] R10:
>> 0000000000000000 R11: ffff88005e6378f0 R12: ffff88005e6378f0
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764244] R13:
>> ffff880070f42000 R14: 0000000000000000 R15: 0000000000000001
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764522] FS:
>> 00007fcfd34b0730(0000) GS:ffff880015fe7000(0000)
>> knlGS:0000000000000000
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765011] CS: e033 DS:
>> 0000 ES: 0000 CR0: 000000008005003b
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765280] CR2:
>> 00000000000002f0 CR3: 000000006ed46000 CR4: 0000000000002660
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765554] DR0:
>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765829] DR3:
>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766105] Process
>> tapdisk2 (pid: 7781, threadinfo ffff88006ed48000, task
>> ffff880079486270)
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766602] Stack:
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766868]
>> ffff88007a81dee0 ffff880079486270 0000000000000000 0000000000000001
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767266]<0>
>> 0000000000000001 ffffffff8121e0b9 0000000000000003 0000000000000000
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767967]<0>
>> ffff880070f42000 0000bda50000bda7 ffff88005e6fc3e0 fffffffd00000000
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.768921] Call Trace:
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769193]
>> [<ffffffff8121e0b9>] ? blktap_ring_ioctl+0x159/0x290
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769474]
>> [<ffffffff8149c50a>] ? error_exit+0x2a/0x60
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769750]
>> [<ffffffff810133dd>] ? retint_restore_args+0x5/0x6
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770030]
>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770308]
>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770592]
>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770874]
>> [<ffffffff8100eca5>] ? xen_force_evtchn_callback+0x9/0xa
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771157]
>> [<ffffffff8100f342>] ? check_events+0x12/0x20
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771435]
>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771715]
>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771993]
>> [<ffffffff8100fcea>] ? xen_spin_lock_slow+0xb7/0xf8
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772275]
>> [<ffffffff810c825a>] ? vfs_ioctl+0x55/0x6b
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772554]
>> [<ffffffff810c8775>] ? do_vfs_ioctl+0x492/0x4e5
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772834]
>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773116]
>> [<ffffffff810c8819>] ? sys_ioctl+0x51/0x70
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773392]
>> [<ffffffff81012a02>] ? system_call_fastpath+0x16/0x1b
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773671] Code: e8 39 f6
>> ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 e6 d6 27 00 4c 89 e7 41
>> 8b 54 24 60 44 89 f6 e8 b9 59 f8 ff 49 8b 44 24 40<48> 8b b8 f0 02
>> 00 00 e8 02 14 df ff 66 90 ff 14 25 78 8d 65 81
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777198] RIP
>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777524] RSP<ffff88006ed49cf8>
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.778067] ---[ end trace
>> a71b80c14de09da1 ]---
>>
>> On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
>>>
>>> On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch
>>> <gerd@alog.com.br<mailto:gerd@alog.com.br>> wrote:
>>>
>>> On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>>>> On Wed, Mar 30, 2011 at 10:14 PM, Gerd Jakobovitsch<gerd@alog.com.br> <mailto:gerd@alog.com.br> wrote:
>>>>> Hello all,
>>>>>
>>>>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a relative success, but the bug at mmu.c appeared once at a while. Therefore, I'm looking for a more stable option.
>>>>> I compiled and ran the newly released xen 4.1, with kernel PVOPS 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs appeared at dmesg. After that, xl and xm commands do not longer respond:
>>>>>
>>>>> [ 145.749573] alloc irq_desc for 2209 on node -1
>>>>> [ 145.749581] alloc kstat_irqs on node -1
>>>>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
>>>>> [ 145.889952] general protection fault: 0000 [#1] SMP
>>>>> [ 145.890109] last sysfs file: /sys/block/tda/removable
>>>>> [ 145.890164] CPU 7
>>>>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1 PowerEdge M610
>>>>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>>>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
>>>>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX: 0000000000000028
>>>>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>>>>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09: ffff88007da42948
>>>>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12: ffff88007da427f8
>>>>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15: ffff88007c75f3a8
>>>>> [ 145.891651] FS: 00007ff33d9a4730(0000) GS:ffff8800189e5000(0000) knlGS:0000000000000000
>>>>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4: 0000000000002660
>>>>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo ffff880064060000, task ffff88007c272d60)
>>>>> [ 145.892669] Stack:
>>>>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000 0000000000000000
>>>>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae ffff880064061e58 ffffffff811d234f
>>>>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398 00000001ffffffff 0000000000000000
>>>>> [ 145.895015] Call Trace:
>>>>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>>>>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>>>>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>>>>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>>>>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>>>>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>>>>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>>>>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
>>>>> [ 145.902008] RIP [<ffffffff81281e79>] blktap_device_end_request+0x4e/0x63
>>>>> [ 145.902321] RSP<ffff880064061cd8>
>>>>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>>>>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181, ffff88007c75f520
>>>>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1
>>>>> [ 262.101232] Call Trace:
>>>>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>>>>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>> [ 262.102294] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>>>> [ 262.102563] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>>>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>> [ 262.103631] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>>>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>> [ 262.104433] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>>>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>> [ 262.104972] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>>>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>> [ 262.105507] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>>>> [ 262.105778] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>>>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>> [ 262.106315] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>>>> [ 262.106584] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>>>> [ 262.106854] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>>>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>> [ 262.107918] sending NMI to all CPUs:
>>>>> [ 262.108189] BUG: unable to handle kernel paging request at ffffffffff5fb310
>>>>> [ 262.108526] IP: [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>>>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>>>>> [ 262.109235] Oops: 0002 [#2] SMP
>>>>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>>>>> [ 262.109830] CPU 4
>>>>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D 2.6.32.35 #1 PowerEdge M610
>>>>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>>>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
>>>>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX: 00000000000160a0
>>>>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI: ffffffff81816020
>>>>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09: 0000000000000004
>>>>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000002
>>>>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15: 0000000000000000
>>>>> [ 262.114149] FS: 00007fa78bcc5710(0063) GS:ffff88001898e000(0000) knlGS:0000000000000000
>>>>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4: 0000000000002660
>>>>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>> [ 262.115707] Process vol_id (pid: 5181, threadinfo ffff88006778e000, task ffff88007db86250)
>>>>> [ 262.116194] Stack:
>>>>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000 0000000076e9ecd0
>>>>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841 ffff88006778f9b8 ffffffff81029f0d
>>>>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520 ffff88006778f9f8 ffffffff81219eb3
>>>>> [ 262.118396] Call Trace:
>>>>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>>>>> [ 262.118925] [<ffffffff81029f0d>] arch_trigger_all_cpu_backtrace+0x45/0x66
>>>>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>>>>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>> [ 262.119996] [<ffffffff8101019f>] ? xen_restore_fl_direct_end+0x0/0x1
>>>>> [ 262.120264] [<ffffffff811fa171>] generic_make_request+0x258/0x2f4
>>>>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>> [ 262.121333] [<ffffffff8111448f>] block_read_full_page+0x246/0x264
>>>>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>> [ 262.122137] [<ffffffff810ba73d>] ? add_to_page_cache_locked+0xa0/0xca
>>>>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>> [ 262.128025] [<ffffffff810c1d36>] __do_page_cache_readahead+0x144/0x177
>>>>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>> [ 262.128548] [<ffffffff810c20d7>] page_cache_sync_readahead+0x38/0x3a
>>>>> [ 262.128810] [<ffffffff810bb833>] generic_file_aio_read+0x24c/0x5c1
>>>>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>> [ 262.129329] [<ffffffff81068a02>] ? autoremove_wake_function+0x0/0x38
>>>>> [ 262.129590] [<ffffffff811d701c>] ? selinux_file_permission+0x5c/0x10e
>>>>> [ 262.129851] [<ffffffff811ce9c4>] ? security_file_permission+0x11/0x13
>>>>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0 c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
>>>>> [ 262.133866] RIP [<ffffffff8102c7d1>] flat_send_IPI_mask+0x6a/0xc0
>>>>> [ 262.134164] RSP<ffff88006778f968>
>>>>> [ 262.134419] CR2: ffffffffff5fb310
>>>>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>>>>
>>>> Can you try to recompile your PVOPS kernel with CONFIG_DEBUG_PAGEALLOC=y?
>>>>
>>>> You can read more about this BUG at
>>>> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>>>>
>>>> I initially hit this BUG sometime Dec 2010... ...
>>>> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>>>>
>>>> Thanks.
>>>>
>>>> Kindest regards,
>>>> Giam Teck Choon
>>> Sorry for the delayed answer. The problem I'm facing now is not
>>> related to the mmu bug - that one I still am seeing at systems
>>> with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
>>> much more troublesome, since I can not run a single VM instance.
>>>
>>> Adding DEBUG_PAGEALLOC, the main difference is that the system is
>>> rebooting shortly after trying to start up a VM:
>>>
>>>
>>> Ok. Sorry, didn't see your log message carefully previously.
>>>
>>>
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
>>> message (uuid = 0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
>>> response' message (uuid = 0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
>>> message (uuid = 0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
>>> open('/storage5/linux-centos-5-64b-base-7253/hda')
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>> open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
>>> pre sector_shift [134217728] post sector_shift [262144]
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
>>> /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
>>> 0x00000001, type: 0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>> /storage5/linux-centos-5-64b-base/hda: 0
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
>>> response' message (uuid = 0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
>>> sector-size: 512 capacity: 262144
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
>>> protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
>>> file: /sys/block/tda/removable
>>>
>>>
>>> Just curious... ... what type of storage you are using for your VMs?
>>>
>>>
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
>>> linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
>>> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
>>> ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
>>> xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
>>> iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
>>> uio dm_multipath bnx2 megaraid_sas
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
>>> comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
>>> e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
>>> blktap_device_end_request+0x4e/0x63
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
>>> e02b:ffff88006da5dcd8 EFLAGS: 00010046
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
>>> 6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
>>> 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
>>> ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
>>> ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
>>> ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
>>> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
>>> knlGS:0000000000000000
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS:
>>> 0000 ES: 0000 CR0: 000000008005003b
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
>>> 0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
>>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
>>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
>>> tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
>>> ffff88007d7750f0)
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
>>> ffff88007d7750f0 0000000000000000 0000000000000000
>>> 0000000000000000
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074]<0>
>>> ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574]<0>
>>> 00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
>>> [<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
>>> [<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
>>> [<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
>>> [<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
>>> [<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
>>> [<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
>>> [<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
>>> [<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
>>> [<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
>>> [<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
>>> [<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
>>> [<ffffffff81112444>] vfs_ioctl+0x5e/0x77
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
>>> [<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
>>> [<ffffffff81112a01>] sys_ioctl+0x57/0x7a
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
>>> [<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
>>> f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
>>> 54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40<48> 8b
>>> b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
>>> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
>>> <ffff88006da5dcd8>
>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
>>> trace da218b929afc63f7 ]---
>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
>>> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>> tap-err:tap_ctl_read_message: failure reading message
>>> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>> tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
>>>
>>>
>>> Looks like it is related to blktap/blktap2 drivers related issue
>>> to me... so you are right... ... this is different BUG from what I
>>> encountered as the logs are different. Sorry, didn't read your
>>> log carefully before replying previously. Have you try to use
>>> normal LVM for your VM to reproduce the BUG as I guess you are
>>> using different storage?
>>>
>>> Thanks.
>>>
>>> Kindest regards,
>>> Giam Teck Choon
>> --
>>
>> *Gerd Jakobovitsch
>> Engenheiro de Produto **
>> ---------------------------------------------------------* *
>> **ALOG Data Centers do Brasil**
>> **Excelência em Projetos de Hosting*
>> Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
>> Telefone: (11) 3524-4970 / (11) 7152-0815
>> *http://www.alog.com.br*<http://www.alog.com.br/>
>>
>>
>> *"Como estão nossos serviços? Clique aqui
>> <https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
>> escutar a sua opinião!"*
>>
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM
2011-04-12 11:19 ` Gerd Jakobovitsch
@ 2011-04-12 16:41 ` Teck Choon Giam
0 siblings, 0 replies; 9+ messages in thread
From: Teck Choon Giam @ 2011-04-12 16:41 UTC (permalink / raw)
To: Gerd Jakobovitsch; +Cc: xen-devel, Konrad Rzeszutek Wilk
On Tue, Apr 12, 2011 at 7:19 PM, Gerd Jakobovitsch <gerd@alog.com.br> wrote:
> Yes. It is the same scenario from previous bug.
>
>
> On 04/12/2011 07:59 AM, Konrad Rzeszutek Wilk wrote:
>>
>> On Fri, Apr 08, 2011 at 06:38:38PM -0300, Gerd Jakobovitsch wrote:
>>>
>>> One more follow-up:
>>>
>>> Another kernel bug report, with no kernel debug activated:
>>
>> This is just with the guest starting right? Not running for a long time?
Since you are running NFS related in storage and the latest
xen/stable-2.6.32 commit has such fix backported
http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=ae333e97552c81ab10395ad1ffc6d6daaadb144a.
Are you able to run your test with this?
>>
>>> r2b16ch2x28p2 kernel: [ 3243.777796] CR2: 00000000000002f0
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761622] BUG: unable to
>>> handle kernel NULL pointer dereference at 00000000000002f0
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761781] IP:
>>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.761892] PGD 710a3067
>>> PUD 724c6067 PMD 0
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762076] Oops: 0000 [#1] SMP
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762212] last sysfs
>>> file: /sys/devices/vbd-6-51712/statistics/wr_sect
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762271] CPU 5
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.762363] Modules linked
>>> in: bnx2 xt_mac bridge stp nfs fscache nfs_acl auth_rpcgss
>>> arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
>>> lockd sunrpc bonding dm_multipath megaraid_sas [last unloaded: bnx2]
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763350] Pid: 7781,
>>> comm: tapdisk2 Not tainted 2.6.32.36 #5 PowerEdge M610
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763410] RIP:
>>> e030:[<ffffffff8121e7f9>] [<ffffffff8121e7f9>]
>>> blktap_device_end_request+0x4e/0x6c
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763519] RSP:
>>> e02b:ffff88006ed49cf8 EFLAGS: 00010046
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763574] RAX:
>>> 0000000000000000 RBX: ffff88005e6fc3e0 RCX: ffffffff811a3de6
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763635] RDX:
>>> ffff88005e6fc3e0 RSI: ffffffff8149bed6 RDI: ffff880070f42178
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763694] RBP:
>>> ffff880070f42010 R08: ffffffff81661840 R09: 00000001002c9435
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.763971] R10:
>>> 0000000000000000 R11: ffff88005e6378f0 R12: ffff88005e6378f0
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764244] R13:
>>> ffff880070f42000 R14: 0000000000000000 R15: 0000000000000001
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.764522] FS:
>>> 00007fcfd34b0730(0000) GS:ffff880015fe7000(0000)
>>> knlGS:0000000000000000
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765011] CS: e033 DS:
>>> 0000 ES: 0000 CR0: 000000008005003b
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765280] CR2:
>>> 00000000000002f0 CR3: 000000006ed46000 CR4: 0000000000002660
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765554] DR0:
>>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.765829] DR3:
>>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766105] Process
>>> tapdisk2 (pid: 7781, threadinfo ffff88006ed48000, task
>>> ffff880079486270)
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766602] Stack:
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.766868]
>>> ffff88007a81dee0 ffff880079486270 0000000000000000 0000000000000001
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767266]<0>
>>> 0000000000000001 ffffffff8121e0b9 0000000000000003 0000000000000000
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.767967]<0>
>>> ffff880070f42000 0000bda50000bda7 ffff88005e6fc3e0 fffffffd00000000
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.768921] Call Trace:
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769193]
>>> [<ffffffff8121e0b9>] ? blktap_ring_ioctl+0x159/0x290
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769474]
>>> [<ffffffff8149c50a>] ? error_exit+0x2a/0x60
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.769750]
>>> [<ffffffff810133dd>] ? retint_restore_args+0x5/0x6
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770030]
>>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770308]
>>> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770592]
>>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.770874]
>>> [<ffffffff8100eca5>] ? xen_force_evtchn_callback+0x9/0xa
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771157]
>>> [<ffffffff8100f342>] ? check_events+0x12/0x20
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771435]
>>> [<ffffffff811846b6>] ? selinux_file_ioctl+0x0/0x3d
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771715]
>>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.771993]
>>> [<ffffffff8100fcea>] ? xen_spin_lock_slow+0xb7/0xf8
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772275]
>>> [<ffffffff810c825a>] ? vfs_ioctl+0x55/0x6b
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772554]
>>> [<ffffffff810c8775>] ? do_vfs_ioctl+0x492/0x4e5
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.772834]
>>> [<ffffffff8100f32f>] ? xen_restore_fl_direct_end+0x0/0x1
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773116]
>>> [<ffffffff810c8819>] ? sys_ioctl+0x51/0x70
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773392]
>>> [<ffffffff81012a02>] ? system_call_fastpath+0x16/0x1b
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.773671] Code: e8 39 f6
>>> ff ff 49 8b 44 24 40 48 8b b8 f0 02 00 00 e8 e6 d6 27 00 4c 89 e7 41
>>> 8b 54 24 60 44 89 f6 e8 b9 59 f8 ff 49 8b 44 24 40<48> 8b b8 f0 02
>>> 00 00 e8 02 14 df ff 66 90 ff 14 25 78 8d 65 81
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777198] RIP
>>> [<ffffffff8121e7f9>] blktap_device_end_request+0x4e/0x6c
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777524]
>>> RSP<ffff88006ed49cf8>
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.777796] CR2:
>>> 00000000000002f0
>>> Apr 8 18:41:58 r2b16ch2x28p2 kernel: [ 3243.778067] ---[ end trace
>>> a71b80c14de09da1 ]---
>>>
>>> On 04/08/2011 12:44 PM, Teck Choon Giam wrote:
>>>>
>>>> On Fri, Apr 8, 2011 at 11:06 PM, Gerd Jakobovitsch
>>>> <gerd@alog.com.br<mailto:gerd@alog.com.br>> wrote:
>>>>
>>>> On 03/30/2011 11:44 PM, Teck Choon Giam wrote:
>>>>>
>>>>> On Wed, Mar 30, 2011 at 10:14 PM, Gerd
>>>>> Jakobovitsch<gerd@alog.com.br> <mailto:gerd@alog.com.br> wrote:
>>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I used to run xen4.0 kernel 2.6.32.24 over CentOS 5.5, with a
>>>>>> relative success, but the bug at mmu.c appeared once at a while. Therefore,
>>>>>> I'm looking for a more stable option.
>>>>>> I compiled and ran the newly released xen 4.1, with kernel PVOPS
>>>>>> 2.6.32.35 over CentOS 5.5. When trying to start a VM, the following bugs
>>>>>> appeared at dmesg. After that, xl and xm commands do not longer respond:
>>>>>>
>>>>>> [ 145.749573] alloc irq_desc for 2209 on node -1
>>>>>> [ 145.749581] alloc kstat_irqs on node -1
>>>>>> [ 145.883515] block tda: sector-size: 512 capacity: 262144
>>>>>> [ 145.889952] general protection fault: 0000 [#1] SMP
>>>>>> [ 145.890109] last sysfs file: /sys/block/tda/removable
>>>>>> [ 145.890164] CPU 7
>>>>>> [ 145.890252] Modules linked in: bridge stp nfs fscache nfs_acl
>>>>>> auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
>>>>>> lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>>> [ 145.891125] Pid: 5179, comm: tapdisk2 Not tainted 2.6.32.35 #1
>>>>>> PowerEdge M610
>>>>>> [ 145.891184] RIP: e030:[<ffffffff81281e79>] [<ffffffff81281e79>]
>>>>>> blktap_device_end_request+0x4e/0x63
>>>>>> [ 145.891296] RSP: e02b:ffff880064061cd8 EFLAGS: 00010046
>>>>>> [ 145.891351] RAX: 6b6b6b6b6b6b6b6b RBX: ffff88007d264690 RCX:
>>>>>> 0000000000000028
>>>>>> [ 145.891410] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
>>>>>> 0000000000000000
>>>>>> [ 145.891469] RBP: ffff880064061cf8 R08: 0000000064061c98 R09:
>>>>>> ffff88007da42948
>>>>>> [ 145.891528] R10: ffffea0000000008 R11: 0000000001f60080 R12:
>>>>>> ffff88007da427f8
>>>>>> [ 145.891587] R13: ffff88007c75f398 R14: 0000000000000000 R15:
>>>>>> ffff88007c75f3a8
>>>>>> [ 145.891651] FS: 00007ff33d9a4730(0000)
>>>>>> GS:ffff8800189e5000(0000) knlGS:0000000000000000
>>>>>> [ 145.891714] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>> [ 145.891771] CR2: 0000000002594cc8 CR3: 000000007be61000 CR4:
>>>>>> 0000000000002660
>>>>>> [ 145.891830] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>>> 0000000000000000
>>>>>> [ 145.891890] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>>>>> 0000000000000400
>>>>>> [ 145.892171] Process tapdisk2 (pid: 5179, threadinfo
>>>>>> ffff880064060000, task ffff88007c272d60)
>>>>>> [ 145.892669] Stack:
>>>>>> [ 145.892934] ffff88007c272d60 0000000000000000 0000000000000000
>>>>>> 0000000000000000
>>>>>> [ 145.893334]<0> ffff880064061e88 ffffffff812815ae
>>>>>> ffff880064061e58 ffffffff811d234f
>>>>>> [ 145.894035]<0> ffff88007e9bbfc0 ffff88007c75f398
>>>>>> 00000001ffffffff 0000000000000000
>>>>>> [ 145.895015] Call Trace:
>>>>>> [ 145.895286] [<ffffffff812815ae>] blktap_ring_ioctl+0x183/0x2d8
>>>>>> [ 145.895566] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>>> [ 145.895844] [<ffffffff811d234f>] ? inode_has_perm+0x77/0x89
>>>>>> [ 145.896124] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>>> [ 145.896403] [<ffffffff81219d28>] ? _raw_spin_unlock+0xab/0xb2
>>>>>> [ 145.896682] [<ffffffff81529311>] ? _spin_unlock+0x9/0xb
>>>>>> [ 145.896958] [<ffffffff81219e24>] ? _raw_spin_lock+0x77/0x12f
>>>>>> [ 145.897234] [<ffffffff811d2415>] ? file_has_perm+0xb4/0xc6
>>>>>> [ 145.897513] [<ffffffff810fe868>] vfs_ioctl+0x5e/0x77
>>>>>> [ 145.897786] [<ffffffff810fed7d>] do_vfs_ioctl+0x484/0x4d5
>>>>>> [ 145.898060] [<ffffffff810fee25>] sys_ioctl+0x57/0x7a
>>>>>> [ 145.898338] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>> [ 145.898614] Code: e8 5f f4 ff ff 49 8b 44 24 40 48 8b b8 80 03
>>>>>> 00 00 e8 64 75 2a 00 41 8b 54 24 60 44 89 f6 4c 89 e7 e8 b5 89 f7 ff 49 8b
>>>>>> 44 24 40<48> 8b b8 80 03 00 00 e8 23 74 2a 00 5b 41 5c 41 5d 41 5e c9 c3
>>>>>> [ 145.902008] RIP [<ffffffff81281e79>]
>>>>>> blktap_device_end_request+0x4e/0x63
>>>>>> [ 145.902321] RSP<ffff880064061cd8>
>>>>>> [ 145.902585] ---[ end trace 2800cfa5aa85ca0a ]---
>>>>>> [ 262.100689] BUG: spinlock lockup on CPU#4, vol_id/5181,
>>>>>> ffff88007c75f520
>>>>>> [ 262.100965] Pid: 5181, comm: vol_id Tainted: G D
>>>>>> 2.6.32.35 #1
>>>>>> [ 262.101232] Call Trace:
>>>>>> [ 262.101497] [<ffffffff81219eae>] _raw_spin_lock+0x101/0x12f
>>>>>> [ 262.101762] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>>> [ 262.102028] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>>> [ 262.102294] [<ffffffff8101019f>] ?
>>>>>> xen_restore_fl_direct_end+0x0/0x1
>>>>>> [ 262.102563] [<ffffffff811fa171>]
>>>>>> generic_make_request+0x258/0x2f4
>>>>>> [ 262.102832] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>>> [ 262.103099] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>>> [ 262.103366] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>>> [ 262.103631] [<ffffffff8111448f>]
>>>>>> block_read_full_page+0x246/0x264
>>>>>> [ 262.103898] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>>> [ 262.104165] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>>> [ 262.104433] [<ffffffff810ba73d>] ?
>>>>>> add_to_page_cache_locked+0xa0/0xca
>>>>>> [ 262.104702] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>>> [ 262.104972] [<ffffffff810c1d36>]
>>>>>> __do_page_cache_readahead+0x144/0x177
>>>>>> [ 262.105240] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>>> [ 262.105507] [<ffffffff810c20d7>]
>>>>>> page_cache_sync_readahead+0x38/0x3a
>>>>>> [ 262.105778] [<ffffffff810bb833>]
>>>>>> generic_file_aio_read+0x24c/0x5c1
>>>>>> [ 262.106045] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>>> [ 262.106315] [<ffffffff81068a02>] ?
>>>>>> autoremove_wake_function+0x0/0x38
>>>>>> [ 262.106584] [<ffffffff811d701c>] ?
>>>>>> selinux_file_permission+0x5c/0x10e
>>>>>> [ 262.106854] [<ffffffff811ce9c4>] ?
>>>>>> security_file_permission+0x11/0x13
>>>>>> [ 262.107120] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>>> [ 262.107385] [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>>> [ 262.107652] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>> [ 262.107918] sending NMI to all CPUs:
>>>>>> [ 262.108189] BUG: unable to handle kernel paging request at
>>>>>> ffffffffff5fb310
>>>>>> [ 262.108526] IP: [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>> [ 262.108832] PGD 1003067 PUD 1004067 PMD 18b7067 PTE 0
>>>>>> [ 262.109235] Oops: 0002 [#2] SMP
>>>>>> [ 262.109565] last sysfs file: /sys/class/blktap2/blktap1/dev
>>>>>> [ 262.109830] CPU 4
>>>>>> [ 262.110121] Modules linked in: bridge stp nfs fscache nfs_acl
>>>>>> auth_rpcgss arptable_filter arp_tables xt_esp ipt_ah xt_physdev xt_multiport
>>>>>> lockd sunrpc bonding dm_multipath bnx2 megaraid_sas
>>>>>> [ 262.111520] Pid: 5181, comm: vol_id Tainted: G D
>>>>>> 2.6.32.35 #1 PowerEdge M610
>>>>>> [ 262.112008] RIP: e030:[<ffffffff8102c7d1>] [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>> [ 262.112535] RSP: e02b:ffff88006778f968 EFLAGS: 00010086
>>>>>> [ 262.112800] RAX: 00000000ff000000 RBX: ffffffff81790060 RCX:
>>>>>> 00000000000160a0
>>>>>> [ 262.113068] RDX: ffff88001898e000 RSI: 0000000000000002 RDI:
>>>>>> ffffffff81816020
>>>>>> [ 262.113337] RBP: ffff88006778f988 R08: 0000000000000000 R09:
>>>>>> 0000000000000004
>>>>>> [ 262.113605] R10: 0000000000000002 R11: 0000000000000004 R12:
>>>>>> 0000000000000002
>>>>>> [ 262.113877] R13: 0000000000000800 R14: 00000000000000ff R15:
>>>>>> 0000000000000000
>>>>>> [ 262.114149] FS: 00007fa78bcc5710(0063)
>>>>>> GS:ffff88001898e000(0000) knlGS:0000000000000000
>>>>>> [ 262.114636] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>> [ 262.114902] CR2: ffffffffff5fb310 CR3: 00000000641b4000 CR4:
>>>>>> 0000000000002660
>>>>>> [ 262.115171] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>>>>> 0000000000000000
>>>>>> [ 262.115438] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>>>>> 0000000000000400
>>>>>> [ 262.115707] Process vol_id (pid: 5181, threadinfo
>>>>>> ffff88006778e000, task ffff88007db86250)
>>>>>> [ 262.116194] Stack:
>>>>>> [ 262.116451] 0000000000000000 0000000076e9ecd0 0000000000000000
>>>>>> 0000000076e9ecd0
>>>>>> [ 262.116825]<0> ffff88006778f998 ffffffff8102c841
>>>>>> ffff88006778f9b8 ffffffff81029f0d
>>>>>> [ 262.117485]<0> ffff88007c75f520 ffff88007c75f520
>>>>>> ffff88006778f9f8 ffffffff81219eb3
>>>>>> [ 262.118396] Call Trace:
>>>>>> [ 262.118657] [<ffffffff8102c841>] flat_send_IPI_all+0x1a/0x56
>>>>>> [ 262.118925] [<ffffffff81029f0d>]
>>>>>> arch_trigger_all_cpu_backtrace+0x45/0x66
>>>>>> [ 262.119195] [<ffffffff81219eb3>] _raw_spin_lock+0x106/0x12f
>>>>>> [ 262.119463] [<ffffffff815293e6>] _spin_lock_irq+0x1e/0x20
>>>>>> [ 262.119730] [<ffffffff811fcc14>] __make_request+0x5e/0x402
>>>>>> [ 262.119996] [<ffffffff8101019f>] ?
>>>>>> xen_restore_fl_direct_end+0x0/0x1
>>>>>> [ 262.120264] [<ffffffff811fa171>]
>>>>>> generic_make_request+0x258/0x2f4
>>>>>> [ 262.120532] [<ffffffff811156d8>] ? bio_init+0x18/0x32
>>>>>> [ 262.120799] [<ffffffff811fbec8>] submit_bio+0xd0/0xd9
>>>>>> [ 262.121066] [<ffffffff81111574>] submit_bh+0xf7/0x11a
>>>>>> [ 262.121333] [<ffffffff8111448f>]
>>>>>> block_read_full_page+0x246/0x264
>>>>>> [ 262.121602] [<ffffffff81117c13>] ? blkdev_get_block+0x0/0x4d
>>>>>> [ 262.121870] [<ffffffff815292c6>] ? _spin_unlock_irq+0x1e/0x20
>>>>>> [ 262.122137] [<ffffffff810ba73d>] ?
>>>>>> add_to_page_cache_locked+0xa0/0xca
>>>>>> [ 262.127766] [<ffffffff81116ef9>] blkdev_readpage+0x13/0x15
>>>>>> [ 262.128025] [<ffffffff810c1d36>]
>>>>>> __do_page_cache_readahead+0x144/0x177
>>>>>> [ 262.128288] [<ffffffff810c1f8f>] ondemand_readahead+0x126/0x18e
>>>>>> [ 262.128548] [<ffffffff810c20d7>]
>>>>>> page_cache_sync_readahead+0x38/0x3a
>>>>>> [ 262.128810] [<ffffffff810bb833>]
>>>>>> generic_file_aio_read+0x24c/0x5c1
>>>>>> [ 262.129070] [<ffffffff810f1808>] do_sync_read+0xe2/0x126
>>>>>> [ 262.129329] [<ffffffff81068a02>] ?
>>>>>> autoremove_wake_function+0x0/0x38
>>>>>> [ 262.129590] [<ffffffff811d701c>] ?
>>>>>> selinux_file_permission+0x5c/0x10e
>>>>>> [ 262.129851] [<ffffffff811ce9c4>] ?
>>>>>> security_file_permission+0x11/0x13
>>>>>> [ 262.130110] [<ffffffff810f1f7b>] vfs_read+0xab/0x167
>>>>>> [ 262.130368] [<ffffffff810f2374>] sys_read+0x47/0x70
>>>>>> [ 262.130624] [<ffffffff81013d02>] system_call_fastpath+0x16/0x1b
>>>>>> [ 262.130883] Code: 8b 05 b4 95 7e 00 83 fe 02 44 8b 68 34 75 0a
>>>>>> ff 90 58 01 00 00 eb 0e f3 90 8b 04 25 00 b3 5f ff f6 c4 10 75 f2 44 89 f0
>>>>>> c1 e0 18<89> 04 25 10 b3 5f ff 41 83 fc 02 74 08 44 89 e0 44 09 e8 eb 06
>>>>>> [ 262.133866] RIP [<ffffffff8102c7d1>]
>>>>>> flat_send_IPI_mask+0x6a/0xc0
>>>>>> [ 262.134164] RSP<ffff88006778f968>
>>>>>> [ 262.134419] CR2: ffffffffff5fb310
>>>>>> [ 262.134673] ---[ end trace 2800cfa5aa85ca0b ]---
>>>>>>
>>>>> Can you try to recompile your PVOPS kernel with
>>>>> CONFIG_DEBUG_PAGEALLOC=y?
>>>>>
>>>>> You can read more about this BUG at
>>>>>
>>>>> http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01756.html
>>>>>
>>>>> I initially hit this BUG sometime Dec 2010... ...
>>>>>
>>>>> http://lists.xensource.com/archives/html/xen-devel/2010-12/msg01501.html
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Kindest regards,
>>>>> Giam Teck Choon
>>>>
>>>> Sorry for the delayed answer. The problem I'm facing now is not
>>>> related to the mmu bug - that one I still am seeing at systems
>>>> with xen 4.0.2 / kernel 2.6.32.24. Newer kernels have new bugs,
>>>> much more troublesome, since I can not run a single VM instance.
>>>>
>>>> Adding DEBUG_PAGEALLOC, the main difference is that the system is
>>>> rebooting shortly after trying to start up a VM:
>>>>
>>>>
>>>> Ok. Sorry, didn't see your log message carefully previously.
>>>>
>>>>
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: I/O queue driver: lio
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'attach'
>>>> message (uuid = 0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'attach
>>>> response' message (uuid = 0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: received 'open'
>>>> message (uuid = 0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: block-aio
>>>> open('/storage5/linux-centos-5-64b-base-7253/hda')
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>>> open(/storage5/linux-centos-5-64b-base-7253/hda) with O_DIRECT
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: Image size:
>>>> pre sector_shift [134217728] post sector_shift [262144]
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: opened image
>>>> /storage5/linux-centos-5-64b-base-rip/hda (1 users, state:
>>>> 0x00000001, type: 0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: VBD CHAIN:
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]:
>>>> /storage5/linux-centos-5-64b-base/hda: 0
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15563]: sending 'open
>>>> response' message (uuid = 0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.644887] block tda:
>>>> sector-size: 512 capacity: 262144
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657328] general
>>>> protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657379] last sysfs
>>>> file: /sys/block/tda/removable
>>>>
>>>>
>>>> Just curious... ... what type of storage you are using for your VMs?
>>>>
>>>>
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657400] CPU 0
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657421] Modules
>>>> linked in: nfs fscache nfs_acl auth_rpcgss bridge stp ocfs2
>>>> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager
>>>> ocfs2_stackglue configfs arptable_filter arp_tables xt_esp ipt_ah
>>>> xt_physdev xt_multiport dm_round_robin lockd sunrpc crc32c bonding
>>>> iscsi_tcp libiscsi_tcp bnx2i libiscsi scsi_transport_iscsi cnic
>>>> uio dm_multipath bnx2 megaraid_sas
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657736] Pid: 15566,
>>>> comm: tapdisk2 Not tainted 2.6.32.36 #4 PowerEdge M610
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657763] RIP:
>>>> e030:[<ffffffff8129fb89>] [<ffffffff8129fb89>]
>>>> blktap_device_end_request+0x4e/0x63
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657808] RSP:
>>>> e02b:ffff88006da5dcd8 EFLAGS: 00010046
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657833] RAX:
>>>> 6b6b6b6b6b6b6b6b RBX: ffff88006566c000 RCX: 0000000000000000
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657860] RDX:
>>>> 0000000000000000 RSI: 0000000000000000 RDI: ffff88006d8c7980
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.657887] RBP:
>>>> ffff88006da5dcf8 R08: ffffffff817e66a0 R09: ffff88007d775790
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658136] R10:
>>>> ffffffff810ccfe4 R11: ffff8800280d8f60 R12: ffff88006720e7f8
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658384] R13:
>>>> ffff88006d8c77e0 R14: 0000000000000000 R15: ffff88006d8c77f0
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.658635] FS:
>>>> 00007f86d75a3730(0000) GS:ffff8800280c7000(0000)
>>>> knlGS:0000000000000000
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659107] CS: e033 DS:
>>>> 0000 ES: 0000 CR0: 000000008005003b
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659351] CR2:
>>>> 0000000045614ed8 CR3: 000000006d911000 CR4: 0000000000002660
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659600] DR0:
>>>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.659847] DR3:
>>>> 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660100] Process
>>>> tapdisk2 (pid: 15566, threadinfo ffff88006da5c000, task
>>>> ffff88007d7750f0)
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660574] Stack:
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.660807]
>>>> ffff88007d7750f0 0000000000000000 0000000000000000
>>>> 0000000000000000
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661074]<0>
>>>> ffff88006da5de88 ffffffff8129f2c0 ffffffff8100fedd 000000016da5ddc8
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.661574]<0>
>>>> 00000000ffffffff ffff88006d8c77e0 00000001816fab27 0000000000000000
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662287] Call Trace:
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662522]
>>>> [<ffffffff8129f2c0>] blktap_ring_ioctl+0x183/0x2d8
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.662767]
>>>> [<ffffffff8100fedd>] ? xen_force_evtchn_callback+0xd/0xf
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663012]
>>>> [<ffffffff811ebe9b>] ? inode_has_perm+0xa1/0xb3
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663260]
>>>> [<ffffffff8101064f>] ? xen_restore_fl_direct_end+0x0/0x1
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663509]
>>>> [<ffffffff81081d84>] ? lock_release+0x1b8/0x1c3
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663756]
>>>> [<ffffffff81233d2c>] ? _raw_spin_unlock+0xab/0xb2
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.663999]
>>>> [<ffffffff8155618d>] ? _spin_unlock+0x26/0x2a
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664248]
>>>> [<ffffffff81134aa1>] ? aio_read_evt+0x87/0x13a
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664493]
>>>> [<ffffffff81134b36>] ? aio_read_evt+0x11c/0x13a
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664740]
>>>> [<ffffffff81233ecc>] ? _raw_spin_lock+0x77/0x12f
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.664986]
>>>> [<ffffffff811ec054>] ? file_has_perm+0xb4/0xc6
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665237]
>>>> [<ffffffff81112444>] vfs_ioctl+0x5e/0x77
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665480]
>>>> [<ffffffff81112959>] do_vfs_ioctl+0x484/0x4d5
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665726]
>>>> [<ffffffff81112a01>] sys_ioctl+0x57/0x7a
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.665970]
>>>> [<ffffffff81013d42>] system_call_fastpath+0x16/0x1b
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.666214] Code: e8 61
>>>> f4 ff ff 49 8b 44 24 40 48 8b b8 70 04 00 00 e8 79 67 2b 00 41 8b
>>>> 54 24 60 44 89 f6 4c 89 e7 e8 76 3b f7 ff 49 8b 44 24 40<48> 8b
>>>> b8 70 04 00 00 e8 1d 65 2b 00 5b 41 5c 41 5d 41 5e c9 c3
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667320] RIP
>>>> [<ffffffff8129fb89>] blktap_device_end_request+0x4e/0x63
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.667577] RSP
>>>> <ffff88006da5dcd8>
>>>> Apr 8 12:00:43 r2b16ch2x28p2 kernel: [ 8879.668069] ---[ end
>>>> trace da218b929afc63f7 ]---
>>>> Apr 8 12:00:43 r2b16ch2x28p2 tapdisk2[15584]: I/O queue driver: lio
>>>> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>>> tap-err:tap_ctl_read_message: failure reading message
>>>> Apr 8 12:00:48 r2b16ch2x28p2 tap-ctl:
>>>> tap-err:tap_ctl_send_and_receive: failed to receive 'unknown' message
>>>>
>>>>
>>>> Looks like it is related to blktap/blktap2 drivers related issue
>>>> to me... so you are right... ... this is different BUG from what I
>>>> encountered as the logs are different. Sorry, didn't read your
>>>> log carefully before replying previously. Have you try to use
>>>> normal LVM for your VM to reproduce the BUG as I guess you are
>>>> using different storage?
>>>>
>>>> Thanks.
>>>>
>>>> Kindest regards,
>>>> Giam Teck Choon
>>>
>>> --
>>>
>>> *Gerd Jakobovitsch
>>> Engenheiro de Produto **
>>> ---------------------------------------------------------* *
>>> **ALOG Data Centers do Brasil**
>>> **Excelência em Projetos de Hosting*
>>> Rua Dr. Miguel Couto, 58 -- 01008-010 -- São Paulo - SP
>>> Telefone: (11) 3524-4970 / (11) 7152-0815
>>> *http://www.alog.com.br*<http://www.alog.com.br/>
>>>
>>>
>>> *"Como estão nossos serviços? Clique aqui
>>> <https://www.surveymonkey.com/s/faleconosco> e nos conte. Queremos
>>> escutar a sua opinião!"*
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-04-12 16:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-30 14:14 BUG at xen4.1/kernel 2.6.32.35 at a CentOS 5.5 when starting a VM Gerd Jakobovitsch
2011-03-31 2:44 ` Teck Choon Giam
2011-04-08 15:06 ` Gerd Jakobovitsch
2011-04-08 15:44 ` Teck Choon Giam
2011-04-08 20:51 ` Gerd Jakobovitsch
2011-04-08 21:38 ` Gerd Jakobovitsch
2011-04-12 10:59 ` Konrad Rzeszutek Wilk
2011-04-12 11:19 ` Gerd Jakobovitsch
2011-04-12 16:41 ` Teck Choon Giam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).