From: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table.c:583
Date: Mon, 14 Jun 2010 14:44:36 +0200 [thread overview]
Message-ID: <4C162434.5050708@nets.rwth-aachen.de> (raw)
In-Reply-To: <4C161FFF.4050102@nets.rwth-aachen.de>
Am 14.06.2010 14:26, schrieb Arnd Hannemann:
> Hi,
>
> Am 14.06.2010 12:57, schrieb Stefano Stabellini:
>> On Mon, 14 Jun 2010, Arnd Hannemann wrote:
>>> Hi,
>>>
>>> we have regular but hard to reproduce (wait for a day or two starting domUs) kernel panics (see below) with latest
>>> "xen/stable-2.6.32.x" git tree.
>>>
>>> Any idea, anyone?
>>>
>>
>> this CS from origin/xen/dom0/gntdev should fix your problem:
>>
>> sstabellini@kaball-desktop:~/xensource/linux-pvops-latest$ git show ad469f0da31bc16b945f9a06710b9d45434d0091
>> commit ad469f0da31bc16b945f9a06710b9d45434d0091
>> Author: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
>> Date: Wed Jun 9 12:34:02 2010 -0700
>>
>> xen/gntdev: use spinlocks rather than rwsem for locking
>>
>> The mmu notifier mechanism calls its callbacks with an rcu lock,
>> which disables preemption. This means we cannot use any blocking
>> synchronization for locking.
>>
>> Convert all the rwsemas to plain spinlocks. This requires that
>> the memory allocation and copying to/from userspace be split
>> from the actual datastructure updates since they can't be done
>> under spinlock.
>>
>> Signed-off-by: Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>
>> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
>>
>
> Unfortunately, this patch does not seem to help. We get a very similar
> backtrace after one hour stress testing with a script starting and stopping
> domUs in a loop.
>
> Maybe the problem is the hypervisor itself?
> We are currently using 4.0.1-rc2-pre (we updated from 4.0.0 because of what we believed was the same
> problem, we had no working netconsole back then though).
FYI: I got lucky and reproduced the error within only 15 minutes and hypervisor version:
(XEN) Xen version 4.0.1-rc3-pre (samsel@umic-mesh.net) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) Mon Jun 14 12:43:49 CEST 2010
(XEN) Latest ChangeSet: Fri Jun 11 14:04:36 2010 +0100 21203:3903d95733f7
traceback below
Jun 14 14:38:14 vmhost2 [ 201.636188] ------------[ cut here ]------------
Jun 14 14:38:14 vmhost2 [ 201.636272] kernel BUG at drivers/xen/grant-table.c:583!
Jun 14 14:38:14 vmhost2 [ 201.636345] invalid opcode: 0000 [#1]
Jun 14 14:38:14 vmhost2 SMP
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.636503] last sysfs file: /sys/devices/virtual/net/br0/bridge/topology_change_detected
Jun 14 14:38:14 vmhost2 [ 201.636596] Modules linked in:
Jun 14 14:38:14 vmhost2 netconsole
Jun 14 14:38:14 vmhost2 raid0
Jun 14 14:38:14 vmhost2 md_mod
Jun 14 14:38:14 vmhost2 rtc_cmos
Jun 14 14:38:14 vmhost2 rtc_core
Jun 14 14:38:14 vmhost2 rtc_lib
Jun 14 14:38:14 vmhost2 thermal
Jun 14 14:38:14 vmhost2 processor
Jun 14 14:38:14 vmhost2 ipv6
Jun 14 14:38:14 vmhost2 thermal_sys
Jun 14 14:38:14 vmhost2 hwmon
Jun 14 14:38:14 vmhost2 button
Jun 14 14:38:14 vmhost2 acpi_processor
Jun 14 14:38:14 vmhost2 sr_mod
Jun 14 14:38:14 vmhost2 pl2303
Jun 14 14:38:14 vmhost2 cdrom
Jun 14 14:38:14 vmhost2 usbserial
Jun 14 14:38:14 vmhost2 evdev
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.637553]
Jun 14 14:38:14 vmhost2 [ 201.637619] Pid: 0, comm: swapper Not tainted (2.6.32.15-xen4.0.0-dom0-stefano #2) System Product Name
Jun 14 14:38:14 vmhost2 [ 201.637715] EIP: 0061:[<c120f170>] EFLAGS: 00010282 CPU: 0
Jun 14 14:38:14 vmhost2 [ 201.637792] EIP is at gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:38:14 vmhost2 [ 201.637864] EAX: ffffffea EBX: c153be84 ECX: 00000001 EDX: 00000000
Jun 14 14:38:14 vmhost2 [ 201.637937] ESI: 00007ff0 EDI: 0000000f EBP: c290d120 ESP: c153be50
Jun 14 14:38:14 vmhost2 [ 201.638022] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Jun 14 14:38:14 vmhost2 [ 201.638096] Process swapper (pid: 0, ti=c153a000 task=c1543760 task.ti=c153a000)
Jun 14 14:38:14 vmhost2 [ 201.638187] Stack:
Jun 14 14:38:14 vmhost2 [ 201.638251] 00000000
Jun 14 14:38:14 vmhost2 00213e1c
Jun 14 14:38:14 vmhost2 c28f20c0
Jun 14 14:38:14 vmhost2 0002c189
Jun 14 14:38:14 vmhost2 ec189000
Jun 14 14:38:14 vmhost2 ecd95944
Jun 14 14:38:14 vmhost2 0000000f
Jun 14 14:38:14 vmhost2 ec189000
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.638634] <0>
Jun 14 14:38:14 vmhost2 00000000
Jun 14 14:38:14 vmhost2 eb406000
Jun 14 14:38:14 vmhost2 00000000
Jun 14 14:38:14 vmhost2 0000000f
Jun 14 14:38:14 vmhost2 ece40000
Jun 14 14:38:14 vmhost2 13e1c001
Jun 14 14:38:14 vmhost2 00000000
Jun 14 14:38:14 vmhost2 0002c189
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.639115] <0>
Jun 14 14:38:14 vmhost2 00000000
Jun 14 14:38:14 vmhost2 c1627a8c
Jun 14 14:38:14 vmhost2 c16277c8
Jun 14 14:38:14 vmhost2 c1627a8c
Jun 14 14:38:14 vmhost2 000068c4
Jun 14 14:38:14 vmhost2 c12200c1
Jun 14 14:38:14 vmhost2 00000000
Jun 14 14:38:14 vmhost2 ebce8000
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.639655] Call Trace:
Jun 14 14:38:14 vmhost2 [ 201.639729] [<c12200c1>] ? net_tx_action+0x1d1/0x9b0
Jun 14 14:38:14 vmhost2 [ 201.639805] [<c135e4e0>] ? process_backlog+0x90/0xa0
Jun 14 14:38:14 vmhost2 [ 201.639882] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 14 14:38:14 vmhost2 [ 201.639956] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 14 14:38:14 vmhost2 [ 201.640032] [<c1210057>] ? __xen_evtchn_do_upcall+0xd7/0x160
Jun 14 14:38:14 vmhost2 [ 201.640108] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 14 14:38:14 vmhost2 [ 201.640184] [<c121063a>] ? xen_evtchn_do_upcall+0x2a/0x40
Jun 14 14:38:14 vmhost2 [ 201.640261] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 14 14:38:14 vmhost2 [ 201.640336] [<c10013a7>] ? hypercall_page+0x3a7/0x1010
Jun 14 14:38:14 vmhost2 [ 201.640411] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 14 14:38:14 vmhost2 [ 201.640486] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 14 14:38:14 vmhost2 [ 201.640560] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 14 14:38:14 vmhost2 [ 201.640635] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 14 14:38:14 vmhost2 [ 201.640710] [<c1578367>] ? unknown_bootoption+0x0/0x190
Jun 14 14:38:14 vmhost2 [ 201.640786] [<c157b0e6>] ? xen_start_kernel+0x624/0x62c
Jun 14 14:38:14 vmhost2 [ 201.640857] Code:
Jun 14 14:38:14 vmhost2 8d
Jun 14 14:38:14 vmhost2 5c
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 34
Jun 14 14:38:14 vmhost2 c1
Jun 14 14:38:14 vmhost2 e0
Jun 14 14:38:14 vmhost2 0c
Jun 14 14:38:14 vmhost2 83
Jun 14 14:38:14 vmhost2 c8
Jun 14 14:38:14 vmhost2 01
Jun 14 14:38:14 vmhost2 89
Jun 14 14:38:14 vmhost2 44
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 34
Jun 14 14:38:14 vmhost2 8b
Jun 14 14:38:14 vmhost2 44
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 0c
Jun 14 14:38:14 vmhost2 c7
Jun 14 14:38:14 vmhost2 44
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 40
Jun 14 14:38:14 vmhost2 00
Jun 14 14:38:14 vmhost2 00
Jun 14 14:38:14 vmhost2 00
Jun 14 14:38:14 vmhost2 00
Jun 14 14:38:14 vmhost2 89
Jun 14 14:38:14 vmhost2 44
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 3c
Jun 14 14:38:14 vmhost2 e8
Jun 14 14:38:14 vmhost2 b8
Jun 14 14:38:14 vmhost2 1e
Jun 14 14:38:14 vmhost2 df
Jun 14 14:38:14 vmhost2 ff
Jun 14 14:38:14 vmhost2 85
Jun 14 14:38:14 vmhost2 c0
Jun 14 14:38:14 vmhost2 0f
Jun 14 14:38:14 vmhost2 84
Jun 14 14:38:14 vmhost2 2c
Jun 14 14:38:14 vmhost2 ff
Jun 14 14:38:14 vmhost2 ff
Jun 14 14:38:14 vmhost2 ff
Jun 14 12:38:13 vmhost2 unparseable log message: "<0f> "
Jun 14 14:38:14 vmhost2 0b
Jun 14 14:38:14 vmhost2 eb
Jun 14 14:38:14 vmhost2 fe
Jun 14 14:38:14 vmhost2 0f
Jun 14 14:38:14 vmhost2 0b
Jun 14 14:38:14 vmhost2 eb
Jun 14 14:38:14 vmhost2 fe
Jun 14 14:38:14 vmhost2 0f
Jun 14 14:38:14 vmhost2 0b
Jun 14 14:38:14 vmhost2 eb
Jun 14 14:38:14 vmhost2 fe
Jun 14 14:38:14 vmhost2 0f
Jun 14 14:38:14 vmhost2 0b
Jun 14 14:38:14 vmhost2 eb
Jun 14 14:38:14 vmhost2 fe
Jun 14 14:38:14 vmhost2 8b
Jun 14 14:38:14 vmhost2 54
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 04
Jun 14 14:38:14 vmhost2 8b
Jun 14 14:38:14 vmhost2 44
Jun 14 14:38:14 vmhost2 24
Jun 14 14:38:14 vmhost2 0c
Jun 14 14:38:14 vmhost2 e8
Jun 14 14:38:14 vmhost2
Jun 14 14:38:14 vmhost2 [ 201.643843] EIP: [<c120f170>]
Jun 14 14:38:14 vmhost2 gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:38:14 vmhost2 SS:ESP 0069:c153be50
Jun 14 14:38:14 vmhost2 [ 201.644028] ---[ end trace af6399fb7ba91a18 ]---
Jun 14 14:38:14 vmhost2 [ 201.644098] Kernel panic - not syncing: Fatal exception in interrupt
Jun 14 14:38:14 vmhost2 [ 201.644173] Pid: 0, comm: swapper Tainted: G D 2.6.32.15-xen4.0.0-dom0-stefano #2
Jun 14 14:38:14 vmhost2 [ 201.644265] Call Trace:
Jun 14 14:38:14 vmhost2 [ 201.644336] [<c141d3b5>] ? panic+0x42/0xe1
Jun 14 14:38:14 vmhost2 [ 201.644408] [<c100cc56>] ? oops_end+0x96/0xa0
Jun 14 14:38:14 vmhost2 [ 201.644481] [<c100a73f>] ? do_invalid_op+0x7f/0x90
Jun 14 14:38:14 vmhost2 [ 201.644555] [<c120f170>] ? gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:38:14 vmhost2 [ 201.644632] [<c13de9b0>] ? br_nf_pre_routing_finish+0x0/0x310
Jun 14 14:38:14 vmhost2 [ 201.644709] [<c137ae82>] ? nf_hook_slow+0x62/0xe0
Jun 14 14:38:14 vmhost2 [ 201.644784] [<c10741e4>] ? __alloc_pages_nodemask+0xe4/0x5b0
Jun 14 14:38:14 vmhost2 [ 201.644860] [<c106271d>] ? handle_IRQ_event+0x5d/0xc0
Jun 14 14:38:14 vmhost2 [ 201.644935] [<c141faa6>] ? error_code+0x66/0x6c
Jun 14 14:38:14 vmhost2 [ 201.645009] [<c137007b>] ? dev_graft_qdisc+0x5b/0x70
Jun 14 14:38:14 vmhost2 [ 201.645083] [<c100a6c0>] ? do_invalid_op+0x0/0x90
Jun 14 14:38:14 vmhost2 [ 201.645157] [<c120f170>] ? gnttab_copy_grant_page+0x1f0/0x260
Jun 14 14:38:14 vmhost2 [ 201.645234] [<c12200c1>] ? net_tx_action+0x1d1/0x9b0
Jun 14 14:38:14 vmhost2 [ 201.645308] [<c135e4e0>] ? process_backlog+0x90/0xa0
Jun 14 14:38:14 vmhost2 [ 201.645382] [<c103bc2e>] ? tasklet_action+0x9e/0xb0
Jun 14 14:38:14 vmhost2 [ 201.645455] [<c103c378>] ? __do_softirq+0x88/0x110
Jun 14 14:38:14 vmhost2 [ 201.645529] [<c1210057>] ? __xen_evtchn_do_upcall+0xd7/0x160
Jun 14 14:38:14 vmhost2 [ 201.645604] [<c103c43d>] ? do_softirq+0x3d/0x40
Jun 14 14:38:14 vmhost2 [ 201.645677] [<c121063a>] ? xen_evtchn_do_upcall+0x2a/0x40
Jun 14 14:38:14 vmhost2 [ 201.645754] [<c1009da7>] ? xen_do_upcall+0x7/0xc
Jun 14 14:38:14 vmhost2 [ 201.645830] [<c10013a7>] ? hypercall_page+0x3a7/0x1010
Jun 14 14:38:14 vmhost2 [ 201.645904] [<c10061ef>] ? xen_safe_halt+0xf/0x20
Jun 14 14:38:14 vmhost2 [ 201.645989] [<c100382c>] ? xen_idle+0x1c/0x30
Jun 14 14:38:14 vmhost2 [ 201.646063] [<c10081fa>] ? cpu_idle+0x3a/0x60
Jun 14 14:38:14 vmhost2 [ 201.646139] [<c15787ef>] ? start_kernel+0x2c6/0x2cb
Jun 14 14:38:14 vmhost2 [ 201.646213] [<c1578367>] ? unknown_bootoption+0x0/0x190
Jun 14 14:38:14 vmhost2 [ 201.646288] [<c157b0e6>] ? xen_start_kernel+0x624/0x62c
next prev parent reply other threads:[~2010-06-14 12:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-14 7:53 xen dom0 2.6.32.15 kernel BUG at drivers/xen/grant-table.c:583 Arnd Hannemann
2010-06-14 10:56 ` Jeremy Fitzhardinge
2010-06-14 11:04 ` Arnd Hannemann
2010-06-14 10:57 ` Stefano Stabellini
2010-06-14 11:09 ` Arnd Hannemann
2010-06-14 12:26 ` Arnd Hannemann
2010-06-14 12:44 ` Arnd Hannemann [this message]
-- strict thread matches above, loose matches on Subject: below --
2010-06-21 8:37 Christian Samsel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C162434.5050708@nets.rwth-aachen.de \
--to=hannemann@nets.rwth-aachen.de \
--cc=stefano.stabellini@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.