* Reproducible system crash
@ 2005-03-15 14:36 ` Peter Joanes
0 siblings, 0 replies; 15+ messages in thread
From: Peter Joanes @ 2005-03-15 14:36 UTC (permalink / raw)
To: xen-devel
Hello,
My system instantly reboots everytime a particular program is run as a user
(unpriviledged) process in dom0. I haven't tried running it under a domU
kernel yet, and, of course, can't even be sure that it is a xen-specific
problem.
I'm using Xen 2.0.4 with kernel 2.6.10 on a recent Gentoo installation (gcc
3.4.3).
Unfortunately the program at issue is proprietary - it's IE6 running on wine
(v. 20050211).
I had copied over a previously working installation (user's ".wine"
directory), but I get the same result at the end of the ie6setup install
stage of the instructions at:
http://frankscorner.org/index.php?p=ie6
I'd be grateful for any suggestions of ways to debug this.
- Peter Joanes.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Reproducible system crash
@ 2005-03-15 16:04 Ian Pratt
2005-03-15 16:30 ` Peter Joanes
0 siblings, 1 reply; 15+ messages in thread
From: Ian Pratt @ 2005-03-15 16:04 UTC (permalink / raw)
To: Peter Joanes, xen-devel; +Cc: ian.pratt
> My system instantly reboots everytime a particular program is
> run as a user
> (unpriviledged) process in dom0.
Are you running this as root or is the binary suid root?
Ian
I haven't tried running it
> under a domU
> kernel yet, and, of course, can't even be sure that it is a
> xen-specific
> problem.
> I'm using Xen 2.0.4 with kernel 2.6.10 on a recent Gentoo
> installation (gcc
> 3.4.3).
>
> Unfortunately the program at issue is proprietary - it's IE6
> running on wine
> (v. 20050211).
> I had copied over a previously working installation (user's ".wine"
> directory), but I get the same result at the end of the
> ie6setup install
> stage of the instructions at:
> http://frankscorner.org/index.php?p=ie6
>
> I'd be grateful for any suggestions of ways to debug this.
>
>
> - Peter Joanes.
>
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from
> real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-15 16:30 ` Peter Joanes
0 siblings, 0 replies; 15+ messages in thread
From: Peter Joanes @ 2005-03-15 16:30 UTC (permalink / raw)
To: xen-devel
On Tuesday 15 March 2005 16:04, Ian Pratt wrote:
> > My system instantly reboots everytime a particular program is
> > run as a user
> > (unpriviledged) process in dom0.
>
> Are you running this as root or is the binary suid root?
No it isn't. One thing that I had thought must be coincidental, but that might
be worth mentioning anyway was that at the end of the IE6 install, the
message said that the computer needed to be restarted, and it was upon
clicking away that box that the whole machine rebooted. Is it possible that
the Wine environment is prevented from trapping a call that does this?
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Reproducible system crash
@ 2005-03-15 16:44 Ian Pratt
0 siblings, 0 replies; 15+ messages in thread
From: Ian Pratt @ 2005-03-15 16:44 UTC (permalink / raw)
To: Peter Joanes, xen-devel; +Cc: ian.pratt
> > > My system instantly reboots everytime a particular program is
> > > run as a user
> > > (unpriviledged) process in dom0.
> >
> > Are you running this as root or is the binary suid root?
>
> No it isn't. One thing that I had thought must be
> coincidental, but that might
> be worth mentioning anyway was that at the end of the IE6
> install, the
> message said that the computer needed to be restarted, and it
> was upon
> clicking away that box that the whole machine rebooted. Is it
> possible that
> the Wine environment is prevented from trapping a call that does this?
I'm not entirely surprised that wine doesn't run, but I am concerned
about it crashing the machine if it is indeed non root. Could you try
running it under strace to see what it is doing?
Thanks,
Ian
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-16 14:29 ` Peter Joanes
0 siblings, 0 replies; 15+ messages in thread
From: Peter Joanes @ 2005-03-16 14:29 UTC (permalink / raw)
To: xen-devel
On Tuesday 15 March 2005 20:12, Ian Pratt wrote:
> Do you know if wine uses vm86 mode?
Google seems to think so.
> Can you get a serial line on the machines?
Yes. [Actually this took a little while because I didn't realise that Xen
manages the serial port, so the dom0 8250 module can't be loaded.]
Here's the end of the output (and I have the whole capture if that would
help):
writev(4, [{"^\0\0\0\2\0\0\0D\0\0\0\24\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64},
{"*\0", 2}], 2) = 66
read(5, "\0\0\0\0 \0\0\0\1\0\0\0 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64) = 64
read(5, "n\0a\0t\0i\0v\0e\0,\0 \0b\0u\0i\0l\0t\0i\0n\0\0\0", 32) = 32
rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No
suchfile or directory)
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
(XEN) BUG at domain.c:142
(XEN) CPU: 0
(XEN) EIP: 0808:[<fc505a0e>]
(XEN) EFLAGS: 00011296
(XEN) eax: 00000000 ebx: fc5fd9e0 ecx: 00000000 edx: 00000000
(XEN) esi: 000e0003 edi: c010a960 ebp: 77acdfa0 esp: fc503f9c
(XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810
(XEN) Stack trace from ESP=fc503f9c:
(XEN) fc52a478 fc52a4ce 0000008e 000e0003 [fc505930] fc5fd9e0 000e0003
77ef5ca4
(XEN) 0000007b 00000000 ffffffff c010a960 77acdfa0 ffffffff 000e0003
c010990a
(XEN) 00000061 00011296 d2278000 00000069 0000007b 0000007b 0000003b
00000033
(XEN) fc5fd9e0
(XEN) Call Trace from ESP=fc503f9c: [<fc505930>]
****************************************
CPU0 FATAL TRAP: vector = 6 (invalid operand)
[error_code=0000]
Aieee! CPU0 is toast...
****************************************
Reboot in five seconds...
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Reproducible system crash
@ 2005-03-16 16:08 Ian Pratt
2005-03-16 16:22 ` Keir Fraser
2005-03-16 18:00 ` Peter Joanes
0 siblings, 2 replies; 15+ messages in thread
From: Ian Pratt @ 2005-03-16 16:08 UTC (permalink / raw)
To: Peter Joanes, xen-devel; +Cc: ian.pratt
> -----Original Message-----
> From: xen-devel-admin@lists.sourceforge.net
> [mailto:xen-devel-admin@lists.sourceforge.net] On Behalf Of
> Peter Joanes
> Sent: 16 March 2005 14:30
> To: xen-devel@lists.sourceforge.net
> Subject: Re: [Xen-devel] Reproducible system crash
>
> On Tuesday 15 March 2005 20:12, Ian Pratt wrote:
> > Do you know if wine uses vm86 mode?
> Google seems to think so.
>
> > Can you get a serial line on the machines?
> Yes. [Actually this took a little while because I didn't
> realise that Xen
> manages the serial port, so the dom0 8250 module can't be loaded.]
> Here's the end of the output (and I have the whole capture if
> that would
> help):
It looks like the domain 0 kernel is trying to do something illegal, and
Xen is killing it. It would be interesting to know whether this still
happens with the unstable.bk tree.
Please can you try this again, but this time building xen with debug=y
i.e.
cd xen; make clean; make debug=y; cd .. ; make dist
We might learn a little more about what's going wrong.
Thanks,
Ian
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id\x14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
2005-03-16 16:08 Reproducible system crash Ian Pratt
@ 2005-03-16 16:22 ` Keir Fraser
2005-03-16 18:39 ` Peter Joanes
2005-03-16 19:35 ` Kip Macy
2005-03-16 18:00 ` Peter Joanes
1 sibling, 2 replies; 15+ messages in thread
From: Keir Fraser @ 2005-03-16 16:22 UTC (permalink / raw)
To: Ian Pratt; +Cc: Peter Joanes, ian.pratt, xen-devel
On 16 Mar 2005, at 16:08, Ian Pratt wrote:
> It looks like the domain 0 kernel is trying to do something illegal,
> and
> Xen is killing it. It would be interesting to know whether this still
> happens with the unstable.bk tree.
>
> Please can you try this again, but this time building xen with debug=y
> i.e.
> cd xen; make clean; make debug=y; cd .. ; make dist
>
> We might learn a little more about what's going wrong.
The stack backtrace also indicates that the code at address 0xc010990a
in XenLinux may also be interesting (that is the code point that caused
the crash).
'objdump -d vmlinux' and then grep the output for that address....
-- Keir
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-16 18:00 ` Peter Joanes
2005-03-16 18:20 ` Keir Fraser
0 siblings, 1 reply; 15+ messages in thread
From: Peter Joanes @ 2005-03-16 18:00 UTC (permalink / raw)
To: xen-devel
On Wednesday 16 March 2005 16:08, Ian Pratt wrote:
> Please can you try this again, but this time building xen with debug=y
I have used 2.0-testing for this, with the same dom0 kernel as before -- I
haven't got round to patching my kernel source tree again yet so that it'll
work with a newer xen.
This was the output this time:
rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No
suchfile or directory)
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
(XEN) DOM0: (file=memory.c, line=409) Bad L1 type settings 0100
(XEN) DOM0: (file=memory.c, line=1759) ptwr: Could not re-validate l1 page
(XEN)
(XEN) BUG at domain.c:143
(XEN) CPU: 0
(XEN) EIP: 0808:[<fc505e0e>]
(XEN) EFLAGS: 00011292
(XEN) eax: 00000000 ebx: fcff99e0 ecx: 00000000 edx: 00000000
(XEN) esi: 00000010 edi: ffb51000 ebp: 00000004 esp: fc503e84
(XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810
(XEN) Stack trace from ESP=fc503e84:
(XEN) fc5323f8 fc532497 0000008f 00000000 fcff0000 00000000 00000010
[fc521b76]
(XEN) ffb50000 00000000 000006df [fc52279c] 00000000 c0386f08 00000008
80000002
(XEN) fc57a180 00000004 fcff0004 00000001 00000ffc e0000000 fcff99e0
fd45b740
(XEN) fcff9a54 00000010 fcff9a54 fcff99e0 00000000 ffb50000 00000000
d29d9ffd
(XEN) fef4a764 00000001 c0109929 07e27061 00001096 00049aa2 07e27061
d29d8ffc
(XEN) 00000010 00000000 00007e28 [fc52394d] 00000001 fc503f6c 00000000
000000b5
(XEN) [fc518117] f1ce2900 0000002e 000002da 00000010 07e28061 05b40000
00000000
(XEN) 00000001 feffbb68 07e27063 fcff99e0 d29d8ffc fc503fb8 00000000
[fc52c568]
(XEN) d29d8ffc 00000000 c010a960 77acdfa0 [fc511972] fc551b40 00000000
fcff99e0
(XEN) ffffffff c010a960 77acdfa0 [fc530c3e] fc503fb8 77ef5ca4 0000007b
00000000
(XEN) ffffffff c010a960 77acdfa0 d29d9010 000e0003 c0109b38 00000061
00011296
(XEN) d29d9000 00000069 0000007b 0000007b 0000003b 00000033 fcff99e0
(XEN) Call Trace from ESP=fc503e84: [<fc521b76>] [<fc52279c>] [<fc52394d>]
[<fc518117>] [<fc52c568>] [<fc511972>]
(XEN) [<fc530c3e>]
****************************************
CPU0 FATAL TRAP: vector = 6 (invalid operand)
[error_code=0000]
Aieee! CPU0 is toast...
****************************************
Reboot in five seconds...
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
2005-03-16 18:00 ` Peter Joanes
@ 2005-03-16 18:20 ` Keir Fraser
2005-03-16 19:07 ` Peter Joanes
0 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-03-16 18:20 UTC (permalink / raw)
To: Peter Joanes; +Cc: xen-devel
On 16 Mar 2005, at 18:00, Peter Joanes wrote:
> (XEN) DOM0: (file=memory.c, line=409) Bad L1 type settings 0100
> (XEN) DOM0: (file=memory.c, line=1759) ptwr: Could not re-validate l1
> page
> (XEN)
Someone is attempting to set _PAGE_GLOBAL bit in one of the page-table
entries.
XenLinux should never do this -- do you have any suspicious-looking
modules installed in your kernel that may need fixing? Alternatively we
could silently drop the bit rather than killing the guest. :-)
It would be interesting to know, if you kill the test that fails above
in the Xen code, whether your problems all go away. It's the test that
starts:
if ( unlikely(l1v & (_PAGE_GLOBAL|_PAGE_PAT)) )
-- Keir
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-16 18:39 ` Peter Joanes
2005-03-16 18:55 ` Keir Fraser
0 siblings, 1 reply; 15+ messages in thread
From: Peter Joanes @ 2005-03-16 18:39 UTC (permalink / raw)
To: xen-devel
On Wednesday 16 March 2005 16:22, Keir Fraser wrote:
> The stack backtrace also indicates that the code at address 0xc010990a
> in XenLinux may also be interesting (that is the code point that caused
> the crash).
> 'objdump -d vmlinux' and then grep the output for that address....
Here's what I think is the relevant section:
c0109900: 1e push %ds
c0109901: 50 push %eax
c0109902: 31 c0 xor %eax,%eax
c0109904: 55 push %ebp
c0109905: 57 push %edi
c0109906: 56 push %esi
c0109907: 52 push %edx
c0109908: 48 dec %eax
c0109909: 51 push %ecx
c010990a: 53 push %ebx
c010990b: fc cld
c010990c: 8c c1 mov %es,%ecx
c010990e: 8b 7c 24 20 mov 0x20(%esp),%edi
c0109912: 8b 54 24 24 mov 0x24(%esp),%edx
c0109916: 89 44 24 24 mov %eax,0x24(%esp)
c010991a: 89 4c 24 20 mov %ecx,0x20(%esp)
c010991e: b9 7b 00 00 00 mov $0x7b,%ecx
c0109923: 8e d9 mov %ecx,%ds
c0109925: 8e c1 mov %ecx,%es
c0109927: 89 e0 mov %esp,%eax
c0109929: 8b 35 9c cf 2f c0 mov 0xc02fcf9c,%esi
c010992f: 8a 5e 01 mov 0x1(%esi),%bl
c0109932: 88 5c 24 2e mov %bl,0x2e(%esp)
c0109936: ff d7 call *%edi
c0109938: e9 3b fd ff ff jmp 0xc0109678
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
2005-03-16 18:39 ` Peter Joanes
@ 2005-03-16 18:55 ` Keir Fraser
2005-03-17 12:33 ` Peter Joanes
0 siblings, 1 reply; 15+ messages in thread
From: Keir Fraser @ 2005-03-16 18:55 UTC (permalink / raw)
To: Peter Joanes; +Cc: xen-devel
On 16 Mar 2005, at 18:39, Peter Joanes wrote:
> On Wednesday 16 March 2005 16:22, Keir Fraser wrote:
>> The stack backtrace also indicates that the code at address 0xc010990a
>> in XenLinux may also be interesting (that is the code point that
>> caused
>> the crash).
>> 'objdump -d vmlinux' and then grep the output for that address....
>
> Here's what I think is the relevant section:
Not so useful unfortunately. Here's a slightly harder thing to try:
1. Edit linux/include/asm-xen/asm-i386/pgtable-2level.h, and change
set_pte_batched() to use xen_l1_pgentry_update rather than
queue_l1_pgentry_update. Also, kill current definition of set_pte() and
redefine it to the same as the new set_pte_batched().
2. Edit xen/arch/x86/memory.c, at the test that fails for you ('Bad L1
type'). In the error path, before returning 0, add:
extern void show_guest_stack(void);
show_guest_stack();
This should cause us to get a backtrace of the guest kernel state when
the bad PTE is written. If you send a link to that along with your
vmlinux file then we can work out where the _PAGE_GLOBAL bit is coming
from.
-- Keir
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-16 19:07 ` Peter Joanes
0 siblings, 0 replies; 15+ messages in thread
From: Peter Joanes @ 2005-03-16 19:07 UTC (permalink / raw)
To: xen-devel
On Wednesday 16 March 2005 18:20, Keir Fraser wrote:
> Someone is attempting to set _PAGE_GLOBAL bit in one of the page-table
> entries.
> XenLinux should never do this -- do you have any suspicious-looking
> modules installed in your kernel that may need fixing?
Nothing that doesn't come with the vanilla sources; lsmod says:
Module Size Used by
pppoatm 4640 1
nfsd 92328 9
exportfs 5024 1 nfsd
lockd 61416 2 nfsd
sunrpc 132676 12 nfsd,lockd
ipt_MASQUERADE 2528 1
iptable_nat 23240 2 ipt_MASQUERADE
ip_conntrack 40340 2 ipt_MASQUERADE,iptable_nat
ipt_REJECT 5632 9
iptable_filter 2752 1
ip_tables 16640 4
ipt_MASQUERADE,iptable_nat,ipt_REJECT,iptable_filter
bridge 45944 0
speedtch 10600 0
firmware_class 7552 1 speedtch
usb_atm 13360 2 speedtch
uhci_hcd 30736 0
usbcore 106616 4 speedtch,usb_atm,uhci_hcd
br2684 7236 0
atm 37528 5 pppoatm,usb_atm,br2684
ppp_generic 21396 5 pppoatm
slhc 6304 1 ppp_generic
loop 12968 0
> It would be interesting to know, if you kill the test that fails above
> in the Xen code, whether your problems all go away. It's the test that
> starts:
> if ( unlikely(l1v & (_PAGE_GLOBAL|_PAGE_PAT)) )
I commented out that part of memory.c, and it still crashes. This time the end
of the output is:
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No
suchfile or directory)
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
(XEN) (file=extable.c, line=71) Pre-exception: fc530a7e -> fc530b34
(XEN) (file=traps.c, line=463) Page fault: fc530b49 -> fc505d30
(XEN) BUG at domain.c:143
(XEN) CPU: 0
(XEN) EIP: 0808:[<fc505e0e>]
(XEN) EFLAGS: 00011296
(XEN) eax: 00000000 ebx: fcff99e0 ecx: 00000000 edx: 00000000
(XEN) esi: 000e0003 edi: c010a960 ebp: 77acdfa0 esp: fc503f9c
(XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810
(XEN) Stack trace from ESP=fc503f9c:
(XEN) fc5323c8 fc532467 0000008f 000e0003 [fc505d30] fcff99e0 000e0003
77ef5ca4
(XEN) 0000007b 00000000 ffffffff c010a960 77acdfa0 00000000 000e0003
c0109904
(XEN) 00000061 00011246 d2411000 00000069 0000007b 0000007b 0000003b
00000033
(XEN) fcff99e0
(XEN) Call Trace from ESP=fc503f9c: [<fc505d30>]
****************************************
CPU0 FATAL TRAP: vector = 6 (invalid operand)
[error_code=0000]
Aieee! CPU0 is toast...
****************************************
Reboot in five seconds...
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
2005-03-16 16:22 ` Keir Fraser
2005-03-16 18:39 ` Peter Joanes
@ 2005-03-16 19:35 ` Kip Macy
1 sibling, 0 replies; 15+ messages in thread
From: Kip Macy @ 2005-03-16 19:35 UTC (permalink / raw)
To: xen-devel, Peter Joanes
Real men don't use debuggers ;-) but for those of us who do:
gdb vmlinux
(gdb) x/i 0xc01099a
will do the same thing. If you have symbols compiled in, you can also do :
(gdb) info line *0xc01099a
and get the exact line of C code.
-Kip
> The stack backtrace also indicates that the code at address 0xc010990a
> in XenLinux may also be interesting (that is the code point that caused
> the crash).
> 'objdump -d vmlinux' and then grep the output for that address....
>
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
@ 2005-03-17 12:33 ` Peter Joanes
2005-03-17 13:17 ` Keir Fraser
0 siblings, 1 reply; 15+ messages in thread
From: Peter Joanes @ 2005-03-17 12:33 UTC (permalink / raw)
To: xen-devel
On Wednesday 16 March 2005 18:55, Keir wrote:
> ... Here's a slightly harder thing to try:
>
> 1. Edit linux/include/asm-xen/asm-i386/pgtable-2level.h, and change
> set_pte_batched() to use xen_l1_pgentry_update rather than
> queue_l1_pgentry_update. Also, kill current definition of set_pte() and
> redefine it to the same as the new set_pte_batched().
s/pgentry/entry/ in the above? -- That's what I've used.
> 2. Edit xen/arch/x86/memory.c, at the test that fails for you ('Bad L1
> type'). In the error path, before returning 0, add:
> extern void show_guest_stack(void);
> show_guest_stack();
I left out the definition, it was somewhere else.
> This should cause us to get a backtrace of the guest kernel state when
> the bad PTE is written. If you send a link to that along with your
> vmlinux file then we can work out where the _PAGE_GLOBAL bit is coming
> from.
You can temporarily get the vmlinux from <http://joanes.net/vmlinux>. The
output contains just a couple of extra lines:
open("/usr/lib/wine/apphelp.dll.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No
suchfile or directory)
rt_sigprocmask(SIG_BLOCK, NULL, [RTMIN], 8) = 0
(XEN) (file=extable.c, line=71) Pre-exception: fc530abe -> fc530b74
(XEN) DOM0: (file=memory.c, line=409) Bad L1 type settings 0100
(XEN) Guest EIP is c0109904
(XEN)
(XEN) DOM0: (file=memory.c, line=1760) ptwr: Could not re-validate l1 page
(XEN)
(XEN) BUG at domain.c:143
(XEN) CPU: 0
(XEN) EIP: 0808:[<fc505e0e>]
(XEN) EFLAGS: 00011292
(XEN) eax: 00000000 ebx: fcff99e0 ecx: 00000000 edx: 00000000
(XEN) esi: 00000010 edi: ff951000 ebp: 00000010 esp: fc503e84
(XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810
(XEN) Stack trace from ESP=fc503e84:
(XEN) fc532408 fc5324a7 0000008f 00000000 fcff0000 00000000 00000010
[fc521b86]
(XEN) ff950000 00000000 000006e0 [fc5227ac] 00000000 c0386f08 00000008
80000001
(XEN) ff913448 00000010 fcff0010 00000004 00000ff0 e0000000 fcff99e0
00000010
(XEN) fcff9a54 00000010 00000008 fcff99e0 00000000 ff950000 00000000
ce42affd
(XEN) fef390a8 00000001 c0109929 0c3d6061 00001096 0030b033 0c3d6061
ce429ffc
(XEN) 00000010 00000000 0000c3d7 [fc52395d] 00000001 fc503f6c 00000000
0000005c
(XEN) [fc518117] 2d47dca4 0000003e 000002e0 00000010 0c3d7061 05c00000
00000000
(XEN) 00000001 feffbb80 0c3d6063 fcff99e0 ce429ffc fc503fb8 ce521ff9
[fc52c578]
(XEN) ce429ffc 00000000 c010a960 77acdfa0 [fc511972] 000e0003 0000a960
fcff99e0
(XEN) ffffffff c010a960 77acdfa0 [fc530c4e] fc503fb8 77ef5ca4 0000007b
00000000
(XEN) ffffffff c010a960 77acdfa0 00000000 000e0003 c0109904 00000061
00011246
(XEN) ce42a000 00000069 0000007b 0000007b 0000003b 00000033 fcff99e0
(XEN) Call Trace from ESP=fc503e84: [<fc521b86>] [<fc5227ac>] [<fc52395d>]
[<fc518117>] [<fc52c578>] [<fc511972>]
(XEN) [<fc530c4e>]
****************************************
CPU0 FATAL TRAP: vector = 6 (invalid operand)
[error_code=0000]
Aieee! CPU0 is toast...
****************************************
Reboot in five seconds...
- Peter.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Reproducible system crash
2005-03-17 12:33 ` Peter Joanes
@ 2005-03-17 13:17 ` Keir Fraser
0 siblings, 0 replies; 15+ messages in thread
From: Keir Fraser @ 2005-03-17 13:17 UTC (permalink / raw)
To: Peter Joanes; +Cc: xen-devel
On 17 Mar 2005, at 12:33, Peter Joanes wrote:
> On Wednesday 16 March 2005 18:55, Keir wrote:
>> ... Here's a slightly harder thing to try:
>>
>> 1. Edit linux/include/asm-xen/asm-i386/pgtable-2level.h, and change
>> set_pte_batched() to use xen_l1_pgentry_update rather than
>> queue_l1_pgentry_update. Also, kill current definition of set_pte()
>> and
>> redefine it to the same as the new set_pte_batched().
> s/pgentry/entry/ in the above? -- That's what I've used
Yes.
>> 2. Edit xen/arch/x86/memory.c, at the test that fails for you ('Bad L1
>> type'). In the error path, before returning 0, add:
>> extern void show_guest_stack(void);
>> show_guest_stack();
> I left out the definition, it was somewhere else.
Okay.
The kernel is still using writable pagetables however (you can tell
because of the 'could not re-validate L1 page' error message). Perhaps
you ran the wrong kernel, or didn't redefine set_pte() correctly in the
right place?
-- Keir
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2005-03-17 13:17 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-16 16:08 Reproducible system crash Ian Pratt
2005-03-16 16:22 ` Keir Fraser
2005-03-16 18:39 ` Peter Joanes
2005-03-16 18:39 ` Peter Joanes
2005-03-16 18:55 ` Keir Fraser
2005-03-17 12:33 ` Peter Joanes
2005-03-17 12:33 ` Peter Joanes
2005-03-17 13:17 ` Keir Fraser
2005-03-16 19:35 ` Kip Macy
2005-03-16 18:00 ` Peter Joanes
2005-03-16 18:00 ` Peter Joanes
2005-03-16 18:20 ` Keir Fraser
2005-03-16 19:07 ` Peter Joanes
2005-03-16 19:07 ` Peter Joanes
[not found] <A95E2296287EAD4EB592B5DEEFCE0E9D1E36A2@liverpoolst.ad.cl.cam.ac.uk>
2005-03-16 14:29 ` Peter Joanes
2005-03-16 14:29 ` Peter Joanes
-- strict thread matches above, loose matches on Subject: below --
2005-03-15 16:44 Ian Pratt
2005-03-15 16:04 Ian Pratt
2005-03-15 16:30 ` Peter Joanes
2005-03-15 16:30 ` Peter Joanes
2005-03-15 14:36 Peter Joanes
2005-03-15 14:36 ` Peter Joanes
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.