* SIGILL in grub in guest on 4365
[not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com>
@ 2007-02-01 12:33 ` Gregory Haskins
[not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Gregory Haskins @ 2007-02-01 12:33 UTC (permalink / raw)
To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Hi All,
New to the list and project...hoping to make a meaningful contribution here someday :)
I am in the process of coming up to speed on the KVM project (very cool BTW). I found that it was extremely simple to get setup and running. During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest. Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal. Non-symbol stack traces indicate it was in the sync() call in libc. I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue.
My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host. This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU. Problem is, I am fairly stumped at this point trying to prove this is true.
So my questions are:
1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest?
2) If they do cause a host exception/exit, what is that path that would handle this?
I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers). I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO. My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first. Any pointers out there?
Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled. Its too early to say right now, of course.
Thanks!
-Greg
PS: Other than the grub issue, I have been successfully hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests.
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL in grub in guest on 4365
[not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
@ 2007-02-01 12:46 ` Avi Kivity
[not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2007-02-01 12:46 UTC (permalink / raw)
To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Gregory Haskins wrote:
> Hi All,
> New to the list and project...hoping to make a meaningful contribution here someday :)
>
> I am in the process of coming up to speed on the KVM project (very cool BTW). I found that it was extremely simple to get setup and running. During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest. Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal. Non-symbol stack traces indicate it was in the sync() call in libc. I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue.
>
> My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host. This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU. Problem is, I am fairly stumped at this point trying to prove this is true.
>
Well, you can't prove it's true, since it's false :)
> So my questions are:
>
> 1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest?
>
An illegal opcode in the guest is handled normally by generating #UD in
the guest, without host involvement at all.
> 2) If they do cause a host exception/exit, what is that path that would handle this?
>
> I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers). I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO. My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first. Any pointers out there?
>
> Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled. Its too early to say right now, of course.
>
My guess is that some horrible bug in the mmu is causing the guest to
jump to some random page and actually execute undefined opcodes.
[btw, running FC5's grub works as expected here]
> Thanks!
> -Greg
>
> PS: Other than the grub issue, I have been successfully hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests.
>
>
Great! I'm looking forward to that, as well as to your contributions.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL in grub in guest on 4365
[not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-01 12:58 ` Gregory Haskins
[not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Gregory Haskins @ 2007-02-01 12:58 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
>>> On Thu, Feb 1, 2007 at 7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>,
Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
> Gregory Haskins wrote:
>
>> 1) Is this how illegal- op would be handled, or would that stay entirely in
> the domain of the guest?
>>
>
> An illegal opcode in the guest is handled normally by generating #UD in
> the guest, without host involvement at all.
>
Ah, thank you. Sounds like I need to do some more reading in the Intel docs ;) Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort.
>
> My guess is that some horrible bug in the mmu is causing the guest to
> jump to some random page and actually execute undefined opcodes.
Yuk...that makese sense though. I can run sync() from other applications just fine so its probably dependent on the locations resolved by the dynamic linker. I have some ideas for a tool that might be helpful here. More to follow.
Thanks for the info and quick reply, BTW.
-Greg
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGILL in grub in guest on 4365
[not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
@ 2007-02-01 13:05 ` Avi Kivity
0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2007-02-01 13:05 UTC (permalink / raw)
To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Gregory Haskins wrote:
>>>> On Thu, Feb 1, 2007 at 7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>,
>>>>
> Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote:
>
>> Gregory Haskins wrote:
>>
>>
>>> 1) Is this how illegal- op would be handled, or would that stay entirely in
>>>
>> the domain of the guest?
>>
>>>
>>>
>> An illegal opcode in the guest is handled normally by generating #UD in
>> the guest, without host involvement at all.
>>
>>
>
> Ah, thank you. Sounds like I need to do some more reading in the Intel docs ;) Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort.
>
>
VT allows both methods. Basically you tell it for which exceptions you
want a vmexit.
In kvm, we only intercept page faults (for mmu virtualization), and, if
guest debugging is enabled, breakpoint exceptions. We let the guest
handle the rest.
--
error compiling committee.c: too many arguments to function
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-02-01 13:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <45C11A7E0200005A0001EE4E@mcclure.wal.novell.com>
[not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com>
2007-02-01 12:33 ` SIGILL in grub in guest on 4365 Gregory Haskins
[not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 12:46 ` Avi Kivity
[not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-01 12:58 ` Gregory Haskins
[not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 13:05 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox