public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* SIGILL in grub in guest on 4365
       [not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com>
@ 2007-02-01 12:33   ` Gregory Haskins
       [not found]     ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Gregory Haskins @ 2007-02-01 12:33 UTC (permalink / raw)
  To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


Hi All,
  New to the list and project...hoping to make a meaningful contribution here someday :)

I am in the process of coming up to speed on the KVM project (very cool BTW).  I found that it was extremely simple to get setup and running.  During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest.  Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal.  Non-symbol stack traces indicate it was in the sync() call in libc.  I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue.

My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host.  This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU.  Problem is, I am fairly stumped at this point trying to prove this is true.

So my questions are: 

1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest?  

2) If they do cause a host exception/exit, what is that path that would handle this?

I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers).  I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO.  My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first.  Any pointers out there?

Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled.  Its too early to say right now, of course.

Thanks!
-Greg

PS: Other than the grub issue, I have been successfully  hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests.


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SIGILL in grub in guest on 4365
       [not found]     ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
@ 2007-02-01 12:46       ` Avi Kivity
       [not found]         ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2007-02-01 12:46 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Gregory Haskins wrote:
> Hi All,
>   New to the list and project...hoping to make a meaningful contribution here someday :)
>
> I am in the process of coming up to speed on the KVM project (very cool BTW).  I found that it was extremely simple to get setup and running.  During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest.  Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal.  Non-symbol stack traces indicate it was in the sync() call in libc.  I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue.
>
> My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host.  This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU.  Problem is, I am fairly stumped at this point trying to prove this is true.
>   

Well, you can't prove it's true, since it's false :)


> So my questions are: 
>
> 1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest?  
>   

An illegal opcode in the guest is handled normally by generating #UD in 
the guest, without host involvement at all.


> 2) If they do cause a host exception/exit, what is that path that would handle this?
>
> I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers).  I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO.  My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first.  Any pointers out there?
>
> Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled.  Its too early to say right now, of course.
>   

My guess is that some horrible bug in the mmu is causing the guest to 
jump to some random page and actually execute undefined opcodes. 

[btw, running FC5's grub works as expected here]

> Thanks!
> -Greg
>
> PS: Other than the grub issue, I have been successfully  hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests.
>
>   

Great!  I'm looking forward to that, as well as to your contributions.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SIGILL in grub in guest on 4365
       [not found]         ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
@ 2007-02-01 12:58           ` Gregory Haskins
       [not found]             ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Gregory Haskins @ 2007-02-01 12:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

>>> On Thu, Feb 1, 2007 at  7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>,
Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote: 
> Gregory Haskins wrote:
>
>> 1) Is this how illegal- op would be handled, or would that stay entirely in 
> the domain of the guest?  
>>   
> 
> An illegal opcode in the guest is handled normally by generating #UD in 
> the guest, without host involvement at all.
> 

Ah, thank you.   Sounds like I need to do some more reading in the Intel docs ;)  Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort.

> 
> My guess is that some horrible bug in the mmu is causing the guest to 
> jump to some random page and actually execute undefined opcodes. 

Yuk...that makese sense though.  I can run sync() from other applications just fine so its probably dependent on the locations resolved by the dynamic linker.  I have some ideas for a tool that might be helpful here.  More to follow.


Thanks for the info and quick reply, BTW.
-Greg


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SIGILL in grub in guest on 4365
       [not found]             ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
@ 2007-02-01 13:05               ` Avi Kivity
  0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2007-02-01 13:05 UTC (permalink / raw)
  To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

Gregory Haskins wrote:
>>>> On Thu, Feb 1, 2007 at  7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>,
>>>>         
> Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote: 
>   
>> Gregory Haskins wrote:
>>
>>     
>>> 1) Is this how illegal- op would be handled, or would that stay entirely in 
>>>       
>> the domain of the guest?  
>>     
>>>   
>>>       
>> An illegal opcode in the guest is handled normally by generating #UD in 
>> the guest, without host involvement at all.
>>
>>     
>
> Ah, thank you.   Sounds like I need to do some more reading in the Intel docs ;)  Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort.
>
>   

VT allows both methods.  Basically you tell it for which exceptions you 
want a vmexit.

In kvm, we only intercept page faults (for mmu virtualization), and, if 
guest debugging is enabled, breakpoint exceptions.  We let the guest 
handle the rest.

-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-02-01 13:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <45C11A7E0200005A0001EE4E@mcclure.wal.novell.com>
     [not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com>
2007-02-01 12:33   ` SIGILL in grub in guest on 4365 Gregory Haskins
     [not found]     ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 12:46       ` Avi Kivity
     [not found]         ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-01 12:58           ` Gregory Haskins
     [not found]             ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 13:05               ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox