* SIGILL in grub in guest on 4365 [not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com> @ 2007-02-01 12:33 ` Gregory Haskins [not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Gregory Haskins @ 2007-02-01 12:33 UTC (permalink / raw) To: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Hi All, New to the list and project...hoping to make a meaningful contribution here someday :) I am in the process of coming up to speed on the KVM project (very cool BTW). I found that it was extremely simple to get setup and running. During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest. Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal. Non-symbol stack traces indicate it was in the sync() call in libc. I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue. My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host. This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU. Problem is, I am fairly stumped at this point trying to prove this is true. So my questions are: 1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest? 2) If they do cause a host exception/exit, what is that path that would handle this? I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers). I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO. My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first. Any pointers out there? Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled. Its too early to say right now, of course. Thanks! -Greg PS: Other than the grub issue, I have been successfully hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>]
* Re: SIGILL in grub in guest on 4365 [not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org> @ 2007-02-01 12:46 ` Avi Kivity [not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Avi Kivity @ 2007-02-01 12:46 UTC (permalink / raw) To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Gregory Haskins wrote: > Hi All, > New to the list and project...hoping to make a meaningful contribution here someday :) > > I am in the process of coming up to speed on the KVM project (very cool BTW). I found that it was extremely simple to get setup and running. During the course of setting it up, I found an issue running a SUSE Linux Enterprise Desktop 10 (x86_64) guest. Basically, if you try to run GRUB, the grub process dies immediately as it takes an illegal instruction (SIGILL) signal. Non-symbol stack traces indicate it was in the sync() call in libc. I have worked around this temporarily by installing LILO under rescue mode...but I figured what better way to learn the code than to try to debug and fix this issue. > > My assumption is that an illegal-opcode will cause either a vm-exit or an illegal-opcode exception down to the host. This in turn would cause either the KVM_RUN ioctl to return (presumably with an EXCEPTION reason) or a signal to be delivered to QEMU. Problem is, I am fairly stumped at this point trying to prove this is true. > Well, you can't prove it's true, since it's false :) > So my questions are: > > 1) Is this how illegal-op would be handled, or would that stay entirely in the domain of the guest? > An illegal opcode in the guest is handled normally by generating #UD in the guest, without host involvement at all. > 2) If they do cause a host exception/exit, what is that path that would handle this? > > I put breakpoints in QEMU in all the obvious places (e.g. looking for VM-Exits in kvm_run(), host-2-guest exception generation points, and/or signal handlers). I have also straced QEMU and it doesn't appear to be taking any signals other than SIGIO. My next step will be to start sprinkling printfs in the QEMU/KVM code and/or debugging/LTT'ing the kernel, but I figured I would ping the group for suggestions first. Any pointers out there? > > Another possibility is that the guest is not generating a real illegal-op and the bug is that QEMU/KVM is accidentally injecting the exception condition (due to corruption, etc) and that explains why I cant seem to find it being explicitly handled. Its too early to say right now, of course. > My guess is that some horrible bug in the mmu is causing the guest to jump to some random page and actually execute undefined opcodes. [btw, running FC5's grub works as expected here] > Thanks! > -Greg > > PS: Other than the grub issue, I have been successfully hosting a 64 bit SLED guest on KVM for days now, so we are pretty close to being able to add it to your list of working guests. > > Great! I'm looking forward to that, as well as to your contributions. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>]
* Re: SIGILL in grub in guest on 4365 [not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org> @ 2007-02-01 12:58 ` Gregory Haskins [not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Gregory Haskins @ 2007-02-01 12:58 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f >>> On Thu, Feb 1, 2007 at 7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>, Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote: > Gregory Haskins wrote: > >> 1) Is this how illegal- op would be handled, or would that stay entirely in > the domain of the guest? >> > > An illegal opcode in the guest is handled normally by generating #UD in > the guest, without host involvement at all. > Ah, thank you. Sounds like I need to do some more reading in the Intel docs ;) Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort. > > My guess is that some horrible bug in the mmu is causing the guest to > jump to some random page and actually execute undefined opcodes. Yuk...that makese sense though. I can run sync() from other applications just fine so its probably dependent on the locations resolved by the dynamic linker. I have some ideas for a tool that might be helpful here. More to follow. Thanks for the info and quick reply, BTW. -Greg ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>]
* Re: SIGILL in grub in guest on 4365 [not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org> @ 2007-02-01 13:05 ` Avi Kivity 0 siblings, 0 replies; 4+ messages in thread From: Avi Kivity @ 2007-02-01 13:05 UTC (permalink / raw) To: Gregory Haskins; +Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f Gregory Haskins wrote: >>>> On Thu, Feb 1, 2007 at 7:46 AM, in message <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>, >>>> > Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org> wrote: > >> Gregory Haskins wrote: >> >> >>> 1) Is this how illegal- op would be handled, or would that stay entirely in >>> >> the domain of the guest? >> >>> >>> >> An illegal opcode in the guest is handled normally by generating #UD in >> the guest, without host involvement at all. >> >> > > Ah, thank you. Sounds like I need to do some more reading in the Intel docs ;) Despite my failings here, I did learn a lot about KVM/QEMU in the process so it wasnt wasted effort. > > VT allows both methods. Basically you tell it for which exceptions you want a vmexit. In kvm, we only intercept page faults (for mmu virtualization), and, if guest debugging is enabled, breakpoint exceptions. We let the guest handle the rest. -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-02-01 13:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <45C11A7E0200005A0001EE4E@mcclure.wal.novell.com>
[not found] ` <45C197CA0200005A0001EE9A@mcclure.wal.novell.com>
2007-02-01 12:33 ` SIGILL in grub in guest on 4365 Gregory Haskins
[not found] ` <45C197E6.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 12:46 ` Avi Kivity
[not found] ` <45C1E116.1070802-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-02-01 12:58 ` Gregory Haskins
[not found] ` <45C19DBA.BA47.005A.0-Et1tbQHTxzrQT0dZR+AlfA@public.gmane.org>
2007-02-01 13:05 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox