From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: 32-on-64 sysenter for pvops Date: Tue, 04 Mar 2008 07:38:05 -0800 Message-ID: <47CD6CDD.5010409@goop.org> References: <47CCA07A.50902@goop.org> <47CD1881.76E4.0078.0@novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <47CD1881.76E4.0078.0@novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: Keir Fraser , Xen-devel , Ian Campbell List-Id: xen-devel@lists.xenproject.org Jan Beulich wrote: >> Anyway, a couple of questions. It seems that the stack frame that Xen's >> sysenter generates is not exactly the same as the one the kernel >> expects, so the direct access to the threadinfo structure doesn't work >> properly. What's the difference in the frames? >> > > The frame is a normal interrupt frame (but not completely/properly filled > in - the implication of course is that the stack has been switched, other > than native sysenter would do), which is why the code in our kernels just > is a special preamble to system_call: > Yes, I copied that code more or less unchanged. > ... > ENDPROC(ia32_sysenter_target) > > # pv sysenter call handler stub > ENTRY(ia32pv_sysenter_target) > RING0_INT_FRAME > movl $__USER_DS,16(%esp) > movl %ebp,12(%esp) > movl $__USER_CS,4(%esp) > addl $4,%esp > >> I guess the other reason for the separate PV Xen sysenter entrypoint is >> to deal with sysexit not working. I addressed this by implementing a >> sysexit pvop using iret, though I think I could just set the TIF_IRET >> flag in threadinfo. >> > > Either should work, but as pointed out above letting it just fall through > to system_call seems even easier. > It means you need to duplicate more code. My variant just has the Xen-specific stack setup on entry, but then it can just fall back to the normal path. >> The sysenter path tries to enable interrupts immediately. Unfortunately >> this doesn't work in a paravirt environment, because not enough kernel >> state has been set up at that point (namely, pointing %fs to the kernel >> percpu data segment). To fix this, defer ENABLE_INTERRUPTS until after >> the kernel state has been set up. >> > > seems bogus: The sysenter handler in our kernels gets called with > interrupts enabled, which is as safe as int $80 going through a trap gate > (i.e. the rest of the kernel needs to be prepared to deal with interrupts > being enabled here anyway). It's a principled fix. It's true that there's only a visible problem when making the Xen sysenter address point to the normal sysenter target - which doesn't work because of the different calling convention. But if it did work (ie, Xen - or another hypervisor - produced the same frame as the normal sysenter instruction), then ENABLE_INTERRUPTS would fail because it's being called before the kernel's percpu segment has been set up. So given that ENABLE_INTERRUPTS needs to happen later, I set up xen_sysenter_target to enter with events masked, so that it's as similar to the hardware instruction as possible, and interrupts enabled are in the same place in both cases. J