* changing definition of paravirt_ops.iret
@ 2007-05-21 16:27 Jeremy Fitzhardinge
2007-05-21 16:46 ` Chris Wright
2007-05-21 18:37 ` Zachary Amsden
0 siblings, 2 replies; 5+ messages in thread
From: Jeremy Fitzhardinge @ 2007-05-21 16:27 UTC (permalink / raw)
To: Zachary Amsden, Rusty Russell, Chris Wright; +Cc: Virtualization Mailing List
I'm implementing a more efficient version of the Xen iret paravirt_op,
so that it can use the real iret instruction where possible. I really
need to get access to per-cpu variables, so I can set the event mask
state in the vcpu_info structure, but unfortunately at the point where
INTERRUPT_RETURN is used in entry.S, the usermode %fs has already been
restored.
How would you feel if we changed paravirt_ops.iret to make it also
responsible for restoring %fs?
In other words, change RESTORE_REGS to skip %fs, and then native_iret
would look like:
1: popl %fs
iret
with the normal exception stuff. Fortunately, %fs is already the first
thing to be saved and last to be restored, so there's no major
rearrangements.
Ideally I'd also like a register to play with, but that would require
rearranging pt_regs, which is a bit tricky.
Also, INTERRUPT_RETURN already has a bug, in which it needs to deal with
its own iret exceptions.
J
diff -r e13ec2ed67aa arch/i386/kernel/entry.S
--- a/arch/i386/kernel/entry.S Mon May 21 16:56:20 2007 +0100
+++ b/arch/i386/kernel/entry.S Mon May 21 17:21:37 2007 +0100
@@ -59,6 +59,9 @@
* INTERRUPT_RETURN (aka. "iret")
* GET_CR0_INTO_EAX (aka. "movl %cr0, %eax")
* ENABLE_INTERRUPTS_SYSEXIT (aka "sti; sysexit").
+ *
+ * INTERRUPT_RETURN is responsible for restoring the usermode %fs
+ * from the stack.
*
* For DISABLE_INTERRUPTS/ENABLE_INTERRUPTS (aka "cli"/"sti"), you must
* specify what registers can be overwritten (CLBR_NONE, CLBR_EAX/EDX/ECX/ANY).
@@ -166,21 +169,15 @@ 2: popl %es; \
2: popl %es; \
CFI_ADJUST_CFA_OFFSET -4;\
/*CFI_RESTORE es;*/\
-3: popl %fs; \
- CFI_ADJUST_CFA_OFFSET -4;\
- /*CFI_RESTORE fs;*/\
.pushsection .fixup,"ax"; \
4: movl $0,(%esp); \
jmp 1b; \
5: movl $0,(%esp); \
jmp 2b; \
-6: movl $0,(%esp); \
- jmp 3b; \
.section __ex_table,"a";\
.align 4; \
.long 1b,4b; \
.long 2b,5b; \
- .long 3b,6b; \
.popsection
#define RING0_INT_FRAME \
@@ -406,19 +403,14 @@ restore_nocheck_notrace:
RESTORE_REGS
addl $4, %esp # skip orig_eax/error_code
CFI_ADJUST_CFA_OFFSET -4
-1: INTERRUPT_RETURN
-.section .fixup,"ax"
+ INTERRUPT_RETURN
+
iret_exc:
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
pushl $0 # no error code
pushl $do_iret_error
jmp error_code
-.previous
-.section __ex_table,"a"
- .align 4
- .long 1b,iret_exc
-.previous
CFI_RESTORE_STATE
ldt_ss:
@@ -863,21 +855,22 @@ nmi_espfix_stack:
RESTORE_REGS
lss 12+4(%esp), %esp # back to espfix stack
CFI_ADJUST_CFA_OFFSET -24
-1: INTERRUPT_RETURN
- CFI_ENDPROC
+ INTERRUPT_RETURN
+ CFI_ENDPROC
+KPROBE_END(nmi)
+
+#ifdef CONFIG_PARAVIRT
+ENTRY(native_iret)
+1: popl %fs
+2: iret
+.pushsection .fixup,"ax"
+3: movl $0,(%esp)
+ jmp 1b
.section __ex_table,"a"
.align 4
- .long 1b,iret_exc
-.previous
-KPROBE_END(nmi)
-
-#ifdef CONFIG_PARAVIRT
-ENTRY(native_iret)
-1: iret
-.section __ex_table,"a"
- .align 4
- .long 1b,iret_exc
-.previous
+ .long 1b,3b
+ .long 2b,iret_exc
+.popsection
END(native_iret)
ENTRY(native_irq_enable_sysexit)
diff -r e13ec2ed67aa arch/i386/xen/enlighten.c
--- a/arch/i386/xen/enlighten.c Mon May 21 16:56:20 2007 +0100
+++ b/arch/i386/xen/enlighten.c Mon May 21 17:21:37 2007 +0100
@@ -851,7 +851,6 @@ static unsigned xen_patch(u8 type, u16 c
SITE(irq_disable);
SITE(save_fl);
SITE(restore_fl);
- SITE(iret);
#undef SITE
patch_site:
@@ -935,7 +934,7 @@ static const struct paravirt_ops xen_par
.read_tsc = native_read_tsc,
.read_pmc = native_read_pmc,
- .iret = (void *)&hypercall_page[__HYPERVISOR_iret],
+ .iret = xen_iret,
.irq_enable_sysexit = NULL, /* never called */
.load_tr_desc = paravirt_nop,
diff -r e13ec2ed67aa include/asm-i386/irqflags.h
--- a/include/asm-i386/irqflags.h Mon May 21 16:56:20 2007 +0100
+++ b/include/asm-i386/irqflags.h Mon May 21 17:21:37 2007 +0100
@@ -106,7 +106,18 @@ static inline unsigned long __raw_local_
#define DISABLE_INTERRUPTS(clobbers) cli
#define ENABLE_INTERRUPTS(clobbers) sti
#define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit
-#define INTERRUPT_RETURN iret
+#define INTERRUPT_RETURN \
+1: popl %fs; \
+2: iret; \
+.pushsection .fixup,"ax"; \
+3: movl $0,(%esp); \
+ jmp 1b; \
+.section __ex_table,"a"; \
+ .align 4; \
+ .long 1b,3b; \
+ .long 2b,iret_exc; \
+.popsection
+
#define GET_CR0_INTO_EAX movl %cr0, %eax
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_PARAVIRT */
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: changing definition of paravirt_ops.iret
2007-05-21 16:27 changing definition of paravirt_ops.iret Jeremy Fitzhardinge
@ 2007-05-21 16:46 ` Chris Wright
2007-05-21 16:53 ` Jeremy Fitzhardinge
2007-05-21 18:37 ` Zachary Amsden
1 sibling, 1 reply; 5+ messages in thread
From: Chris Wright @ 2007-05-21 16:46 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Chris Wright, Virtualization Mailing List
* Jeremy Fitzhardinge (jeremy@goop.org) wrote:
> I'm implementing a more efficient version of the Xen iret paravirt_op,
> so that it can use the real iret instruction where possible. I really
> need to get access to per-cpu variables, so I can set the event mask
> state in the vcpu_info structure, but unfortunately at the point where
> INTERRUPT_RETURN is used in entry.S, the usermode %fs has already been
> restored.
>
> How would you feel if we changed paravirt_ops.iret to make it also
> responsible for restoring %fs?
This is definitely ad-hoc semantic change, but I don't see a beter
way to do it (other than have iret be restore_regs_and_iret, which
isn't really an improvement).
thanks,
-chris
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: changing definition of paravirt_ops.iret
2007-05-21 16:46 ` Chris Wright
@ 2007-05-21 16:53 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 5+ messages in thread
From: Jeremy Fitzhardinge @ 2007-05-21 16:53 UTC (permalink / raw)
To: Chris Wright; +Cc: Virtualization Mailing List
Chris Wright wrote:
> This is definitely ad-hoc semantic change, but I don't see a beter
> way to do it (other than have iret be restore_regs_and_iret, which
> isn't really an improvement).
I just realized its a bit nastier than that. There's also the issue of
removing the error_code from the stack, which is above %fs. So it
becomes "restore %fs, remove error_code and iret".
Also, the INTERRUPT_RETURN in nmi_espfix_stack needs us to actually push
%fs and an error code, just so that it can be removed (but I have no
idea whether this is actually correct; it's very much Zach territory).
J
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: changing definition of paravirt_ops.iret
2007-05-21 16:27 changing definition of paravirt_ops.iret Jeremy Fitzhardinge
2007-05-21 16:46 ` Chris Wright
@ 2007-05-21 18:37 ` Zachary Amsden
2007-05-22 0:08 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 5+ messages in thread
From: Zachary Amsden @ 2007-05-21 18:37 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Chris Wright, Virtualization Mailing List, Eric W. Biederman
Jeremy Fitzhardinge wrote:
> I'm implementing a more efficient version of the Xen iret paravirt_op,
> so that it can use the real iret instruction where possible. I really
> need to get access to per-cpu variables, so I can set the event mask
> state in the vcpu_info structure, but unfortunately at the point where
> INTERRUPT_RETURN is used in entry.S, the usermode %fs has already been
> restored.
>
> How would you feel if we changed paravirt_ops.iret to make it also
> responsible for restoring %fs?
>
> In other words, change RESTORE_REGS to skip %fs, and then native_iret
> would look like:
>
> 1: popl %fs
> iret
>
> with the normal exception stuff. Fortunately, %fs is already the first
> thing to be saved and last to be restored, so there's no major
> rearrangements.
>
> Ideally I'd also like a register to play with, but that would require
> rearranging pt_regs, which is a bit tricky.
>
Strongly nacked. If you need %fs and a free register, just push / pop
them yourself. We use push / pop to get %eax free for IRET in the VMI ROM.
A change in IRET call convention is performance critical, and has severe
negative effects on VMI, since we then need to wrap the IRET entry point
in the ROM with more code in Linux - so iret is a jump to fs restore,
call to ROM gobbledygook that misaligns the call/return stack and rots
performance in some hitherto unkown case - perhaps on native.
There is absolutely no argument that re-arranging entry.S like this is
nightmarish and makes the code near unmaintainable without carefully
considering the effects on paravirt-ops hypervisors.
If you really absolutely must avoid reloading %fs for your own
performance, I can suggest several alternative schemes that, while more
complex, might also make native exception handling faster, thus
justifying the hack-to-bitty-pieces of entry.S
Zach
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: changing definition of paravirt_ops.iret
2007-05-21 18:37 ` Zachary Amsden
@ 2007-05-22 0:08 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 5+ messages in thread
From: Jeremy Fitzhardinge @ 2007-05-22 0:08 UTC (permalink / raw)
To: Zachary Amsden
Cc: Chris Wright, Virtualization Mailing List, Eric W. Biederman
Zachary Amsden wrote:
> Strongly nacked. If you need %fs and a free register, just push / pop
> them yourself. We use push / pop to get %eax free for IRET in the VMI
> ROM.
Yeah, it doesn't interact nicely with either the error_code or the nmi
stack switch thing.
J
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-05-22 0:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-21 16:27 changing definition of paravirt_ops.iret Jeremy Fitzhardinge
2007-05-21 16:46 ` Chris Wright
2007-05-21 16:53 ` Jeremy Fitzhardinge
2007-05-21 18:37 ` Zachary Amsden
2007-05-22 0:08 ` Jeremy Fitzhardinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).