* [PATCH 0/3] Split up pv-ops
@ 2009-11-18 0:13 Alexander Graf
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw)
To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin
Paravirt ops is currently only capable of either replacing a lot of Linux
internal code or none at all. The are users that don't need all of the
possibilities pv-ops delivers though.
On KVM for example we're perfectly fine not using the PV MMU, thus not
touching any MMU code. That way we don't have to improve pv-ops to become
fast, we just don't compile the MMU parts in!
This patchset splits pv-ops into several smaller config options split by
feature category and then converts the KVM pv-ops code to use only the
bits that are required, lowering overhead.
Alexander Graf (3):
Split paravirt ops by functionality
Only export selected pv-ops feature structs
Split the KVM pv-ops support by feature
arch/x86/Kconfig | 72 +++++++++++++++++++++++++-
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/desc.h | 4 +-
arch/x86/include/asm/fixmap.h | 2 +-
arch/x86/include/asm/highmem.h | 2 +-
arch/x86/include/asm/io_32.h | 4 +-
arch/x86/include/asm/io_64.h | 2 +-
arch/x86/include/asm/irqflags.h | 21 ++++++--
arch/x86/include/asm/mmu_context.h | 4 +-
arch/x86/include/asm/msr.h | 4 +-
arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++-
arch/x86/include/asm/paravirt_types.h | 12 +++++
arch/x86/include/asm/pgalloc.h | 2 +-
arch/x86/include/asm/pgtable-3level_types.h | 2 +-
arch/x86/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/processor.h | 2 +-
arch/x86/include/asm/required-features.h | 2 +-
arch/x86/include/asm/smp.h | 2 +-
arch/x86/include/asm/system.h | 13 +++--
arch/x86/include/asm/tlbflush.h | 4 +-
arch/x86/kernel/head_64.S | 2 +-
arch/x86/kernel/kvm.c | 22 ++++++---
arch/x86/kernel/paravirt.c | 37 +++++++++++--
arch/x86/kernel/tsc.c | 2 +-
arch/x86/kernel/vsmp_64.c | 2 +-
arch/x86/xen/Kconfig | 2 +-
26 files changed, 219 insertions(+), 50 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 1/3] Split paravirt ops by functionality 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf @ 2009-11-18 0:13 ` Alexander Graf 2009-11-19 14:59 ` Jeremy Fitzhardinge 2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf ` (3 subsequent siblings) 4 siblings, 1 reply; 13+ messages in thread From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw) To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin Currently when using paravirt ops it's an all-or-nothing option. We can either use pv-ops for CPU, MMU, timing, etc. or not at all. Now there are some use cases where we don't need the full feature set, but only a small chunk of it. KVM is a pretty prominent example for this. So let's make everything a bit more fine-grained. We already have a splitting by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock". Taking that existing splitting and extending it to only compile in the PV capable bits sounded like a natural fit. That way we don't get performance hits in MMU code from using the KVM PV clock which only needs the TIME parts of pv-ops. We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing the CONFIG_PARAVIRT did before this splitting. We move all users of CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did before. So here it is - the splitting! I would have made the patch smaller, but this was the closest I could get to atomic (for bisect) while staying sane. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/x86/Kconfig | 47 ++++++++++++++++++++++++-- arch/x86/include/asm/apic.h | 2 +- arch/x86/include/asm/desc.h | 4 +- arch/x86/include/asm/fixmap.h | 2 +- arch/x86/include/asm/highmem.h | 2 +- arch/x86/include/asm/io_32.h | 4 ++- arch/x86/include/asm/io_64.h | 2 +- arch/x86/include/asm/irqflags.h | 21 +++++++++--- arch/x86/include/asm/mmu_context.h | 4 +- arch/x86/include/asm/msr.h | 4 +- arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++++++++++- arch/x86/include/asm/paravirt_types.h | 12 +++++++ arch/x86/include/asm/pgalloc.h | 2 +- arch/x86/include/asm/pgtable-3level_types.h | 2 +- arch/x86/include/asm/pgtable.h | 2 +- arch/x86/include/asm/processor.h | 2 +- arch/x86/include/asm/required-features.h | 2 +- arch/x86/include/asm/smp.h | 2 +- arch/x86/include/asm/system.h | 13 +++++-- arch/x86/include/asm/tlbflush.h | 4 +- arch/x86/kernel/head_64.S | 2 +- arch/x86/kernel/paravirt.c | 2 + arch/x86/kernel/tsc.c | 2 +- arch/x86/kernel/vsmp_64.c | 2 +- arch/x86/xen/Kconfig | 2 +- 25 files changed, 149 insertions(+), 38 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0c7b699..8c150b6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -350,7 +350,7 @@ endif config X86_VSMP bool "ScaleMP vSMP" - select PARAVIRT + select PARAVIRT_ALL depends on X86_64 && PCI depends on X86_EXTENDED_PLATFORM ---help--- @@ -493,7 +493,7 @@ source "arch/x86/xen/Kconfig" config VMI bool "VMI Guest support (DEPRECATED)" - select PARAVIRT + select PARAVIRT_ALL depends on X86_32 ---help--- VMI provides a paravirtualized interface to the VMware ESX server @@ -512,7 +512,6 @@ config VMI config KVM_CLOCK bool "KVM paravirtualized clock" - select PARAVIRT select PARAVIRT_CLOCK ---help--- Turning on this option will allow you to run a paravirtualized clock @@ -523,7 +522,7 @@ config KVM_CLOCK config KVM_GUEST bool "KVM Guest support" - select PARAVIRT + select PARAVIRT_ALL ---help--- This option enables various optimizations for running under the KVM hypervisor. @@ -551,8 +550,48 @@ config PARAVIRT_SPINLOCKS If you are unsure how to answer this question, answer N. +config PARAVIRT_CPU + bool + select PARAVIRT + default n + +config PARAVIRT_TIME + bool + select PARAVIRT + default n + +config PARAVIRT_IRQ + bool + select PARAVIRT + default n + +config PARAVIRT_APIC + bool + select PARAVIRT + default n + +config PARAVIRT_MMU + bool + select PARAVIRT + default n + +# +# This is a placeholder to activate the old "include all pv-ops functionality" +# behavior. If you're using this I'd recommend looking through your code to see +# if you can be more specific. It probably saves you a few cycles! +# +config PARAVIRT_ALL + bool + select PARAVIRT_CPU + select PARAVIRT_TIME + select PARAVIRT_IRQ + select PARAVIRT_APIC + select PARAVIRT_MMU + default n + config PARAVIRT_CLOCK bool + select PARAVIRT_TIME default n endif diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 474d80d..b54c24a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -81,7 +81,7 @@ static inline bool apic_from_smp_config(void) /* * Basic functions accessing APICs. */ -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_APIC #include <asm/paravirt.h> #endif diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index e8de2f6..cf65891 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -78,7 +78,7 @@ static inline int desc_empty(const void *ptr) return !(desc[0] | desc[1]); } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_CPU #include <asm/paravirt.h> #else #define load_TR_desc() native_load_tr_desc() @@ -108,7 +108,7 @@ static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) { } -#endif /* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_CPU */ #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt)) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 14f9890..5f29317 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -156,7 +156,7 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags); -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_MMU static inline void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags) { diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h index 014c2b8..458d785 100644 --- a/arch/x86/include/asm/highmem.h +++ b/arch/x86/include/asm/highmem.h @@ -66,7 +66,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type); void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); struct page *kmap_atomic_to_page(void *ptr); -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_MMU #define kmap_atomic_pte(page, type) kmap_atomic(page, type) #endif diff --git a/arch/x86/include/asm/io_32.h b/arch/x86/include/asm/io_32.h index a299900..a263c6f 100644 --- a/arch/x86/include/asm/io_32.h +++ b/arch/x86/include/asm/io_32.h @@ -109,7 +109,9 @@ extern void io_delay_init(void); #if defined(CONFIG_PARAVIRT) #include <asm/paravirt.h> -#else +#endif + +#ifndef CONFIG_PARAVIRT_CPU static inline void slow_down_io(void) { diff --git a/arch/x86/include/asm/io_64.h b/arch/x86/include/asm/io_64.h index 2440678..82c6eae 100644 --- a/arch/x86/include/asm/io_64.h +++ b/arch/x86/include/asm/io_64.h @@ -40,7 +40,7 @@ extern void native_io_delay(void); extern int io_delay_type; extern void io_delay_init(void); -#if defined(CONFIG_PARAVIRT) +#if defined(CONFIG_PARAVIRT_CPU) #include <asm/paravirt.h> #else diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index 9e2b952..b8d8f4c 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -58,9 +58,11 @@ static inline void native_halt(void) #ifdef CONFIG_PARAVIRT #include <asm/paravirt.h> -#else +#endif + #ifndef __ASSEMBLY__ +#ifndef CONFIG_PARAVIRT_IRQ static inline unsigned long __raw_local_save_flags(void) { return native_save_fl(); @@ -110,12 +112,17 @@ static inline unsigned long __raw_local_irq_save(void) return flags; } -#else +#endif /* CONFIG_PARAVIRT_IRQ */ + +#else /* __ASSEMBLY__ */ +#ifndef CONFIG_PARAVIRT_IRQ #define ENABLE_INTERRUPTS(x) sti #define DISABLE_INTERRUPTS(x) cli +#endif /* !CONFIG_PARAVIRT_IRQ */ #ifdef CONFIG_X86_64 +#ifndef CONFIG_PARAVIRT_CPU #define SWAPGS swapgs /* * Currently paravirt can't handle swapgs nicely when we @@ -128,8 +135,6 @@ static inline unsigned long __raw_local_irq_save(void) */ #define SWAPGS_UNSAFE_STACK swapgs -#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ - #define INTERRUPT_RETURN iretq #define USERGS_SYSRET64 \ swapgs; \ @@ -141,16 +146,22 @@ static inline unsigned long __raw_local_irq_save(void) swapgs; \ sti; \ sysexit +#endif /* !CONFIG_PARAVIRT_CPU */ + +#ifndef CONFIG_PARAVIRT_IRQ +#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ +#endif /* !CONFIG_PARAVIRT_IRQ */ #else +#ifndef CONFIG_PARAVIRT_CPU #define INTERRUPT_RETURN iret #define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit #define GET_CR0_INTO_EAX movl %cr0, %eax +#endif /* !CONFIG_PARAVIRT_CPU */ #endif #endif /* __ASSEMBLY__ */ -#endif /* CONFIG_PARAVIRT */ #ifndef __ASSEMBLY__ #define raw_local_save_flags(flags) \ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 4a2d4e0..a209e67 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -6,14 +6,14 @@ #include <asm/pgalloc.h> #include <asm/tlbflush.h> #include <asm/paravirt.h> -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_MMU #include <asm-generic/mm_hooks.h> static inline void paravirt_activate_mm(struct mm_struct *prev, struct mm_struct *next) { } -#endif /* !CONFIG_PARAVIRT */ +#endif /* !CONFIG_PARAVIRT_MMU */ /* * Used for LDT copy/destruction. diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 7e2b6ba..80ec5a5 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -123,7 +123,7 @@ static inline unsigned long long native_read_pmc(int counter) return EAX_EDX_VAL(val, low, high); } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_CPU #include <asm/paravirt.h> #else #include <linux/errno.h> @@ -234,7 +234,7 @@ do { \ #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux)) -#endif /* !CONFIG_PARAVIRT */ +#endif /* !CONFIG_PARAVIRT_CPU */ #define checking_wrmsrl(msr, val) wrmsr_safe((msr), (u32)(val), \ diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index efb3899..e543098 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -18,6 +18,7 @@ static inline int paravirt_enabled(void) return pv_info.paravirt_enabled; } +#ifdef CONFIG_PARAVIRT_CPU static inline void load_sp0(struct tss_struct *tss, struct thread_struct *thread) { @@ -58,7 +59,9 @@ static inline void write_cr0(unsigned long x) { PVOP_VCALL1(pv_cpu_ops.write_cr0, x); } +#endif /* CONFIG_PARAVIRT_CPU */ +#ifdef CONFIG_PARAVIRT_MMU static inline unsigned long read_cr2(void) { return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2); @@ -78,7 +81,9 @@ static inline void write_cr3(unsigned long x) { PVOP_VCALL1(pv_mmu_ops.write_cr3, x); } +#endif /* CONFIG_PARAVIRT_MMU */ +#ifdef CONFIG_PARAVIRT_CPU static inline unsigned long read_cr4(void) { return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr4); @@ -92,8 +97,9 @@ static inline void write_cr4(unsigned long x) { PVOP_VCALL1(pv_cpu_ops.write_cr4, x); } +#endif /* CONFIG_PARAVIRT_CPU */ -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_CPU) static inline unsigned long read_cr8(void) { return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr8); @@ -105,6 +111,7 @@ static inline void write_cr8(unsigned long x) } #endif +#ifdef CONFIG_PARAVIRT_IRQ static inline void raw_safe_halt(void) { PVOP_VCALL0(pv_irq_ops.safe_halt); @@ -114,14 +121,18 @@ static inline void halt(void) { PVOP_VCALL0(pv_irq_ops.safe_halt); } +#endif /* CONFIG_PARAVIRT_IRQ */ +#ifdef CONFIG_PARAVIRT_CPU static inline void wbinvd(void) { PVOP_VCALL0(pv_cpu_ops.wbinvd); } +#endif #define get_kernel_rpl() (pv_info.kernel_rpl) +#ifdef CONFIG_PARAVIRT_CPU static inline u64 paravirt_read_msr(unsigned msr, int *err) { return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err); @@ -224,12 +235,16 @@ do { \ } while (0) #define rdtscll(val) (val = paravirt_read_tsc()) +#endif /* CONFIG_PARAVIRT_CPU */ +#ifdef CONFIG_PARAVIRT_TIME static inline unsigned long long paravirt_sched_clock(void) { return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); } +#endif /* CONFIG_PARAVIRT_TIME */ +#ifdef CONFIG_PARAVIRT_CPU static inline unsigned long long paravirt_read_pmc(int counter) { return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter); @@ -345,8 +360,9 @@ static inline void slow_down_io(void) pv_cpu_ops.io_delay(); #endif } +#endif /* CONFIG_PARAVIRT_CPU */ -#ifdef CONFIG_SMP +#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_APIC) static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip, unsigned long start_esp) { @@ -355,6 +371,7 @@ static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip, } #endif +#ifdef CONFIG_PARAVIRT_MMU static inline void paravirt_activate_mm(struct mm_struct *prev, struct mm_struct *next) { @@ -698,7 +715,9 @@ static inline void pmd_clear(pmd_t *pmdp) set_pmd(pmdp, __pmd(0)); } #endif /* CONFIG_X86_PAE */ +#endif /* CONFIG_PARAVIRT_MMU */ +#ifdef CONFIG_PARAVIRT_CPU #define __HAVE_ARCH_START_CONTEXT_SWITCH static inline void arch_start_context_switch(struct task_struct *prev) { @@ -709,7 +728,9 @@ static inline void arch_end_context_switch(struct task_struct *next) { PVOP_VCALL1(pv_cpu_ops.end_context_switch, next); } +#endif /* CONFIG_PARAVIRT_CPU */ +#ifdef CONFIG_PARAVIRT_MMU #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE static inline void arch_enter_lazy_mmu_mode(void) { @@ -728,6 +749,7 @@ static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx, { pv_mmu_ops.set_fixmap(idx, phys, flags); } +#endif /* CONFIG_PARAVIRT_MMU */ #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) @@ -838,6 +860,7 @@ static __always_inline void __raw_spin_unlock(struct raw_spinlock *lock) #define __PV_IS_CALLEE_SAVE(func) \ ((struct paravirt_callee_save) { func }) +#ifdef CONFIG_PARAVIRT_IRQ static inline unsigned long __raw_local_save_flags(void) { return PVOP_CALLEE0(unsigned long, pv_irq_ops.save_fl); @@ -866,6 +889,7 @@ static inline unsigned long __raw_local_irq_save(void) raw_local_irq_disable(); return f; } +#endif /* CONFIG_PARAVIRT_IRQ */ /* Make sure as little as possible of this mess escapes. */ @@ -948,10 +972,13 @@ extern void default_banner(void); #define PARA_INDIRECT(addr) *%cs:addr #endif +#ifdef CONFIG_PARAVIRT_CPU #define INTERRUPT_RETURN \ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \ jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret)) +#endif /* CONFIG_PARAVIRT_CPU */ +#ifdef CONFIG_PARAVIRT_IRQ #define DISABLE_INTERRUPTS(clobbers) \ PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \ PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ @@ -963,13 +990,17 @@ extern void default_banner(void); PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \ PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) +#endif /* CONFIG_PARAVIRT_IRQ */ +#ifdef CONFIG_PARAVIRT_CPU #define USERGS_SYSRET32 \ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret32), \ CLBR_NONE, \ jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret32)) +#endif /* CONFIG_PARAVIRT_CPU */ #ifdef CONFIG_X86_32 +#ifdef CONFIG_PARAVIRT_CPU #define GET_CR0_INTO_EAX \ push %ecx; push %edx; \ call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \ @@ -979,10 +1010,12 @@ extern void default_banner(void); PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \ CLBR_NONE, \ jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit)) +#endif /* CONFIG_PARAVIRT_CPU */ #else /* !CONFIG_X86_32 */ +#ifdef CONFIG_PARAVIRT_CPU /* * If swapgs is used while the userspace stack is still current, * there's no way to call a pvop. The PV replacement *must* be @@ -1002,17 +1035,23 @@ extern void default_banner(void); PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs) \ ) +#endif /* CONFIG_PARAVIRT_CPU */ +#ifdef CONFIG_PARAVIRT_MMU #define GET_CR2_INTO_RCX \ call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2); \ movq %rax, %rcx; \ xorq %rax, %rax; +#endif /* CONFIG_PARAVIRT_MMU */ +#ifdef CONFIG_PARAVIRT_IRQ #define PARAVIRT_ADJUST_EXCEPTION_FRAME \ PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_adjust_exception_frame), \ CLBR_NONE, \ call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame)) +#endif /* CONFIG_PARAVIRT_IRQ */ +#ifdef CONFIG_PARAVIRT_CPU #define USERGS_SYSRET64 \ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64), \ CLBR_NONE, \ @@ -1022,6 +1061,7 @@ extern void default_banner(void); PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \ CLBR_NONE, \ jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit)) +#endif /* CONFIG_PARAVIRT_CPU */ #endif /* CONFIG_X86_32 */ #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 9357473..e190450 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -343,12 +343,24 @@ struct paravirt_patch_template { extern struct pv_info pv_info; extern struct pv_init_ops pv_init_ops; +#ifdef CONFIG_PARAVIRT_TIME extern struct pv_time_ops pv_time_ops; +#endif +#ifdef CONFIG_PARAVIRT_CPU extern struct pv_cpu_ops pv_cpu_ops; +#endif +#ifdef CONFIG_PARAVIRT_IRQ extern struct pv_irq_ops pv_irq_ops; +#endif +#ifdef CONFIG_PARAVIRT_APIC extern struct pv_apic_ops pv_apic_ops; +#endif +#ifdef CONFIG_PARAVIRT_MMU extern struct pv_mmu_ops pv_mmu_ops; +#endif +#ifdef CONFIG_PARAVIRT_SPINLOCKS extern struct pv_lock_ops pv_lock_ops; +#endif #define PARAVIRT_PATCH(x) \ (offsetof(struct paravirt_patch_template, x) / sizeof(void *)) diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index 0e8c2a0..94cce3d 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -7,7 +7,7 @@ static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU #include <asm/paravirt.h> #else #define paravirt_pgd_alloc(mm) __paravirt_pgd_alloc(mm) diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index 1bd5876..be58e74 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -18,7 +18,7 @@ typedef union { } pte_t; #endif /* !__ASSEMBLY__ */ -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU #define SHARED_KERNEL_PMD (pv_info.shared_kernel_pmd) #else #define SHARED_KERNEL_PMD 1 diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index af6fd36..b68edfc 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -26,7 +26,7 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]; extern spinlock_t pgd_lock; extern struct list_head pgd_list; -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU #include <asm/paravirt.h> #else /* !CONFIG_PARAVIRT */ #define set_pte(ptep, pte) native_set_pte(ptep, pte) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index c3429e8..a42a807 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -571,7 +571,7 @@ static inline void native_swapgs(void) #endif } -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_CPU #include <asm/paravirt.h> #else #define __cpuid native_cpuid diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h index 64cf2d2..f68edf2 100644 --- a/arch/x86/include/asm/required-features.h +++ b/arch/x86/include/asm/required-features.h @@ -48,7 +48,7 @@ #endif #ifdef CONFIG_X86_64 -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU /* Paravirtualized systems may not have PSE or PGE available */ #define NEED_PSE 0 #define NEED_PGE 0 diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index 1e79678..fdd889a 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -66,7 +66,7 @@ struct smp_ops { extern void set_cpu_sibling_map(int cpu); #ifdef CONFIG_SMP -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_APIC #define startup_ipi_hook(phys_apicid, start_eip, start_esp) do { } while (0) #endif extern struct smp_ops smp_ops; diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h index f08f973..63ca93c 100644 --- a/arch/x86/include/asm/system.h +++ b/arch/x86/include/asm/system.h @@ -302,13 +302,18 @@ static inline void native_wbinvd(void) #ifdef CONFIG_PARAVIRT #include <asm/paravirt.h> -#else -#define read_cr0() (native_read_cr0()) -#define write_cr0(x) (native_write_cr0(x)) +#endif/* CONFIG_PARAVIRT */ + +#ifndef CONFIG_PARAVIRT_MMU #define read_cr2() (native_read_cr2()) #define write_cr2(x) (native_write_cr2(x)) #define read_cr3() (native_read_cr3()) #define write_cr3(x) (native_write_cr3(x)) +#endif /* CONFIG_PARAVIRT_MMU */ + +#ifndef CONFIG_PARAVIRT_CPU +#define read_cr0() (native_read_cr0()) +#define write_cr0(x) (native_write_cr0(x)) #define read_cr4() (native_read_cr4()) #define read_cr4_safe() (native_read_cr4_safe()) #define write_cr4(x) (native_write_cr4(x)) @@ -322,7 +327,7 @@ static inline void native_wbinvd(void) /* Clear the 'TS' bit */ #define clts() (native_clts()) -#endif/* CONFIG_PARAVIRT */ +#endif /* CONFIG_PARAVIRT_CPU */ #define stts() write_cr0(read_cr0() | X86_CR0_TS) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 7f3eba0..89e055c 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -7,7 +7,7 @@ #include <asm/processor.h> #include <asm/system.h> -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU #include <asm/paravirt.h> #else #define __flush_tlb() __native_flush_tlb() @@ -162,7 +162,7 @@ static inline void reset_lazy_tlbstate(void) #endif /* SMP */ -#ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PARAVIRT_MMU #define flush_tlb_others(mask, mm, va) native_flush_tlb_others(mask, mm, va) #endif diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 780cd92..1284d8d 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -20,7 +20,7 @@ #include <asm/processor-flags.h> #include <asm/percpu.h> -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_MMU #include <asm/asm-offsets.h> #include <asm/paravirt.h> #else diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 1b1739d..c8530bd 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -155,12 +155,14 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf, else if (opfunc == _paravirt_ident_64) ret = paravirt_patch_ident_64(insnbuf, len); +#ifdef CONFIG_PARAVIRT_CPU else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) || type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) || type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) || type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64)) /* If operation requires a jmp, then jmp */ ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len); +#endif else /* Otherwise call the function; assume target could clobber any caller-save reg */ diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index cd982f4..96aad98 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -66,7 +66,7 @@ u64 native_sched_clock(void) /* We need to define a real function for sched_clock, to override the weak default version */ -#ifdef CONFIG_PARAVIRT +#ifdef CONFIG_PARAVIRT_TIME unsigned long long sched_clock(void) { return paravirt_sched_clock(); diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c index a1d804b..23f4612 100644 --- a/arch/x86/kernel/vsmp_64.c +++ b/arch/x86/kernel/vsmp_64.c @@ -22,7 +22,7 @@ #include <asm/paravirt.h> #include <asm/setup.h> -#if defined CONFIG_PCI && defined CONFIG_PARAVIRT +#if defined CONFIG_PCI && defined CONFIG_PARAVIRT_IRQ /* * Interrupt control on vSMPowered systems: * ~AC is a shadow of IF. If IF is 'on' AC should be 'off' diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig index b83e119..eef41bd 100644 --- a/arch/x86/xen/Kconfig +++ b/arch/x86/xen/Kconfig @@ -4,7 +4,7 @@ config XEN bool "Xen guest support" - select PARAVIRT + select PARAVIRT_ALL select PARAVIRT_CLOCK depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS) depends on X86_CMPXCHG && X86_TSC -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] Split paravirt ops by functionality 2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf @ 2009-11-19 14:59 ` Jeremy Fitzhardinge 2009-11-19 15:21 ` Alexander Graf 0 siblings, 1 reply; 13+ messages in thread From: Jeremy Fitzhardinge @ 2009-11-19 14:59 UTC (permalink / raw) To: Alexander Graf Cc: kvm list, Nick Piggin, Glauber Costa, Avi Kivity, virtualization On 11/18/09 08:13, Alexander Graf wrote: > Currently when using paravirt ops it's an all-or-nothing option. We can either > use pv-ops for CPU, MMU, timing, etc. or not at all. > > Now there are some use cases where we don't need the full feature set, but only > a small chunk of it. KVM is a pretty prominent example for this. > > So let's make everything a bit more fine-grained. We already have a splitting > by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock". > > Taking that existing splitting and extending it to only compile in the PV > capable bits sounded like a natural fit. That way we don't get performance hits > in MMU code from using the KVM PV clock which only needs the TIME parts of > pv-ops. > > We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing > the CONFIG_PARAVIRT did before this splitting. We move all users of > CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did > before. > > So here it is - the splitting! I would have made the patch smaller, but this > was the closest I could get to atomic (for bisect) while staying sane. > The basic idea seems pretty sane. I'm wondering how much compile test coverage you've given all these extra config options; there are now a lot more combinations, and your use of select is particularly worrying because they don't propagate dependencies properly. For example, does this actually work? > config XEN > bool "Xen guest support" > - select PARAVIRT > + select PARAVIRT_ALL > select PARAVIRT_CLOCK > depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS) > depends on X86_CMPXCHG && X86_TSC > Does selecting PARAVIRT_ALL end up selecting all the other PARAVIRT_*? Can you reassure me? Also, I think VMI is the only serious user of PARAVIRT_APIC, so we can mark that to go when VMI does. What ends up using plain CONFIG_PARAVIRT? Do we still need it? J > Signed-off-by: Alexander Graf <agraf@suse.de> > --- > arch/x86/Kconfig | 47 ++++++++++++++++++++++++-- > arch/x86/include/asm/apic.h | 2 +- > arch/x86/include/asm/desc.h | 4 +- > arch/x86/include/asm/fixmap.h | 2 +- > arch/x86/include/asm/highmem.h | 2 +- > arch/x86/include/asm/io_32.h | 4 ++- > arch/x86/include/asm/io_64.h | 2 +- > arch/x86/include/asm/irqflags.h | 21 +++++++++--- > arch/x86/include/asm/mmu_context.h | 4 +- > arch/x86/include/asm/msr.h | 4 +- > arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++++++++++- > arch/x86/include/asm/paravirt_types.h | 12 +++++++ > arch/x86/include/asm/pgalloc.h | 2 +- > arch/x86/include/asm/pgtable-3level_types.h | 2 +- > arch/x86/include/asm/pgtable.h | 2 +- > arch/x86/include/asm/processor.h | 2 +- > arch/x86/include/asm/required-features.h | 2 +- > arch/x86/include/asm/smp.h | 2 +- > arch/x86/include/asm/system.h | 13 +++++-- > arch/x86/include/asm/tlbflush.h | 4 +- > arch/x86/kernel/head_64.S | 2 +- > arch/x86/kernel/paravirt.c | 2 + > arch/x86/kernel/tsc.c | 2 +- > arch/x86/kernel/vsmp_64.c | 2 +- > arch/x86/xen/Kconfig | 2 +- > 25 files changed, 149 insertions(+), 38 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 0c7b699..8c150b6 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -350,7 +350,7 @@ endif > > config X86_VSMP > bool "ScaleMP vSMP" > - select PARAVIRT > + select PARAVIRT_ALL > depends on X86_64 && PCI > depends on X86_EXTENDED_PLATFORM > ---help--- > @@ -493,7 +493,7 @@ source "arch/x86/xen/Kconfig" > > config VMI > bool "VMI Guest support (DEPRECATED)" > - select PARAVIRT > + select PARAVIRT_ALL > depends on X86_32 > ---help--- > VMI provides a paravirtualized interface to the VMware ESX server > @@ -512,7 +512,6 @@ config VMI > > config KVM_CLOCK > bool "KVM paravirtualized clock" > - select PARAVIRT > select PARAVIRT_CLOCK > ---help--- > Turning on this option will allow you to run a paravirtualized clock > @@ -523,7 +522,7 @@ config KVM_CLOCK > > config KVM_GUEST > bool "KVM Guest support" > - select PARAVIRT > + select PARAVIRT_ALL > ---help--- > This option enables various optimizations for running under the KVM > hypervisor. > @@ -551,8 +550,48 @@ config PARAVIRT_SPINLOCKS > > If you are unsure how to answer this question, answer N. > > +config PARAVIRT_CPU > + bool > + select PARAVIRT > + default n > + > +config PARAVIRT_TIME > + bool > + select PARAVIRT > + default n > + > +config PARAVIRT_IRQ > + bool > + select PARAVIRT > + default n > + > +config PARAVIRT_APIC > + bool > + select PARAVIRT > + default n > + > +config PARAVIRT_MMU > + bool > + select PARAVIRT > + default n > + > +# > +# This is a placeholder to activate the old "include all pv-ops functionality" > +# behavior. If you're using this I'd recommend looking through your code to see > +# if you can be more specific. It probably saves you a few cycles! > +# > +config PARAVIRT_ALL > + bool > + select PARAVIRT_CPU > + select PARAVIRT_TIME > + select PARAVIRT_IRQ > + select PARAVIRT_APIC > + select PARAVIRT_MMU > + default n > + > config PARAVIRT_CLOCK > bool > + select PARAVIRT_TIME > default n > > endif > diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h > index 474d80d..b54c24a 100644 > --- a/arch/x86/include/asm/apic.h > +++ b/arch/x86/include/asm/apic.h > @@ -81,7 +81,7 @@ static inline bool apic_from_smp_config(void) > /* > * Basic functions accessing APICs. > */ > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_APIC > #include <asm/paravirt.h> > #endif > > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > index e8de2f6..cf65891 100644 > --- a/arch/x86/include/asm/desc.h > +++ b/arch/x86/include/asm/desc.h > @@ -78,7 +78,7 @@ static inline int desc_empty(const void *ptr) > return !(desc[0] | desc[1]); > } > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_CPU > #include <asm/paravirt.h> > #else > #define load_TR_desc() native_load_tr_desc() > @@ -108,7 +108,7 @@ static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) > static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries) > { > } > -#endif /* CONFIG_PARAVIRT */ > +#endif /* CONFIG_PARAVIRT_CPU */ > > #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt)) > > diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h > index 14f9890..5f29317 100644 > --- a/arch/x86/include/asm/fixmap.h > +++ b/arch/x86/include/asm/fixmap.h > @@ -156,7 +156,7 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); > void native_set_fixmap(enum fixed_addresses idx, > phys_addr_t phys, pgprot_t flags); > > -#ifndef CONFIG_PARAVIRT > +#ifndef CONFIG_PARAVIRT_MMU > static inline void __set_fixmap(enum fixed_addresses idx, > phys_addr_t phys, pgprot_t flags) > { > diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h > index 014c2b8..458d785 100644 > --- a/arch/x86/include/asm/highmem.h > +++ b/arch/x86/include/asm/highmem.h > @@ -66,7 +66,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type); > void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot); > struct page *kmap_atomic_to_page(void *ptr); > > -#ifndef CONFIG_PARAVIRT > +#ifndef CONFIG_PARAVIRT_MMU > #define kmap_atomic_pte(page, type) kmap_atomic(page, type) > #endif > > diff --git a/arch/x86/include/asm/io_32.h b/arch/x86/include/asm/io_32.h > index a299900..a263c6f 100644 > --- a/arch/x86/include/asm/io_32.h > +++ b/arch/x86/include/asm/io_32.h > @@ -109,7 +109,9 @@ extern void io_delay_init(void); > > #if defined(CONFIG_PARAVIRT) > #include <asm/paravirt.h> > -#else > +#endif > + > +#ifndef CONFIG_PARAVIRT_CPU > > static inline void slow_down_io(void) > { > diff --git a/arch/x86/include/asm/io_64.h b/arch/x86/include/asm/io_64.h > index 2440678..82c6eae 100644 > --- a/arch/x86/include/asm/io_64.h > +++ b/arch/x86/include/asm/io_64.h > @@ -40,7 +40,7 @@ extern void native_io_delay(void); > extern int io_delay_type; > extern void io_delay_init(void); > > -#if defined(CONFIG_PARAVIRT) > +#if defined(CONFIG_PARAVIRT_CPU) > #include <asm/paravirt.h> > #else > > diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h > index 9e2b952..b8d8f4c 100644 > --- a/arch/x86/include/asm/irqflags.h > +++ b/arch/x86/include/asm/irqflags.h > @@ -58,9 +58,11 @@ static inline void native_halt(void) > > #ifdef CONFIG_PARAVIRT > #include <asm/paravirt.h> > -#else > +#endif > + > #ifndef __ASSEMBLY__ > > +#ifndef CONFIG_PARAVIRT_IRQ > static inline unsigned long __raw_local_save_flags(void) > { > return native_save_fl(); > @@ -110,12 +112,17 @@ static inline unsigned long __raw_local_irq_save(void) > > return flags; > } > -#else > +#endif /* CONFIG_PARAVIRT_IRQ */ > + > +#else /* __ASSEMBLY__ */ > > +#ifndef CONFIG_PARAVIRT_IRQ > #define ENABLE_INTERRUPTS(x) sti > #define DISABLE_INTERRUPTS(x) cli > +#endif /* !CONFIG_PARAVIRT_IRQ */ > > #ifdef CONFIG_X86_64 > +#ifndef CONFIG_PARAVIRT_CPU > #define SWAPGS swapgs > /* > * Currently paravirt can't handle swapgs nicely when we > @@ -128,8 +135,6 @@ static inline unsigned long __raw_local_irq_save(void) > */ > #define SWAPGS_UNSAFE_STACK swapgs > > -#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ > - > #define INTERRUPT_RETURN iretq > #define USERGS_SYSRET64 \ > swapgs; \ > @@ -141,16 +146,22 @@ static inline unsigned long __raw_local_irq_save(void) > swapgs; \ > sti; \ > sysexit > +#endif /* !CONFIG_PARAVIRT_CPU */ > + > +#ifndef CONFIG_PARAVIRT_IRQ > +#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */ > +#endif /* !CONFIG_PARAVIRT_IRQ */ > > #else > +#ifndef CONFIG_PARAVIRT_CPU > #define INTERRUPT_RETURN iret > #define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit > #define GET_CR0_INTO_EAX movl %cr0, %eax > +#endif /* !CONFIG_PARAVIRT_CPU */ > #endif > > > #endif /* __ASSEMBLY__ */ > -#endif /* CONFIG_PARAVIRT */ > > #ifndef __ASSEMBLY__ > #define raw_local_save_flags(flags) \ > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h > index 4a2d4e0..a209e67 100644 > --- a/arch/x86/include/asm/mmu_context.h > +++ b/arch/x86/include/asm/mmu_context.h > @@ -6,14 +6,14 @@ > #include <asm/pgalloc.h> > #include <asm/tlbflush.h> > #include <asm/paravirt.h> > -#ifndef CONFIG_PARAVIRT > +#ifndef CONFIG_PARAVIRT_MMU > #include <asm-generic/mm_hooks.h> > > static inline void paravirt_activate_mm(struct mm_struct *prev, > struct mm_struct *next) > { > } > -#endif /* !CONFIG_PARAVIRT */ > +#endif /* !CONFIG_PARAVIRT_MMU */ > > /* > * Used for LDT copy/destruction. > diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h > index 7e2b6ba..80ec5a5 100644 > --- a/arch/x86/include/asm/msr.h > +++ b/arch/x86/include/asm/msr.h > @@ -123,7 +123,7 @@ static inline unsigned long long native_read_pmc(int counter) > return EAX_EDX_VAL(val, low, high); > } > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_CPU > #include <asm/paravirt.h> > #else > #include <linux/errno.h> > @@ -234,7 +234,7 @@ do { \ > > #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux)) > > -#endif /* !CONFIG_PARAVIRT */ > +#endif /* !CONFIG_PARAVIRT_CPU */ > > > #define checking_wrmsrl(msr, val) wrmsr_safe((msr), (u32)(val), \ > diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h > index efb3899..e543098 100644 > --- a/arch/x86/include/asm/paravirt.h > +++ b/arch/x86/include/asm/paravirt.h > @@ -18,6 +18,7 @@ static inline int paravirt_enabled(void) > return pv_info.paravirt_enabled; > } > > +#ifdef CONFIG_PARAVIRT_CPU > static inline void load_sp0(struct tss_struct *tss, > struct thread_struct *thread) > { > @@ -58,7 +59,9 @@ static inline void write_cr0(unsigned long x) > { > PVOP_VCALL1(pv_cpu_ops.write_cr0, x); > } > +#endif /* CONFIG_PARAVIRT_CPU */ > > +#ifdef CONFIG_PARAVIRT_MMU > static inline unsigned long read_cr2(void) > { > return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2); > @@ -78,7 +81,9 @@ static inline void write_cr3(unsigned long x) > { > PVOP_VCALL1(pv_mmu_ops.write_cr3, x); > } > +#endif /* CONFIG_PARAVIRT_MMU */ > > +#ifdef CONFIG_PARAVIRT_CPU > static inline unsigned long read_cr4(void) > { > return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr4); > @@ -92,8 +97,9 @@ static inline void write_cr4(unsigned long x) > { > PVOP_VCALL1(pv_cpu_ops.write_cr4, x); > } > +#endif /* CONFIG_PARAVIRT_CPU */ > > -#ifdef CONFIG_X86_64 > +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_CPU) > static inline unsigned long read_cr8(void) > { > return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr8); > @@ -105,6 +111,7 @@ static inline void write_cr8(unsigned long x) > } > #endif > > +#ifdef CONFIG_PARAVIRT_IRQ > static inline void raw_safe_halt(void) > { > PVOP_VCALL0(pv_irq_ops.safe_halt); > @@ -114,14 +121,18 @@ static inline void halt(void) > { > PVOP_VCALL0(pv_irq_ops.safe_halt); > } > +#endif /* CONFIG_PARAVIRT_IRQ */ > > +#ifdef CONFIG_PARAVIRT_CPU > static inline void wbinvd(void) > { > PVOP_VCALL0(pv_cpu_ops.wbinvd); > } > +#endif > > #define get_kernel_rpl() (pv_info.kernel_rpl) > > +#ifdef CONFIG_PARAVIRT_CPU > static inline u64 paravirt_read_msr(unsigned msr, int *err) > { > return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err); > @@ -224,12 +235,16 @@ do { \ > } while (0) > > #define rdtscll(val) (val = paravirt_read_tsc()) > +#endif /* CONFIG_PARAVIRT_CPU */ > > +#ifdef CONFIG_PARAVIRT_TIME > static inline unsigned long long paravirt_sched_clock(void) > { > return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock); > } > +#endif /* CONFIG_PARAVIRT_TIME */ > > +#ifdef CONFIG_PARAVIRT_CPU > static inline unsigned long long paravirt_read_pmc(int counter) > { > return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter); > @@ -345,8 +360,9 @@ static inline void slow_down_io(void) > pv_cpu_ops.io_delay(); > #endif > } > +#endif /* CONFIG_PARAVIRT_CPU */ > > -#ifdef CONFIG_SMP > +#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_APIC) > static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip, > unsigned long start_esp) > { > @@ -355,6 +371,7 @@ static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip, > } > #endif > > +#ifdef CONFIG_PARAVIRT_MMU > static inline void paravirt_activate_mm(struct mm_struct *prev, > struct mm_struct *next) > { > @@ -698,7 +715,9 @@ static inline void pmd_clear(pmd_t *pmdp) > set_pmd(pmdp, __pmd(0)); > } > #endif /* CONFIG_X86_PAE */ > +#endif /* CONFIG_PARAVIRT_MMU */ > > +#ifdef CONFIG_PARAVIRT_CPU > #define __HAVE_ARCH_START_CONTEXT_SWITCH > static inline void arch_start_context_switch(struct task_struct *prev) > { > @@ -709,7 +728,9 @@ static inline void arch_end_context_switch(struct task_struct *next) > { > PVOP_VCALL1(pv_cpu_ops.end_context_switch, next); > } > +#endif /* CONFIG_PARAVIRT_CPU */ > > +#ifdef CONFIG_PARAVIRT_MMU > #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE > static inline void arch_enter_lazy_mmu_mode(void) > { > @@ -728,6 +749,7 @@ static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx, > { > pv_mmu_ops.set_fixmap(idx, phys, flags); > } > +#endif /* CONFIG_PARAVIRT_MMU */ > > #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS) > > @@ -838,6 +860,7 @@ static __always_inline void __raw_spin_unlock(struct raw_spinlock *lock) > #define __PV_IS_CALLEE_SAVE(func) \ > ((struct paravirt_callee_save) { func }) > > +#ifdef CONFIG_PARAVIRT_IRQ > static inline unsigned long __raw_local_save_flags(void) > { > return PVOP_CALLEE0(unsigned long, pv_irq_ops.save_fl); > @@ -866,6 +889,7 @@ static inline unsigned long __raw_local_irq_save(void) > raw_local_irq_disable(); > return f; > } > +#endif /* CONFIG_PARAVIRT_IRQ */ > > > /* Make sure as little as possible of this mess escapes. */ > @@ -948,10 +972,13 @@ extern void default_banner(void); > #define PARA_INDIRECT(addr) *%cs:addr > #endif > > +#ifdef CONFIG_PARAVIRT_CPU > #define INTERRUPT_RETURN \ > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \ > jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret)) > +#endif /* CONFIG_PARAVIRT_CPU */ > > +#ifdef CONFIG_PARAVIRT_IRQ > #define DISABLE_INTERRUPTS(clobbers) \ > PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \ > PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ > @@ -963,13 +990,17 @@ extern void default_banner(void); > PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \ > call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \ > PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);) > +#endif /* CONFIG_PARAVIRT_IRQ */ > > +#ifdef CONFIG_PARAVIRT_CPU > #define USERGS_SYSRET32 \ > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret32), \ > CLBR_NONE, \ > jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret32)) > +#endif /* CONFIG_PARAVIRT_CPU */ > > #ifdef CONFIG_X86_32 > +#ifdef CONFIG_PARAVIRT_CPU > #define GET_CR0_INTO_EAX \ > push %ecx; push %edx; \ > call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \ > @@ -979,10 +1010,12 @@ extern void default_banner(void); > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \ > CLBR_NONE, \ > jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit)) > +#endif /* CONFIG_PARAVIRT_CPU */ > > > #else /* !CONFIG_X86_32 */ > > +#ifdef CONFIG_PARAVIRT_CPU > /* > * If swapgs is used while the userspace stack is still current, > * there's no way to call a pvop. The PV replacement *must* be > @@ -1002,17 +1035,23 @@ extern void default_banner(void); > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \ > call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs) \ > ) > +#endif /* CONFIG_PARAVIRT_CPU */ > > +#ifdef CONFIG_PARAVIRT_MMU > #define GET_CR2_INTO_RCX \ > call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2); \ > movq %rax, %rcx; \ > xorq %rax, %rax; > +#endif /* CONFIG_PARAVIRT_MMU */ > > +#ifdef CONFIG_PARAVIRT_IRQ > #define PARAVIRT_ADJUST_EXCEPTION_FRAME \ > PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_adjust_exception_frame), \ > CLBR_NONE, \ > call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame)) > +#endif /* CONFIG_PARAVIRT_IRQ */ > > +#ifdef CONFIG_PARAVIRT_CPU > #define USERGS_SYSRET64 \ > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64), \ > CLBR_NONE, \ > @@ -1022,6 +1061,7 @@ extern void default_banner(void); > PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \ > CLBR_NONE, \ > jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit)) > +#endif /* CONFIG_PARAVIRT_CPU */ > #endif /* CONFIG_X86_32 */ > > #endif /* __ASSEMBLY__ */ > diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h > index 9357473..e190450 100644 > --- a/arch/x86/include/asm/paravirt_types.h > +++ b/arch/x86/include/asm/paravirt_types.h > @@ -343,12 +343,24 @@ struct paravirt_patch_template { > > extern struct pv_info pv_info; > extern struct pv_init_ops pv_init_ops; > +#ifdef CONFIG_PARAVIRT_TIME > extern struct pv_time_ops pv_time_ops; > +#endif > +#ifdef CONFIG_PARAVIRT_CPU > extern struct pv_cpu_ops pv_cpu_ops; > +#endif > +#ifdef CONFIG_PARAVIRT_IRQ > extern struct pv_irq_ops pv_irq_ops; > +#endif > +#ifdef CONFIG_PARAVIRT_APIC > extern struct pv_apic_ops pv_apic_ops; > +#endif > +#ifdef CONFIG_PARAVIRT_MMU > extern struct pv_mmu_ops pv_mmu_ops; > +#endif > +#ifdef CONFIG_PARAVIRT_SPINLOCKS > extern struct pv_lock_ops pv_lock_ops; > +#endif > That's unpleasantly noisy, but I guess we just have to blame cpp's syntax. > > #define PARAVIRT_PATCH(x) \ > (offsetof(struct paravirt_patch_template, x) / sizeof(void *)) > diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h > index 0e8c2a0..94cce3d 100644 > --- a/arch/x86/include/asm/pgalloc.h > +++ b/arch/x86/include/asm/pgalloc.h > @@ -7,7 +7,7 @@ > > static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; } > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > #include <asm/paravirt.h> > #else > #define paravirt_pgd_alloc(mm) __paravirt_pgd_alloc(mm) > diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h > index 1bd5876..be58e74 100644 > --- a/arch/x86/include/asm/pgtable-3level_types.h > +++ b/arch/x86/include/asm/pgtable-3level_types.h > @@ -18,7 +18,7 @@ typedef union { > } pte_t; > #endif /* !__ASSEMBLY__ */ > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > #define SHARED_KERNEL_PMD (pv_info.shared_kernel_pmd) > #else > #define SHARED_KERNEL_PMD 1 > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h > index af6fd36..b68edfc 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -26,7 +26,7 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]; > extern spinlock_t pgd_lock; > extern struct list_head pgd_list; > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > #include <asm/paravirt.h> > #else /* !CONFIG_PARAVIRT */ > #define set_pte(ptep, pte) native_set_pte(ptep, pte) > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index c3429e8..a42a807 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -571,7 +571,7 @@ static inline void native_swapgs(void) > #endif > } > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_CPU > #include <asm/paravirt.h> > #else > #define __cpuid native_cpuid > diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h > index 64cf2d2..f68edf2 100644 > --- a/arch/x86/include/asm/required-features.h > +++ b/arch/x86/include/asm/required-features.h > @@ -48,7 +48,7 @@ > #endif > > #ifdef CONFIG_X86_64 > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > /* Paravirtualized systems may not have PSE or PGE available */ > #define NEED_PSE 0 > #define NEED_PGE 0 > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > index 1e79678..fdd889a 100644 > --- a/arch/x86/include/asm/smp.h > +++ b/arch/x86/include/asm/smp.h > @@ -66,7 +66,7 @@ struct smp_ops { > extern void set_cpu_sibling_map(int cpu); > > #ifdef CONFIG_SMP > -#ifndef CONFIG_PARAVIRT > +#ifndef CONFIG_PARAVIRT_APIC > #define startup_ipi_hook(phys_apicid, start_eip, start_esp) do { } while (0) > #endif > extern struct smp_ops smp_ops; > diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h > index f08f973..63ca93c 100644 > --- a/arch/x86/include/asm/system.h > +++ b/arch/x86/include/asm/system.h > @@ -302,13 +302,18 @@ static inline void native_wbinvd(void) > > #ifdef CONFIG_PARAVIRT > #include <asm/paravirt.h> > -#else > -#define read_cr0() (native_read_cr0()) > -#define write_cr0(x) (native_write_cr0(x)) > +#endif/* CONFIG_PARAVIRT */ > + > +#ifndef CONFIG_PARAVIRT_MMU > #define read_cr2() (native_read_cr2()) > #define write_cr2(x) (native_write_cr2(x)) > #define read_cr3() (native_read_cr3()) > #define write_cr3(x) (native_write_cr3(x)) > +#endif /* CONFIG_PARAVIRT_MMU */ > + > +#ifndef CONFIG_PARAVIRT_CPU > +#define read_cr0() (native_read_cr0()) > +#define write_cr0(x) (native_write_cr0(x)) > #define read_cr4() (native_read_cr4()) > #define read_cr4_safe() (native_read_cr4_safe()) > #define write_cr4(x) (native_write_cr4(x)) > @@ -322,7 +327,7 @@ static inline void native_wbinvd(void) > /* Clear the 'TS' bit */ > #define clts() (native_clts()) > > -#endif/* CONFIG_PARAVIRT */ > +#endif /* CONFIG_PARAVIRT_CPU */ > > #define stts() write_cr0(read_cr0() | X86_CR0_TS) > > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index 7f3eba0..89e055c 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -7,7 +7,7 @@ > #include <asm/processor.h> > #include <asm/system.h> > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > #include <asm/paravirt.h> > #else > #define __flush_tlb() __native_flush_tlb() > @@ -162,7 +162,7 @@ static inline void reset_lazy_tlbstate(void) > > #endif /* SMP */ > > -#ifndef CONFIG_PARAVIRT > +#ifndef CONFIG_PARAVIRT_MMU > #define flush_tlb_others(mask, mm, va) native_flush_tlb_others(mask, mm, va) > #endif > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S > index 780cd92..1284d8d 100644 > --- a/arch/x86/kernel/head_64.S > +++ b/arch/x86/kernel/head_64.S > @@ -20,7 +20,7 @@ > #include <asm/processor-flags.h> > #include <asm/percpu.h> > > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_MMU > #include <asm/asm-offsets.h> > #include <asm/paravirt.h> > #else > diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c > index 1b1739d..c8530bd 100644 > --- a/arch/x86/kernel/paravirt.c > +++ b/arch/x86/kernel/paravirt.c > @@ -155,12 +155,14 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf, > else if (opfunc == _paravirt_ident_64) > ret = paravirt_patch_ident_64(insnbuf, len); > > +#ifdef CONFIG_PARAVIRT_CPU > else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) || > type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) || > type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) || > type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64)) > /* If operation requires a jmp, then jmp */ > ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len); > +#endif > else > /* Otherwise call the function; assume target could > clobber any caller-save reg */ > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > index cd982f4..96aad98 100644 > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -66,7 +66,7 @@ u64 native_sched_clock(void) > > /* We need to define a real function for sched_clock, to override the > weak default version */ > -#ifdef CONFIG_PARAVIRT > +#ifdef CONFIG_PARAVIRT_TIME > unsigned long long sched_clock(void) > { > return paravirt_sched_clock(); > diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c > index a1d804b..23f4612 100644 > --- a/arch/x86/kernel/vsmp_64.c > +++ b/arch/x86/kernel/vsmp_64.c > @@ -22,7 +22,7 @@ > #include <asm/paravirt.h> > #include <asm/setup.h> > > -#if defined CONFIG_PCI && defined CONFIG_PARAVIRT > +#if defined CONFIG_PCI && defined CONFIG_PARAVIRT_IRQ > /* > * Interrupt control on vSMPowered systems: > * ~AC is a shadow of IF. If IF is 'on' AC should be 'off' > diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig > index b83e119..eef41bd 100644 > --- a/arch/x86/xen/Kconfig > +++ b/arch/x86/xen/Kconfig > @@ -4,7 +4,7 @@ > > config XEN > bool "Xen guest support" > - select PARAVIRT > + select PARAVIRT_ALL > select PARAVIRT_CLOCK > depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS) > depends on X86_CMPXCHG && X86_TSC > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] Split paravirt ops by functionality 2009-11-19 14:59 ` Jeremy Fitzhardinge @ 2009-11-19 15:21 ` Alexander Graf 0 siblings, 0 replies; 13+ messages in thread From: Alexander Graf @ 2009-11-19 15:21 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: kvm list, Nick Piggin, Glauber Costa, Avi Kivity, virtualization Jeremy Fitzhardinge wrote: > On 11/18/09 08:13, Alexander Graf wrote: > >> Currently when using paravirt ops it's an all-or-nothing option. We can either >> use pv-ops for CPU, MMU, timing, etc. or not at all. >> >> Now there are some use cases where we don't need the full feature set, but only >> a small chunk of it. KVM is a pretty prominent example for this. >> >> So let's make everything a bit more fine-grained. We already have a splitting >> by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock". >> >> Taking that existing splitting and extending it to only compile in the PV >> capable bits sounded like a natural fit. That way we don't get performance hits >> in MMU code from using the KVM PV clock which only needs the TIME parts of >> pv-ops. >> >> We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing >> the CONFIG_PARAVIRT did before this splitting. We move all users of >> CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did >> before. >> >> So here it is - the splitting! I would have made the patch smaller, but this >> was the closest I could get to atomic (for bisect) while staying sane. >> >> > > The basic idea seems pretty sane. I'm wondering how much compile test > coverage you've given all these extra config options; there are now a > lot more combinations, and your use of select is particularly worrying > because they don't propagate dependencies properly. > Uh - I don't see where there should be any dependencies. > For example, does this actually work? > >> config XEN >> bool "Xen guest support" >> - select PARAVIRT >> + select PARAVIRT_ALL >> select PARAVIRT_CLOCK >> depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS) >> depends on X86_CMPXCHG && X86_TSC >> >> > Does selecting PARAVIRT_ALL end up selecting all the other PARAVIRT_*? > > Can you reassure me? > > +config PARAVIRT_ALL > + bool > + select PARAVIRT_CPU > + select PARAVIRT_TIME > + select PARAVIRT_IRQ > + select PARAVIRT_APIC > + select PARAVIRT_MMU > + default n > + So selecting PARAVIRT_ALL selects all the other split PARAVIRT parts that in turn select PARAVIRT. I tested a compile on x86_64 with Xen DomU support enabled and it compiled fine :-). > Also, I think VMI is the only serious user of PARAVIRT_APIC, so we can > mark that to go when VMI does. > Sounds good :-). So the patch even serves as a helper for anyone who'll remove that support later. > What ends up using plain CONFIG_PARAVIRT? Do we still need it? > It's used for an info field that says which hypervisor we're running on and as config option to know if we need to include the binary patch magic. As soon as a single sub-paravirt option is selected, we need that to make sure we have the framework in place. Alex ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/3] Only export selected pv-ops feature structs 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf 2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf @ 2009-11-18 0:13 ` Alexander Graf 2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf ` (2 subsequent siblings) 4 siblings, 0 replies; 13+ messages in thread From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw) To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin To really check for sure that we're not using any pv-ops code by accident, we should make sure that we don't even export the structures used to access pv-ops exported functions. So let's surround the pv-ops structs by #ifdefs. Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/x86/kernel/paravirt.c | 35 +++++++++++++++++++++++++++++------ 1 files changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index c8530bd..0619e7c 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -124,11 +124,21 @@ static void *get_call_destination(u8 type) { struct paravirt_patch_template tmpl = { .pv_init_ops = pv_init_ops, +#ifdef CONFIG_PARAVIRT_TIME .pv_time_ops = pv_time_ops, +#endif +#ifdef CONFIG_PARAVIRT_CPU .pv_cpu_ops = pv_cpu_ops, +#endif +#ifdef CONFIG_PARAVIRT_IRQ .pv_irq_ops = pv_irq_ops, +#endif +#ifdef CONFIG_PARAVIRT_APIC .pv_apic_ops = pv_apic_ops, +#endif +#ifdef CONFIG_PARAVIRT_MMU .pv_mmu_ops = pv_mmu_ops, +#endif #ifdef CONFIG_PARAVIRT_SPINLOCKS .pv_lock_ops = pv_lock_ops, #endif @@ -185,6 +195,7 @@ unsigned paravirt_patch_insns(void *insnbuf, unsigned len, return insn_len; } +#ifdef CONFIG_PARAVIRT_MMU static void native_flush_tlb(void) { __native_flush_tlb(); @@ -203,6 +214,7 @@ static void native_flush_tlb_single(unsigned long addr) { __native_flush_tlb_single(addr); } +#endif /* CONFIG_PARAVIRT_MMU */ /* These are in entry.S */ extern void native_iret(void); @@ -284,6 +296,7 @@ enum paravirt_lazy_mode paravirt_get_lazy_mode(void) return percpu_read(paravirt_lazy_mode); } +#ifdef CONFIG_PARAVIRT_MMU void arch_flush_lazy_mmu_mode(void) { preempt_disable(); @@ -295,6 +308,7 @@ void arch_flush_lazy_mmu_mode(void) preempt_enable(); } +#endif /* CONFIG_PARAVIRT_MMU */ struct pv_info pv_info = { .name = "bare hardware", @@ -306,11 +320,16 @@ struct pv_info pv_info = { struct pv_init_ops pv_init_ops = { .patch = native_patch, }; +EXPORT_SYMBOL_GPL(pv_info); +#ifdef CONFIG_PARAVIRT_TIME struct pv_time_ops pv_time_ops = { .sched_clock = native_sched_clock, }; +EXPORT_SYMBOL_GPL(pv_time_ops); +#endif +#ifdef CONFIG_PARAVIRT_IRQ struct pv_irq_ops pv_irq_ops = { .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl), @@ -322,7 +341,10 @@ struct pv_irq_ops pv_irq_ops = { .adjust_exception_frame = paravirt_nop, #endif }; +EXPORT_SYMBOL (pv_irq_ops); +#endif +#ifdef CONFIG_PARAVIRT_CPU struct pv_cpu_ops pv_cpu_ops = { .cpuid = native_cpuid, .get_debugreg = native_get_debugreg, @@ -383,12 +405,17 @@ struct pv_cpu_ops pv_cpu_ops = { .start_context_switch = paravirt_nop, .end_context_switch = paravirt_nop, }; +EXPORT_SYMBOL (pv_cpu_ops); +#endif +#ifdef CONFIG_PARAVIRT_APIC struct pv_apic_ops pv_apic_ops = { #ifdef CONFIG_X86_LOCAL_APIC .startup_ipi_hook = paravirt_nop, #endif }; +EXPORT_SYMBOL_GPL(pv_apic_ops); +#endif #if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE) /* 32-bit pagetable entries */ @@ -398,6 +425,7 @@ struct pv_apic_ops pv_apic_ops = { #define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_64) #endif +#ifdef CONFIG_PARAVIRT_MMU struct pv_mmu_ops pv_mmu_ops = { .read_cr2 = native_read_cr2, @@ -470,10 +498,5 @@ struct pv_mmu_ops pv_mmu_ops = { .set_fixmap = native_set_fixmap, }; - -EXPORT_SYMBOL_GPL(pv_time_ops); -EXPORT_SYMBOL (pv_cpu_ops); EXPORT_SYMBOL (pv_mmu_ops); -EXPORT_SYMBOL_GPL(pv_apic_ops); -EXPORT_SYMBOL_GPL(pv_info); -EXPORT_SYMBOL (pv_irq_ops); +#endif -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] Split the KVM pv-ops support by feature 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf 2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf 2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf @ 2009-11-18 0:13 ` Alexander Graf 2009-11-18 1:33 ` Rusty Russell 2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity 2009-12-03 14:52 ` Alexander Graf 4 siblings, 1 reply; 13+ messages in thread From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw) To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin Currently selecting KVM guest support enabled multiple features at once that not everyone necessarily wants to have, namely: - PV MMU - zero io delay - apic detection workaround Let's split them off so we don't drag in the full pv-ops framework just to detect we're running on KVM. That gives us more chances to tweak performance! Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/x86/Kconfig | 29 ++++++++++++++++++++++++++++- arch/x86/kernel/kvm.c | 22 +++++++++++++++------- 2 files changed, 43 insertions(+), 8 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8c150b6..97d4f92 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -522,11 +522,38 @@ config KVM_CLOCK config KVM_GUEST bool "KVM Guest support" - select PARAVIRT_ALL + select PARAVIRT ---help--- This option enables various optimizations for running under the KVM hypervisor. +config KVM_IODELAY + bool "KVM IO-delay support" + depends on KVM_GUEST + select PARAVIRT_CPU + ---help--- + Usually we wait for PIO access to complete. When inside KVM there's + no need to do that, as we know that we're not going through a bus, + but process PIO requests instantly. + + This option disables PIO waits, but drags in CPU-bound pv-ops. Thus + you will probably get more speed loss than speedup using this option. + + If in doubt, say N. + +config KVM_MMU + bool "KVM PV MMU support" + depends on KVM_GUEST + select PARAVIRT_MMU + ---help--- + This option enables the paravirtualized MMU for KVM. In most cases + it's pretty useless and shouldn't be used. + + It will only cost you performance, because it drags in pv-ops for + memory management. + + If in doubt, say N. + source "arch/x86/lguest/Kconfig" config PARAVIRT diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 63b0ec8..7e0207f 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -29,6 +29,16 @@ #include <linux/hardirq.h> #include <asm/timer.h> +#ifdef CONFIG_KVM_IODELAY +/* + * No need for any "IO delay" on KVM + */ +static void kvm_io_delay(void) +{ +} +#endif /* CONFIG_KVM_IODELAY */ + +#ifdef CONFIG_KVM_MMU #define MMU_QUEUE_SIZE 1024 struct kvm_para_state { @@ -43,13 +53,6 @@ static struct kvm_para_state *kvm_para_state(void) return &per_cpu(para_state, raw_smp_processor_id()); } -/* - * No need for any "IO delay" on KVM - */ -static void kvm_io_delay(void) -{ -} - static void kvm_mmu_op(void *buffer, unsigned len) { int r; @@ -194,15 +197,19 @@ static void kvm_leave_lazy_mmu(void) mmu_queue_flush(state); paravirt_leave_lazy_mmu(); } +#endif /* CONFIG_KVM_MMU */ static void __init paravirt_ops_setup(void) { pv_info.name = "KVM"; pv_info.paravirt_enabled = 1; +#ifdef CONFIG_KVM_IODELAY if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY)) pv_cpu_ops.io_delay = kvm_io_delay; +#endif +#ifdef CONFIG_KVM_MMU if (kvm_para_has_feature(KVM_FEATURE_MMU_OP)) { pv_mmu_ops.set_pte = kvm_set_pte; pv_mmu_ops.set_pte_at = kvm_set_pte_at; @@ -226,6 +233,7 @@ static void __init paravirt_ops_setup(void) pv_mmu_ops.lazy_mode.enter = kvm_enter_lazy_mmu; pv_mmu_ops.lazy_mode.leave = kvm_leave_lazy_mmu; } +#endif /* CONFIG_KVM_MMU */ #ifdef CONFIG_X86_IO_APIC no_timer_check = 1; #endif -- 1.6.0.2 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] Split the KVM pv-ops support by feature 2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf @ 2009-11-18 1:33 ` Rusty Russell 2009-11-18 1:37 ` Alexander Graf 0 siblings, 1 reply; 13+ messages in thread From: Rusty Russell @ 2009-11-18 1:33 UTC (permalink / raw) To: virtualization Cc: Alexander Graf, kvm list, Nick Piggin, Glauber Costa, Avi Kivity, Jeremy Fitzhardinge On Wed, 18 Nov 2009 10:43:12 am Alexander Graf wrote: > Currently selecting KVM guest support enabled multiple features at once that > not everyone necessarily wants to have, namely: These patches make perfect sense, but please make sure Jeremy Fitzhardinge (CC'd) is in the loop, as he split the structs in the first place. Thanks! Rusty. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] Split the KVM pv-ops support by feature 2009-11-18 1:33 ` Rusty Russell @ 2009-11-18 1:37 ` Alexander Graf 0 siblings, 0 replies; 13+ messages in thread From: Alexander Graf @ 2009-11-18 1:37 UTC (permalink / raw) To: Rusty Russell Cc: virtualization, kvm list, Nick Piggin, Glauber Costa, Avi Kivity, Jeremy Fitzhardinge On 18.11.2009, at 02:33, Rusty Russell wrote: > On Wed, 18 Nov 2009 10:43:12 am Alexander Graf wrote: >> Currently selecting KVM guest support enabled multiple features at >> once that >> not everyone necessarily wants to have, namely: > > These patches make perfect sense, but please make sure Jeremy > Fitzhardinge > (CC'd) is in the loop, as he split the structs in the first place. Oh sure, if you know other people I should CC please tell me the mail addresses too :-). Jeremy, did you get the patchset? I can send it to you if you're not subscribed to virtualization@. Alex ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf ` (2 preceding siblings ...) 2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf @ 2009-11-19 7:42 ` Avi Kivity 2009-12-03 14:52 ` Alexander Graf 4 siblings, 0 replies; 13+ messages in thread From: Avi Kivity @ 2009-11-19 7:42 UTC (permalink / raw) To: Alexander Graf; +Cc: kvm list, virtualization, Glauber Costa, Nick Piggin On 11/18/2009 02:13 AM, Alexander Graf wrote: > Paravirt ops is currently only capable of either replacing a lot of Linux > internal code or none at all. The are users that don't need all of the > possibilities pv-ops delivers though. > > On KVM for example we're perfectly fine not using the PV MMU, thus not > touching any MMU code. That way we don't have to improve pv-ops to become > fast, we just don't compile the MMU parts in! > > This patchset splits pv-ops into several smaller config options split by > feature category and then converts the KVM pv-ops code to use only the > bits that are required, lowering overhead. > > Alexander Graf (3): > Split paravirt ops by functionality > Only export selected pv-ops feature structs > Split the KVM pv-ops support by feature > > The whole thing looks good to me. Let's wait for Jeremy to ack though. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf ` (3 preceding siblings ...) 2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity @ 2009-12-03 14:52 ` Alexander Graf 2009-12-03 15:00 ` Avi Kivity 4 siblings, 1 reply; 13+ messages in thread From: Alexander Graf @ 2009-12-03 14:52 UTC (permalink / raw) To: kvm list; +Cc: Nick Piggin, Glauber Costa, Avi Kivity, virtualization, rusty Alexander Graf wrote: > Paravirt ops is currently only capable of either replacing a lot of Linux > internal code or none at all. The are users that don't need all of the > possibilities pv-ops delivers though. > > On KVM for example we're perfectly fine not using the PV MMU, thus not > touching any MMU code. That way we don't have to improve pv-ops to become > fast, we just don't compile the MMU parts in! > > This patchset splits pv-ops into several smaller config options split by > feature category and then converts the KVM pv-ops code to use only the > bits that are required, lowering overhead. > So has this ended up in some tree yet? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops 2009-12-03 14:52 ` Alexander Graf @ 2009-12-03 15:00 ` Avi Kivity 2009-12-03 15:04 ` Alexander Graf 0 siblings, 1 reply; 13+ messages in thread From: Avi Kivity @ 2009-12-03 15:00 UTC (permalink / raw) To: Alexander Graf Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty On 12/03/2009 04:52 PM, Alexander Graf wrote: > Alexander Graf wrote: > >> Paravirt ops is currently only capable of either replacing a lot of Linux >> internal code or none at all. The are users that don't need all of the >> possibilities pv-ops delivers though. >> >> On KVM for example we're perfectly fine not using the PV MMU, thus not >> touching any MMU code. That way we don't have to improve pv-ops to become >> fast, we just don't compile the MMU parts in! >> >> This patchset splits pv-ops into several smaller config options split by >> feature category and then converts the KVM pv-ops code to use only the >> bits that are required, lowering overhead. >> >> > So has this ended up in some tree yet? > Don't think so. I suggest you copy lkml and Ingo. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops 2009-12-03 15:00 ` Avi Kivity @ 2009-12-03 15:04 ` Alexander Graf 2009-12-03 15:07 ` Avi Kivity 0 siblings, 1 reply; 13+ messages in thread From: Alexander Graf @ 2009-12-03 15:04 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty Avi Kivity wrote: > On 12/03/2009 04:52 PM, Alexander Graf wrote: >> Alexander Graf wrote: >> >>> Paravirt ops is currently only capable of either replacing a lot of >>> Linux >>> internal code or none at all. The are users that don't need all of the >>> possibilities pv-ops delivers though. >>> >>> On KVM for example we're perfectly fine not using the PV MMU, thus not >>> touching any MMU code. That way we don't have to improve pv-ops to >>> become >>> fast, we just don't compile the MMU parts in! >>> >>> This patchset splits pv-ops into several smaller config options >>> split by >>> feature category and then converts the KVM pv-ops code to use only the >>> bits that are required, lowering overhead. >>> >>> >> So has this ended up in some tree yet? >> > > Don't think so. I suggest you copy lkml and Ingo. Sending off the complete set again? Rebased against what? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops 2009-12-03 15:04 ` Alexander Graf @ 2009-12-03 15:07 ` Avi Kivity 0 siblings, 0 replies; 13+ messages in thread From: Avi Kivity @ 2009-12-03 15:07 UTC (permalink / raw) To: Alexander Graf Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty On 12/03/2009 05:04 PM, Alexander Graf wrote: >> Don't think so. I suggest you copy lkml and Ingo. >> > Sending off the complete set again? Yes. > Rebased against what? > tip's x86/paravirt seems like a good choice (though only one patch is in there at present). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-12-03 15:07 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf 2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf 2009-11-19 14:59 ` Jeremy Fitzhardinge 2009-11-19 15:21 ` Alexander Graf 2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf 2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf 2009-11-18 1:33 ` Rusty Russell 2009-11-18 1:37 ` Alexander Graf 2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity 2009-12-03 14:52 ` Alexander Graf 2009-12-03 15:00 ` Avi Kivity 2009-12-03 15:04 ` Alexander Graf 2009-12-03 15:07 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox