* [PATCH 0/3] Split up pv-ops
@ 2009-11-18 0:13 Alexander Graf
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw)
To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin
Paravirt ops is currently only capable of either replacing a lot of Linux
internal code or none at all. The are users that don't need all of the
possibilities pv-ops delivers though.
On KVM for example we're perfectly fine not using the PV MMU, thus not
touching any MMU code. That way we don't have to improve pv-ops to become
fast, we just don't compile the MMU parts in!
This patchset splits pv-ops into several smaller config options split by
feature category and then converts the KVM pv-ops code to use only the
bits that are required, lowering overhead.
Alexander Graf (3):
Split paravirt ops by functionality
Only export selected pv-ops feature structs
Split the KVM pv-ops support by feature
arch/x86/Kconfig | 72 +++++++++++++++++++++++++-
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/desc.h | 4 +-
arch/x86/include/asm/fixmap.h | 2 +-
arch/x86/include/asm/highmem.h | 2 +-
arch/x86/include/asm/io_32.h | 4 +-
arch/x86/include/asm/io_64.h | 2 +-
arch/x86/include/asm/irqflags.h | 21 ++++++--
arch/x86/include/asm/mmu_context.h | 4 +-
arch/x86/include/asm/msr.h | 4 +-
arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++-
arch/x86/include/asm/paravirt_types.h | 12 +++++
arch/x86/include/asm/pgalloc.h | 2 +-
arch/x86/include/asm/pgtable-3level_types.h | 2 +-
arch/x86/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/processor.h | 2 +-
arch/x86/include/asm/required-features.h | 2 +-
arch/x86/include/asm/smp.h | 2 +-
arch/x86/include/asm/system.h | 13 +++--
arch/x86/include/asm/tlbflush.h | 4 +-
arch/x86/kernel/head_64.S | 2 +-
arch/x86/kernel/kvm.c | 22 ++++++---
arch/x86/kernel/paravirt.c | 37 +++++++++++--
arch/x86/kernel/tsc.c | 2 +-
arch/x86/kernel/vsmp_64.c | 2 +-
arch/x86/xen/Kconfig | 2 +-
26 files changed, 219 insertions(+), 50 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] Split paravirt ops by functionality
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
@ 2009-11-18 0:13 ` Alexander Graf
2009-11-19 14:59 ` Jeremy Fitzhardinge
2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf
` (3 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw)
To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin
Currently when using paravirt ops it's an all-or-nothing option. We can either
use pv-ops for CPU, MMU, timing, etc. or not at all.
Now there are some use cases where we don't need the full feature set, but only
a small chunk of it. KVM is a pretty prominent example for this.
So let's make everything a bit more fine-grained. We already have a splitting
by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock".
Taking that existing splitting and extending it to only compile in the PV
capable bits sounded like a natural fit. That way we don't get performance hits
in MMU code from using the KVM PV clock which only needs the TIME parts of
pv-ops.
We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing
the CONFIG_PARAVIRT did before this splitting. We move all users of
CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did
before.
So here it is - the splitting! I would have made the patch smaller, but this
was the closest I could get to atomic (for bisect) while staying sane.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/x86/Kconfig | 47 ++++++++++++++++++++++++--
arch/x86/include/asm/apic.h | 2 +-
arch/x86/include/asm/desc.h | 4 +-
arch/x86/include/asm/fixmap.h | 2 +-
arch/x86/include/asm/highmem.h | 2 +-
arch/x86/include/asm/io_32.h | 4 ++-
arch/x86/include/asm/io_64.h | 2 +-
arch/x86/include/asm/irqflags.h | 21 +++++++++---
arch/x86/include/asm/mmu_context.h | 4 +-
arch/x86/include/asm/msr.h | 4 +-
arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++++++++++-
arch/x86/include/asm/paravirt_types.h | 12 +++++++
arch/x86/include/asm/pgalloc.h | 2 +-
arch/x86/include/asm/pgtable-3level_types.h | 2 +-
arch/x86/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/processor.h | 2 +-
arch/x86/include/asm/required-features.h | 2 +-
arch/x86/include/asm/smp.h | 2 +-
arch/x86/include/asm/system.h | 13 +++++--
arch/x86/include/asm/tlbflush.h | 4 +-
arch/x86/kernel/head_64.S | 2 +-
arch/x86/kernel/paravirt.c | 2 +
arch/x86/kernel/tsc.c | 2 +-
arch/x86/kernel/vsmp_64.c | 2 +-
arch/x86/xen/Kconfig | 2 +-
25 files changed, 149 insertions(+), 38 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0c7b699..8c150b6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -350,7 +350,7 @@ endif
config X86_VSMP
bool "ScaleMP vSMP"
- select PARAVIRT
+ select PARAVIRT_ALL
depends on X86_64 && PCI
depends on X86_EXTENDED_PLATFORM
---help---
@@ -493,7 +493,7 @@ source "arch/x86/xen/Kconfig"
config VMI
bool "VMI Guest support (DEPRECATED)"
- select PARAVIRT
+ select PARAVIRT_ALL
depends on X86_32
---help---
VMI provides a paravirtualized interface to the VMware ESX server
@@ -512,7 +512,6 @@ config VMI
config KVM_CLOCK
bool "KVM paravirtualized clock"
- select PARAVIRT
select PARAVIRT_CLOCK
---help---
Turning on this option will allow you to run a paravirtualized clock
@@ -523,7 +522,7 @@ config KVM_CLOCK
config KVM_GUEST
bool "KVM Guest support"
- select PARAVIRT
+ select PARAVIRT_ALL
---help---
This option enables various optimizations for running under the KVM
hypervisor.
@@ -551,8 +550,48 @@ config PARAVIRT_SPINLOCKS
If you are unsure how to answer this question, answer N.
+config PARAVIRT_CPU
+ bool
+ select PARAVIRT
+ default n
+
+config PARAVIRT_TIME
+ bool
+ select PARAVIRT
+ default n
+
+config PARAVIRT_IRQ
+ bool
+ select PARAVIRT
+ default n
+
+config PARAVIRT_APIC
+ bool
+ select PARAVIRT
+ default n
+
+config PARAVIRT_MMU
+ bool
+ select PARAVIRT
+ default n
+
+#
+# This is a placeholder to activate the old "include all pv-ops functionality"
+# behavior. If you're using this I'd recommend looking through your code to see
+# if you can be more specific. It probably saves you a few cycles!
+#
+config PARAVIRT_ALL
+ bool
+ select PARAVIRT_CPU
+ select PARAVIRT_TIME
+ select PARAVIRT_IRQ
+ select PARAVIRT_APIC
+ select PARAVIRT_MMU
+ default n
+
config PARAVIRT_CLOCK
bool
+ select PARAVIRT_TIME
default n
endif
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 474d80d..b54c24a 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -81,7 +81,7 @@ static inline bool apic_from_smp_config(void)
/*
* Basic functions accessing APICs.
*/
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_APIC
#include <asm/paravirt.h>
#endif
diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index e8de2f6..cf65891 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -78,7 +78,7 @@ static inline int desc_empty(const void *ptr)
return !(desc[0] | desc[1]);
}
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_CPU
#include <asm/paravirt.h>
#else
#define load_TR_desc() native_load_tr_desc()
@@ -108,7 +108,7 @@ static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries)
{
}
-#endif /* CONFIG_PARAVIRT */
+#endif /* CONFIG_PARAVIRT_CPU */
#define store_ldt(ldt) asm("sldt %0" : "=m"(ldt))
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 14f9890..5f29317 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -156,7 +156,7 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
void native_set_fixmap(enum fixed_addresses idx,
phys_addr_t phys, pgprot_t flags);
-#ifndef CONFIG_PARAVIRT
+#ifndef CONFIG_PARAVIRT_MMU
static inline void __set_fixmap(enum fixed_addresses idx,
phys_addr_t phys, pgprot_t flags)
{
diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h
index 014c2b8..458d785 100644
--- a/arch/x86/include/asm/highmem.h
+++ b/arch/x86/include/asm/highmem.h
@@ -66,7 +66,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot);
struct page *kmap_atomic_to_page(void *ptr);
-#ifndef CONFIG_PARAVIRT
+#ifndef CONFIG_PARAVIRT_MMU
#define kmap_atomic_pte(page, type) kmap_atomic(page, type)
#endif
diff --git a/arch/x86/include/asm/io_32.h b/arch/x86/include/asm/io_32.h
index a299900..a263c6f 100644
--- a/arch/x86/include/asm/io_32.h
+++ b/arch/x86/include/asm/io_32.h
@@ -109,7 +109,9 @@ extern void io_delay_init(void);
#if defined(CONFIG_PARAVIRT)
#include <asm/paravirt.h>
-#else
+#endif
+
+#ifndef CONFIG_PARAVIRT_CPU
static inline void slow_down_io(void)
{
diff --git a/arch/x86/include/asm/io_64.h b/arch/x86/include/asm/io_64.h
index 2440678..82c6eae 100644
--- a/arch/x86/include/asm/io_64.h
+++ b/arch/x86/include/asm/io_64.h
@@ -40,7 +40,7 @@ extern void native_io_delay(void);
extern int io_delay_type;
extern void io_delay_init(void);
-#if defined(CONFIG_PARAVIRT)
+#if defined(CONFIG_PARAVIRT_CPU)
#include <asm/paravirt.h>
#else
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 9e2b952..b8d8f4c 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -58,9 +58,11 @@ static inline void native_halt(void)
#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
-#else
+#endif
+
#ifndef __ASSEMBLY__
+#ifndef CONFIG_PARAVIRT_IRQ
static inline unsigned long __raw_local_save_flags(void)
{
return native_save_fl();
@@ -110,12 +112,17 @@ static inline unsigned long __raw_local_irq_save(void)
return flags;
}
-#else
+#endif /* CONFIG_PARAVIRT_IRQ */
+
+#else /* __ASSEMBLY__ */
+#ifndef CONFIG_PARAVIRT_IRQ
#define ENABLE_INTERRUPTS(x) sti
#define DISABLE_INTERRUPTS(x) cli
+#endif /* !CONFIG_PARAVIRT_IRQ */
#ifdef CONFIG_X86_64
+#ifndef CONFIG_PARAVIRT_CPU
#define SWAPGS swapgs
/*
* Currently paravirt can't handle swapgs nicely when we
@@ -128,8 +135,6 @@ static inline unsigned long __raw_local_irq_save(void)
*/
#define SWAPGS_UNSAFE_STACK swapgs
-#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */
-
#define INTERRUPT_RETURN iretq
#define USERGS_SYSRET64 \
swapgs; \
@@ -141,16 +146,22 @@ static inline unsigned long __raw_local_irq_save(void)
swapgs; \
sti; \
sysexit
+#endif /* !CONFIG_PARAVIRT_CPU */
+
+#ifndef CONFIG_PARAVIRT_IRQ
+#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */
+#endif /* !CONFIG_PARAVIRT_IRQ */
#else
+#ifndef CONFIG_PARAVIRT_CPU
#define INTERRUPT_RETURN iret
#define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit
#define GET_CR0_INTO_EAX movl %cr0, %eax
+#endif /* !CONFIG_PARAVIRT_CPU */
#endif
#endif /* __ASSEMBLY__ */
-#endif /* CONFIG_PARAVIRT */
#ifndef __ASSEMBLY__
#define raw_local_save_flags(flags) \
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 4a2d4e0..a209e67 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -6,14 +6,14 @@
#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
#include <asm/paravirt.h>
-#ifndef CONFIG_PARAVIRT
+#ifndef CONFIG_PARAVIRT_MMU
#include <asm-generic/mm_hooks.h>
static inline void paravirt_activate_mm(struct mm_struct *prev,
struct mm_struct *next)
{
}
-#endif /* !CONFIG_PARAVIRT */
+#endif /* !CONFIG_PARAVIRT_MMU */
/*
* Used for LDT copy/destruction.
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 7e2b6ba..80ec5a5 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -123,7 +123,7 @@ static inline unsigned long long native_read_pmc(int counter)
return EAX_EDX_VAL(val, low, high);
}
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_CPU
#include <asm/paravirt.h>
#else
#include <linux/errno.h>
@@ -234,7 +234,7 @@ do { \
#define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
-#endif /* !CONFIG_PARAVIRT */
+#endif /* !CONFIG_PARAVIRT_CPU */
#define checking_wrmsrl(msr, val) wrmsr_safe((msr), (u32)(val), \
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index efb3899..e543098 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -18,6 +18,7 @@ static inline int paravirt_enabled(void)
return pv_info.paravirt_enabled;
}
+#ifdef CONFIG_PARAVIRT_CPU
static inline void load_sp0(struct tss_struct *tss,
struct thread_struct *thread)
{
@@ -58,7 +59,9 @@ static inline void write_cr0(unsigned long x)
{
PVOP_VCALL1(pv_cpu_ops.write_cr0, x);
}
+#endif /* CONFIG_PARAVIRT_CPU */
+#ifdef CONFIG_PARAVIRT_MMU
static inline unsigned long read_cr2(void)
{
return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2);
@@ -78,7 +81,9 @@ static inline void write_cr3(unsigned long x)
{
PVOP_VCALL1(pv_mmu_ops.write_cr3, x);
}
+#endif /* CONFIG_PARAVIRT_MMU */
+#ifdef CONFIG_PARAVIRT_CPU
static inline unsigned long read_cr4(void)
{
return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr4);
@@ -92,8 +97,9 @@ static inline void write_cr4(unsigned long x)
{
PVOP_VCALL1(pv_cpu_ops.write_cr4, x);
}
+#endif /* CONFIG_PARAVIRT_CPU */
-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_CPU)
static inline unsigned long read_cr8(void)
{
return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr8);
@@ -105,6 +111,7 @@ static inline void write_cr8(unsigned long x)
}
#endif
+#ifdef CONFIG_PARAVIRT_IRQ
static inline void raw_safe_halt(void)
{
PVOP_VCALL0(pv_irq_ops.safe_halt);
@@ -114,14 +121,18 @@ static inline void halt(void)
{
PVOP_VCALL0(pv_irq_ops.safe_halt);
}
+#endif /* CONFIG_PARAVIRT_IRQ */
+#ifdef CONFIG_PARAVIRT_CPU
static inline void wbinvd(void)
{
PVOP_VCALL0(pv_cpu_ops.wbinvd);
}
+#endif
#define get_kernel_rpl() (pv_info.kernel_rpl)
+#ifdef CONFIG_PARAVIRT_CPU
static inline u64 paravirt_read_msr(unsigned msr, int *err)
{
return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err);
@@ -224,12 +235,16 @@ do { \
} while (0)
#define rdtscll(val) (val = paravirt_read_tsc())
+#endif /* CONFIG_PARAVIRT_CPU */
+#ifdef CONFIG_PARAVIRT_TIME
static inline unsigned long long paravirt_sched_clock(void)
{
return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
}
+#endif /* CONFIG_PARAVIRT_TIME */
+#ifdef CONFIG_PARAVIRT_CPU
static inline unsigned long long paravirt_read_pmc(int counter)
{
return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter);
@@ -345,8 +360,9 @@ static inline void slow_down_io(void)
pv_cpu_ops.io_delay();
#endif
}
+#endif /* CONFIG_PARAVIRT_CPU */
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_APIC)
static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip,
unsigned long start_esp)
{
@@ -355,6 +371,7 @@ static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip,
}
#endif
+#ifdef CONFIG_PARAVIRT_MMU
static inline void paravirt_activate_mm(struct mm_struct *prev,
struct mm_struct *next)
{
@@ -698,7 +715,9 @@ static inline void pmd_clear(pmd_t *pmdp)
set_pmd(pmdp, __pmd(0));
}
#endif /* CONFIG_X86_PAE */
+#endif /* CONFIG_PARAVIRT_MMU */
+#ifdef CONFIG_PARAVIRT_CPU
#define __HAVE_ARCH_START_CONTEXT_SWITCH
static inline void arch_start_context_switch(struct task_struct *prev)
{
@@ -709,7 +728,9 @@ static inline void arch_end_context_switch(struct task_struct *next)
{
PVOP_VCALL1(pv_cpu_ops.end_context_switch, next);
}
+#endif /* CONFIG_PARAVIRT_CPU */
+#ifdef CONFIG_PARAVIRT_MMU
#define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
static inline void arch_enter_lazy_mmu_mode(void)
{
@@ -728,6 +749,7 @@ static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
{
pv_mmu_ops.set_fixmap(idx, phys, flags);
}
+#endif /* CONFIG_PARAVIRT_MMU */
#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS)
@@ -838,6 +860,7 @@ static __always_inline void __raw_spin_unlock(struct raw_spinlock *lock)
#define __PV_IS_CALLEE_SAVE(func) \
((struct paravirt_callee_save) { func })
+#ifdef CONFIG_PARAVIRT_IRQ
static inline unsigned long __raw_local_save_flags(void)
{
return PVOP_CALLEE0(unsigned long, pv_irq_ops.save_fl);
@@ -866,6 +889,7 @@ static inline unsigned long __raw_local_irq_save(void)
raw_local_irq_disable();
return f;
}
+#endif /* CONFIG_PARAVIRT_IRQ */
/* Make sure as little as possible of this mess escapes. */
@@ -948,10 +972,13 @@ extern void default_banner(void);
#define PARA_INDIRECT(addr) *%cs:addr
#endif
+#ifdef CONFIG_PARAVIRT_CPU
#define INTERRUPT_RETURN \
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \
jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret))
+#endif /* CONFIG_PARAVIRT_CPU */
+#ifdef CONFIG_PARAVIRT_IRQ
#define DISABLE_INTERRUPTS(clobbers) \
PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \
PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \
@@ -963,13 +990,17 @@ extern void default_banner(void);
PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \
call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \
PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
+#endif /* CONFIG_PARAVIRT_IRQ */
+#ifdef CONFIG_PARAVIRT_CPU
#define USERGS_SYSRET32 \
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret32), \
CLBR_NONE, \
jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret32))
+#endif /* CONFIG_PARAVIRT_CPU */
#ifdef CONFIG_X86_32
+#ifdef CONFIG_PARAVIRT_CPU
#define GET_CR0_INTO_EAX \
push %ecx; push %edx; \
call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \
@@ -979,10 +1010,12 @@ extern void default_banner(void);
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \
CLBR_NONE, \
jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit))
+#endif /* CONFIG_PARAVIRT_CPU */
#else /* !CONFIG_X86_32 */
+#ifdef CONFIG_PARAVIRT_CPU
/*
* If swapgs is used while the userspace stack is still current,
* there's no way to call a pvop. The PV replacement *must* be
@@ -1002,17 +1035,23 @@ extern void default_banner(void);
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \
call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs) \
)
+#endif /* CONFIG_PARAVIRT_CPU */
+#ifdef CONFIG_PARAVIRT_MMU
#define GET_CR2_INTO_RCX \
call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2); \
movq %rax, %rcx; \
xorq %rax, %rax;
+#endif /* CONFIG_PARAVIRT_MMU */
+#ifdef CONFIG_PARAVIRT_IRQ
#define PARAVIRT_ADJUST_EXCEPTION_FRAME \
PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_adjust_exception_frame), \
CLBR_NONE, \
call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame))
+#endif /* CONFIG_PARAVIRT_IRQ */
+#ifdef CONFIG_PARAVIRT_CPU
#define USERGS_SYSRET64 \
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64), \
CLBR_NONE, \
@@ -1022,6 +1061,7 @@ extern void default_banner(void);
PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \
CLBR_NONE, \
jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit))
+#endif /* CONFIG_PARAVIRT_CPU */
#endif /* CONFIG_X86_32 */
#endif /* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 9357473..e190450 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -343,12 +343,24 @@ struct paravirt_patch_template {
extern struct pv_info pv_info;
extern struct pv_init_ops pv_init_ops;
+#ifdef CONFIG_PARAVIRT_TIME
extern struct pv_time_ops pv_time_ops;
+#endif
+#ifdef CONFIG_PARAVIRT_CPU
extern struct pv_cpu_ops pv_cpu_ops;
+#endif
+#ifdef CONFIG_PARAVIRT_IRQ
extern struct pv_irq_ops pv_irq_ops;
+#endif
+#ifdef CONFIG_PARAVIRT_APIC
extern struct pv_apic_ops pv_apic_ops;
+#endif
+#ifdef CONFIG_PARAVIRT_MMU
extern struct pv_mmu_ops pv_mmu_ops;
+#endif
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
extern struct pv_lock_ops pv_lock_ops;
+#endif
#define PARAVIRT_PATCH(x) \
(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 0e8c2a0..94cce3d 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -7,7 +7,7 @@
static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
#include <asm/paravirt.h>
#else
#define paravirt_pgd_alloc(mm) __paravirt_pgd_alloc(mm)
diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h
index 1bd5876..be58e74 100644
--- a/arch/x86/include/asm/pgtable-3level_types.h
+++ b/arch/x86/include/asm/pgtable-3level_types.h
@@ -18,7 +18,7 @@ typedef union {
} pte_t;
#endif /* !__ASSEMBLY__ */
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
#define SHARED_KERNEL_PMD (pv_info.shared_kernel_pmd)
#else
#define SHARED_KERNEL_PMD 1
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index af6fd36..b68edfc 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -26,7 +26,7 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
extern spinlock_t pgd_lock;
extern struct list_head pgd_list;
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
#include <asm/paravirt.h>
#else /* !CONFIG_PARAVIRT */
#define set_pte(ptep, pte) native_set_pte(ptep, pte)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c3429e8..a42a807 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -571,7 +571,7 @@ static inline void native_swapgs(void)
#endif
}
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_CPU
#include <asm/paravirt.h>
#else
#define __cpuid native_cpuid
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index 64cf2d2..f68edf2 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -48,7 +48,7 @@
#endif
#ifdef CONFIG_X86_64
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
/* Paravirtualized systems may not have PSE or PGE available */
#define NEED_PSE 0
#define NEED_PGE 0
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 1e79678..fdd889a 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -66,7 +66,7 @@ struct smp_ops {
extern void set_cpu_sibling_map(int cpu);
#ifdef CONFIG_SMP
-#ifndef CONFIG_PARAVIRT
+#ifndef CONFIG_PARAVIRT_APIC
#define startup_ipi_hook(phys_apicid, start_eip, start_esp) do { } while (0)
#endif
extern struct smp_ops smp_ops;
diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
index f08f973..63ca93c 100644
--- a/arch/x86/include/asm/system.h
+++ b/arch/x86/include/asm/system.h
@@ -302,13 +302,18 @@ static inline void native_wbinvd(void)
#ifdef CONFIG_PARAVIRT
#include <asm/paravirt.h>
-#else
-#define read_cr0() (native_read_cr0())
-#define write_cr0(x) (native_write_cr0(x))
+#endif/* CONFIG_PARAVIRT */
+
+#ifndef CONFIG_PARAVIRT_MMU
#define read_cr2() (native_read_cr2())
#define write_cr2(x) (native_write_cr2(x))
#define read_cr3() (native_read_cr3())
#define write_cr3(x) (native_write_cr3(x))
+#endif /* CONFIG_PARAVIRT_MMU */
+
+#ifndef CONFIG_PARAVIRT_CPU
+#define read_cr0() (native_read_cr0())
+#define write_cr0(x) (native_write_cr0(x))
#define read_cr4() (native_read_cr4())
#define read_cr4_safe() (native_read_cr4_safe())
#define write_cr4(x) (native_write_cr4(x))
@@ -322,7 +327,7 @@ static inline void native_wbinvd(void)
/* Clear the 'TS' bit */
#define clts() (native_clts())
-#endif/* CONFIG_PARAVIRT */
+#endif /* CONFIG_PARAVIRT_CPU */
#define stts() write_cr0(read_cr0() | X86_CR0_TS)
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 7f3eba0..89e055c 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -7,7 +7,7 @@
#include <asm/processor.h>
#include <asm/system.h>
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
#include <asm/paravirt.h>
#else
#define __flush_tlb() __native_flush_tlb()
@@ -162,7 +162,7 @@ static inline void reset_lazy_tlbstate(void)
#endif /* SMP */
-#ifndef CONFIG_PARAVIRT
+#ifndef CONFIG_PARAVIRT_MMU
#define flush_tlb_others(mask, mm, va) native_flush_tlb_others(mask, mm, va)
#endif
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 780cd92..1284d8d 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -20,7 +20,7 @@
#include <asm/processor-flags.h>
#include <asm/percpu.h>
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_MMU
#include <asm/asm-offsets.h>
#include <asm/paravirt.h>
#else
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 1b1739d..c8530bd 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -155,12 +155,14 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
else if (opfunc == _paravirt_ident_64)
ret = paravirt_patch_ident_64(insnbuf, len);
+#ifdef CONFIG_PARAVIRT_CPU
else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) ||
type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) ||
type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64))
/* If operation requires a jmp, then jmp */
ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len);
+#endif
else
/* Otherwise call the function; assume target could
clobber any caller-save reg */
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index cd982f4..96aad98 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -66,7 +66,7 @@ u64 native_sched_clock(void)
/* We need to define a real function for sched_clock, to override the
weak default version */
-#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_PARAVIRT_TIME
unsigned long long sched_clock(void)
{
return paravirt_sched_clock();
diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
index a1d804b..23f4612 100644
--- a/arch/x86/kernel/vsmp_64.c
+++ b/arch/x86/kernel/vsmp_64.c
@@ -22,7 +22,7 @@
#include <asm/paravirt.h>
#include <asm/setup.h>
-#if defined CONFIG_PCI && defined CONFIG_PARAVIRT
+#if defined CONFIG_PCI && defined CONFIG_PARAVIRT_IRQ
/*
* Interrupt control on vSMPowered systems:
* ~AC is a shadow of IF. If IF is 'on' AC should be 'off'
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index b83e119..eef41bd 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -4,7 +4,7 @@
config XEN
bool "Xen guest support"
- select PARAVIRT
+ select PARAVIRT_ALL
select PARAVIRT_CLOCK
depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
depends on X86_CMPXCHG && X86_TSC
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] Only export selected pv-ops feature structs
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
@ 2009-11-18 0:13 ` Alexander Graf
2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw)
To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin
To really check for sure that we're not using any pv-ops code by accident,
we should make sure that we don't even export the structures used to access
pv-ops exported functions.
So let's surround the pv-ops structs by #ifdefs.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/x86/kernel/paravirt.c | 35 +++++++++++++++++++++++++++++------
1 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c8530bd..0619e7c 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -124,11 +124,21 @@ static void *get_call_destination(u8 type)
{
struct paravirt_patch_template tmpl = {
.pv_init_ops = pv_init_ops,
+#ifdef CONFIG_PARAVIRT_TIME
.pv_time_ops = pv_time_ops,
+#endif
+#ifdef CONFIG_PARAVIRT_CPU
.pv_cpu_ops = pv_cpu_ops,
+#endif
+#ifdef CONFIG_PARAVIRT_IRQ
.pv_irq_ops = pv_irq_ops,
+#endif
+#ifdef CONFIG_PARAVIRT_APIC
.pv_apic_ops = pv_apic_ops,
+#endif
+#ifdef CONFIG_PARAVIRT_MMU
.pv_mmu_ops = pv_mmu_ops,
+#endif
#ifdef CONFIG_PARAVIRT_SPINLOCKS
.pv_lock_ops = pv_lock_ops,
#endif
@@ -185,6 +195,7 @@ unsigned paravirt_patch_insns(void *insnbuf, unsigned len,
return insn_len;
}
+#ifdef CONFIG_PARAVIRT_MMU
static void native_flush_tlb(void)
{
__native_flush_tlb();
@@ -203,6 +214,7 @@ static void native_flush_tlb_single(unsigned long addr)
{
__native_flush_tlb_single(addr);
}
+#endif /* CONFIG_PARAVIRT_MMU */
/* These are in entry.S */
extern void native_iret(void);
@@ -284,6 +296,7 @@ enum paravirt_lazy_mode paravirt_get_lazy_mode(void)
return percpu_read(paravirt_lazy_mode);
}
+#ifdef CONFIG_PARAVIRT_MMU
void arch_flush_lazy_mmu_mode(void)
{
preempt_disable();
@@ -295,6 +308,7 @@ void arch_flush_lazy_mmu_mode(void)
preempt_enable();
}
+#endif /* CONFIG_PARAVIRT_MMU */
struct pv_info pv_info = {
.name = "bare hardware",
@@ -306,11 +320,16 @@ struct pv_info pv_info = {
struct pv_init_ops pv_init_ops = {
.patch = native_patch,
};
+EXPORT_SYMBOL_GPL(pv_info);
+#ifdef CONFIG_PARAVIRT_TIME
struct pv_time_ops pv_time_ops = {
.sched_clock = native_sched_clock,
};
+EXPORT_SYMBOL_GPL(pv_time_ops);
+#endif
+#ifdef CONFIG_PARAVIRT_IRQ
struct pv_irq_ops pv_irq_ops = {
.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
@@ -322,7 +341,10 @@ struct pv_irq_ops pv_irq_ops = {
.adjust_exception_frame = paravirt_nop,
#endif
};
+EXPORT_SYMBOL (pv_irq_ops);
+#endif
+#ifdef CONFIG_PARAVIRT_CPU
struct pv_cpu_ops pv_cpu_ops = {
.cpuid = native_cpuid,
.get_debugreg = native_get_debugreg,
@@ -383,12 +405,17 @@ struct pv_cpu_ops pv_cpu_ops = {
.start_context_switch = paravirt_nop,
.end_context_switch = paravirt_nop,
};
+EXPORT_SYMBOL (pv_cpu_ops);
+#endif
+#ifdef CONFIG_PARAVIRT_APIC
struct pv_apic_ops pv_apic_ops = {
#ifdef CONFIG_X86_LOCAL_APIC
.startup_ipi_hook = paravirt_nop,
#endif
};
+EXPORT_SYMBOL_GPL(pv_apic_ops);
+#endif
#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_PAE)
/* 32-bit pagetable entries */
@@ -398,6 +425,7 @@ struct pv_apic_ops pv_apic_ops = {
#define PTE_IDENT __PV_IS_CALLEE_SAVE(_paravirt_ident_64)
#endif
+#ifdef CONFIG_PARAVIRT_MMU
struct pv_mmu_ops pv_mmu_ops = {
.read_cr2 = native_read_cr2,
@@ -470,10 +498,5 @@ struct pv_mmu_ops pv_mmu_ops = {
.set_fixmap = native_set_fixmap,
};
-
-EXPORT_SYMBOL_GPL(pv_time_ops);
-EXPORT_SYMBOL (pv_cpu_ops);
EXPORT_SYMBOL (pv_mmu_ops);
-EXPORT_SYMBOL_GPL(pv_apic_ops);
-EXPORT_SYMBOL_GPL(pv_info);
-EXPORT_SYMBOL (pv_irq_ops);
+#endif
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] Split the KVM pv-ops support by feature
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf
@ 2009-11-18 0:13 ` Alexander Graf
2009-11-18 1:33 ` Rusty Russell
2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity
2009-12-03 14:52 ` Alexander Graf
4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 0:13 UTC (permalink / raw)
To: kvm list; +Cc: virtualization, Glauber Costa, Avi Kivity, Nick Piggin
Currently selecting KVM guest support enabled multiple features at once that
not everyone necessarily wants to have, namely:
- PV MMU
- zero io delay
- apic detection workaround
Let's split them off so we don't drag in the full pv-ops framework just to
detect we're running on KVM. That gives us more chances to tweak performance!
Signed-off-by: Alexander Graf <agraf@suse.de>
---
arch/x86/Kconfig | 29 ++++++++++++++++++++++++++++-
arch/x86/kernel/kvm.c | 22 +++++++++++++++-------
2 files changed, 43 insertions(+), 8 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8c150b6..97d4f92 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -522,11 +522,38 @@ config KVM_CLOCK
config KVM_GUEST
bool "KVM Guest support"
- select PARAVIRT_ALL
+ select PARAVIRT
---help---
This option enables various optimizations for running under the KVM
hypervisor.
+config KVM_IODELAY
+ bool "KVM IO-delay support"
+ depends on KVM_GUEST
+ select PARAVIRT_CPU
+ ---help---
+ Usually we wait for PIO access to complete. When inside KVM there's
+ no need to do that, as we know that we're not going through a bus,
+ but process PIO requests instantly.
+
+ This option disables PIO waits, but drags in CPU-bound pv-ops. Thus
+ you will probably get more speed loss than speedup using this option.
+
+ If in doubt, say N.
+
+config KVM_MMU
+ bool "KVM PV MMU support"
+ depends on KVM_GUEST
+ select PARAVIRT_MMU
+ ---help---
+ This option enables the paravirtualized MMU for KVM. In most cases
+ it's pretty useless and shouldn't be used.
+
+ It will only cost you performance, because it drags in pv-ops for
+ memory management.
+
+ If in doubt, say N.
+
source "arch/x86/lguest/Kconfig"
config PARAVIRT
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 63b0ec8..7e0207f 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -29,6 +29,16 @@
#include <linux/hardirq.h>
#include <asm/timer.h>
+#ifdef CONFIG_KVM_IODELAY
+/*
+ * No need for any "IO delay" on KVM
+ */
+static void kvm_io_delay(void)
+{
+}
+#endif /* CONFIG_KVM_IODELAY */
+
+#ifdef CONFIG_KVM_MMU
#define MMU_QUEUE_SIZE 1024
struct kvm_para_state {
@@ -43,13 +53,6 @@ static struct kvm_para_state *kvm_para_state(void)
return &per_cpu(para_state, raw_smp_processor_id());
}
-/*
- * No need for any "IO delay" on KVM
- */
-static void kvm_io_delay(void)
-{
-}
-
static void kvm_mmu_op(void *buffer, unsigned len)
{
int r;
@@ -194,15 +197,19 @@ static void kvm_leave_lazy_mmu(void)
mmu_queue_flush(state);
paravirt_leave_lazy_mmu();
}
+#endif /* CONFIG_KVM_MMU */
static void __init paravirt_ops_setup(void)
{
pv_info.name = "KVM";
pv_info.paravirt_enabled = 1;
+#ifdef CONFIG_KVM_IODELAY
if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
pv_cpu_ops.io_delay = kvm_io_delay;
+#endif
+#ifdef CONFIG_KVM_MMU
if (kvm_para_has_feature(KVM_FEATURE_MMU_OP)) {
pv_mmu_ops.set_pte = kvm_set_pte;
pv_mmu_ops.set_pte_at = kvm_set_pte_at;
@@ -226,6 +233,7 @@ static void __init paravirt_ops_setup(void)
pv_mmu_ops.lazy_mode.enter = kvm_enter_lazy_mmu;
pv_mmu_ops.lazy_mode.leave = kvm_leave_lazy_mmu;
}
+#endif /* CONFIG_KVM_MMU */
#ifdef CONFIG_X86_IO_APIC
no_timer_check = 1;
#endif
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] Split the KVM pv-ops support by feature
2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf
@ 2009-11-18 1:33 ` Rusty Russell
2009-11-18 1:37 ` Alexander Graf
0 siblings, 1 reply; 13+ messages in thread
From: Rusty Russell @ 2009-11-18 1:33 UTC (permalink / raw)
To: virtualization
Cc: Alexander Graf, kvm list, Nick Piggin, Glauber Costa, Avi Kivity,
Jeremy Fitzhardinge
On Wed, 18 Nov 2009 10:43:12 am Alexander Graf wrote:
> Currently selecting KVM guest support enabled multiple features at once that
> not everyone necessarily wants to have, namely:
These patches make perfect sense, but please make sure Jeremy Fitzhardinge
(CC'd) is in the loop, as he split the structs in the first place.
Thanks!
Rusty.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] Split the KVM pv-ops support by feature
2009-11-18 1:33 ` Rusty Russell
@ 2009-11-18 1:37 ` Alexander Graf
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2009-11-18 1:37 UTC (permalink / raw)
To: Rusty Russell
Cc: virtualization, kvm list, Nick Piggin, Glauber Costa, Avi Kivity,
Jeremy Fitzhardinge
On 18.11.2009, at 02:33, Rusty Russell wrote:
> On Wed, 18 Nov 2009 10:43:12 am Alexander Graf wrote:
>> Currently selecting KVM guest support enabled multiple features at
>> once that
>> not everyone necessarily wants to have, namely:
>
> These patches make perfect sense, but please make sure Jeremy
> Fitzhardinge
> (CC'd) is in the loop, as he split the structs in the first place.
Oh sure, if you know other people I should CC please tell me the mail
addresses too :-).
Jeremy, did you get the patchset? I can send it to you if you're not
subscribed to virtualization@.
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
` (2 preceding siblings ...)
2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf
@ 2009-11-19 7:42 ` Avi Kivity
2009-12-03 14:52 ` Alexander Graf
4 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2009-11-19 7:42 UTC (permalink / raw)
To: Alexander Graf; +Cc: kvm list, virtualization, Glauber Costa, Nick Piggin
On 11/18/2009 02:13 AM, Alexander Graf wrote:
> Paravirt ops is currently only capable of either replacing a lot of Linux
> internal code or none at all. The are users that don't need all of the
> possibilities pv-ops delivers though.
>
> On KVM for example we're perfectly fine not using the PV MMU, thus not
> touching any MMU code. That way we don't have to improve pv-ops to become
> fast, we just don't compile the MMU parts in!
>
> This patchset splits pv-ops into several smaller config options split by
> feature category and then converts the KVM pv-ops code to use only the
> bits that are required, lowering overhead.
>
> Alexander Graf (3):
> Split paravirt ops by functionality
> Only export selected pv-ops feature structs
> Split the KVM pv-ops support by feature
>
>
The whole thing looks good to me. Let's wait for Jeremy to ack though.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] Split paravirt ops by functionality
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
@ 2009-11-19 14:59 ` Jeremy Fitzhardinge
2009-11-19 15:21 ` Alexander Graf
0 siblings, 1 reply; 13+ messages in thread
From: Jeremy Fitzhardinge @ 2009-11-19 14:59 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm list, Nick Piggin, Glauber Costa, Avi Kivity, virtualization
On 11/18/09 08:13, Alexander Graf wrote:
> Currently when using paravirt ops it's an all-or-nothing option. We can either
> use pv-ops for CPU, MMU, timing, etc. or not at all.
>
> Now there are some use cases where we don't need the full feature set, but only
> a small chunk of it. KVM is a pretty prominent example for this.
>
> So let's make everything a bit more fine-grained. We already have a splitting
> by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock".
>
> Taking that existing splitting and extending it to only compile in the PV
> capable bits sounded like a natural fit. That way we don't get performance hits
> in MMU code from using the KVM PV clock which only needs the TIME parts of
> pv-ops.
>
> We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing
> the CONFIG_PARAVIRT did before this splitting. We move all users of
> CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did
> before.
>
> So here it is - the splitting! I would have made the patch smaller, but this
> was the closest I could get to atomic (for bisect) while staying sane.
>
The basic idea seems pretty sane. I'm wondering how much compile test
coverage you've given all these extra config options; there are now a
lot more combinations, and your use of select is particularly worrying
because they don't propagate dependencies properly.
For example, does this actually work?
> config XEN
> bool "Xen guest support"
> - select PARAVIRT
> + select PARAVIRT_ALL
> select PARAVIRT_CLOCK
> depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
> depends on X86_CMPXCHG && X86_TSC
>
Does selecting PARAVIRT_ALL end up selecting all the other PARAVIRT_*?
Can you reassure me?
Also, I think VMI is the only serious user of PARAVIRT_APIC, so we can
mark that to go when VMI does.
What ends up using plain CONFIG_PARAVIRT? Do we still need it?
J
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> arch/x86/Kconfig | 47 ++++++++++++++++++++++++--
> arch/x86/include/asm/apic.h | 2 +-
> arch/x86/include/asm/desc.h | 4 +-
> arch/x86/include/asm/fixmap.h | 2 +-
> arch/x86/include/asm/highmem.h | 2 +-
> arch/x86/include/asm/io_32.h | 4 ++-
> arch/x86/include/asm/io_64.h | 2 +-
> arch/x86/include/asm/irqflags.h | 21 +++++++++---
> arch/x86/include/asm/mmu_context.h | 4 +-
> arch/x86/include/asm/msr.h | 4 +-
> arch/x86/include/asm/paravirt.h | 44 ++++++++++++++++++++++++-
> arch/x86/include/asm/paravirt_types.h | 12 +++++++
> arch/x86/include/asm/pgalloc.h | 2 +-
> arch/x86/include/asm/pgtable-3level_types.h | 2 +-
> arch/x86/include/asm/pgtable.h | 2 +-
> arch/x86/include/asm/processor.h | 2 +-
> arch/x86/include/asm/required-features.h | 2 +-
> arch/x86/include/asm/smp.h | 2 +-
> arch/x86/include/asm/system.h | 13 +++++--
> arch/x86/include/asm/tlbflush.h | 4 +-
> arch/x86/kernel/head_64.S | 2 +-
> arch/x86/kernel/paravirt.c | 2 +
> arch/x86/kernel/tsc.c | 2 +-
> arch/x86/kernel/vsmp_64.c | 2 +-
> arch/x86/xen/Kconfig | 2 +-
> 25 files changed, 149 insertions(+), 38 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 0c7b699..8c150b6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -350,7 +350,7 @@ endif
>
> config X86_VSMP
> bool "ScaleMP vSMP"
> - select PARAVIRT
> + select PARAVIRT_ALL
> depends on X86_64 && PCI
> depends on X86_EXTENDED_PLATFORM
> ---help---
> @@ -493,7 +493,7 @@ source "arch/x86/xen/Kconfig"
>
> config VMI
> bool "VMI Guest support (DEPRECATED)"
> - select PARAVIRT
> + select PARAVIRT_ALL
> depends on X86_32
> ---help---
> VMI provides a paravirtualized interface to the VMware ESX server
> @@ -512,7 +512,6 @@ config VMI
>
> config KVM_CLOCK
> bool "KVM paravirtualized clock"
> - select PARAVIRT
> select PARAVIRT_CLOCK
> ---help---
> Turning on this option will allow you to run a paravirtualized clock
> @@ -523,7 +522,7 @@ config KVM_CLOCK
>
> config KVM_GUEST
> bool "KVM Guest support"
> - select PARAVIRT
> + select PARAVIRT_ALL
> ---help---
> This option enables various optimizations for running under the KVM
> hypervisor.
> @@ -551,8 +550,48 @@ config PARAVIRT_SPINLOCKS
>
> If you are unsure how to answer this question, answer N.
>
> +config PARAVIRT_CPU
> + bool
> + select PARAVIRT
> + default n
> +
> +config PARAVIRT_TIME
> + bool
> + select PARAVIRT
> + default n
> +
> +config PARAVIRT_IRQ
> + bool
> + select PARAVIRT
> + default n
> +
> +config PARAVIRT_APIC
> + bool
> + select PARAVIRT
> + default n
> +
> +config PARAVIRT_MMU
> + bool
> + select PARAVIRT
> + default n
> +
> +#
> +# This is a placeholder to activate the old "include all pv-ops functionality"
> +# behavior. If you're using this I'd recommend looking through your code to see
> +# if you can be more specific. It probably saves you a few cycles!
> +#
> +config PARAVIRT_ALL
> + bool
> + select PARAVIRT_CPU
> + select PARAVIRT_TIME
> + select PARAVIRT_IRQ
> + select PARAVIRT_APIC
> + select PARAVIRT_MMU
> + default n
> +
> config PARAVIRT_CLOCK
> bool
> + select PARAVIRT_TIME
> default n
>
> endif
> diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
> index 474d80d..b54c24a 100644
> --- a/arch/x86/include/asm/apic.h
> +++ b/arch/x86/include/asm/apic.h
> @@ -81,7 +81,7 @@ static inline bool apic_from_smp_config(void)
> /*
> * Basic functions accessing APICs.
> */
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_APIC
> #include <asm/paravirt.h>
> #endif
>
> diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
> index e8de2f6..cf65891 100644
> --- a/arch/x86/include/asm/desc.h
> +++ b/arch/x86/include/asm/desc.h
> @@ -78,7 +78,7 @@ static inline int desc_empty(const void *ptr)
> return !(desc[0] | desc[1]);
> }
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_CPU
> #include <asm/paravirt.h>
> #else
> #define load_TR_desc() native_load_tr_desc()
> @@ -108,7 +108,7 @@ static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
> static inline void paravirt_free_ldt(struct desc_struct *ldt, unsigned entries)
> {
> }
> -#endif /* CONFIG_PARAVIRT */
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> #define store_ldt(ldt) asm("sldt %0" : "=m"(ldt))
>
> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index 14f9890..5f29317 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -156,7 +156,7 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte);
> void native_set_fixmap(enum fixed_addresses idx,
> phys_addr_t phys, pgprot_t flags);
>
> -#ifndef CONFIG_PARAVIRT
> +#ifndef CONFIG_PARAVIRT_MMU
> static inline void __set_fixmap(enum fixed_addresses idx,
> phys_addr_t phys, pgprot_t flags)
> {
> diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h
> index 014c2b8..458d785 100644
> --- a/arch/x86/include/asm/highmem.h
> +++ b/arch/x86/include/asm/highmem.h
> @@ -66,7 +66,7 @@ void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
> void *kmap_atomic_prot_pfn(unsigned long pfn, enum km_type type, pgprot_t prot);
> struct page *kmap_atomic_to_page(void *ptr);
>
> -#ifndef CONFIG_PARAVIRT
> +#ifndef CONFIG_PARAVIRT_MMU
> #define kmap_atomic_pte(page, type) kmap_atomic(page, type)
> #endif
>
> diff --git a/arch/x86/include/asm/io_32.h b/arch/x86/include/asm/io_32.h
> index a299900..a263c6f 100644
> --- a/arch/x86/include/asm/io_32.h
> +++ b/arch/x86/include/asm/io_32.h
> @@ -109,7 +109,9 @@ extern void io_delay_init(void);
>
> #if defined(CONFIG_PARAVIRT)
> #include <asm/paravirt.h>
> -#else
> +#endif
> +
> +#ifndef CONFIG_PARAVIRT_CPU
>
> static inline void slow_down_io(void)
> {
> diff --git a/arch/x86/include/asm/io_64.h b/arch/x86/include/asm/io_64.h
> index 2440678..82c6eae 100644
> --- a/arch/x86/include/asm/io_64.h
> +++ b/arch/x86/include/asm/io_64.h
> @@ -40,7 +40,7 @@ extern void native_io_delay(void);
> extern int io_delay_type;
> extern void io_delay_init(void);
>
> -#if defined(CONFIG_PARAVIRT)
> +#if defined(CONFIG_PARAVIRT_CPU)
> #include <asm/paravirt.h>
> #else
>
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 9e2b952..b8d8f4c 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -58,9 +58,11 @@ static inline void native_halt(void)
>
> #ifdef CONFIG_PARAVIRT
> #include <asm/paravirt.h>
> -#else
> +#endif
> +
> #ifndef __ASSEMBLY__
>
> +#ifndef CONFIG_PARAVIRT_IRQ
> static inline unsigned long __raw_local_save_flags(void)
> {
> return native_save_fl();
> @@ -110,12 +112,17 @@ static inline unsigned long __raw_local_irq_save(void)
>
> return flags;
> }
> -#else
> +#endif /* CONFIG_PARAVIRT_IRQ */
> +
> +#else /* __ASSEMBLY__ */
>
> +#ifndef CONFIG_PARAVIRT_IRQ
> #define ENABLE_INTERRUPTS(x) sti
> #define DISABLE_INTERRUPTS(x) cli
> +#endif /* !CONFIG_PARAVIRT_IRQ */
>
> #ifdef CONFIG_X86_64
> +#ifndef CONFIG_PARAVIRT_CPU
> #define SWAPGS swapgs
> /*
> * Currently paravirt can't handle swapgs nicely when we
> @@ -128,8 +135,6 @@ static inline unsigned long __raw_local_irq_save(void)
> */
> #define SWAPGS_UNSAFE_STACK swapgs
>
> -#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */
> -
> #define INTERRUPT_RETURN iretq
> #define USERGS_SYSRET64 \
> swapgs; \
> @@ -141,16 +146,22 @@ static inline unsigned long __raw_local_irq_save(void)
> swapgs; \
> sti; \
> sysexit
> +#endif /* !CONFIG_PARAVIRT_CPU */
> +
> +#ifndef CONFIG_PARAVIRT_IRQ
> +#define PARAVIRT_ADJUST_EXCEPTION_FRAME /* */
> +#endif /* !CONFIG_PARAVIRT_IRQ */
>
> #else
> +#ifndef CONFIG_PARAVIRT_CPU
> #define INTERRUPT_RETURN iret
> #define ENABLE_INTERRUPTS_SYSEXIT sti; sysexit
> #define GET_CR0_INTO_EAX movl %cr0, %eax
> +#endif /* !CONFIG_PARAVIRT_CPU */
> #endif
>
>
> #endif /* __ASSEMBLY__ */
> -#endif /* CONFIG_PARAVIRT */
>
> #ifndef __ASSEMBLY__
> #define raw_local_save_flags(flags) \
> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> index 4a2d4e0..a209e67 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -6,14 +6,14 @@
> #include <asm/pgalloc.h>
> #include <asm/tlbflush.h>
> #include <asm/paravirt.h>
> -#ifndef CONFIG_PARAVIRT
> +#ifndef CONFIG_PARAVIRT_MMU
> #include <asm-generic/mm_hooks.h>
>
> static inline void paravirt_activate_mm(struct mm_struct *prev,
> struct mm_struct *next)
> {
> }
> -#endif /* !CONFIG_PARAVIRT */
> +#endif /* !CONFIG_PARAVIRT_MMU */
>
> /*
> * Used for LDT copy/destruction.
> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> index 7e2b6ba..80ec5a5 100644
> --- a/arch/x86/include/asm/msr.h
> +++ b/arch/x86/include/asm/msr.h
> @@ -123,7 +123,7 @@ static inline unsigned long long native_read_pmc(int counter)
> return EAX_EDX_VAL(val, low, high);
> }
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_CPU
> #include <asm/paravirt.h>
> #else
> #include <linux/errno.h>
> @@ -234,7 +234,7 @@ do { \
>
> #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
>
> -#endif /* !CONFIG_PARAVIRT */
> +#endif /* !CONFIG_PARAVIRT_CPU */
>
>
> #define checking_wrmsrl(msr, val) wrmsr_safe((msr), (u32)(val), \
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index efb3899..e543098 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -18,6 +18,7 @@ static inline int paravirt_enabled(void)
> return pv_info.paravirt_enabled;
> }
>
> +#ifdef CONFIG_PARAVIRT_CPU
> static inline void load_sp0(struct tss_struct *tss,
> struct thread_struct *thread)
> {
> @@ -58,7 +59,9 @@ static inline void write_cr0(unsigned long x)
> {
> PVOP_VCALL1(pv_cpu_ops.write_cr0, x);
> }
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> +#ifdef CONFIG_PARAVIRT_MMU
> static inline unsigned long read_cr2(void)
> {
> return PVOP_CALL0(unsigned long, pv_mmu_ops.read_cr2);
> @@ -78,7 +81,9 @@ static inline void write_cr3(unsigned long x)
> {
> PVOP_VCALL1(pv_mmu_ops.write_cr3, x);
> }
> +#endif /* CONFIG_PARAVIRT_MMU */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> static inline unsigned long read_cr4(void)
> {
> return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr4);
> @@ -92,8 +97,9 @@ static inline void write_cr4(unsigned long x)
> {
> PVOP_VCALL1(pv_cpu_ops.write_cr4, x);
> }
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> -#ifdef CONFIG_X86_64
> +#if defined(CONFIG_X86_64) && defined(CONFIG_PARAVIRT_CPU)
> static inline unsigned long read_cr8(void)
> {
> return PVOP_CALL0(unsigned long, pv_cpu_ops.read_cr8);
> @@ -105,6 +111,7 @@ static inline void write_cr8(unsigned long x)
> }
> #endif
>
> +#ifdef CONFIG_PARAVIRT_IRQ
> static inline void raw_safe_halt(void)
> {
> PVOP_VCALL0(pv_irq_ops.safe_halt);
> @@ -114,14 +121,18 @@ static inline void halt(void)
> {
> PVOP_VCALL0(pv_irq_ops.safe_halt);
> }
> +#endif /* CONFIG_PARAVIRT_IRQ */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> static inline void wbinvd(void)
> {
> PVOP_VCALL0(pv_cpu_ops.wbinvd);
> }
> +#endif
>
> #define get_kernel_rpl() (pv_info.kernel_rpl)
>
> +#ifdef CONFIG_PARAVIRT_CPU
> static inline u64 paravirt_read_msr(unsigned msr, int *err)
> {
> return PVOP_CALL2(u64, pv_cpu_ops.read_msr, msr, err);
> @@ -224,12 +235,16 @@ do { \
> } while (0)
>
> #define rdtscll(val) (val = paravirt_read_tsc())
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> +#ifdef CONFIG_PARAVIRT_TIME
> static inline unsigned long long paravirt_sched_clock(void)
> {
> return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
> }
> +#endif /* CONFIG_PARAVIRT_TIME */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> static inline unsigned long long paravirt_read_pmc(int counter)
> {
> return PVOP_CALL1(u64, pv_cpu_ops.read_pmc, counter);
> @@ -345,8 +360,9 @@ static inline void slow_down_io(void)
> pv_cpu_ops.io_delay();
> #endif
> }
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> -#ifdef CONFIG_SMP
> +#if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_APIC)
> static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip,
> unsigned long start_esp)
> {
> @@ -355,6 +371,7 @@ static inline void startup_ipi_hook(int phys_apicid, unsigned long start_eip,
> }
> #endif
>
> +#ifdef CONFIG_PARAVIRT_MMU
> static inline void paravirt_activate_mm(struct mm_struct *prev,
> struct mm_struct *next)
> {
> @@ -698,7 +715,9 @@ static inline void pmd_clear(pmd_t *pmdp)
> set_pmd(pmdp, __pmd(0));
> }
> #endif /* CONFIG_X86_PAE */
> +#endif /* CONFIG_PARAVIRT_MMU */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> #define __HAVE_ARCH_START_CONTEXT_SWITCH
> static inline void arch_start_context_switch(struct task_struct *prev)
> {
> @@ -709,7 +728,9 @@ static inline void arch_end_context_switch(struct task_struct *next)
> {
> PVOP_VCALL1(pv_cpu_ops.end_context_switch, next);
> }
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> +#ifdef CONFIG_PARAVIRT_MMU
> #define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
> static inline void arch_enter_lazy_mmu_mode(void)
> {
> @@ -728,6 +749,7 @@ static inline void __set_fixmap(unsigned /* enum fixed_addresses */ idx,
> {
> pv_mmu_ops.set_fixmap(idx, phys, flags);
> }
> +#endif /* CONFIG_PARAVIRT_MMU */
>
> #if defined(CONFIG_SMP) && defined(CONFIG_PARAVIRT_SPINLOCKS)
>
> @@ -838,6 +860,7 @@ static __always_inline void __raw_spin_unlock(struct raw_spinlock *lock)
> #define __PV_IS_CALLEE_SAVE(func) \
> ((struct paravirt_callee_save) { func })
>
> +#ifdef CONFIG_PARAVIRT_IRQ
> static inline unsigned long __raw_local_save_flags(void)
> {
> return PVOP_CALLEE0(unsigned long, pv_irq_ops.save_fl);
> @@ -866,6 +889,7 @@ static inline unsigned long __raw_local_irq_save(void)
> raw_local_irq_disable();
> return f;
> }
> +#endif /* CONFIG_PARAVIRT_IRQ */
>
>
> /* Make sure as little as possible of this mess escapes. */
> @@ -948,10 +972,13 @@ extern void default_banner(void);
> #define PARA_INDIRECT(addr) *%cs:addr
> #endif
>
> +#ifdef CONFIG_PARAVIRT_CPU
> #define INTERRUPT_RETURN \
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \
> jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_iret))
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> +#ifdef CONFIG_PARAVIRT_IRQ
> #define DISABLE_INTERRUPTS(clobbers) \
> PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \
> PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \
> @@ -963,13 +990,17 @@ extern void default_banner(void);
> PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE); \
> call PARA_INDIRECT(pv_irq_ops+PV_IRQ_irq_enable); \
> PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
> +#endif /* CONFIG_PARAVIRT_IRQ */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> #define USERGS_SYSRET32 \
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret32), \
> CLBR_NONE, \
> jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_usergs_sysret32))
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> #ifdef CONFIG_X86_32
> +#ifdef CONFIG_PARAVIRT_CPU
> #define GET_CR0_INTO_EAX \
> push %ecx; push %edx; \
> call PARA_INDIRECT(pv_cpu_ops+PV_CPU_read_cr0); \
> @@ -979,10 +1010,12 @@ extern void default_banner(void);
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \
> CLBR_NONE, \
> jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit))
> +#endif /* CONFIG_PARAVIRT_CPU */
>
>
> #else /* !CONFIG_X86_32 */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> /*
> * If swapgs is used while the userspace stack is still current,
> * there's no way to call a pvop. The PV replacement *must* be
> @@ -1002,17 +1035,23 @@ extern void default_banner(void);
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_swapgs), CLBR_NONE, \
> call PARA_INDIRECT(pv_cpu_ops+PV_CPU_swapgs) \
> )
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> +#ifdef CONFIG_PARAVIRT_MMU
> #define GET_CR2_INTO_RCX \
> call PARA_INDIRECT(pv_mmu_ops+PV_MMU_read_cr2); \
> movq %rax, %rcx; \
> xorq %rax, %rax;
> +#endif /* CONFIG_PARAVIRT_MMU */
>
> +#ifdef CONFIG_PARAVIRT_IRQ
> #define PARAVIRT_ADJUST_EXCEPTION_FRAME \
> PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_adjust_exception_frame), \
> CLBR_NONE, \
> call PARA_INDIRECT(pv_irq_ops+PV_IRQ_adjust_exception_frame))
> +#endif /* CONFIG_PARAVIRT_IRQ */
>
> +#ifdef CONFIG_PARAVIRT_CPU
> #define USERGS_SYSRET64 \
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_usergs_sysret64), \
> CLBR_NONE, \
> @@ -1022,6 +1061,7 @@ extern void default_banner(void);
> PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_irq_enable_sysexit), \
> CLBR_NONE, \
> jmp PARA_INDIRECT(pv_cpu_ops+PV_CPU_irq_enable_sysexit))
> +#endif /* CONFIG_PARAVIRT_CPU */
> #endif /* CONFIG_X86_32 */
>
> #endif /* __ASSEMBLY__ */
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index 9357473..e190450 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -343,12 +343,24 @@ struct paravirt_patch_template {
>
> extern struct pv_info pv_info;
> extern struct pv_init_ops pv_init_ops;
> +#ifdef CONFIG_PARAVIRT_TIME
> extern struct pv_time_ops pv_time_ops;
> +#endif
> +#ifdef CONFIG_PARAVIRT_CPU
> extern struct pv_cpu_ops pv_cpu_ops;
> +#endif
> +#ifdef CONFIG_PARAVIRT_IRQ
> extern struct pv_irq_ops pv_irq_ops;
> +#endif
> +#ifdef CONFIG_PARAVIRT_APIC
> extern struct pv_apic_ops pv_apic_ops;
> +#endif
> +#ifdef CONFIG_PARAVIRT_MMU
> extern struct pv_mmu_ops pv_mmu_ops;
> +#endif
> +#ifdef CONFIG_PARAVIRT_SPINLOCKS
> extern struct pv_lock_ops pv_lock_ops;
> +#endif
>
That's unpleasantly noisy, but I guess we just have to blame cpp's syntax.
>
> #define PARAVIRT_PATCH(x) \
> (offsetof(struct paravirt_patch_template, x) / sizeof(void *))
> diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
> index 0e8c2a0..94cce3d 100644
> --- a/arch/x86/include/asm/pgalloc.h
> +++ b/arch/x86/include/asm/pgalloc.h
> @@ -7,7 +7,7 @@
>
> static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> #include <asm/paravirt.h>
> #else
> #define paravirt_pgd_alloc(mm) __paravirt_pgd_alloc(mm)
> diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h
> index 1bd5876..be58e74 100644
> --- a/arch/x86/include/asm/pgtable-3level_types.h
> +++ b/arch/x86/include/asm/pgtable-3level_types.h
> @@ -18,7 +18,7 @@ typedef union {
> } pte_t;
> #endif /* !__ASSEMBLY__ */
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> #define SHARED_KERNEL_PMD (pv_info.shared_kernel_pmd)
> #else
> #define SHARED_KERNEL_PMD 1
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index af6fd36..b68edfc 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -26,7 +26,7 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
> extern spinlock_t pgd_lock;
> extern struct list_head pgd_list;
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> #include <asm/paravirt.h>
> #else /* !CONFIG_PARAVIRT */
> #define set_pte(ptep, pte) native_set_pte(ptep, pte)
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index c3429e8..a42a807 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -571,7 +571,7 @@ static inline void native_swapgs(void)
> #endif
> }
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_CPU
> #include <asm/paravirt.h>
> #else
> #define __cpuid native_cpuid
> diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
> index 64cf2d2..f68edf2 100644
> --- a/arch/x86/include/asm/required-features.h
> +++ b/arch/x86/include/asm/required-features.h
> @@ -48,7 +48,7 @@
> #endif
>
> #ifdef CONFIG_X86_64
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> /* Paravirtualized systems may not have PSE or PGE available */
> #define NEED_PSE 0
> #define NEED_PGE 0
> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> index 1e79678..fdd889a 100644
> --- a/arch/x86/include/asm/smp.h
> +++ b/arch/x86/include/asm/smp.h
> @@ -66,7 +66,7 @@ struct smp_ops {
> extern void set_cpu_sibling_map(int cpu);
>
> #ifdef CONFIG_SMP
> -#ifndef CONFIG_PARAVIRT
> +#ifndef CONFIG_PARAVIRT_APIC
> #define startup_ipi_hook(phys_apicid, start_eip, start_esp) do { } while (0)
> #endif
> extern struct smp_ops smp_ops;
> diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
> index f08f973..63ca93c 100644
> --- a/arch/x86/include/asm/system.h
> +++ b/arch/x86/include/asm/system.h
> @@ -302,13 +302,18 @@ static inline void native_wbinvd(void)
>
> #ifdef CONFIG_PARAVIRT
> #include <asm/paravirt.h>
> -#else
> -#define read_cr0() (native_read_cr0())
> -#define write_cr0(x) (native_write_cr0(x))
> +#endif/* CONFIG_PARAVIRT */
> +
> +#ifndef CONFIG_PARAVIRT_MMU
> #define read_cr2() (native_read_cr2())
> #define write_cr2(x) (native_write_cr2(x))
> #define read_cr3() (native_read_cr3())
> #define write_cr3(x) (native_write_cr3(x))
> +#endif /* CONFIG_PARAVIRT_MMU */
> +
> +#ifndef CONFIG_PARAVIRT_CPU
> +#define read_cr0() (native_read_cr0())
> +#define write_cr0(x) (native_write_cr0(x))
> #define read_cr4() (native_read_cr4())
> #define read_cr4_safe() (native_read_cr4_safe())
> #define write_cr4(x) (native_write_cr4(x))
> @@ -322,7 +327,7 @@ static inline void native_wbinvd(void)
> /* Clear the 'TS' bit */
> #define clts() (native_clts())
>
> -#endif/* CONFIG_PARAVIRT */
> +#endif /* CONFIG_PARAVIRT_CPU */
>
> #define stts() write_cr0(read_cr0() | X86_CR0_TS)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index 7f3eba0..89e055c 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -7,7 +7,7 @@
> #include <asm/processor.h>
> #include <asm/system.h>
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> #include <asm/paravirt.h>
> #else
> #define __flush_tlb() __native_flush_tlb()
> @@ -162,7 +162,7 @@ static inline void reset_lazy_tlbstate(void)
>
> #endif /* SMP */
>
> -#ifndef CONFIG_PARAVIRT
> +#ifndef CONFIG_PARAVIRT_MMU
> #define flush_tlb_others(mask, mm, va) native_flush_tlb_others(mask, mm, va)
> #endif
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 780cd92..1284d8d 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -20,7 +20,7 @@
> #include <asm/processor-flags.h>
> #include <asm/percpu.h>
>
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_MMU
> #include <asm/asm-offsets.h>
> #include <asm/paravirt.h>
> #else
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 1b1739d..c8530bd 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -155,12 +155,14 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void *insnbuf,
> else if (opfunc == _paravirt_ident_64)
> ret = paravirt_patch_ident_64(insnbuf, len);
>
> +#ifdef CONFIG_PARAVIRT_CPU
> else if (type == PARAVIRT_PATCH(pv_cpu_ops.iret) ||
> type == PARAVIRT_PATCH(pv_cpu_ops.irq_enable_sysexit) ||
> type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret32) ||
> type == PARAVIRT_PATCH(pv_cpu_ops.usergs_sysret64))
> /* If operation requires a jmp, then jmp */
> ret = paravirt_patch_jmp(insnbuf, opfunc, addr, len);
> +#endif
> else
> /* Otherwise call the function; assume target could
> clobber any caller-save reg */
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index cd982f4..96aad98 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -66,7 +66,7 @@ u64 native_sched_clock(void)
>
> /* We need to define a real function for sched_clock, to override the
> weak default version */
> -#ifdef CONFIG_PARAVIRT
> +#ifdef CONFIG_PARAVIRT_TIME
> unsigned long long sched_clock(void)
> {
> return paravirt_sched_clock();
> diff --git a/arch/x86/kernel/vsmp_64.c b/arch/x86/kernel/vsmp_64.c
> index a1d804b..23f4612 100644
> --- a/arch/x86/kernel/vsmp_64.c
> +++ b/arch/x86/kernel/vsmp_64.c
> @@ -22,7 +22,7 @@
> #include <asm/paravirt.h>
> #include <asm/setup.h>
>
> -#if defined CONFIG_PCI && defined CONFIG_PARAVIRT
> +#if defined CONFIG_PCI && defined CONFIG_PARAVIRT_IRQ
> /*
> * Interrupt control on vSMPowered systems:
> * ~AC is a shadow of IF. If IF is 'on' AC should be 'off'
> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
> index b83e119..eef41bd 100644
> --- a/arch/x86/xen/Kconfig
> +++ b/arch/x86/xen/Kconfig
> @@ -4,7 +4,7 @@
>
> config XEN
> bool "Xen guest support"
> - select PARAVIRT
> + select PARAVIRT_ALL
> select PARAVIRT_CLOCK
> depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
> depends on X86_CMPXCHG && X86_TSC
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] Split paravirt ops by functionality
2009-11-19 14:59 ` Jeremy Fitzhardinge
@ 2009-11-19 15:21 ` Alexander Graf
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2009-11-19 15:21 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: kvm list, Nick Piggin, Glauber Costa, Avi Kivity, virtualization
Jeremy Fitzhardinge wrote:
> On 11/18/09 08:13, Alexander Graf wrote:
>
>> Currently when using paravirt ops it's an all-or-nothing option. We can either
>> use pv-ops for CPU, MMU, timing, etc. or not at all.
>>
>> Now there are some use cases where we don't need the full feature set, but only
>> a small chunk of it. KVM is a pretty prominent example for this.
>>
>> So let's make everything a bit more fine-grained. We already have a splitting
>> by function groups, namely "cpu", "mmu", "time", "irq", "apic" and "spinlock".
>>
>> Taking that existing splitting and extending it to only compile in the PV
>> capable bits sounded like a natural fit. That way we don't get performance hits
>> in MMU code from using the KVM PV clock which only needs the TIME parts of
>> pv-ops.
>>
>> We define a new CONFIG_PARAVIRT_ALL option that basically does the same thing
>> the CONFIG_PARAVIRT did before this splitting. We move all users of
>> CONFIG_PARAVIRT to CONFIG_PARAVIRT_ALL, so they behave the same way they did
>> before.
>>
>> So here it is - the splitting! I would have made the patch smaller, but this
>> was the closest I could get to atomic (for bisect) while staying sane.
>>
>>
>
> The basic idea seems pretty sane. I'm wondering how much compile test
> coverage you've given all these extra config options; there are now a
> lot more combinations, and your use of select is particularly worrying
> because they don't propagate dependencies properly.
>
Uh - I don't see where there should be any dependencies.
> For example, does this actually work?
>
>> config XEN
>> bool "Xen guest support"
>> - select PARAVIRT
>> + select PARAVIRT_ALL
>> select PARAVIRT_CLOCK
>> depends on X86_64 || (X86_32 && X86_PAE && !X86_VISWS)
>> depends on X86_CMPXCHG && X86_TSC
>>
>>
> Does selecting PARAVIRT_ALL end up selecting all the other PARAVIRT_*?
>
> Can you reassure me?
>
> +config PARAVIRT_ALL
> + bool
> + select PARAVIRT_CPU
> + select PARAVIRT_TIME
> + select PARAVIRT_IRQ
> + select PARAVIRT_APIC
> + select PARAVIRT_MMU
> + default n
> +
So selecting PARAVIRT_ALL selects all the other split PARAVIRT parts
that in turn select PARAVIRT. I tested a compile on x86_64 with Xen DomU
support enabled and it compiled fine :-).
> Also, I think VMI is the only serious user of PARAVIRT_APIC, so we can
> mark that to go when VMI does.
>
Sounds good :-). So the patch even serves as a helper for anyone who'll
remove that support later.
> What ends up using plain CONFIG_PARAVIRT? Do we still need it?
>
It's used for an info field that says which hypervisor we're running on
and as config option to know if we need to include the binary patch
magic. As soon as a single sub-paravirt option is selected, we need that
to make sure we have the framework in place.
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
` (3 preceding siblings ...)
2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity
@ 2009-12-03 14:52 ` Alexander Graf
2009-12-03 15:00 ` Avi Kivity
4 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2009-12-03 14:52 UTC (permalink / raw)
To: kvm list; +Cc: Nick Piggin, Glauber Costa, Avi Kivity, virtualization, rusty
Alexander Graf wrote:
> Paravirt ops is currently only capable of either replacing a lot of Linux
> internal code or none at all. The are users that don't need all of the
> possibilities pv-ops delivers though.
>
> On KVM for example we're perfectly fine not using the PV MMU, thus not
> touching any MMU code. That way we don't have to improve pv-ops to become
> fast, we just don't compile the MMU parts in!
>
> This patchset splits pv-ops into several smaller config options split by
> feature category and then converts the KVM pv-ops code to use only the
> bits that are required, lowering overhead.
>
So has this ended up in some tree yet?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops
2009-12-03 14:52 ` Alexander Graf
@ 2009-12-03 15:00 ` Avi Kivity
2009-12-03 15:04 ` Alexander Graf
0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2009-12-03 15:00 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty
On 12/03/2009 04:52 PM, Alexander Graf wrote:
> Alexander Graf wrote:
>
>> Paravirt ops is currently only capable of either replacing a lot of Linux
>> internal code or none at all. The are users that don't need all of the
>> possibilities pv-ops delivers though.
>>
>> On KVM for example we're perfectly fine not using the PV MMU, thus not
>> touching any MMU code. That way we don't have to improve pv-ops to become
>> fast, we just don't compile the MMU parts in!
>>
>> This patchset splits pv-ops into several smaller config options split by
>> feature category and then converts the KVM pv-ops code to use only the
>> bits that are required, lowering overhead.
>>
>>
> So has this ended up in some tree yet?
>
Don't think so. I suggest you copy lkml and Ingo.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops
2009-12-03 15:00 ` Avi Kivity
@ 2009-12-03 15:04 ` Alexander Graf
2009-12-03 15:07 ` Avi Kivity
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2009-12-03 15:04 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty
Avi Kivity wrote:
> On 12/03/2009 04:52 PM, Alexander Graf wrote:
>> Alexander Graf wrote:
>>
>>> Paravirt ops is currently only capable of either replacing a lot of
>>> Linux
>>> internal code or none at all. The are users that don't need all of the
>>> possibilities pv-ops delivers though.
>>>
>>> On KVM for example we're perfectly fine not using the PV MMU, thus not
>>> touching any MMU code. That way we don't have to improve pv-ops to
>>> become
>>> fast, we just don't compile the MMU parts in!
>>>
>>> This patchset splits pv-ops into several smaller config options
>>> split by
>>> feature category and then converts the KVM pv-ops code to use only the
>>> bits that are required, lowering overhead.
>>>
>>>
>> So has this ended up in some tree yet?
>>
>
> Don't think so. I suggest you copy lkml and Ingo.
Sending off the complete set again? Rebased against what?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Split up pv-ops
2009-12-03 15:04 ` Alexander Graf
@ 2009-12-03 15:07 ` Avi Kivity
0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2009-12-03 15:07 UTC (permalink / raw)
To: Alexander Graf
Cc: kvm list, Nick Piggin, Glauber Costa, virtualization, rusty
On 12/03/2009 05:04 PM, Alexander Graf wrote:
>> Don't think so. I suggest you copy lkml and Ingo.
>>
> Sending off the complete set again?
Yes.
> Rebased against what?
>
tip's x86/paravirt seems like a good choice (though only one patch is in
there at present).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-12-03 15:07 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-18 0:13 [PATCH 0/3] Split up pv-ops Alexander Graf
2009-11-18 0:13 ` [PATCH 1/3] Split paravirt ops by functionality Alexander Graf
2009-11-19 14:59 ` Jeremy Fitzhardinge
2009-11-19 15:21 ` Alexander Graf
2009-11-18 0:13 ` [PATCH 2/3] Only export selected pv-ops feature structs Alexander Graf
2009-11-18 0:13 ` [PATCH 3/3] Split the KVM pv-ops support by feature Alexander Graf
2009-11-18 1:33 ` Rusty Russell
2009-11-18 1:37 ` Alexander Graf
2009-11-19 7:42 ` [PATCH 0/3] Split up pv-ops Avi Kivity
2009-12-03 14:52 ` Alexander Graf
2009-12-03 15:00 ` Avi Kivity
2009-12-03 15:04 ` Alexander Graf
2009-12-03 15:07 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox