* [PATCH 1/2] x86/hyperv: Add callback filter to cpumask_to_vpset()
2023-03-27 13:16 [PATCH 0/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes Michael Kelley
@ 2023-03-27 13:16 ` Michael Kelley
2023-03-27 13:16 ` [PATCH 2/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes Michael Kelley
2023-04-13 1:34 ` [PATCH 0/2] " Wei Liu
2 siblings, 0 replies; 4+ messages in thread
From: Michael Kelley @ 2023-03-27 13:16 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
arnd, x86, linux-kernel, linux-hyperv, linux-arch
Cc: mikelley
When copying CPUs from a Linux cpumask to a Hyper-V VPset,
cpumask_to_vpset() currently has a "_noself" variant that doesn't copy
the current CPU to the VPset. Generalize this variant by replacing it
with a "_skip" variant having a callback function that is invoked for
each CPU to decide if that CPU should be copied. Update the one caller
of cpumask_to_vpset_noself() to use the new "_skip" variant instead.
No functional change.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
arch/x86/hyperv/hv_apic.c | 12 ++++++++----
include/asm-generic/mshyperv.h | 22 ++++++++++++++--------
2 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index fb8b2c0..1fbda2f 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -96,6 +96,11 @@ static void hv_apic_eoi_write(u32 reg, u32 val)
wrmsr(HV_X64_MSR_EOI, val, 0);
}
+static bool cpu_is_self(int cpu)
+{
+ return cpu == smp_processor_id();
+}
+
/*
* IPI implementation on Hyper-V.
*/
@@ -128,10 +133,9 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
*/
if (!cpumask_equal(mask, cpu_present_mask) || exclude_self) {
ipi_arg->vp_set.format = HV_GENERIC_SET_SPARSE_4K;
- if (exclude_self)
- nr_bank = cpumask_to_vpset_noself(&(ipi_arg->vp_set), mask);
- else
- nr_bank = cpumask_to_vpset(&(ipi_arg->vp_set), mask);
+
+ nr_bank = cpumask_to_vpset_skip(&(ipi_arg->vp_set), mask,
+ exclude_self ? cpu_is_self : NULL);
/*
* 'nr_bank <= 0' means some CPUs in cpumask can't be
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index afcd9ae..402a8c1 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -210,10 +210,9 @@ static inline int hv_cpu_number_to_vp_number(int cpu_number)
static inline int __cpumask_to_vpset(struct hv_vpset *vpset,
const struct cpumask *cpus,
- bool exclude_self)
+ bool (*func)(int cpu))
{
int cpu, vcpu, vcpu_bank, vcpu_offset, nr_bank = 1;
- int this_cpu = smp_processor_id();
int max_vcpu_bank = hv_max_vp_index / HV_VCPUS_PER_SPARSE_BANK;
/* vpset.valid_bank_mask can represent up to HV_MAX_SPARSE_VCPU_BANKS banks */
@@ -232,7 +231,7 @@ static inline int __cpumask_to_vpset(struct hv_vpset *vpset,
* Some banks may end up being empty but this is acceptable.
*/
for_each_cpu(cpu, cpus) {
- if (exclude_self && cpu == this_cpu)
+ if (func && func(cpu))
continue;
vcpu = hv_cpu_number_to_vp_number(cpu);
if (vcpu == VP_INVAL)
@@ -248,17 +247,24 @@ static inline int __cpumask_to_vpset(struct hv_vpset *vpset,
return nr_bank;
}
+/*
+ * Convert a Linux cpumask into a Hyper-V VPset. In the _skip variant,
+ * 'func' is called for each CPU present in cpumask. If 'func' returns
+ * true, that CPU is skipped -- i.e., that CPU from cpumask is *not*
+ * added to the Hyper-V VPset. If 'func' is NULL, no CPUs are
+ * skipped.
+ */
static inline int cpumask_to_vpset(struct hv_vpset *vpset,
const struct cpumask *cpus)
{
- return __cpumask_to_vpset(vpset, cpus, false);
+ return __cpumask_to_vpset(vpset, cpus, NULL);
}
-static inline int cpumask_to_vpset_noself(struct hv_vpset *vpset,
- const struct cpumask *cpus)
+static inline int cpumask_to_vpset_skip(struct hv_vpset *vpset,
+ const struct cpumask *cpus,
+ bool (*func)(int cpu))
{
- WARN_ON_ONCE(preemptible());
- return __cpumask_to_vpset(vpset, cpus, true);
+ return __cpumask_to_vpset(vpset, cpus, func);
}
void hyperv_report_panic(struct pt_regs *regs, long err, bool in_die);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH 2/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
2023-03-27 13:16 [PATCH 0/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes Michael Kelley
2023-03-27 13:16 ` [PATCH 1/2] x86/hyperv: Add callback filter to cpumask_to_vpset() Michael Kelley
@ 2023-03-27 13:16 ` Michael Kelley
2023-04-13 1:34 ` [PATCH 0/2] " Wei Liu
2 siblings, 0 replies; 4+ messages in thread
From: Michael Kelley @ 2023-03-27 13:16 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
arnd, x86, linux-kernel, linux-hyperv, linux-arch
Cc: mikelley
In the case where page tables are not freed, native_flush_tlb_multi()
does not do a remote TLB flush on CPUs in lazy TLB mode because the
CPU will flush itself at the next context switch. By comparison, the
Hyper-V enlightened TLB flush does not exclude CPUs in lazy TLB mode
and so performs unnecessary flushes.
If we're not freeing page tables, add logic to test for lazy TLB
mode when adding CPUs to the input argument to the Hyper-V TLB
flush hypercall. Exclude lazy TLB mode CPUs so the behavior
matches native_flush_tlb_multi() and the unnecessary flushes are
avoided. Handle both the <=64 vCPU case and the _ex case for >64
vCPUs.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
arch/x86/hyperv/mmu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 0ad2378..8460bd3 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -52,6 +52,11 @@ static inline int fill_gva_list(u64 gva_list[], int offset,
return gva_n - offset;
}
+static bool cpu_is_lazy(int cpu)
+{
+ return per_cpu(cpu_tlbstate_shared.is_lazy, cpu);
+}
+
static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
const struct flush_tlb_info *info)
{
@@ -60,6 +65,7 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
struct hv_tlb_flush *flush;
u64 status;
unsigned long flags;
+ bool do_lazy = !info->freed_tables;
trace_hyperv_mmu_flush_tlb_multi(cpus, info);
@@ -112,6 +118,8 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
goto do_ex_hypercall;
for_each_cpu(cpu, cpus) {
+ if (do_lazy && cpu_is_lazy(cpu))
+ continue;
vcpu = hv_cpu_number_to_vp_number(cpu);
if (vcpu == VP_INVAL) {
local_irq_restore(flags);
@@ -198,7 +206,8 @@ static u64 hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
flush->hv_vp_set.valid_bank_mask = 0;
flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
- nr_bank = cpumask_to_vpset(&(flush->hv_vp_set), cpus);
+ nr_bank = cpumask_to_vpset_skip(&flush->hv_vp_set, cpus,
+ info->freed_tables ? NULL : cpu_is_lazy);
if (nr_bank < 0)
return HV_STATUS_INVALID_PARAMETER;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH 0/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
2023-03-27 13:16 [PATCH 0/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes Michael Kelley
2023-03-27 13:16 ` [PATCH 1/2] x86/hyperv: Add callback filter to cpumask_to_vpset() Michael Kelley
2023-03-27 13:16 ` [PATCH 2/2] x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes Michael Kelley
@ 2023-04-13 1:34 ` Wei Liu
2 siblings, 0 replies; 4+ messages in thread
From: Wei Liu @ 2023-04-13 1:34 UTC (permalink / raw)
To: Michael Kelley
Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
arnd, x86, linux-kernel, linux-hyperv, linux-arch
On Mon, Mar 27, 2023 at 06:16:05AM -0700, Michael Kelley wrote:
> The Hyper-V enlightened TLB remote flush function does not exclude
> lazy TLB mode CPUs like the equivalent native function. Limited
> telemetry shows that up to 80% of the CPUs being flushed are in
> lazy mode, so flushing them is unnecessary and wasteful.
>
> The best place to exclude the lazy TLB mode CPUs is when copying
> the Linux cpumask to the Hyper-V VPset data structure, since the
> copying already processes CPUs one-by-one. Currently this copying
> function has the capabilty to exclude the calling CPU. Generalize
> this exclusion functionality to exclude CPUs based on a callback
> function that is invoked for each CPU. Then for TLB flushing,
> use this callback function to check the lazy TLB mode status of
> each targeted CPU.
>
> Patch 1 of this series does the generalization, and fixes up the
> one caller of the existing "exclude self" capability.
>
> Patch 2 then implements the exclusion based on lazy TLB mode,
> using the generalization from Patch 1.
>
> Michael Kelley (2):
> x86/hyperv: Add callback filter to cpumask_to_vpset()
> x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
>
> arch/x86/hyperv/hv_apic.c | 12 ++++++++----
> arch/x86/hyperv/mmu.c | 11 ++++++++++-
> include/asm-generic/mshyperv.h | 22 ++++++++++++++--------
> 3 files changed, 32 insertions(+), 13 deletions(-)
Applied to hyperv-next. Thanks.
>
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread