All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] x86: Use a single fixed ASID per domain
@ 2026-04-15 13:32 Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Jason Andryuk, Anthony PERARD, Michal Orzel, Julien Grall,
	Stefano Stabellini, Dario Faggioli, Juergen Gross, George Dunlap,
	Tim Deegan

(ASID can be replaced with VPID for Intel)

This patch series is a reorganized version of [1] though, it share the same goal.
Not a lot of logic remain from v1/v2 [2] and the patch series is structured quite
differently to v3 [1].

Currently in Xen. In order to perform a TLB flush for a guest, we just increment
the ASID, which has the same effect as flushing the TLB. When we run out of ASID,
we flush the entire TLB and start over.
This is done per pCPU, thus the ASID management is done on a per-pCPU basis.

However, AMD SEV requires the ASID to be configured in advance and be unique per
domain as it bound to the encryption key used for the domain.

This patch series propose a new ASID model where a fixed single ASID is attributed
to the domain, thus all vCPU share such ASID. Moreover, the TLB for a vCPU is always
flushed in these conditions :
* the vCPU is migrated to another pCPU
(remote pCPU TLB may be out of sync or lack flushes made previously)
* a different vCPU of the same domain previously ran on this pCPU
(TLB entries comes from the previous vCPU, which may have a different CR3)

Albeit related to a SEV vulnerability, [4] gives more justifications on the needs
of these rules for correctness.

It is also important for making hypervisor-side guest broadcast TLB flushing
realistic, as a single fixed ASID per domain allows a broacast TLB flush to use
such ASID as target.

Moreover, AMD programmer manual also explicitely require this for correctly virtualizing
INVLPGB in guests (through 090h:7 ("Enable INVLPGB/TLBSYNC.") in VMCB) :
> A guest that executes a legal INVLPGB that is not intercepted will have the requested
> ASID field replaced by the current ASID and the valid ASID bit set before doing the
> broadcast invalidation.  Because of its broadcast nature, the ASID field must be global
> and all processors must allocate the same ASID to the same Guest for proper operation.
> Hypervisors that do not support a global ASID must intercept the Guest usage of INVLPGB,
> if enabled, for proper behavior.

In order to avoid making ASID management too complex, transition to only using a fixed ASID
per domain. If no ASID is available (e.g out of ASID or ASID usage is diabled), we use ASID=1
as a placeholder (which is still a valid ASID), and always flush the TLB on context switch
if the vcpu is using ASID=1.

Also add a way to specify a prefered minimum ASID, it is meant to be used later on to avoid
clobbering the low ASID space of SEV-ES and favor allocating ASIDs over that space.

[1] x86/hvm: Introduce Xen-wide ASID allocator
https://lore.kernel.org/xen-devel/cover.1750770621.git.teddy.astie@vates.tech/

[2] x86/hvm: Introduce Xen-wide ASID allocator
https://lore.kernel.org/xen-devel/cover.1723574652.git.vaishali.thakkar@vates.tech/

[4] TLB Poisoning Attacks on AMD Secure Encrypted Virtualization
https://dl.acm.org/doi/epdf/10.1145/3485832.3485876

Teddy Astie (7):
  vmx: Introduce vcpu single context VPID invalidation
  common: Track latest pCPU that ran the vCPU
  common: Introduce needs_tlb_flush vcpu field
  x86: Set v->needs_tlb_flush when needed
  x86/hvm: Flush TLB on vCPU overlaps on the same pCPU
  x86/hvm: Transition to needs_tlb_flush logic, use per-domain ASID
  hvm: Allow specifying a prefered asid minimum

 docs/misc/xen-command-line.pandoc      |   2 +-
 xen/arch/x86/flushtlb.c                |  22 +--
 xen/arch/x86/hvm/asid.c                | 177 +++++++++++--------------
 xen/arch/x86/hvm/emulate.c             |   2 +-
 xen/arch/x86/hvm/hvm.c                 |  25 +++-
 xen/arch/x86/hvm/nestedhvm.c           |   7 +-
 xen/arch/x86/hvm/svm/asid.c            |  67 +++++++---
 xen/arch/x86/hvm/svm/nestedsvm.c       |   2 +-
 xen/arch/x86/hvm/svm/svm.c             |  47 +++++--
 xen/arch/x86/hvm/svm/svm.h             |   4 -
 xen/arch/x86/hvm/vmx/vmcs.c            |   6 +-
 xen/arch/x86/hvm/vmx/vmx.c             |  77 ++++++-----
 xen/arch/x86/hvm/vmx/vvmx.c            |   4 +-
 xen/arch/x86/include/asm/flushtlb.h    |   7 -
 xen/arch/x86/include/asm/hvm/asid.h    |  27 ++--
 xen/arch/x86/include/asm/hvm/domain.h  |   1 +
 xen/arch/x86/include/asm/hvm/hvm.h     |  15 +--
 xen/arch/x86/include/asm/hvm/svm.h     |   5 +
 xen/arch/x86/include/asm/hvm/vcpu.h    |  10 +-
 xen/arch/x86/include/asm/hvm/vmx/vmx.h |  23 +++-
 xen/arch/x86/mm/hap/hap.c              |   9 +-
 xen/arch/x86/mm/p2m.c                  |   7 +-
 xen/arch/x86/mm/paging.c               |   2 +-
 xen/arch/x86/mm/shadow/multi.c         |  12 +-
 xen/common/domain.c                    |   8 ++
 xen/common/sched/core.c                |   5 +
 xen/include/xen/sched.h                |   6 +
 27 files changed, 319 insertions(+), 260 deletions(-)

-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-05-04 15:41   ` Jan Beulich
  2026-04-15 13:32 ` [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU Teddy Astie
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné

Introduce vpid_sync_vcpu_context to do a single-context invalidation
on the vpid attached to the vcpu as a alternative to per-gva and all-context
invlidations.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
 xen/arch/x86/include/asm/hvm/vmx/vmx.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index da04752e17..3524cb3536 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -452,6 +452,27 @@ static inline void ept_sync_all(void)
 
 void ept_sync_domain(struct p2m_domain *p2m);
 
+static inline void vpid_sync_vcpu_context(const struct vcpu *v)
+{
+    int type = INVVPID_SINGLE_CONTEXT;
+
+    /*
+     * If single context invalidation is not supported, we escalate to
+     * use all context invalidation.
+     */
+    if ( likely(cpu_has_vmx_vpid_invvpid_single_context) )
+        goto execute_invvpid;
+
+    /*
+     * If single context invalidation is not supported, we escalate to
+     * use all context invalidation.
+     */
+    type = INVVPID_ALL_CONTEXT;
+
+execute_invvpid:
+    __invvpid(type, v->arch.hvm.n1asid.asid, 0);
+}
+
 static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
 {
     int type = INVVPID_INDIVIDUAL_ADDR;
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-05-04 15:49   ` Jan Beulich
  2026-04-15 13:32 ` [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field Teddy Astie
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Jason Andryuk, Anthony PERARD, Michal Orzel, Julien Grall,
	Stefano Stabellini

Track on which pCPU each vCPU of a domain ran. This will
be used to know whether a TLB flush is required or not
when the vCPU is migrated on another pCPU.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
I wonder if there is a way to move 

    curr->domain->latest_vcpu[cpu] = curr->vcpu_id

into a (at least more) common code ?

 xen/arch/x86/hvm/svm/svm.c | 3 +++
 xen/arch/x86/hvm/vmx/vmx.c | 3 +++
 xen/common/domain.c        | 8 ++++++++
 xen/include/xen/sched.h    | 4 ++++
 4 files changed, 18 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index ced6166847..58e927ae04 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -967,6 +967,7 @@ void asmlinkage svm_vmenter_helper(void)
     const struct cpu_user_regs *regs = guest_cpu_user_regs();
     struct vcpu *curr = current;
     struct vmcb_struct *vmcb = curr->arch.hvm.svm.vmcb;
+    unsigned int cpu = smp_processor_id();
 
     ASSERT(hvmemul_cache_disabled(curr));
 
@@ -977,6 +978,8 @@ void asmlinkage svm_vmenter_helper(void)
 
     svm_sync_vmcb(curr, vmcb_needs_vmsave);
 
+    curr->domain->latest_vcpu[cpu] = curr->vcpu_id;
+
     vmcb->rax = regs->rax;
     vmcb->rip = regs->rip;
     vmcb->rsp = regs->rsp;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 269ca56433..ec0a790336 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4934,6 +4934,7 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
     u32 new_asid, old_asid;
     struct hvm_vcpu_asid *p_asid;
     bool need_flush;
+    unsigned int cpu = smp_processor_id();
 
     ASSERT(hvmemul_cache_disabled(curr));
 
@@ -4977,6 +4978,8 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
     if ( unlikely(need_flush) )
         vpid_sync_all();
 
+    currd->latest_vcpu[cpu] = curr->vcpu_id;
+
     if ( paging_mode_hap(curr->domain) )
     {
         struct ept_data *ept = &p2m_get_hostp2m(currd)->ept;
diff --git a/xen/common/domain.c b/xen/common/domain.c
index bb9e210c28..7867166411 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -758,6 +758,7 @@ static void _domain_destroy(struct domain *d)
     rangeset_domain_destroy(d);
 
     free_cpumask_var(d->dirty_cpumask);
+    xfree(d->latest_vcpu);
 
     xsm_free_security_domain(d);
 
@@ -992,6 +993,13 @@ struct domain *domain_create(domid_t domid,
     if ( !zalloc_cpumask_var(&d->dirty_cpumask) )
         goto fail;
 
+    err = -ENOMEM;
+    d->latest_vcpu = xmalloc_array(int, nr_cpu_ids);
+    if ( !d->latest_vcpu )
+        goto fail;
+    for (unsigned int i = 0; i < nr_cpu_ids; i++)
+        d->latest_vcpu[i] = -1;
+
     rangeset_domain_initialise(d);
 
     if ( is_idle_domain(d) )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 212c7d765c..4b8ae21b51 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -567,6 +567,10 @@ struct domain
     /* Bitmask of CPUs which are holding onto this domain's state. */
     cpumask_var_t    dirty_cpumask;
 
+    /* Mapping of the latest vCPU that ran on a specific CPU
+     * (-1 if the vCPU hasn't ran yet) */
+    int *latest_vcpu;
+
     struct arch_domain arch;
 
     void *ssid; /* sHype security subject identifier */
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-05-04 15:54   ` Jan Beulich
  2026-04-15 13:32 ` [RFC PATCH 4/7] x86: Set v->needs_tlb_flush when needed Teddy Astie
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Dario Faggioli, Juergen Gross, George Dunlap,
	Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Julien Grall, Roger Pau Monné, Stefano Stabellini

This field is meant to be used to schedule a TLB flush on the vCPU
before waking it up. This field can be set from another vCPU at any
time.

Schedule a TLB flush when the vCPU is migrated to another CPU.
This is needed as the vCPU-related TLB entries may be out of sync
with what happened on another core.

Currently, no architecture use this mechanism, but it is meant to
be used as a way to schedule a TLB flush on the vCPU.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
 xen/common/sched/core.c | 5 +++++
 xen/include/xen/sched.h | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index a57d5dd929..f8e615b3af 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1188,7 +1188,12 @@ static void sched_unit_migrate_finish(struct sched_unit *unit)
 
     /* Wake on new CPU. */
     for_each_sched_unit_vcpu ( unit, v )
+    {
+        if ( old_cpu != new_cpu )
+            /* Migrating to another CPU needs TLB flush */
+            v->needs_tlb_flush = true;
         vcpu_wake(v);
+    }
 }
 
 static bool sched_check_affinity_broken(const struct sched_unit *unit)
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 4b8ae21b51..a26c571015 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -224,6 +224,8 @@ struct vcpu
     bool             defer_shutdown;
     /* VCPU is paused following shutdown request (d->is_shutting_down)? */
     bool             paused_for_shutdown;
+    /* VCPU needs its TLB flushed before waking. */
+    bool             needs_tlb_flush;
     /* VCPU need affinity restored */
     uint8_t          affinity_broken;
 #define VCPU_AFFINITY_OVERRIDE    0x01
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 4/7] x86: Set v->needs_tlb_flush when needed
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
                   ` (2 preceding siblings ...)
  2026-04-15 13:32 ` [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 5/7] x86/hvm: Flush TLB on vCPU overlaps on the same pCPU Teddy Astie
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Jason Andryuk, Tim Deegan

Sets v->needs_tlb_flush in where a tlb flush is expected
to be scheduled on the vCPU.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
The goal here is to avoid too much noise in [1], hence it currently
cohexists with hvm_asid_flush_vcpu(), but [1] will drop
hvm_asid_flush_vcpu() and only keep needs_tlb_flush.

[1] x86/hvm: Transition to needs_tlb_flush logic, use  per-domain ASID

 xen/arch/x86/flushtlb.c        | 4 ++++
 xen/arch/x86/hvm/emulate.c     | 1 +
 xen/arch/x86/hvm/hvm.c         | 1 +
 xen/arch/x86/hvm/svm/svm.c     | 5 +++++
 xen/arch/x86/hvm/vmx/vmcs.c    | 1 +
 xen/arch/x86/hvm/vmx/vmx.c     | 3 +++
 xen/arch/x86/hvm/vmx/vvmx.c    | 1 +
 xen/arch/x86/mm/p2m.c          | 4 ++++
 xen/arch/x86/mm/paging.c       | 1 +
 xen/arch/x86/mm/shadow/multi.c | 1 +
 10 files changed, 22 insertions(+)

diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index 23721bb52c..8ee2385bba 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -324,7 +324,11 @@ unsigned int guest_flush_tlb_flags(const struct domain *d)
 void guest_flush_tlb_mask(const struct domain *d, const cpumask_t *mask)
 {
     unsigned int flags = guest_flush_tlb_flags(d);
+    struct vcpu *v;
 
     if ( flags )
         flush_mask(mask, flags);
+
+    for_each_vcpu(d, v)
+        v->needs_tlb_flush = true;
 }
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f3aae158e9..3bc1d321cc 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2657,6 +2657,7 @@ static int cf_check hvmemul_tlb_op(
         if ( x86emul_invpcid_type(aux) != X86_INVPCID_INDIV_ADDR )
         {
             hvm_asid_flush_vcpu(current);
+            current->needs_tlb_flush = true;
             break;
         }
         aux = x86emul_invpcid_pcid(aux);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4a81afce02..0f0b0e242f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1613,6 +1613,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
     struct domain *d = v->domain;
 
     hvm_asid_flush_vcpu(v);
+    v->needs_tlb_flush = true;
 
     spin_lock_init(&v->arch.hvm.tm_lock);
     INIT_LIST_HEAD(&v->arch.hvm.tm_list);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 58e927ae04..64c08432fd 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -138,6 +138,8 @@ static void cf_check svm_update_guest_cr(
         {
             if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
                 hvm_asid_flush_vcpu(v);
+                
+            v->needs_tlb_flush = true;
         }
         else if ( nestedhvm_vmswitch_in_progress(v) )
             ; /* CR3 switches during VMRUN/VMEXIT do not flush the TLB. */
@@ -944,6 +946,7 @@ static void noreturn cf_check svm_do_resume(void)
         hvm_migrate_pirqs(v);
         /* Migrating to another ASID domain.  Request a new ASID. */
         hvm_asid_flush_vcpu(v);
+        v->needs_tlb_flush = true;
     }
 
     if ( !vcpu_guestmode && !vlapic_hw_disabled(vlapic) )
@@ -2306,6 +2309,8 @@ static void cf_check svm_invlpg(struct vcpu *v, unsigned long linear)
 {
     /* Safe fallback. Take a new ASID. */
     hvm_asid_flush_vcpu(v);
+    /* Schedule a tlb flush on the VCPU. */
+    v->needs_tlb_flush = true;
 }
 
 static bool cf_check svm_get_pending_event(
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 8e52ef4d49..4efe13e07f 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1904,6 +1904,7 @@ void cf_check vmx_do_resume(void)
         v->arch.hvm.vmx.hostenv_migrated = 1;
 
         hvm_asid_flush_vcpu(v);
+        v->needs_tlb_flush = true;
     }
 
     debug_state = v->domain->debugger_attached
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ec0a790336..0e4f9f9c3d 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1511,6 +1511,7 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
 
             wbinvd();               /* flush possibly polluted cache */
             hvm_asid_flush_vcpu(v); /* invalidate memory type cached in TLB */
+            v->needs_tlb_flush = true; /* invalidate memory type cached in TLB */
             v->arch.hvm.vmx.cache_mode = CACHE_MODE_NO_FILL;
         }
         else
@@ -1520,6 +1521,7 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
             if ( !is_iommu_enabled(v->domain) || iommu_snoop )
                 vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
             hvm_asid_flush_vcpu(v); /* no need to flush cache */
+            v->needs_tlb_flush = true;
         }
     }
 }
@@ -1872,6 +1874,7 @@ static void cf_check vmx_update_guest_cr(
 
         if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
             hvm_asid_flush_vcpu(v);
+        v->needs_tlb_flush = true;
         break;
 
     default:
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index e4cdfe55c1..16d6f1d61b 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1254,6 +1254,7 @@ static void virtual_vmentry(struct cpu_user_regs *regs)
         if ( nvmx->guest_vpid != new_vpid )
         {
             hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
+            v->needs_tlb_flush = true;
             nvmx->guest_vpid = new_vpid;
         }
     }
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fddecdf978..910623ac93 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -25,6 +25,7 @@
 #include <asm/p2m.h>
 #include <asm/mem_sharing.h>
 #include <asm/hvm/nestedhvm.h>
+#include <asm/hvm/vcpu.h>
 #include <asm/altp2m.h>
 #include <asm/vm_event.h>
 #include <xsm/xsm.h>
@@ -1439,6 +1440,7 @@ p2m_flush(struct vcpu *v, struct p2m_domain *p2m)
     vcpu_nestedhvm(v).nv_p2m = NULL;
     p2m_flush_table(p2m);
     hvm_asid_flush_vcpu(v);
+    v->needs_tlb_flush = true;
 }
 
 void
@@ -1498,6 +1500,7 @@ static void assign_np2m(struct vcpu *v, struct p2m_domain *p2m)
 static void nvcpu_flush(struct vcpu *v)
 {
     hvm_asid_flush_vcpu(v);
+    v->needs_tlb_flush = true;
     vcpu_nestedhvm(v).stale_np2m = true;
 }
 
@@ -1618,6 +1621,7 @@ void np2m_schedule(int dir)
             {
                 /* This vCPU's np2m was flushed while it was not runnable */
                 hvm_asid_flush_core();
+                curr->needs_tlb_flush = true;
                 vcpu_nestedhvm(curr).nv_p2m = NULL;
             }
             else
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 2396f81ad5..b0b3bef753 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -939,6 +939,7 @@ void paging_update_nestedmode(struct vcpu *v)
         /* TODO: shadow-on-shadow */
         v->arch.paging.nestedmode = NULL;
     hvm_asid_flush_vcpu(v);
+    v->needs_tlb_flush = true;
 }
 
 int __init paging_set_allocation(struct domain *d, unsigned int pages,
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 80cd3299fa..2df2842138 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3165,6 +3165,7 @@ sh_update_linear_entries(struct vcpu *v)
      * without this change, it would fetch the wrong value due to a stale TLB.
      */
     sh_flush_local(d);
+    v->needs_tlb_flush = true;
 }
 
 static pagetable_t cf_check sh_update_cr3(struct vcpu *v, bool noflush)
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 5/7] x86/hvm: Flush TLB on vCPU overlaps on the same pCPU
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
                   ` (3 preceding siblings ...)
  2026-04-15 13:32 ` [RFC PATCH 4/7] x86: Set v->needs_tlb_flush when needed Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 6/7] x86/hvm: Transition to needs_tlb_flush logic, use per-domain ASID Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 7/7] hvm: Allow specifying a prefered asid minimum Teddy Astie
  6 siblings, 0 replies; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné,
	Jason Andryuk

When using the same ASID/VPID for all vCPU of a domain, we need
to make sure that when context switching between 2 vCPUs of the
same domain on the same pCPU don't miss a TLB flush, as they
may be using different CR3.

Flush the TLB if the latest vCPU that ran on this core differ
to the one we're currently on.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
Would it be preferable to move it to common code (or shared logic for x86).

 xen/arch/x86/hvm/svm/svm.c | 8 ++++++++
 xen/arch/x86/hvm/vmx/vmx.c | 7 +++++++
 2 files changed, 15 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 64c08432fd..8714fb18ec 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -981,6 +981,14 @@ void asmlinkage svm_vmenter_helper(void)
 
     svm_sync_vmcb(curr, vmcb_needs_vmsave);
 
+    /**
+     * Check if we were the latest vCPU of this domain that ran on this pCPU.
+     * Flush the TLB if it is not, as the TLB entries are the ones from the previous
+     * vCPU. vCPU migration from a CPU to another always imply a TLB flush.
+     */
+    if ( curr->domain->latest_vcpu[cpu] != curr->vcpu_id )
+        curr->needs_tlb_flush = true;
+
     curr->domain->latest_vcpu[cpu] = curr->vcpu_id;
 
     vmcb->rax = regs->rax;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 0e4f9f9c3d..dceff2f221 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4980,6 +4980,13 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
 
     if ( unlikely(need_flush) )
         vpid_sync_all();
+    /**
+     * Check if we were the latest vCPU of this domain that ran on this pCPU.
+     * Flush the TLB if it is not, as the TLB entries are the ones from the previous
+     * vCPU. vCPU migration from a CPU to another always imply a TLB flush.
+     */
+    if ( currd->latest_vcpu[cpu] != curr->vcpu_id )
+        curr->needs_tlb_flush = true;
 
     currd->latest_vcpu[cpu] = curr->vcpu_id;
 
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 6/7] x86/hvm: Transition to needs_tlb_flush logic, use per-domain ASID
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
                   ` (4 preceding siblings ...)
  2026-04-15 13:32 ` [RFC PATCH 5/7] x86/hvm: Flush TLB on vCPU overlaps on the same pCPU Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  2026-04-15 13:32 ` [RFC PATCH 7/7] hvm: Allow specifying a prefered asid minimum Teddy Astie
  6 siblings, 0 replies; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel
  Cc: Teddy Astie, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Jason Andryuk, Tim Deegan

Change the ASID model where all vCPU of a domain share the same ASID
as required by AMD SEV and broadcast TLB flushing features (AMD INVLPGB).

ASID 1 is reserved as a placeholder for "no domain ASID", and used when
either ASID are not supported or no more ASID is available for use.
In this case, we always flush the TLB when from and to such domain's
vCPU.

Moreover, centralize the TLB flushing logic to use needs_tlb_flush, if
a full TLB flush needs to be performed for the vCPU, either through
SVM tlb_control or VMX invvpid before entering the guest.

As a result, drop ASID tickling logic, which is now redundant with the
needs_tlb_flush mechanism introduced previously. Also take the opportunity
to drop some now unused helpers now that FLUSH_HVM_ASID_CORE is dropped.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
Nested virt makes some aspects tricky and is hard to reason about,
especially since there are 2 ASID in such case and complex TLB flushing
logic.

 docs/misc/xen-command-line.pandoc      |   2 +-
 xen/arch/x86/flushtlb.c                |  18 +--
 xen/arch/x86/hvm/asid.c                | 169 +++++++++++--------------
 xen/arch/x86/hvm/emulate.c             |   1 -
 xen/arch/x86/hvm/hvm.c                 |  24 +++-
 xen/arch/x86/hvm/nestedhvm.c           |   7 +-
 xen/arch/x86/hvm/svm/asid.c            |  67 +++++++---
 xen/arch/x86/hvm/svm/nestedsvm.c       |   2 +-
 xen/arch/x86/hvm/svm/svm.c             |  31 +++--
 xen/arch/x86/hvm/svm/svm.h             |   4 -
 xen/arch/x86/hvm/vmx/vmcs.c            |   5 +-
 xen/arch/x86/hvm/vmx/vmx.c             |  64 +++++-----
 xen/arch/x86/hvm/vmx/vvmx.c            |   3 +-
 xen/arch/x86/include/asm/flushtlb.h    |   7 -
 xen/arch/x86/include/asm/hvm/asid.h    |  26 ++--
 xen/arch/x86/include/asm/hvm/domain.h  |   1 +
 xen/arch/x86/include/asm/hvm/hvm.h     |  15 +--
 xen/arch/x86/include/asm/hvm/svm.h     |   5 +
 xen/arch/x86/include/asm/hvm/vcpu.h    |  10 +-
 xen/arch/x86/include/asm/hvm/vmx/vmx.h |   4 +-
 xen/arch/x86/mm/hap/hap.c              |   9 +-
 xen/arch/x86/mm/p2m.c                  |   3 -
 xen/arch/x86/mm/paging.c               |   1 -
 xen/arch/x86/mm/shadow/multi.c         |  11 +-
 24 files changed, 228 insertions(+), 261 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 6c77129732..099b718df9 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -208,7 +208,7 @@ to appropriate auditing by Xen.  Argo is disabled by default.
 > Default: `true`
 
 Permit Xen to use Address Space Identifiers.  This is an optimisation which
-tags the TLB entries with an ID per vcpu.  This allows for guest TLB flushes
+tags the TLB entries with an ID per domain.  This allows for guest TLB flushes
 to be performed without the overhead of a complete TLB flush.
 
 ### async-show-all (x86)
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index 8ee2385bba..f60d2df8fd 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -13,6 +13,7 @@
 #include <xen/softirq.h>
 #include <asm/cache.h>
 #include <asm/flushtlb.h>
+#include <asm/hvm/hvm.h>
 #include <asm/invpcid.h>
 #include <asm/nops.h>
 #include <asm/page.h>
@@ -117,7 +118,6 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
 
     if ( tlb_clk_enabled )
         t = pre_flush();
-    hvm_flush_guest_tlbs();
 
     old_cr4 = read_cr4();
     ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE));
@@ -221,9 +221,6 @@ unsigned int flush_area_local(const void *va, unsigned int flags)
             do_tlb_flush();
     }
 
-    if ( flags & FLUSH_HVM_ASID_CORE )
-        hvm_flush_guest_tlbs();
-
     if ( flags & (FLUSH_CACHE_EVICT | FLUSH_CACHE_WRITEBACK) )
     {
         const struct cpuinfo_x86 *c = &current_cpu_data;
@@ -313,21 +310,12 @@ void cache_writeback(const void *addr, unsigned int size)
     asm volatile ("sfence" ::: "memory");
 }
 
-unsigned int guest_flush_tlb_flags(const struct domain *d)
-{
-    bool shadow = paging_mode_shadow(d);
-    bool asid = is_hvm_domain(d) && (cpu_has_svm || shadow);
-
-    return (shadow ? FLUSH_TLB : 0) | (asid ? FLUSH_HVM_ASID_CORE : 0);
-}
-
 void guest_flush_tlb_mask(const struct domain *d, const cpumask_t *mask)
 {
-    unsigned int flags = guest_flush_tlb_flags(d);
     struct vcpu *v;
 
-    if ( flags )
-        flush_mask(mask, flags);
+    if ( paging_mode_shadow(d) )
+        flush_tlb_mask(mask);
 
     for_each_vcpu(d, v)
         v->needs_tlb_flush = true;
diff --git a/xen/arch/x86/hvm/asid.c b/xen/arch/x86/hvm/asid.c
index 935cae3901..1a21125161 100644
--- a/xen/arch/x86/hvm/asid.c
+++ b/xen/arch/x86/hvm/asid.c
@@ -5,138 +5,115 @@
  * Copyright (c) 2009, Citrix Systems, Inc.
  */
 
+#include <xen/errno.h>
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/param.h>
-#include <xen/sched.h>
-#include <xen/smp.h>
-#include <xen/percpu.h>
+#include <xen/spinlock.h>
+#include <xen/xvmalloc.h>
+
+#include <asm/bitops.h>
 #include <asm/hvm/asid.h>
 
 /* Xen command-line option to enable ASIDs */
 static bool __read_mostly opt_asid_enabled = true;
 boolean_param("asid", opt_asid_enabled);
 
+bool __read_mostly asid_enabled = false;
+static unsigned long __ro_after_init *asid_bitmap;
+static unsigned long __ro_after_init asid_count;
+static DEFINE_SPINLOCK(asid_lock);
+
 /*
- * ASIDs partition the physical TLB.  In the current implementation ASIDs are
- * introduced to reduce the number of TLB flushes.  Each time the guest's
- * virtual address space changes (e.g. due to an INVLPG, MOV-TO-{CR3, CR4}
- * operation), instead of flushing the TLB, a new ASID is assigned.  This
- * reduces the number of TLB flushes to at most 1/#ASIDs.  The biggest
- * advantage is that hot parts of the hypervisor's code and data retain in
- * the TLB.
- *
  * Sketch of the Implementation:
+ * ASIDs are assigned uniquely per domain and doesn't change during the lifecycle of the
+ * domain. Once vcpus are initialized and are up, we assign the same ASID to all vcpus
+ * of that domain at the first VMRUN. In order to process a TLB flush on a vcpu, we set
+ * needs_tlb_flush to schedule a TLB flush for the next VMRUN (e.g using tlb control 
+ * field of VMCB).
  *
- * ASIDs are a CPU-local resource.  As preemption of ASIDs is not possible,
- * ASIDs are assigned in a round-robin scheme.  To minimize the overhead of
- * ASID invalidation, at the time of a TLB flush,  ASIDs are tagged with a
- * 64-bit generation.  Only on a generation overflow the code needs to
- * invalidate all ASID information stored at the VCPUs with are run on the
- * specific physical processor.  This overflow appears after about 2^80
- * host processor cycles, so we do not optimize this case, but simply disable
- * ASID useage to retain correctness.
+ * We reserve ASID=1 as being the ASID used when none other is available (or with asid
+ * use disabled). Multiples domains may use this ASID, thus we need to systematically
+ * flush the TLB for this one when switching between vCPUs with ASID=1.
  */
 
-/* Per-CPU ASID management. */
-struct hvm_asid_data {
-   uint64_t core_asid_generation;
-   uint32_t next_asid;
-   uint32_t max_asid;
-   bool disabled;
-};
-
-static DEFINE_PER_CPU(struct hvm_asid_data, hvm_asid_data);
-
-void hvm_asid_init(unsigned int nasids)
+int __init hvm_asid_init(unsigned long nasids)
 {
-    static int8_t __ro_after_init g_disabled = -1;
-    struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
+    ASSERT(nasids);
 
-    data->max_asid = nasids - 1;
-    data->disabled = !opt_asid_enabled || (nasids <= 1);
+    asid_count = nasids;
+    asid_enabled = opt_asid_enabled && (nasids > 1);
 
-    if ( g_disabled < 0 )
-    {
-        g_disabled = data->disabled;
-        printk("HVM: ASIDs %sabled\n", data->disabled ? "dis" : "en");
-    }
-    else if ( g_disabled != data->disabled )
-        printk("HVM: CPU%u: ASIDs %sabled\n", smp_processor_id(),
-               data->disabled ? "dis" : "en");
+    asid_bitmap = xvzalloc_array(unsigned long, BITS_TO_LONGS(asid_count + 1));
+    if ( !asid_bitmap )
+        return -ENOMEM;
 
-    /* Zero indicates 'invalid generation', so we start the count at one. */
-    data->core_asid_generation = 1;
+    printk("HVM: ASIDs %sabled (count=%lu)\n", asid_enabled ? "en" : "dis", asid_count);
 
-    /* Zero indicates 'ASIDs disabled', so we start the count at one. */
-    data->next_asid = 1;
-}
+    /* ASID 0 and 1 are reserved, mark it as permanently used */
+    set_bit(0, asid_bitmap);
+    set_bit(1, asid_bitmap);
 
-void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid)
-{
-    write_atomic(&asid->generation, 0);
+    return 0;
 }
 
-void hvm_asid_flush_vcpu(struct vcpu *v)
+int hvm_asid_alloc(struct hvm_asid *asid)
 {
-    hvm_asid_flush_vcpu_asid(&v->arch.hvm.n1asid);
-    hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
-}
+    unsigned long new_asid;
 
-void hvm_asid_flush_core(void)
-{
-    struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
+    if ( !asid_enabled )
+    {
+        asid->asid = 1;
+        return 0;
+    }
 
-    if ( data->disabled )
-        return;
+    spin_lock(&asid_lock);
+    new_asid = find_first_zero_bit(asid_bitmap, asid_count);
+    if ( new_asid > asid_count )
+        return -ENOSPC;
 
-    if ( likely(++data->core_asid_generation != 0) )
-        return;
+    set_bit(new_asid, asid_bitmap);
 
-    /*
-     * ASID generations are 64 bit.  Overflow of generations never happens.
-     * For safety, we simply disable ASIDs, so correctness is established; it
-     * only runs a bit slower.
-     */
-    printk("HVM: ASID generation overrun. Disabling ASIDs.\n");
-    data->disabled = 1;
+    asid->asid = new_asid;
+    spin_unlock(&asid_lock);
+    return 0;
 }
 
-bool hvm_asid_handle_vmenter(struct hvm_vcpu_asid *asid)
+int hvm_asid_alloc_range(struct hvm_asid *asid, unsigned long min, unsigned long max)
 {
-    struct hvm_asid_data *data = &this_cpu(hvm_asid_data);
+    unsigned long new_asid;
+    
+    if ( WARN_ON(min >= asid_count) )
+        return -EINVAL;
 
-    /* On erratum #170 systems we must flush the TLB. 
-     * Generation overruns are taken here, too. */
-    if ( data->disabled )
-        goto disabled;
+    if ( !asid_enabled )
+        return -EOPNOTSUPP;
 
-    /* Test if VCPU has valid ASID. */
-    if ( read_atomic(&asid->generation) == data->core_asid_generation )
-        return 0;
+    spin_lock(&asid_lock);
+    new_asid = find_next_zero_bit(asid_bitmap, asid_count, min);
+    if ( new_asid > max || new_asid > asid_count )
+        return -ENOSPC;
 
-    /* If there are no free ASIDs, need to go to a new generation */
-    if ( unlikely(data->next_asid > data->max_asid) )
-    {
-        hvm_asid_flush_core();
-        data->next_asid = 1;
-        if ( data->disabled )
-            goto disabled;
-    }
+    set_bit(new_asid, asid_bitmap);
 
-    /* Now guaranteed to be a free ASID. */
-    asid->asid = data->next_asid++;
-    write_atomic(&asid->generation, data->core_asid_generation);
+    asid->asid = new_asid;
+    spin_unlock(&asid_lock);
+    return 0;
+}
 
-    /*
-     * When we assign ASID 1, flush all TLB entries as we are starting a new
-     * generation, and all old ASID allocations are now stale. 
-     */
-    return (asid->asid == 1);
+void hvm_asid_free(struct hvm_asid *asid)
+{
+    ASSERT( asid->asid );
 
- disabled:
-    asid->asid = 0;
-    return 0;
+    if ( !asid_enabled || asid->asid == 1 )
+        return;
+
+    ASSERT( asid->asid < asid_count );
+
+    spin_lock(&asid_lock);
+    WARN_ON(!test_bit(asid->asid, asid_bitmap));
+    clear_bit(asid->asid, asid_bitmap);
+    spin_unlock(&asid_lock);
 }
 
 /*
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 3bc1d321cc..efdf236df1 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -2656,7 +2656,6 @@ static int cf_check hvmemul_tlb_op(
     case x86emul_invpcid:
         if ( x86emul_invpcid_type(aux) != X86_INVPCID_INDIV_ADDR )
         {
-            hvm_asid_flush_vcpu(current);
             current->needs_tlb_flush = true;
             break;
         }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0f0b0e242f..2ab625d6e0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -520,11 +520,13 @@ static bool hvm_get_pending_event(struct vcpu *v, struct x86_event *info)
 
 void hvm_do_resume(struct vcpu *v)
 {
+    struct domain *currd = v->domain;;
+
     check_wakeup_from_wait();
 
     pt_restore_timer(v);
 
-    if ( has_vpci(v->domain) && vpci_process_pending(v) )
+    if ( has_vpci(currd) && vpci_process_pending(v) )
     {
         raise_softirq(SCHEDULE_SOFTIRQ);
         return;
@@ -559,6 +561,13 @@ void hvm_do_resume(struct vcpu *v)
             v->arch.monitor.next_interrupt_enabled = false;
         }
     }
+
+    if ( unlikely(currd->arch.hvm.asid.asid == 1) )
+        /*
+         * As ASID=1 may be shared across multiples domains, we can't easily track the
+         * state of the TLB, we always flush the TLB before resuming this vCPU.
+         */
+        v->needs_tlb_flush = true;
 }
 
 static int cf_check hvm_print_line(
@@ -714,6 +723,10 @@ int hvm_domain_initialise(struct domain *d,
     if ( rc )
         goto fail2;
 
+    rc = hvm_asid_alloc(&d->arch.hvm.asid);
+    if ( rc )
+        goto fail2;
+
     rc = alternative_call(hvm_funcs.domain_initialise, d);
     if ( rc != 0 )
         goto fail2;
@@ -794,7 +807,7 @@ void hvm_domain_destroy(struct domain *d)
         list_del(&ioport->list);
         xfree(ioport);
     }
-
+    hvm_asid_free(&d->arch.hvm.asid);
     destroy_vpci_mmcfg(d);
 }
 
@@ -1612,7 +1625,6 @@ int hvm_vcpu_initialise(struct vcpu *v)
     int rc;
     struct domain *d = v->domain;
 
-    hvm_asid_flush_vcpu(v);
     v->needs_tlb_flush = true;
 
     spin_lock_init(&v->arch.hvm.tm_lock);
@@ -4085,6 +4097,11 @@ static void hvm_s3_resume(struct domain *d)
     }
 }
 
+int hvm_flush_tlb(const unsigned long *vcpu_bitmap)
+{
+    return current->domain->arch.paging.flush_tlb(vcpu_bitmap);
+}
+
 static int hvmop_flush_tlb_all(void)
 {
     if ( !is_hvm_domain(current->domain) )
@@ -5464,4 +5481,3 @@ int hvm_copy_context_and_params(struct domain *dst, struct domain *src)
  * indent-tabs-mode: nil
  * End:
  */
-
diff --git a/xen/arch/x86/hvm/nestedhvm.c b/xen/arch/x86/hvm/nestedhvm.c
index bddd77d810..61e866b771 100644
--- a/xen/arch/x86/hvm/nestedhvm.c
+++ b/xen/arch/x86/hvm/nestedhvm.c
@@ -12,6 +12,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/event.h>  /* for local_event_delivery_(en|dis)able */
 #include <asm/paging.h> /* for paging_mode_hap() */
+#include <asm/hvm/asid.h>
 
 static unsigned long *shadow_io_bitmap[3];
 
@@ -36,13 +37,11 @@ nestedhvm_vcpu_reset(struct vcpu *v)
     hvm_unmap_guest_frame(nv->nv_vvmcx, 1);
     nv->nv_vvmcx = NULL;
     nv->nv_vvmcxaddr = INVALID_PADDR;
-    nv->nv_flushp2m = 0;
+    nv->nv_flushp2m = true;
     nv->nv_p2m = NULL;
     nv->stale_np2m = false;
     nv->np2m_generation = 0;
 
-    hvm_asid_flush_vcpu_asid(&nv->nv_n2asid);
-
     alternative_vcall(hvm_funcs.nhvm_vcpu_reset, v);
 
     /* vcpu is in host mode */
@@ -86,7 +85,7 @@ static void cf_check nestedhvm_flushtlb_ipi(void *info)
      * This is cheaper than flush_tlb_local() and has
      * the same desired effect.
      */
-    hvm_asid_flush_core();
+    WARN_ON(hvm_flush_tlb(NULL));
     vcpu_nestedhvm(v).nv_p2m = NULL;
     vcpu_nestedhvm(v).stale_np2m = true;
 }
diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
index 53aa5d0512..44d2138895 100644
--- a/xen/arch/x86/hvm/svm/asid.c
+++ b/xen/arch/x86/hvm/svm/asid.c
@@ -1,39 +1,46 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 /*
- * asid.c: handling ASIDs in SVM.
+ * asid.c: handling ASIDs/VPIDs.
  * Copyright (c) 2007, Advanced Micro Devices, Inc.
  */
 
+#include <xen/cpumask.h>
+
 #include <asm/amd.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/svm.h>
+#include <asm/processor.h>
 
 #include "svm.h"
 #include "vmcb.h"
 
-void svm_asid_init(const struct cpuinfo_x86 *c)
+void __init svm_asid_init(void)
 {
-    unsigned int nasids = 0;
+    unsigned int cpu, nasids = cpuid_ebx(0x8000000aU);
+
+    if ( !nasids )
+        nasids = 1;
 
-    /* Check for erratum #170, and leave ASIDs disabled if it's present. */
-    if ( !cpu_has_amd_erratum(c, AMD_ERRATUM_170) )
-        nasids = cpuid_ebx(0x8000000aU);
+    for_each_present_cpu(cpu)
+    {
+        /* Check for erratum #170, and leave ASIDs disabled if it's present. */
+        if ( cpu_has_amd_erratum(&cpu_data[cpu], AMD_ERRATUM_170) )
+        {
+            printk(XENLOG_WARNING "Disabling ASID due to errata 170 on CPU%u\n", cpu);
+            nasids = 1;
+        }
+    }
 
-    hvm_asid_init(nasids);
+    BUG_ON(hvm_asid_init(nasids));
 }
 
 /*
- * Called directly before VMRUN.  Checks if the VCPU needs a new ASID,
- * assigns it, and if required, issues required TLB flushes.
+ * Called directly at the first VMRUN/VMENTER of a vcpu to assign the ASID/VPID.
  */
-void svm_asid_handle_vmrun(void)
+void svm_vcpu_assign_asid(struct vcpu *v)
 {
-    struct vcpu *curr = current;
-    struct vmcb_struct *vmcb = curr->arch.hvm.svm.vmcb;
-    struct hvm_vcpu_asid *p_asid =
-        nestedhvm_vcpu_in_guestmode(curr)
-        ? &vcpu_nestedhvm(curr).nv_n2asid : &curr->arch.hvm.n1asid;
-    bool need_flush = hvm_asid_handle_vmenter(p_asid);
+    struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
+    struct hvm_asid *p_asid = &v->domain->arch.hvm.asid;
 
     /* ASID 0 indicates that ASIDs are disabled. */
     if ( p_asid->asid == 0 )
@@ -44,11 +51,31 @@ void svm_asid_handle_vmrun(void)
         return;
     }
 
-    if ( vmcb_get_asid(vmcb) != p_asid->asid )
-        vmcb_set_asid(vmcb, p_asid->asid);
+    /* In case ASIDs are disabled, as ASID = 0 is reserved, guest can use 1 instead. */
+    vmcb_set_asid(vmcb, asid_enabled ? p_asid->asid : 1);
+}
+
+/* Call to make a TLB flush at the next VMRUN. */
+void svm_vcpu_set_tlb_control(struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
+
+    /*
+     * If the vcpu is already running, the tlb control flag may not be
+     * processed and will be cleared at the next VMEXIT, which will undo
+     * what we are trying to do.
+     */
+    WARN_ON(v != current && v->is_running);
+
+    vmcb->tlb_control =
+        cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
+}
+
+void svm_vcpu_clear_tlb_control(struct vcpu *v)
+{
+    struct vmcb_struct *vmcb = v->arch.hvm.svm.vmcb;
 
-    /* We can't rely on TLB_CTRL_FLUSH_ASID as all ASIDs are stale here. */
-    vmcb->tlb_control = need_flush ? TLB_CTRL_FLUSH_ALL : TLB_CTRL_NO_FLUSH;
+    vmcb->tlb_control = TLB_CTRL_NO_FLUSH;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c
index ef6fa5d23b..d9a84f9388 100644
--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -5,6 +5,7 @@
  *
  */
 
+#include <asm/hvm/asid.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/svm.h>
 #include <asm/hvm/nestedhvm.h>
@@ -638,7 +639,6 @@ nsvm_vcpu_vmentry(struct vcpu *v, struct cpu_user_regs *regs,
     if ( svm->ns_asid != vmcb_get_asid(ns_vmcb))
     {
         nv->nv_flushp2m = 1;
-        hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
         svm->ns_asid = vmcb_get_asid(ns_vmcb);
     }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 8714fb18ec..ba22acb8a6 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -27,6 +27,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/svm.h>
+#include <asm/hvm/asid.h>
 #include <asm/i387.h>
 #include <asm/idt.h>
 #include <asm/iocap.h>
@@ -137,16 +138,18 @@ static void cf_check svm_update_guest_cr(
         if ( !nestedhvm_enabled(v->domain) )
         {
             if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
-                hvm_asid_flush_vcpu(v);
                 
             v->needs_tlb_flush = true;
         }
         else if ( nestedhvm_vmswitch_in_progress(v) )
             ; /* CR3 switches during VMRUN/VMEXIT do not flush the TLB. */
         else if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
-            hvm_asid_flush_vcpu_asid(
-                nestedhvm_vcpu_in_guestmode(v)
-                ? &vcpu_nestedhvm(v).nv_n2asid : &v->arch.hvm.n1asid);
+        {
+            if (nestedhvm_vcpu_in_guestmode(v))
+                vcpu_nestedhvm(v).nv_flushp2m = true;
+            else
+                v->needs_tlb_flush = true;
+        }
         break;
     case 4:
         value = HVM_CR4_HOST_MASK;
@@ -944,8 +947,6 @@ static void noreturn cf_check svm_do_resume(void)
         v->arch.hvm.svm.launch_core = smp_processor_id();
         hvm_migrate_timers(v);
         hvm_migrate_pirqs(v);
-        /* Migrating to another ASID domain.  Request a new ASID. */
-        hvm_asid_flush_vcpu(v);
         v->needs_tlb_flush = true;
     }
 
@@ -974,8 +975,6 @@ void asmlinkage svm_vmenter_helper(void)
 
     ASSERT(hvmemul_cache_disabled(curr));
 
-    svm_asid_handle_vmrun();
-
     TRACE_TIME(TRC_HVM_VMENTRY |
                (nestedhvm_vcpu_in_guestmode(curr) ? TRC_HVM_NESTEDFLAG : 0));
 
@@ -991,6 +990,9 @@ void asmlinkage svm_vmenter_helper(void)
 
     curr->domain->latest_vcpu[cpu] = curr->vcpu_id;
 
+    if ( test_and_clear_bool(curr->needs_tlb_flush) )
+        svm_vcpu_set_tlb_control(curr);
+
     vmcb->rax = regs->rax;
     vmcb->rip = regs->rip;
     vmcb->rsp = regs->rsp;
@@ -1111,6 +1113,8 @@ static int cf_check svm_vcpu_initialise(struct vcpu *v)
         return rc;
     }
 
+    svm_vcpu_assign_asid(v);
+
     return 0;
 }
 
@@ -1536,9 +1540,6 @@ static int _svm_cpu_up(bool bsp)
     /* check for erratum 383 */
     svm_init_erratum_383(c);
 
-    /* Initialize core's ASID handling. */
-    svm_asid_init(c);
-
     /* Initialize OSVW bits to be used by guests */
     svm_host_osvw_init();
 
@@ -2293,7 +2294,7 @@ static void svm_invlpga_intercept(
 {
     svm_invlpga(linear,
                 (asid == 0)
-                ? v->arch.hvm.n1asid.asid
+                ? v->domain->arch.hvm.asid.asid
                 : vcpu_nestedhvm(v).nv_n2asid.asid);
 }
 
@@ -2315,8 +2316,6 @@ static bool cf_check is_invlpg(
 
 static void cf_check svm_invlpg(struct vcpu *v, unsigned long linear)
 {
-    /* Safe fallback. Take a new ASID. */
-    hvm_asid_flush_vcpu(v);
     /* Schedule a tlb flush on the VCPU. */
     v->needs_tlb_flush = true;
 }
@@ -2486,6 +2485,8 @@ const struct hvm_function_table * __init start_svm(void)
     svm_function_table.caps.hap_superpage_2mb = true;
     svm_function_table.caps.hap_superpage_1gb = cpu_has_page1gb;
 
+    svm_asid_init();
+
     return &svm_function_table;
 }
 
@@ -2542,6 +2543,8 @@ void asmlinkage svm_vmexit_handler(void)
                    (vlapic_get_reg(vlapic, APIC_TASKPRI) & 0x0F));
     }
 
+    svm_vcpu_clear_tlb_control(v);
+
     exit_reason = vmcb->exitcode;
 
     if ( hvm_long_mode_active(v) )
diff --git a/xen/arch/x86/hvm/svm/svm.h b/xen/arch/x86/hvm/svm/svm.h
index cfa411ad5a..901354e914 100644
--- a/xen/arch/x86/hvm/svm/svm.h
+++ b/xen/arch/x86/hvm/svm/svm.h
@@ -12,12 +12,8 @@
 #include <xen/types.h>
 
 struct cpu_user_regs;
-struct cpuinfo_x86;
 struct vcpu;
 
-void svm_asid_init(const struct cpuinfo_x86 *c);
-void svm_asid_handle_vmrun(void);
-
 unsigned long *svm_msrbit(unsigned long *msr_bitmap, uint32_t msr);
 void __update_guest_eip(struct cpu_user_regs *regs, unsigned int inst_len);
 
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4efe13e07f..88808e26e7 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -20,6 +20,7 @@
 #include <asm/current.h>
 #include <asm/flushtlb.h>
 #include <asm/hvm/hvm.h>
+#include <asm/hvm/asid.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/vmx/vmcs.h>
@@ -778,8 +779,6 @@ static int _vmx_cpu_up(bool bsp)
 
     this_cpu(vmxon) = 1;
 
-    hvm_asid_init(cpu_has_vmx_vpid ? (1u << VMCS_VPID_WIDTH) : 0);
-
     if ( cpu_has_vmx_ept )
         ept_sync_all();
 
@@ -1903,7 +1902,6 @@ void cf_check vmx_do_resume(void)
          */
         v->arch.hvm.vmx.hostenv_migrated = 1;
 
-        hvm_asid_flush_vcpu(v);
         v->needs_tlb_flush = true;
     }
 
@@ -2117,7 +2115,6 @@ void vmcs_dump_vcpu(struct vcpu *v)
          (SECONDARY_EXEC_ENABLE_VPID | SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
         printk("Virtual processor ID = 0x%04x VMfunc controls = %016lx\n",
                vmr16(VIRTUAL_PROCESSOR_ID), vmr(VM_FUNCTION_CONTROL));
-
     vmx_vmcs_exit(v);
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index dceff2f221..15989cf136 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -25,6 +25,7 @@
 #include <asm/fsgsbase.h>
 #include <asm/gdbsx.h>
 #include <asm/guest-msr.h>
+#include <asm/hvm/asid.h>
 #include <asm/hvm/emulate.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/monitor.h>
@@ -834,6 +835,18 @@ static void cf_check vmx_cpuid_policy_changed(struct vcpu *v)
         vmx_update_secondary_exec_control(v);
     }
 
+    if ( asid_enabled )
+    {
+        v->arch.hvm.vmx.secondary_exec_control |= SECONDARY_EXEC_ENABLE_VPID;
+        vmx_update_secondary_exec_control(v);
+    }
+    else
+    {
+        v->arch.hvm.vmx.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
+        vmx_update_secondary_exec_control(v);
+    }
+
+
     /*
      * We can safely pass MSR_SPEC_CTRL through to the guest, even if STIBP
      * isn't enumerated in hardware, as SPEC_CTRL_STIBP is ignored.
@@ -1510,7 +1523,6 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
             vmx_set_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
 
             wbinvd();               /* flush possibly polluted cache */
-            hvm_asid_flush_vcpu(v); /* invalidate memory type cached in TLB */
             v->needs_tlb_flush = true; /* invalidate memory type cached in TLB */
             v->arch.hvm.vmx.cache_mode = CACHE_MODE_NO_FILL;
         }
@@ -1520,7 +1532,6 @@ static void cf_check vmx_handle_cd(struct vcpu *v, unsigned long value)
             vmx_set_guest_pat(v, *pat);
             if ( !is_iommu_enabled(v->domain) || iommu_snoop )
                 vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
-            hvm_asid_flush_vcpu(v); /* no need to flush cache */
             v->needs_tlb_flush = true;
         }
     }
@@ -1873,7 +1884,6 @@ static void cf_check vmx_update_guest_cr(
         __vmwrite(GUEST_CR3, v->arch.hvm.hw_cr[3]);
 
         if ( !(flags & HVM_UPDATE_GUEST_CR3_NOFLUSH) )
-            hvm_asid_flush_vcpu(v);
         v->needs_tlb_flush = true;
         break;
 
@@ -3171,6 +3181,8 @@ const struct hvm_function_table * __init start_vmx(void)
     lbr_tsx_fixup_check();
     ler_to_fixup_check();
 
+    BUG_ON(hvm_asid_init(cpu_has_vmx_vpid ? (1u << VMCS_VPID_WIDTH) : 1));
+
     return &vmx_function_table;
 }
 
@@ -4934,9 +4946,7 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
 {
     struct vcpu *curr = current;
     struct domain *currd = curr->domain;
-    u32 new_asid, old_asid;
-    struct hvm_vcpu_asid *p_asid;
-    bool need_flush;
+    struct hvm_asid *p_asid;
     unsigned int cpu = smp_processor_id();
 
     ASSERT(hvmemul_cache_disabled(curr));
@@ -4948,38 +4958,16 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
     if ( curr->domain->arch.hvm.pi_ops.vcpu_block )
         vmx_pi_do_resume(curr);
 
-    if ( !cpu_has_vmx_vpid )
+    if ( !asid_enabled )
         goto out;
     if ( nestedhvm_vcpu_in_guestmode(curr) )
         p_asid = &vcpu_nestedhvm(curr).nv_n2asid;
     else
-        p_asid = &curr->arch.hvm.n1asid;
+        p_asid = &currd->arch.hvm.asid;
 
-    old_asid = p_asid->asid;
-    need_flush = hvm_asid_handle_vmenter(p_asid);
-    new_asid = p_asid->asid;
+    __vmwrite(VIRTUAL_PROCESSOR_ID, p_asid->asid);
 
-    if ( unlikely(new_asid != old_asid) )
-    {
-        __vmwrite(VIRTUAL_PROCESSOR_ID, new_asid);
-        if ( !old_asid && new_asid )
-        {
-            /* VPID was disabled: now enabled. */
-            curr->arch.hvm.vmx.secondary_exec_control |=
-                SECONDARY_EXEC_ENABLE_VPID;
-            vmx_update_secondary_exec_control(curr);
-        }
-        else if ( old_asid && !new_asid )
-        {
-            /* VPID was enabled: now disabled. */
-            curr->arch.hvm.vmx.secondary_exec_control &=
-                ~SECONDARY_EXEC_ENABLE_VPID;
-            vmx_update_secondary_exec_control(curr);
-        }
-    }
 
-    if ( unlikely(need_flush) )
-        vpid_sync_all();
     /**
      * Check if we were the latest vCPU of this domain that ran on this pCPU.
      * Flush the TLB if it is not, as the TLB entries are the ones from the previous
@@ -4993,16 +4981,21 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
     if ( paging_mode_hap(curr->domain) )
     {
         struct ept_data *ept = &p2m_get_hostp2m(currd)->ept;
-        unsigned int cpu = smp_processor_id();
         unsigned int inv = 0; /* None => Single => All */
         struct ept_data *single = NULL; /* Single eptp, iff inv == 1 */
 
+        if ( test_and_clear_bool(curr->needs_tlb_flush)  )
+        {
+            inv = 1;
+            single = ept;
+        }
+
         if ( cpumask_test_cpu(cpu, ept->invalidate) )
         {
             cpumask_clear_cpu(cpu, ept->invalidate);
 
             /* Automatically invalidate all contexts if nested. */
-            inv += 1 + nestedhvm_enabled(currd);
+            inv = 1 + nestedhvm_enabled(currd);
             single = ept;
         }
 
@@ -5031,6 +5024,11 @@ bool asmlinkage vmx_vmenter_helper(const struct cpu_user_regs *regs)
             __invept(inv == 1 ? INVEPT_SINGLE_CONTEXT : INVEPT_ALL_CONTEXT,
                      inv == 1 ? single->eptp          : 0);
     }
+    else /* Shadow paging */
+    {
+        if ( test_and_clear_bool(curr->needs_tlb_flush) )
+            vpid_sync_vcpu_context(curr);
+    }
 
  out:
     if ( unlikely(curr->arch.hvm.vmx.lbr_flags & LBR_FIXUP_MASK) )
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 16d6f1d61b..7314aa5633 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1253,7 +1253,6 @@ static void virtual_vmentry(struct cpu_user_regs *regs)
 
         if ( nvmx->guest_vpid != new_vpid )
         {
-            hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(v).nv_n2asid);
             v->needs_tlb_flush = true;
             nvmx->guest_vpid = new_vpid;
         }
@@ -2053,7 +2052,7 @@ static int nvmx_handle_invvpid(struct cpu_user_regs *regs)
     case INVVPID_INDIVIDUAL_ADDR:
     case INVVPID_SINGLE_CONTEXT:
     case INVVPID_ALL_CONTEXT:
-        hvm_asid_flush_vcpu_asid(&vcpu_nestedhvm(current).nv_n2asid);
+        hvm_flush_tlb(NULL);
         break;
     default:
         vmfail(regs, VMX_INSN_INVEPT_INVVPID_INVALID_OP);
diff --git a/xen/arch/x86/include/asm/flushtlb.h b/xen/arch/x86/include/asm/flushtlb.h
index 7bcbca2b7f..d8167aca18 100644
--- a/xen/arch/x86/include/asm/flushtlb.h
+++ b/xen/arch/x86/include/asm/flushtlb.h
@@ -125,12 +125,6 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4);
 #define FLUSH_VCPU_STATE 0x1000
  /* Flush the per-cpu root page table */
 #define FLUSH_ROOT_PGTBL 0x2000
-#if CONFIG_HVM
- /* Flush all HVM guests linear TLB (using ASID/VPID) */
-#define FLUSH_HVM_ASID_CORE 0x4000
-#else
-#define FLUSH_HVM_ASID_CORE 0
-#endif
 #if defined(CONFIG_PV) || defined(CONFIG_SHADOW_PAGING)
 /*
  * Adding this to the flags passed to flush_area_mask will prevent using the
@@ -190,7 +184,6 @@ void flush_area_mask(const cpumask_t *mask, const void *va,
 
 static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache) {}
 
-unsigned int guest_flush_tlb_flags(const struct domain *d);
 void guest_flush_tlb_mask(const struct domain *d, const cpumask_t *mask);
 
 #endif /* __FLUSHTLB_H__ */
diff --git a/xen/arch/x86/include/asm/hvm/asid.h b/xen/arch/x86/include/asm/hvm/asid.h
index 25ba57e768..13ea357f70 100644
--- a/xen/arch/x86/include/asm/hvm/asid.h
+++ b/xen/arch/x86/include/asm/hvm/asid.h
@@ -8,25 +8,21 @@
 #ifndef __ASM_X86_HVM_ASID_H__
 #define __ASM_X86_HVM_ASID_H__
 
+#include <xen/stdbool.h>
+#include <xen/stdint.h>
 
-struct vcpu;
-struct hvm_vcpu_asid;
+struct hvm_asid {
+  uint32_t asid;
+};
 
-/* Initialise ASID management for the current physical CPU. */
-void hvm_asid_init(unsigned int nasids);
+extern bool asid_enabled;
 
-/* Invalidate a particular ASID allocation: forces re-allocation. */
-void hvm_asid_flush_vcpu_asid(struct hvm_vcpu_asid *asid);
+/* Initialise ASID management distributed across all CPUs. */
+int hvm_asid_init(unsigned long nasids);
 
-/* Invalidate all ASID allocations for specified VCPU: forces re-allocation. */
-void hvm_asid_flush_vcpu(struct vcpu *v);
-
-/* Flush all ASIDs on this processor core. */
-void hvm_asid_flush_core(void);
-
-/* Called before entry to guest context. Checks ASID allocation, returns a
- * boolean indicating whether all ASIDs must be flushed. */
-bool hvm_asid_handle_vmenter(struct hvm_vcpu_asid *asid);
+int hvm_asid_alloc(struct hvm_asid *asid);
+int hvm_asid_alloc_range(struct hvm_asid *asid, unsigned long min, unsigned long max);
+void hvm_asid_free(struct hvm_asid *asid);
 
 #endif /* __ASM_X86_HVM_ASID_H__ */
 
diff --git a/xen/arch/x86/include/asm/hvm/domain.h b/xen/arch/x86/include/asm/hvm/domain.h
index abf9bc448d..03fe6c8f7b 100644
--- a/xen/arch/x86/include/asm/hvm/domain.h
+++ b/xen/arch/x86/include/asm/hvm/domain.h
@@ -139,6 +139,7 @@ struct hvm_domain {
     } write_map;
 
     struct hvm_pi_ops pi_ops;
+    struct hvm_asid asid;
 
     union {
         struct vmx_domain vmx;
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index e7c1364802..935d9e7548 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -274,6 +274,8 @@ int hvm_domain_initialise(struct domain *d,
 void hvm_domain_relinquish_resources(struct domain *d);
 void hvm_domain_destroy(struct domain *d);
 
+int hvm_flush_tlb(const unsigned long *vcpu_bitmap);
+
 int hvm_vcpu_initialise(struct vcpu *v);
 void hvm_vcpu_destroy(struct vcpu *v);
 void hvm_vcpu_down(struct vcpu *v);
@@ -497,17 +499,6 @@ static inline void hvm_set_tsc_offset(struct vcpu *v, uint64_t offset)
     alternative_vcall(hvm_funcs.set_tsc_offset, v, offset);
 }
 
-/*
- * Called to ensure than all guest-specific mappings in a tagged TLB are 
- * flushed; does *not* flush Xen's TLB entries, and on processors without a 
- * tagged TLB it will be a noop.
- */
-static inline void hvm_flush_guest_tlbs(void)
-{
-    if ( hvm_enabled )
-        hvm_asid_flush_core();
-}
-
 static inline unsigned int
 hvm_get_cpl(struct vcpu *v)
 {
@@ -901,8 +892,6 @@ static inline int hvm_cpu_up(void)
 
 static inline void hvm_cpu_down(void) {}
 
-static inline void hvm_flush_guest_tlbs(void) {}
-
 static inline void hvm_invlpg(const struct vcpu *v, unsigned long linear)
 {
     ASSERT_UNREACHABLE();
diff --git a/xen/arch/x86/include/asm/hvm/svm.h b/xen/arch/x86/include/asm/hvm/svm.h
index a35a61273b..1877bb149a 100644
--- a/xen/arch/x86/include/asm/hvm/svm.h
+++ b/xen/arch/x86/include/asm/hvm/svm.h
@@ -9,6 +9,11 @@
 #ifndef __ASM_X86_HVM_SVM_H__
 #define __ASM_X86_HVM_SVM_H__
 
+void svm_asid_init(void);
+void svm_vcpu_assign_asid(struct vcpu *v);
+void svm_vcpu_set_tlb_control(struct vcpu *v);
+void svm_vcpu_clear_tlb_control(struct vcpu *v);
+
 /*
  * PV context switch helpers.  Prefetching the VMCB area itself has been shown
  * to be useful for performance.
diff --git a/xen/arch/x86/include/asm/hvm/vcpu.h b/xen/arch/x86/include/asm/hvm/vcpu.h
index 836138a4a6..7472cdd0f2 100644
--- a/xen/arch/x86/include/asm/hvm/vcpu.h
+++ b/xen/arch/x86/include/asm/hvm/vcpu.h
@@ -9,6 +9,7 @@
 #define __ASM_X86_HVM_VCPU_H__
 
 #include <xen/tasklet.h>
+#include <asm/hvm/asid.h>
 #include <asm/hvm/vlapic.h>
 #include <asm/hvm/vmx/vmcs.h>
 #include <asm/hvm/vmx/vvmx.h>
@@ -16,11 +17,6 @@
 #include <asm/mtrr.h>
 #include <public/hvm/ioreq.h>
 
-struct hvm_vcpu_asid {
-    uint64_t generation;
-    uint32_t asid;
-};
-
 struct hvm_vcpu_io {
     /*
      * HVM emulation:
@@ -78,7 +74,7 @@ struct nestedvcpu {
     bool stale_np2m; /* True when p2m_base in VMCx02 is no longer valid */
     uint64_t np2m_generation;
 
-    struct hvm_vcpu_asid nv_n2asid;
+    struct hvm_asid nv_n2asid;
 
     bool nv_vmentry_pending;
     bool nv_vmexit_pending;
@@ -142,8 +138,6 @@ struct hvm_vcpu {
     /* (MFN) hypervisor page table */
     pagetable_t         monitor_table;
 
-    struct hvm_vcpu_asid n1asid;
-
     u64                 msr_tsc_adjust;
 
     union {
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
index 3524cb3536..4f307ae123 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
@@ -470,7 +470,7 @@ static inline void vpid_sync_vcpu_context(const struct vcpu *v)
     type = INVVPID_ALL_CONTEXT;
 
 execute_invvpid:
-    __invvpid(type, v->arch.hvm.n1asid.asid, 0);
+    __invvpid(type, v->domain->arch.hvm.asid.asid, 0);
 }
 
 static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
@@ -494,7 +494,7 @@ static inline void vpid_sync_vcpu_gva(struct vcpu *v, unsigned long gva)
         type = INVVPID_ALL_CONTEXT;
 
 execute_invvpid:
-    __invvpid(type, v->arch.hvm.n1asid.asid, (u64)gva);
+    __invvpid(type, v->domain->arch.hvm.asid.asid, (u64)gva);
 }
 
 static inline void vpid_sync_all(void)
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 5ccb80bda5..156734d3e0 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -27,6 +27,7 @@
 #include <asm/p2m.h>
 #include <asm/domain.h>
 #include <xen/numa.h>
+#include <asm/hvm/asid.h>
 #include <asm/hvm/nestedhvm.h>
 #include <public/sched.h>
 
@@ -750,18 +751,16 @@ static bool cf_check flush_tlb(const unsigned long *vcpu_bitmap)
         if ( !flush_vcpu(v, vcpu_bitmap) )
             continue;
 
-        hvm_asid_flush_vcpu(v);
-
         cpu = read_atomic(&v->dirty_cpu);
         if ( cpu != this_cpu && is_vcpu_dirty_cpu(cpu) && v->is_running )
             __cpumask_set_cpu(cpu, mask);
     }
 
+    guest_flush_tlb_mask(d, mask);
+
     /*
      * Trigger a vmexit on all pCPUs with dirty vCPU state in order to force an
-     * ASID/VPID change and hence accomplish a guest TLB flush. Note that vCPUs
-     * not currently running will already be flushed when scheduled because of
-     * the ASID tickle done in the loop above.
+     * ASID/VPID flush and hence accomplish a guest TLB flush.
      */
     on_selected_cpus(mask, NULL, NULL, 0);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 910623ac93..6eb6035f09 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1439,7 +1439,6 @@ p2m_flush(struct vcpu *v, struct p2m_domain *p2m)
     ASSERT(v->domain == p2m->domain);
     vcpu_nestedhvm(v).nv_p2m = NULL;
     p2m_flush_table(p2m);
-    hvm_asid_flush_vcpu(v);
     v->needs_tlb_flush = true;
 }
 
@@ -1499,7 +1498,6 @@ static void assign_np2m(struct vcpu *v, struct p2m_domain *p2m)
 
 static void nvcpu_flush(struct vcpu *v)
 {
-    hvm_asid_flush_vcpu(v);
     v->needs_tlb_flush = true;
     vcpu_nestedhvm(v).stale_np2m = true;
 }
@@ -1620,7 +1618,6 @@ void np2m_schedule(int dir)
             if ( !np2m_valid )
             {
                 /* This vCPU's np2m was flushed while it was not runnable */
-                hvm_asid_flush_core();
                 curr->needs_tlb_flush = true;
                 vcpu_nestedhvm(curr).nv_p2m = NULL;
             }
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index b0b3bef753..af5584a777 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -938,7 +938,6 @@ void paging_update_nestedmode(struct vcpu *v)
     else
         /* TODO: shadow-on-shadow */
         v->arch.paging.nestedmode = NULL;
-    hvm_asid_flush_vcpu(v);
     v->needs_tlb_flush = true;
 }
 
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 2df2842138..0eef92bc5b 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -81,12 +81,6 @@ const char *const fetch_type_names[] = {
 
 static pagetable_t cf_check sh_update_cr3(struct vcpu *v, bool noflush);
 
-/* Helper to perform a local TLB flush. */
-static void sh_flush_local(const struct domain *d)
-{
-    flush_local(guest_flush_tlb_flags(d));
-}
-
 #if GUEST_PAGING_LEVELS >= 4 && defined(CONFIG_PV32)
 #define ASSERT_VALID_L2(t) \
     ASSERT((t) == SH_type_l2_shadow || (t) == SH_type_l2h_shadow)
@@ -2947,7 +2941,8 @@ static bool cf_check sh_invlpg(struct vcpu *v, unsigned long linear)
     if ( mfn_to_page(sl1mfn)->u.sh.type
          == SH_type_fl1_shadow )
     {
-        sh_flush_local(v->domain);
+        flush_tlb_local();
+        v->needs_tlb_flush = true;
         return false;
     }
 
@@ -3164,7 +3159,7 @@ sh_update_linear_entries(struct vcpu *v)
      * linear pagetable to read a top-level shadow page table entry. But,
      * without this change, it would fetch the wrong value due to a stale TLB.
      */
-    sh_flush_local(d);
+    flush_tlb_local();
     v->needs_tlb_flush = true;
 }
 
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 7/7] hvm: Allow specifying a prefered asid minimum
  2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
                   ` (5 preceding siblings ...)
  2026-04-15 13:32 ` [RFC PATCH 6/7] x86/hvm: Transition to needs_tlb_flush logic, use per-domain ASID Teddy Astie
@ 2026-04-15 13:32 ` Teddy Astie
  6 siblings, 0 replies; 13+ messages in thread
From: Teddy Astie @ 2026-04-15 13:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Teddy Astie, Jan Beulich, Andrew Cooper, Roger Pau Monné

To avoid clobbering all ASIDs that are below SEV-enabled guest maximum,
we want to allocate if possible all ASID over a "prefered minimum"
and fallback to below otherwise.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
---
 xen/arch/x86/hvm/asid.c             | 8 ++++++++
 xen/arch/x86/include/asm/hvm/asid.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/xen/arch/x86/hvm/asid.c b/xen/arch/x86/hvm/asid.c
index 1a21125161..4ad3200c96 100644
--- a/xen/arch/x86/hvm/asid.c
+++ b/xen/arch/x86/hvm/asid.c
@@ -22,6 +22,9 @@ boolean_param("asid", opt_asid_enabled);
 bool __read_mostly asid_enabled = false;
 static unsigned long __ro_after_init *asid_bitmap;
 static unsigned long __ro_after_init asid_count;
+
+/* Default minimum ASID to use */
+unsigned long __read_mostly asid_default_min = 0;
 static DEFINE_SPINLOCK(asid_lock);
 
 /*
@@ -67,6 +70,11 @@ int hvm_asid_alloc(struct hvm_asid *asid)
         return 0;
     }
 
+    /* Try to allocate above default minimum */
+    if ( asid_default_min &&
+         !hvm_asid_alloc_range(asid, asid_default_min, asid_count) )
+        return 0;
+
     spin_lock(&asid_lock);
     new_asid = find_first_zero_bit(asid_bitmap, asid_count);
     if ( new_asid > asid_count )
diff --git a/xen/arch/x86/include/asm/hvm/asid.h b/xen/arch/x86/include/asm/hvm/asid.h
index 13ea357f70..e989ebbe8c 100644
--- a/xen/arch/x86/include/asm/hvm/asid.h
+++ b/xen/arch/x86/include/asm/hvm/asid.h
@@ -16,6 +16,7 @@ struct hvm_asid {
 };
 
 extern bool asid_enabled;
+extern unsigned long asid_default_min;
 
 /* Initialise ASID management distributed across all CPUs. */
 int hvm_asid_init(unsigned long nasids);
-- 
2.52.0



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation
  2026-04-15 13:32 ` [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
@ 2026-05-04 15:41   ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2026-05-04 15:41 UTC (permalink / raw)
  To: Teddy Astie; +Cc: Andrew Cooper, Roger Pau Monné, xen-devel

On 15.04.2026 15:32, Teddy Astie wrote:
> Introduce vpid_sync_vcpu_context to do a single-context invalidation
> on the vpid attached to the vcpu as a alternative to per-gva and all-context
> invlidations.
> 
> Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
> ---
>  xen/arch/x86/include/asm/hvm/vmx/vmx.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmx.h b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> index da04752e17..3524cb3536 100644
> --- a/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> +++ b/xen/arch/x86/include/asm/hvm/vmx/vmx.h
> @@ -452,6 +452,27 @@ static inline void ept_sync_all(void)
>  
>  void ept_sync_domain(struct p2m_domain *p2m);
>  
> +static inline void vpid_sync_vcpu_context(const struct vcpu *v)
> +{
> +    int type = INVVPID_SINGLE_CONTEXT;
> +
> +    /*
> +     * If single context invalidation is not supported, we escalate to
> +     * use all context invalidation.
> +     */
> +    if ( likely(cpu_has_vmx_vpid_invvpid_single_context) )
> +        goto execute_invvpid;
> +
> +    /*
> +     * If single context invalidation is not supported, we escalate to
> +     * use all context invalidation.
> +     */
> +    type = INVVPID_ALL_CONTEXT;
> +
> +execute_invvpid:
> +    __invvpid(type, v->arch.hvm.n1asid.asid, 0);
> +}

I think this (such) better would be introduced with a user (else the
description wants to say what it's going to be needed for). I further think
that this (such) would better be done without goto (else at the very least
the label wants to conform to ./CODING_STYLE). And finally I think that the
local variable would better be of an unsigned type.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU
  2026-04-15 13:32 ` [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU Teddy Astie
@ 2026-05-04 15:49   ` Jan Beulich
  2026-05-05 10:17     ` Teddy Astie
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2026-05-04 15:49 UTC (permalink / raw)
  To: Teddy Astie
  Cc: Andrew Cooper, Roger Pau Monné, Jason Andryuk,
	Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
	xen-devel

On 15.04.2026 15:32, Teddy Astie wrote:
> Track on which pCPU each vCPU of a domain ran. This will
> be used to know whether a TLB flush is required or not
> when the vCPU is migrated on another pCPU.

Somewhat related tracking already exists - see the dirty_cpumask field.
But what title and description say doesn't match ...

> @@ -977,6 +978,8 @@ void asmlinkage svm_vmenter_helper(void)
>  
>      svm_sync_vmcb(curr, vmcb_needs_vmsave);
>  
> +    curr->domain->latest_vcpu[cpu] = curr->vcpu_id;

... the implementation anyway: You track which vCPU last ran on a given
pCPU. Since the same pCPU may have run multiple vCPU-s which then weren't
scheduled again, you lose data afaict.

> @@ -992,6 +993,13 @@ struct domain *domain_create(domid_t domid,
>      if ( !zalloc_cpumask_var(&d->dirty_cpumask) )
>          goto fail;
>  
> +    err = -ENOMEM;
> +    d->latest_vcpu = xmalloc_array(int, nr_cpu_ids);

xvmalloc_array() please, as this can be huge. It possibly being huge is
also of concern.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -567,6 +567,10 @@ struct domain
>      /* Bitmask of CPUs which are holding onto this domain's state. */
>      cpumask_var_t    dirty_cpumask;
>  
> +    /* Mapping of the latest vCPU that ran on a specific CPU
> +     * (-1 if the vCPU hasn't ran yet) */
> +    int *latest_vcpu;

Why plain int? You don't really leverage -1 as a sentinel, and any
unsigned value >= nr_cpu_ids would do in its stead.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field
  2026-04-15 13:32 ` [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field Teddy Astie
@ 2026-05-04 15:54   ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2026-05-04 15:54 UTC (permalink / raw)
  To: Teddy Astie
  Cc: Dario Faggioli, Juergen Gross, George Dunlap, Andrew Cooper,
	Anthony PERARD, Michal Orzel, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, xen-devel

On 15.04.2026 15:32, Teddy Astie wrote:
> This field is meant to be used to schedule a TLB flush on the vCPU
> before waking it up. This field can be set from another vCPU at any
> time.
> 
> Schedule a TLB flush when the vCPU is migrated to another CPU.
> This is needed as the vCPU-related TLB entries may be out of sync
> with what happened on another core.

While not an issue with ...

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -1188,7 +1188,12 @@ static void sched_unit_migrate_finish(struct sched_unit *unit)
>  
>      /* Wake on new CPU. */
>      for_each_sched_unit_vcpu ( unit, v )
> +    {
> +        if ( old_cpu != new_cpu )
> +            /* Migrating to another CPU needs TLB flush */
> +            v->needs_tlb_flush = true;

... this setting of the flag, for almost any other place (which apparently
the next patch is going to introduce, judging by its title) the immediate
question is: What if the vCPU is presently running? The flag is meant to
only take effect on migration to another pCPU. IOW the overall purpose
doesn't become quite clear.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -224,6 +224,8 @@ struct vcpu
>      bool             defer_shutdown;
>      /* VCPU is paused following shutdown request (d->is_shutting_down)? */
>      bool             paused_for_shutdown;
> +    /* VCPU needs its TLB flushed before waking. */
> +    bool             needs_tlb_flush;

If only x86 is going to use the field, why is it being put here?

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU
  2026-05-04 15:49   ` Jan Beulich
@ 2026-05-05 10:17     ` Teddy Astie
  2026-05-05 10:30       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Teddy Astie @ 2026-05-05 10:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné, Jason Andryuk,
	Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
	xen-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2162 bytes --]

Le 04/05/2026 à 17:51, Jan Beulich a écrit :
> On 15.04.2026 15:32, Teddy Astie wrote:
>> Track on which pCPU each vCPU of a domain ran. This will
>> be used to know whether a TLB flush is required or not
>> when the vCPU is migrated on another pCPU.
> 
> Somewhat related tracking already exists - see the dirty_cpumask field.

I've seen it, but I'm not sure how it can be leveraged here.

I will try to take a closer look if that could be used instead.

> But what title and description say doesn't match ...
> 
>> @@ -977,6 +978,8 @@ void asmlinkage svm_vmenter_helper(void)
>>   
>>       svm_sync_vmcb(curr, vmcb_needs_vmsave);
>>   
>> +    curr->domain->latest_vcpu[cpu] = curr->vcpu_id;
> 
> ... the implementation anyway: You track which vCPU last ran on a given
> pCPU. Since the same pCPU may have run multiple vCPU-s which then weren't
> scheduled again, you lose data afaict.
> 

I mixed up the wording. But the implementation is the proper intent.

It's more

   Track which vCPU of the domain each pCPU ran.

>> @@ -992,6 +993,13 @@ struct domain *domain_create(domid_t domid,
>>       if ( !zalloc_cpumask_var(&d->dirty_cpumask) )
>>           goto fail;
>>   
>> +    err = -ENOMEM;
>> +    d->latest_vcpu = xmalloc_array(int, nr_cpu_ids);
> 
> xvmalloc_array() please, as this can be huge. It possibly being huge is
> also of concern.
> 
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -567,6 +567,10 @@ struct domain
>>       /* Bitmask of CPUs which are holding onto this domain's state. */
>>       cpumask_var_t    dirty_cpumask;
>>   
>> +    /* Mapping of the latest vCPU that ran on a specific CPU
>> +     * (-1 if the vCPU hasn't ran yet) */
>> +    int *latest_vcpu;
> 
> Why plain int? You don't really leverage -1 as a sentinel, and any
> unsigned value >= nr_cpu_ids would do in its stead.
> 

int is not really required here in practice. It's more here to express a 
invalid state instead of leaving 0 (which would be first pCPU) even if 
it would not change the overall behavior. Using ~0 could also work 
alternatively.

> Jan
> 

Teddy

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 2489 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU
  2026-05-05 10:17     ` Teddy Astie
@ 2026-05-05 10:30       ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2026-05-05 10:30 UTC (permalink / raw)
  To: Teddy Astie
  Cc: Andrew Cooper, Roger Pau Monné, Jason Andryuk,
	Anthony PERARD, Michal Orzel, Julien Grall, Stefano Stabellini,
	xen-devel

On 05.05.2026 12:17, Teddy Astie wrote:
> Le 04/05/2026 à 17:51, Jan Beulich a écrit :
>> On 15.04.2026 15:32, Teddy Astie wrote:
>>> Track on which pCPU each vCPU of a domain ran. This will
>>> be used to know whether a TLB flush is required or not
>>> when the vCPU is migrated on another pCPU.
>>
>> Somewhat related tracking already exists - see the dirty_cpumask field.
> 
> I've seen it, but I'm not sure how it can be leveraged here.
> 
> I will try to take a closer look if that could be used instead.
> 
>> But what title and description say doesn't match ...
>>
>>> @@ -977,6 +978,8 @@ void asmlinkage svm_vmenter_helper(void)
>>>   
>>>       svm_sync_vmcb(curr, vmcb_needs_vmsave);
>>>   
>>> +    curr->domain->latest_vcpu[cpu] = curr->vcpu_id;
>>
>> ... the implementation anyway: You track which vCPU last ran on a given
>> pCPU. Since the same pCPU may have run multiple vCPU-s which then weren't
>> scheduled again, you lose data afaict.
>>
> 
> I mixed up the wording. But the implementation is the proper intent.
> 
> It's more
> 
>    Track which vCPU of the domain each pCPU ran.

Okay, yet then (as already pointed out) how do you know vCPU0 ran last on
a given pCPU if after its de-scheduling vCPU1 (of the same domain) was
put there. Your track record (after de-scheduling vCPU1) will say only
vCPU1; information on vCPU0 will be lost. Yet then, as also indicated,
it's not quite clear to me how exactly you mean to leverage this tracking.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-05 10:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 13:32 [RFC PATCH 0/7] x86: Use a single fixed ASID per domain Teddy Astie
2026-04-15 13:32 ` [RFC PATCH 1/7] vmx: Introduce vcpu single context VPID invalidation Teddy Astie
2026-05-04 15:41   ` Jan Beulich
2026-04-15 13:32 ` [RFC PATCH 2/7] common: Track latest pCPU that ran the vCPU Teddy Astie
2026-05-04 15:49   ` Jan Beulich
2026-05-05 10:17     ` Teddy Astie
2026-05-05 10:30       ` Jan Beulich
2026-04-15 13:32 ` [RFC PATCH 3/7] common: Introduce needs_tlb_flush vcpu field Teddy Astie
2026-05-04 15:54   ` Jan Beulich
2026-04-15 13:32 ` [RFC PATCH 4/7] x86: Set v->needs_tlb_flush when needed Teddy Astie
2026-04-15 13:32 ` [RFC PATCH 5/7] x86/hvm: Flush TLB on vCPU overlaps on the same pCPU Teddy Astie
2026-04-15 13:32 ` [RFC PATCH 6/7] x86/hvm: Transition to needs_tlb_flush logic, use per-domain ASID Teddy Astie
2026-04-15 13:32 ` [RFC PATCH 7/7] hvm: Allow specifying a prefered asid minimum Teddy Astie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.