kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: kvm@vger.kernel.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org,
	 Maxim Levitsky <mlevitsk@redhat.com>,
	David Matlack <dmatlack@google.com>,
	 Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	 Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Vasant Hegde <vasant.hegde@amd.com>,
	 Joerg Roedel <joro@8bytes.org>,
	David Woodhouse <dwmw2@infradead.org>,
	 Lu Baolu <baolu.lu@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH 64/67] iommu/amd: KVM: SVM: Allow KVM to control need for GA log interrupts
Date: Thu, 10 Apr 2025 08:45:48 -0700	[thread overview]
Message-ID: <Z_fnrP4e77mKjdX9@google.com> (raw)
In-Reply-To: <9b7ceea3-8c47-4383-ad9c-1a9bbdc9044a@oracle.com>

On Wed, Apr 09, 2025, Joao Martins wrote:
> On 04/04/2025 20:39, Sean Christopherson wrote:
> I would suggest holding off on this and the next one, while progressing with
> the rest of the series.

Agreed, though I think there's a "pure win" alternative that can be safely
implemented (but it definitely should be done separately).

If HLT-exiting is disabled for the VM, and the VM doesn't have access to the
various paravirtual features that can put it into a synthetic HLT state (PV async
#PF and/or Xen support), then I'm pretty sure GALogIntr can be disabled entirely,
i.e. disabled during the initial irq_set_vcpu_affinity() and never enabled.  KVM
doesn't emulate HLT via its full emulator for AMD (just non-unrestricted Intel
guests), so I'm pretty sure there would be no need for KVM to ever wake a vCPU in
response to a device interrupt.

> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index 2e016b98fa1b..27b03e718980 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> > -static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
> > +static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu,
> > +				  bool ga_log_intr)
> >  {
> >  	if (cpu >= 0) {
> >  		entry->lo.fields_vapic.destination =
> > @@ -3783,12 +3784,14 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
> >  		entry->hi.fields.destination =
> >  					APICID_TO_IRTE_DEST_HI(cpu);
> >  		entry->lo.fields_vapic.is_run = true;
> > +		entry->lo.fields_vapic.ga_log_intr = false;
> >  	} else {
> >  		entry->lo.fields_vapic.is_run = false;
> > +		entry->lo.fields_vapic.ga_log_intr = ga_log_intr;
> >  	}
> >  }
> >
> 
> isRun, Destination and GATag are not cached. Quoting the update from a few years
> back (page 93 of IOMMU spec dated Feb 2025):
> 
> | When virtual interrupts are enabled by setting MMIO Offset 0018h[GAEn] and
> | IRTE[GuestMode=1], IRTE[IsRun], IRTE[Destination], and if present IRTE[GATag],
> | are not cached by the IOMMU. Modifications to these fields do not require an
> | invalidation of the Interrupt Remapping Table.

Ooh, that's super helpful info.  Any objections to me adding verbose comments to
explain the effective rules for amd_iommu_update_ga()?

> This is the reason we were able to get rid of the IOMMU invalidation in
> amd_iommu_update_ga() ... which sped up vmexit/vmenter flow with iommu avic.
> Besides the lock contention that was observed at the time, we were seeing stalls
> in this path with enough vCPUs IIRC; CCing Alejandro to keep me honest.
> 
> Now this change above is incorrect as is and to make it correct: you will need
> xor with the previous content of the IRTE::ga_log_intr and then if it changes
> then you re-add back an invalidation command via
> iommu_flush_irt_and_complete()). The latter is what I am worried will
> reintroduce these above problem :(

Ya, the need to flush definitely changes things.

> The invalidation command (which has a completion barrier to serialize
> invalidation execution) takes some time in h/w, and will make all your vcpus
> content on the irq table lock (as is). Even assuming you somehow move the
> invalidation outside the lock, you will content on the iommu lock (for the
> command queue) or best case assuming no locks (which I am not sure it is
> possible) you will need to wait for the command to complete until you can
> progress forward with entering/exiting.
> 
> Unless the GALogIntr bit is somehow also not cached too which wouldn't need the
> invalidation command (which would be good news!). Adding Suravee/Vasant here.
> 
> It's a nice trick how you would leverage this in SVM, but do you have
> measurements that corroborate its introduction? How many unnecessary GALog
> entries were you able to avoid with this trick say on a workload that would
> exercise this (like netperf 1byte RR test that sleeps and wakes up a lot) ?

I didn't do any measurements; I assumed the opportunistic toggling of GALogIntr
would be "free".

There might be optimizations that could be done, but I think the better solution
is to simply disable GALogIntr when it's not needed.  That'd limit the benefits
to select setups, but trying to optimize IRQ bypass for VMs that are CPU-overcommited,
i.e. can't do native HLT, seems rather pointless.

> I should also mention that there's a different logic that is alternative to
> GALog (in Genoa or more recent), which is GAPPI support whereby an entry is
> generated but a more rare condition. Quoting the an excerpts below:
> 
> | This mode is enabled by setting MMIO Offset 0018h[GAPPIEn]=1. Under this
> | mode, guest interrupts (IRTE[GuestMode]=1) update the vAPIC backing page IRR
> | status as normal.
> 
> | In GAPPI mode, a GAPPI interrupt is generated if all of the guest IRR bits
> | were previously clear prior to the last IRR update. This indicates the new
> | interrupt is the first pending interrupt to the
> | vCPU. The GAPPI interrupt is used to signal Hypervisor software to schedule
> | one or more vCPUs to execute pending interrupts.
> 
> | Implementations may not be able to perfectly determine if all of the IRR bits
> | were previously clear prior to updating the vAPIC backing page to set IRR.
> | Spurious interrupts may be generated as a
> | result. Software must be designed to handle this possibility

I saw this as well.  I'm curious if enabling GAPPI mode affects IOMMU interrupt
delivery latency/throughput due to having to scrape the entire IRR.

> Page 99, "2.2.5.5 Guest APIC Physical Processor Interrupt",
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf

  reply	other threads:[~2025-04-10 15:45 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-04 19:38 [PATCH 00/67] KVM: iommu: Overhaul device posted IRQs support Sean Christopherson
2025-04-04 19:38 ` [PATCH 01/67] KVM: SVM: Allocate IR data using atomic allocation Sean Christopherson
2025-04-04 19:38 ` [PATCH 02/67] KVM: x86: Reset IRTE to host control if *new* route isn't postable Sean Christopherson
2025-04-11  8:08   ` Sairaj Kodilkar
2025-04-11 14:16     ` Sean Christopherson
2025-04-15 11:36       ` Paolo Bonzini
2025-04-04 19:38 ` [PATCH 03/67] KVM: x86: Explicitly treat routing entry type changes as changes Sean Christopherson
2025-04-04 19:38 ` [PATCH 04/67] KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer Sean Christopherson
2025-04-04 19:38 ` [PATCH 05/67] iommu/amd: Return an error if vCPU affinity is set for non-vCPU IRTE Sean Christopherson
2025-04-11  8:34   ` Sairaj Kodilkar
2025-04-11 14:05     ` Sean Christopherson
2025-04-11 17:02       ` Sairaj Kodilkar
2025-04-11 19:30         ` Sean Christopherson
2025-04-18 12:25   ` Vasant Hegde
2025-04-04 19:38 ` [PATCH 06/67] iommu/amd: WARN if KVM attempts to set vCPU affinity without posted intrrupts Sean Christopherson
2025-04-11  8:28   ` Sairaj Kodilkar
2025-04-11 14:10     ` Sean Christopherson
2025-04-11 17:03       ` Sairaj Kodilkar
2025-04-15 11:42       ` Paolo Bonzini
2025-04-15 17:48       ` Vasant Hegde
2025-04-15 22:04         ` Sean Christopherson
2025-04-16  9:47           ` Sairaj Kodilkar
2025-04-17 17:37             ` Paolo Bonzini
2025-04-04 19:38 ` [PATCH 07/67] KVM: SVM: WARN if an invalid posted interrupt IRTE entry is added Sean Christopherson
2025-04-04 19:38 ` [PATCH 08/67] KVM: x86: Pass new routing entries and irqfd when updating IRTEs Sean Christopherson
2025-04-11 10:57   ` Arun Kodilkar, Sairaj
2025-04-11 14:01     ` Sean Christopherson
2025-04-11 17:22       ` Sairaj Kodilkar
2025-04-04 19:38 ` [PATCH 09/67] KVM: SVM: Track per-vCPU IRTEs using kvm_kernel_irqfd structure Sean Christopherson
2025-04-11  7:47   ` Arun Kodilkar, Sairaj
2025-04-11 14:32     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 10/67] KVM: SVM: Delete IRTE link from previous vCPU before setting new IRTE Sean Christopherson
2025-04-04 19:38 ` [PATCH 11/67] KVM: SVM: Delete IRTE link from previous vCPU irrespective of new routing Sean Christopherson
2025-04-15 11:06   ` Sairaj Kodilkar
2025-04-15 14:55     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 12/67] KVM: SVM: Drop pointless masking of default APIC base when setting V_APIC_BAR Sean Christopherson
2025-04-04 19:38 ` [PATCH 13/67] KVM: SVM: Drop pointless masking of kernel page pa's with AVIC HPA masks Sean Christopherson
2025-04-04 19:38 ` [PATCH 14/67] KVM: SVM: Add helper to deduplicate code for getting AVIC backing page Sean Christopherson
2025-04-15 11:11   ` Sairaj Kodilkar
2025-04-15 14:57     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 15/67] KVM: SVM: Drop vcpu_svm's pointless avic_backing_page field Sean Christopherson
2025-04-04 19:38 ` [PATCH 16/67] KVM: SVM: Inhibit AVIC if ID is too big instead of rejecting vCPU creation Sean Christopherson
2025-04-04 19:38 ` [PATCH 17/67] KVM: SVM: Drop redundant check in AVIC code on ID during " Sean Christopherson
2025-04-15 11:16   ` Sairaj Kodilkar
2025-04-04 19:38 ` [PATCH 18/67] KVM: SVM: Track AVIC tables as natively sized pointers, not "struct pages" Sean Christopherson
2025-04-04 19:38 ` [PATCH 19/67] KVM: SVM: Drop superfluous "cache" of AVIC Physical ID entry pointer Sean Christopherson
2025-04-04 19:38 ` [PATCH 20/67] KVM: VMX: Move enable_ipiv knob to common x86 Sean Christopherson
2025-04-04 19:38 ` [PATCH 21/67] KVM: SVM: Add enable_ipiv param, never set IsRunning if disabled Sean Christopherson
2025-04-04 19:38 ` [PATCH 22/67] KVM: SVM: Disable (x2)AVIC IPI virtualization if CPU has erratum #1235 Sean Christopherson
2025-04-04 19:38 ` [PATCH 23/67] KVM: VMX: Suppress PI notifications whenever the vCPU is put Sean Christopherson
2025-04-04 19:38 ` [PATCH 24/67] KVM: SVM: Add a comment to explain why avic_vcpu_blocking() ignores IRQ blocking Sean Christopherson
2025-04-04 19:38 ` [PATCH 25/67] iommu/amd: KVM: SVM: Use pi_desc_addr to derive ga_root_ptr Sean Christopherson
2025-04-18 12:24   ` Vasant Hegde
2025-04-04 19:38 ` [PATCH 26/67] iommu/amd: KVM: SVM: Delete now-unused cached/previous GA tag fields Sean Christopherson
2025-04-08 16:57   ` Paolo Bonzini
2025-04-08 22:25     ` Sean Christopherson
2025-04-18 12:25   ` Vasant Hegde
2025-04-04 19:38 ` [PATCH 27/67] iommu/amd: KVM: SVM: Pass NULL @vcpu_info to indicate "not guest mode" Sean Christopherson
2025-04-04 19:38 ` [PATCH 28/67] KVM: SVM: Get vCPU info for IRTE using new routing entry Sean Christopherson
2025-04-04 19:38 ` [PATCH 29/67] KVM: SVM: Stop walking list of routing table entries when updating IRTE Sean Christopherson
2025-04-08 16:56   ` Paolo Bonzini
2025-04-04 19:38 ` [PATCH 30/67] KVM: VMX: " Sean Christopherson
2025-04-08 17:00   ` Paolo Bonzini
2025-05-20 20:36     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 31/67] KVM: SVM: Extract SVM specific code out of get_pi_vcpu_info() Sean Christopherson
2025-04-23 15:21   ` Francesco Lavra
2025-04-23 15:55     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 32/67] KVM: x86: Nullify irqfd->producer after updating IRTEs Sean Christopherson
2025-04-04 19:38 ` [PATCH 33/67] KVM: x86: Dedup AVIC vs. PI code for identifying target vCPU Sean Christopherson
2025-04-08 17:30   ` Paolo Bonzini
2025-04-08 20:51     ` Sean Christopherson
2025-04-24  4:39   ` Sairaj Kodilkar
2025-04-24 14:13     ` Sean Christopherson
2025-04-04 19:38 ` [PATCH 34/67] KVM: x86: Move posted interrupt tracepoint to common code Sean Christopherson
2025-04-04 19:38 ` [PATCH 35/67] KVM: SVM: Clean up return handling in avic_pi_update_irte() Sean Christopherson
2025-04-04 19:38 ` [PATCH 36/67] iommu: KVM: Split "struct vcpu_data" into separate AMD vs. Intel structs Sean Christopherson
2025-04-04 19:38 ` [PATCH 37/67] KVM: Don't WARN if updating IRQ bypass route fails Sean Christopherson
2025-04-04 19:38 ` [PATCH 38/67] KVM: Fold kvm_arch_irqfd_route_changed() into kvm_arch_update_irqfd_routing() Sean Christopherson
2025-04-04 19:38 ` [PATCH 39/67] KVM: x86: Track irq_bypass_vcpu in common x86 code Sean Christopherson
2025-04-04 19:38 ` [PATCH 40/67] KVM: x86: Skip IOMMU IRTE updates if there's no old or new vCPU being targeted Sean Christopherson
2025-04-04 19:38 ` [PATCH 41/67] KVM: x86: Don't update IRTE entries when old and new routes were !MSI Sean Christopherson
2025-04-04 19:38 ` [PATCH 42/67] KVM: SVM: Revert IRTE to legacy mode if IOMMU doesn't provide IR metadata Sean Christopherson
2025-04-04 19:38 ` [PATCH 43/67] KVM: SVM: Take and hold ir_list_lock across IRTE updates in IOMMU Sean Christopherson
2025-04-04 19:38 ` [PATCH 44/67] iommu/amd: KVM: SVM: Infer IsRun from validity of pCPU destination Sean Christopherson
2025-04-08 12:26   ` Joerg Roedel
2025-04-04 19:39 ` [PATCH 45/67] iommu/amd: Factor out helper for manipulating IRTE GA/CPU info Sean Christopherson
2025-04-04 19:39 ` [PATCH 46/67] iommu/amd: KVM: SVM: Set pCPU info in IRTE when setting vCPU affinity Sean Christopherson
2025-04-04 19:39 ` [PATCH 47/67] iommu/amd: KVM: SVM: Add IRTE metadata to affined vCPU's list if AVIC is inhibited Sean Christopherson
2025-04-04 19:39 ` [PATCH 48/67] KVM: SVM: Don't check for assigned device(s) when updating affinity Sean Christopherson
2025-04-04 19:39 ` [PATCH 49/67] KVM: SVM: Don't check for assigned device(s) when activating AVIC Sean Christopherson
2025-04-04 19:39 ` [PATCH 50/67] KVM: SVM: WARN if (de)activating guest mode in IOMMU fails Sean Christopherson
2025-04-04 19:39 ` [PATCH 51/67] KVM: SVM: Process all IRTEs on affinity change even if one update fails Sean Christopherson
2025-04-04 19:39 ` [PATCH 52/67] KVM: SVM: WARN if updating IRTE GA fields in IOMMU fails Sean Christopherson
2025-04-04 19:39 ` [PATCH 53/67] KVM: x86: Drop superfluous "has assigned device" check in kvm_pi_update_irte() Sean Christopherson
2025-04-04 19:39 ` [PATCH 54/67] KVM: x86: WARN if IRQ bypass isn't supported " Sean Christopherson
2025-04-04 19:39 ` [PATCH 55/67] KVM: x86: WARN if IRQ bypass routing is updated without in-kernel local APIC Sean Christopherson
2025-04-04 19:39 ` [PATCH 56/67] KVM: SVM: WARN if ir_list is non-empty at vCPU free Sean Christopherson
2025-04-04 19:39 ` [PATCH 57/67] KVM: x86: Decouple device assignment from IRQ bypass Sean Christopherson
2025-04-04 19:39 ` [PATCH 58/67] KVM: VMX: WARN if VT-d Posted IRQs aren't possible when starting " Sean Christopherson
2025-04-04 19:39 ` [PATCH 59/67] KVM: SVM: Use vcpu_idx, not vcpu_id, for GA log tag/metadata Sean Christopherson
2025-04-04 19:39 ` [PATCH 60/67] iommu/amd: WARN if KVM calls GA IRTE helpers without virtual APIC support Sean Christopherson
2025-04-04 19:39 ` [PATCH 61/67] KVM: SVM: Fold avic_set_pi_irte_mode() into its sole caller Sean Christopherson
2025-04-04 19:39 ` [PATCH 62/67] KVM: SVM: Don't check vCPU's blocking status when toggling AVIC on/off Sean Christopherson
2025-04-08 17:51   ` Paolo Bonzini
2025-04-04 19:39 ` [PATCH 63/67] KVM: SVM: Consolidate IRTE update " Sean Christopherson
2025-04-04 19:39 ` [PATCH 64/67] iommu/amd: KVM: SVM: Allow KVM to control need for GA log interrupts Sean Christopherson
2025-04-09 11:56   ` Joao Martins
2025-04-10 15:45     ` Sean Christopherson [this message]
2025-04-10 17:13       ` Joao Martins
2025-04-10 17:29         ` Sean Christopherson
2025-04-18 12:17     ` Vasant Hegde
2025-04-18 18:48       ` Sean Christopherson
2025-04-23 10:21         ` Joao Martins
2025-04-04 19:39 ` [PATCH 65/67] KVM: SVM: Generate GA log IRQs only if the associated vCPUs is blocking Sean Christopherson
2025-04-08 17:53   ` Paolo Bonzini
2025-04-08 21:31     ` Sean Christopherson
2025-04-09 10:34       ` Paolo Bonzini
2025-04-04 19:39 ` [PATCH 66/67] *** DO NOT MERGE *** iommu/amd: Hack to fake IRQ posting support Sean Christopherson
2025-04-04 19:39 ` [PATCH 67/67] *** DO NOT MERGE *** KVM: selftests: WIP posted interrupts test Sean Christopherson
2025-04-08 12:44 ` [PATCH 00/67] KVM: iommu: Overhaul device posted IRQs support Joerg Roedel
2025-04-09  8:30   ` Vasant Hegde
2025-04-08 15:36 ` Paolo Bonzini
2025-04-08 17:13 ` David Matlack
2025-05-23 23:52   ` David Matlack
2025-04-18 13:01 ` David Woodhouse
2025-04-18 16:22   ` Sean Christopherson
2025-05-15 12:08 ` Sairaj Kodilkar
2025-05-15 22:05   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_fnrP4e77mKjdX9@google.com \
    --to=seanjc@google.com \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=vasant.hegde@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).