All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Zeng Guang <guang.zeng@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>, X86 Kernel <x86@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Thomas Gleixner <tglx@linutronix.de>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	Joerg Roedel <joro@8bytes.org>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@redhat.com>,
	"Luse, Paul E" <paul.e.luse@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	Jens Axboe <axboe@kernel.dk>, "Raj, Ashok" <ashok.raj@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	"maz@kernel.org" <maz@kernel.org>,
	"seanjc@google.com" <seanjc@google.com>,
	Robin Murphy <robin.murphy@arm.com>,
	jacob.jun.pan@linux.intel.com
Subject: Re: [PATCH 09/15] x86/irq: Install posted MSI notification handler
Date: Tue, 2 Apr 2024 19:43:55 -0700	[thread overview]
Message-ID: <20240402194355.72b2ade8@jacob-builder> (raw)
In-Reply-To: <a4f169fa-663d-4a94-878b-d783f67d48c9@intel.com>

Hi Zeng,

On Fri, 29 Mar 2024 15:32:00 +0800, Zeng Guang <guang.zeng@intel.com> wrote:

> On 1/27/2024 7:42 AM, Jacob Pan wrote:
> > @@ -353,6 +360,111 @@ void intel_posted_msi_init(void)
> >   	pid->nv = POSTED_MSI_NOTIFICATION_VECTOR;
> >   	pid->ndst = this_cpu_read(x86_cpu_to_apicid);
> >   }
> > +
> > +/*
> > + * De-multiplexing posted interrupts is on the performance path, the
> > code
> > + * below is written to optimize the cache performance based on the
> > following
> > + * considerations:
> > + * 1.Posted interrupt descriptor (PID) fits in a cache line that is
> > frequently
> > + *   accessed by both CPU and IOMMU.
> > + * 2.During posted MSI processing, the CPU needs to do 64-bit read and
> > xchg
> > + *   for checking and clearing posted interrupt request (PIR), a 256
> > bit field
> > + *   within the PID.
> > + * 3.On the other side, the IOMMU does atomic swaps of the entire PID
> > cache
> > + *   line when posting interrupts and setting control bits.
> > + * 4.The CPU can access the cache line a magnitude faster than the
> > IOMMU.
> > + * 5.Each time the IOMMU does interrupt posting to the PIR will evict
> > the PID
> > + *   cache line. The cache line states after each operation are as
> > follows:
> > + *   CPU		IOMMU			PID Cache line
> > state
> > + *   ---------------------------------------------------------------
> > + *...read64					exclusive
> > + *...lock xchg64				modified
> > + *...			post/atomic swap	invalid
> > + *...-------------------------------------------------------------
> > + *
> > + * To reduce L1 data cache miss, it is important to avoid contention
> > with
> > + * IOMMU's interrupt posting/atomic swap. Therefore, a copy of PIR is
> > used
> > + * to dispatch interrupt handlers.
> > + *
> > + * In addition, the code is trying to keep the cache line state
> > consistent
> > + * as much as possible. e.g. when making a copy and clearing the PIR
> > + * (assuming non-zero PIR bits are present in the entire PIR), it does:
> > + *		read, read, read, read, xchg, xchg, xchg, xchg
> > + * instead of:
> > + *		read, xchg, read, xchg, read, xchg, read, xchg
> > + */
> > +static __always_inline inline bool handle_pending_pir(u64 *pir, struct
> > pt_regs *regs) +{
> > +	int i, vec = FIRST_EXTERNAL_VECTOR;
> > +	unsigned long pir_copy[4];
> > +	bool handled = false;
> > +
> > +	for (i = 0; i < 4; i++)
> > +		pir_copy[i] = pir[i];
> > +
> > +	for (i = 0; i < 4; i++) {
> > +		if (!pir_copy[i])
> > +			continue;
> > +
> > +		pir_copy[i] = arch_xchg(pir, 0);  
> 
> Here is a problem that pir_copy[i] will always be written as pir[0]. 
> This leads to handle spurious posted MSIs later.
Yes, you are right. It should be
pir_copy[i] = arch_xchg(&pir[i], 0);

Will fix in v2, really appreciated.

> > +		handled = true;
> > +	}
> > +
> > +	if (handled) {
> > +		for_each_set_bit_from(vec, pir_copy,
> > FIRST_SYSTEM_VECTOR)
> > +			call_irq_handler(vec, regs);
> > +	}
> > +
> > +	return handled;
> > +}
> > +
> > +/*
> > + * Performance data shows that 3 is good enough to harvest 90+% of the
> > benefit
> > + * on high IRQ rate workload.
> > + */
> > +#define MAX_POSTED_MSI_COALESCING_LOOP 3
> > +
> > +/*
> > + * For MSIs that are delivered as posted interrupts, the CPU
> > notifications
> > + * can be coalesced if the MSIs arrive in high frequency bursts.
> > + */
> > +DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification)
> > +{
> > +	struct pt_regs *old_regs = set_irq_regs(regs);
> > +	struct pi_desc *pid;
> > +	int i = 0;
> > +
> > +	pid = this_cpu_ptr(&posted_interrupt_desc);
> > +
> > +	inc_irq_stat(posted_msi_notification_count);
> > +	irq_enter();
> > +
> > +	/*
> > +	 * Max coalescing count includes the extra round of
> > handle_pending_pir
> > +	 * after clearing the outstanding notification bit. Hence, at
> > most
> > +	 * MAX_POSTED_MSI_COALESCING_LOOP - 1 loops are executed here.
> > +	 */
> > +	while (++i < MAX_POSTED_MSI_COALESCING_LOOP) {
> > +		if (!handle_pending_pir(pid->pir64, regs))
> > +			break;
> > +	}
> > +
> > +	/*
> > +	 * Clear outstanding notification bit to allow new IRQ
> > notifications,
> > +	 * do this last to maximize the window of interrupt coalescing.
> > +	 */
> > +	pi_clear_on(pid);
> > +
> > +	/*
> > +	 * There could be a race of PI notification and the clearing
> > of ON bit,
> > +	 * process PIR bits one last time such that handling the new
> > interrupts
> > +	 * are not delayed until the next IRQ.
> > +	 */
> > +	handle_pending_pir(pid->pir64, regs);
> > +
> > +	apic_eoi();
> > +	irq_exit();
> > +	set_irq_regs(old_regs);
> >   }
> >   #endif /* X86_POSTED_MSI */
> >     


Thanks,

Jacob

  reply	other threads:[~2024-04-03  2:39 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 23:42 [PATCH 00/15] Coalesced Interrupt Delivery with posted MSI Jacob Pan
2024-01-26 23:42 ` [PATCH 01/15] x86/irq: Move posted interrupt descriptor out of vmx code Jacob Pan
2024-01-26 23:42 ` [PATCH 02/15] x86/irq: Unionize PID.PIR for 64bit access w/o casting Jacob Pan
2024-01-26 23:42 ` [PATCH 03/15] x86/irq: Use bitfields exclusively in posted interrupt descriptor Jacob Pan
2024-01-31  1:48   ` Sean Christopherson
2024-02-06  0:40     ` Jacob Pan
2024-01-26 23:42 ` [PATCH 04/15] x86/irq: Add a Kconfig option for posted MSI Jacob Pan
2024-04-05  2:28   ` Robert Hoo
2024-04-05 15:54     ` Jacob Pan
2024-01-26 23:42 ` [PATCH 05/15] x86/irq: Reserve a per CPU IDT vector for posted MSIs Jacob Pan
2024-04-04 13:38   ` Robert Hoo
2024-04-04 17:17     ` Jacob Pan
2024-01-26 23:42 ` [PATCH 06/15] x86/irq: Set up per host CPU posted interrupt descriptors Jacob Pan
2024-02-13 19:44   ` Jacob Pan
2024-01-26 23:42 ` [PATCH 07/15] x86/irq: Add accessors for " Jacob Pan
2024-01-26 23:42 ` [PATCH 08/15] x86/irq: Factor out calling ISR from common_interrupt Jacob Pan
2024-01-26 23:42 ` [PATCH 09/15] x86/irq: Install posted MSI notification handler Jacob Pan
2024-03-29  7:32   ` Zeng Guang
2024-04-03  2:43     ` Jacob Pan [this message]
2024-01-26 23:42 ` [PATCH 10/15] x86/irq: Factor out common code for checking pending interrupts Jacob Pan
2024-01-26 23:42 ` [PATCH 11/15] x86/irq: Extend checks for pending vectors to posted interrupts Jacob Pan
2024-01-26 23:42 ` [PATCH 12/15] iommu/vt-d: Make posted MSI an opt-in cmdline option Jacob Pan
2024-01-26 23:42 ` [PATCH 13/15] iommu/vt-d: Add an irq_chip for posted MSIs Jacob Pan
2024-01-26 23:42 ` [PATCH 14/15] iommu/vt-d: Add a helper to retrieve PID address Jacob Pan
2024-01-26 23:42 ` [PATCH 15/15] iommu/vt-d: Enable posted mode for device MSIs Jacob Pan
2024-02-08 15:34 ` [PATCH 00/15] Coalesced Interrupt Delivery with posted MSI Jens Axboe
2024-02-09 17:43   ` Jacob Pan
2024-02-09 20:31     ` Jens Axboe
2024-02-12 18:27       ` Jacob Pan
2024-02-12 18:36         ` Jens Axboe
2024-02-12 20:13           ` Jacob Pan
2024-02-13  1:10           ` Jacob Pan
2024-04-04 13:45 ` Robert Hoo
2024-04-04 17:37   ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240402194355.72b2ade8@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=axboe@kernel.dk \
    --cc=baolu.lu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=guang.zeng@intel.com \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul.e.luse@intel.com \
    --cc=peterz@infradead.org \
    --cc=robin.murphy@arm.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.