All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Oleksii Kurochko <oleksii.kurochko@gmail.com>
Subject: Re: [PATCH for-4.21 01/10] x86/HPET: limit channel changes
Date: Thu, 16 Oct 2025 17:07:03 +0200	[thread overview]
Message-ID: <aPEKFwPe-PiHh-Ay@Mac.lan> (raw)
In-Reply-To: <14bb12b2-1a01-49a8-be9a-6a32c3729e9e@suse.com>

On Thu, Oct 16, 2025 at 01:47:38PM +0200, Jan Beulich wrote:
> On 16.10.2025 12:24, Roger Pau Monné wrote:
> > On Thu, Oct 16, 2025 at 09:31:21AM +0200, Jan Beulich wrote:
> >> Despite 1db7829e5657 ("x86/hpet: do local APIC EOI after interrupt
> >> processing") we can still observe nested invocations of
> >> hpet_interrupt_handler(). This is, afaict, a result of previously used
> >> channels retaining their IRQ affinity until some other CPU re-uses them.
> > 
> > But the underlying problem here is not so much the affinity itself,
> > but the fact that the channel is not stopped after firing?
> 
> (when being detached, that is) That's the main problem here, yes. A minor
> benefit is to avoid the MMIO write in hpet_msi_set_affinity(). See also
> below.
> 
> Further, even when mask while detaching, the issue would re-surface after
> unmasking; it's just that the window then is smaller.

Yeah, it could trigger after unmasking, but the window is smaller
there, as after enabling the comparator will get updated to the new
deadline.

> >> @@ -454,9 +456,21 @@ static struct hpet_event_channel *hpet_g
> >>      if ( num_hpets_used >= nr_cpu_ids )
> >>          return &hpet_events[cpu];
> >>  
> >> +    /*
> >> +     * Try the least recently used channel first.  It may still have its IRQ's
> >> +     * affinity set to the desired CPU.  This way we also limit having multiple
> >> +     * of our IRQs raised on the same CPU, in possibly a nested manner.
> >> +     */
> >> +    ch = per_cpu(lru_channel, cpu);
> >> +    if ( ch && !test_and_set_bit(HPET_EVT_USED_BIT, &ch->flags) )
> >> +    {
> >> +        ch->cpu = cpu;
> >> +        return ch;
> >> +    }
> >> +
> >> +    /* Then look for an unused channel. */
> >>      next = arch_fetch_and_add(&next_channel, 1) % num_hpets_used;
> >>  
> >> -    /* try unused channel first */
> >>      for ( i = next; i < next + num_hpets_used; i++ )
> >>      {
> >>          ch = &hpet_events[i % num_hpets_used];
> >> @@ -479,6 +493,8 @@ static void set_channel_irq_affinity(str
> >>  {
> >>      struct irq_desc *desc = irq_to_desc(ch->msi.irq);
> >>  
> >> +    per_cpu(lru_channel, ch->cpu) = ch;
> >> +
> >>      ASSERT(!local_irq_is_enabled());
> >>      spin_lock(&desc->lock);
> >>      hpet_msi_mask(desc);
> > 
> > Maybe I'm missing the point here, but you are resetting the MSI
> > affinity anyway here, so there isn't much point in attempting to
> > re-use the same channel when Xen still unconditionally goes through the
> > process of setting the affinity anyway?
> 
> While still using normal IRQs, there's still a benefit: We can re-use the
> same vector (as staying on the same CPU), and hence we save an IRQ
> migration (being the main source of nested IRQs according to my
> observations).

Hm, I see.  You short-circuit all the logic in _assign_irq_vector().

> We could actually do even better, by avoiding the mask/unmask pair there,
> which would avoid triggering the "immediate" IRQ that I (for now) see as
> the only explanation of the large amount of "early" IRQs that I observe
> on (at least) Intel hardware. That would require doing the msg.dest32
> check earlier, but otherwise looks feasible. (Actually, the unmask would
> still be necessary, in case we're called with the channel already masked.)

Checking with .dest32 seems a bit crude, I would possibly prefer to
slightly modify hpet_attach_channel() to notice when ch->cpu == cpu
and avoid the call to set_channel_irq_affinity()?

Thanks, Roger.


  reply	other threads:[~2025-10-16 15:07 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  7:30 [PATCH for-4.21 00/10] x86/HPET: broadcast IRQ and other improvements Jan Beulich
2025-10-16  7:31 ` [PATCH for-4.21 01/10] x86/HPET: limit channel changes Jan Beulich
2025-10-16 10:24   ` Roger Pau Monné
2025-10-16 11:47     ` Jan Beulich
2025-10-16 15:07       ` Roger Pau Monné [this message]
2025-10-16 15:16         ` Jan Beulich
2025-10-16 15:25           ` Roger Pau Monné
2025-10-17  9:23   ` Roger Pau Monné
2025-10-17  9:55     ` Jan Beulich
2025-10-16  7:31 ` [PATCH for-4.21 02/10] x86/HPET: disable unused channels Jan Beulich
2025-10-16 11:42   ` Roger Pau Monné
2025-10-16 11:57     ` Jan Beulich
2025-10-16 15:34       ` Roger Pau Monné
2025-10-16 15:55         ` Jan Beulich
2025-10-16 16:28           ` Roger Pau Monné
2025-10-16 16:31   ` Roger Pau Monné
2025-10-17  6:08     ` Jan Beulich
2025-10-17  6:10       ` Jan Beulich
2025-10-16  7:32 ` [PATCH for-4.21 03/10] x86/HPET: use single, global, low-priority vector for broadcast IRQ Jan Beulich
2025-10-16 16:27   ` Roger Pau Monné
2025-10-17  7:15     ` Jan Beulich
2025-10-17  8:20       ` Roger Pau Monné
2025-10-20  5:53         ` Jan Beulich
2025-10-20 15:49           ` Roger Pau Monné
2025-10-20 16:05             ` Jan Beulich
2025-10-21  8:37               ` Roger Pau Monné
2025-10-16 17:01   ` Andrew Cooper
2025-10-17  6:23     ` Jan Beulich
2025-10-16  7:32 ` [PATCH for-4.21 04/10] x86/HPET: ignore "stale" IRQs Jan Beulich
2025-10-17  9:19   ` Roger Pau Monné
2025-10-17  9:57     ` Jan Beulich
2025-10-17 12:13       ` Roger Pau Monné
2025-10-16  7:32 ` [PATCH 05/10] x86/HPET: avoid indirect call to event handler Jan Beulich
2025-10-16  7:33 ` [PATCH 06/10] x86/HPET: make another channel flags update atomic Jan Beulich
2025-10-16  7:33 ` [PATCH 07/10] x86/HPET: move legacy tick IRQ count adjustment Jan Beulich
2025-10-16  7:34 ` [PATCH 08/10] x86/HPET: shrink IRQ-descriptor locked region in set_channel_irq_affinity() Jan Beulich
2025-10-16  7:34 ` [PATCH 09/10] x86/HPET: reduce hpet_next_event() call sites Jan Beulich
2025-10-16  7:35 ` [PATCH 10/10] x86/HPET: don't use hardcoded 0 for "long timeout" Jan Beulich
2025-10-16 10:05 ` [PATCH for-4.21 00/10] x86/HPET: broadcast IRQ and other improvements Roger Pau Monné
2025-10-16 10:41   ` Jan Beulich
2025-10-17 16:03 ` Oleksii Kurochko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPEKFwPe-PiHh-Ay@Mac.lan \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=oleksii.kurochko@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.