public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Michael Kelley <mhklinux@outlook.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Long Li <longli@microsoft.com>, Thomas Gleixner <tglx@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Florian Bezdeka <florian.bezdeka@siemens.com>,
	RT <linux-rt-users@vger.kernel.org>,
	Mitchell Levy <levymitchell0@gmail.com>,
	Saurabh Singh Sengar <ssengar@linux.microsoft.com>,
	Naman Jain <namjain@linux.microsoft.com>
Subject: Re: [PATCH v3] drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
Date: Wed, 18 Mar 2026 12:03:03 +0100	[thread overview]
Message-ID: <7f248f1f-a4ad-442d-bd85-23e57e58eeba@siemens.com> (raw)
In-Reply-To: <20260318100138.GimjldpV@linutronix.de>

On 18.03.26 11:01, Sebastian Andrzej Siewior wrote:
> On 2026-03-17 17:25:20 [+0000], Michael Kelley wrote:
>> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Sent: Thursday, March 12, 2026 10:07 AM
>>>
>>
>> Let me try to address the range of questions here and in the follow-up
>> discussion. As background, an overview of VMBus interrupt handling is in:
>>
>> Documentation/virt/hyperv/vmbus.rst
>>
>> in the section entitled "Synthetic Interrupt Controller (synic)". The
>> relevant text is:
>>
>>    The SINT is mapped to a single per-CPU architectural interrupt (i.e,
>>    an 8-bit x86/x64 interrupt vector, or an arm64 PPI INTID). Because
>>    each CPU in the guest has a synic and may receive VMBus interrupts,
>>    they are best modeled in Linux as per-CPU interrupts. This model works
>>    well on arm64 where a single per-CPU Linux IRQ is allocated for
>>    VMBUS_MESSAGE_SINT. This IRQ appears in /proc/interrupts as an IRQ labelled
>>    "Hyper-V VMbus". Since x86/x64 lacks support for per-CPU IRQs, an x86
>>    interrupt vector is statically allocated (HYPERVISOR_CALLBACK_VECTOR)
>>    across all CPUs and explicitly coded to call vmbus_isr(). In this case,
>>    there's no Linux IRQ, and the interrupts are visible in aggregate in
>>    /proc/interrupts on the "HYP" line.
>>
>> The use of a statically allocated sysvec pre-dates my involvement in this
>> code starting in 2017, but I believe it was modelled after what Xen does,
>> and for the same reason -- to effectively create a per-CPU interrupt on
>> x86/x64. Acorn is also using HYPERVISOR_CALLBACK_VECTOR, but I
>> don't know if that is also to create a per-CPU interrupt.
> 
> If you create a vector, it becomes per-CPU. There is simply no mapping
> from HYPERVISOR_CALLBACK_VECTOR to request_percpu_irq(). But if we had
> this…
> 
> …
>>> What clears this? This is wrongly placed. This should go to
>>> sysvec_hyperv_callback() instead with its matching canceling part. The
>>> add_interrupt_randomness() should also be there and not here.
>>> sysvec_hyperv_stimer0() managed to do so.
>>
>> I don't have any knowledge to bring regarding the use of
>> lockdep_hardirq_threaded().
> 
> It is used in IRQ core to mark the execution of an interrupt handler
> which becomes threaded in a forced-threaded scenario. The goal is to let
> lockdep know that this piece of code on !RT will be threaded on RT and
> therefore there is no need to report a possible locking problem that
> will not exist on RT.
> 
>>> Different question: What guarantees that there won't be another
>>> interrupt before this one is done? The handshake appears to be
>>> deprecated. The interrupt itself returns ACKing (or not) but the actual
>>> handler is delayed to this thread. Depending on the userland it could
>>> take some time and I don't know how impatient the host is.
>>
>> In more recent versions of Hyper-V, what's deprecated is Hyper-V implicitly
>> and automatically doing the EOI. So in sysvec_hyperv_callback(), apic_eoi()
>> is usually explicitly called to ack the interrupt.
>>
>> There's no guarantee, in either the existing case or the new PREEMPT_RT
>> case, that another VMBus interrupt won't come in on the same CPU
>> before the tasklets scheduled by vmbus_message_sched() or
>> vmbus_chan_sched() have run. From a functional standpoint, the Linux
>> code and interaction with Hyper-V handles another interrupt correctly.
> 
> So there is no scenario that the host will trigger interrupts because
> the guest is leaving the ISR without doing anything/ making progress?
> 
>> From a delay standpoint, there's not a problem for the normal (i.e., not
>> PREEMPT_RT) case because the tasklets run as the interrupt exits -- they
>> don't end up in ksoftirqd. For the PREEMPT_RT case, I can see your point
>> about delays since the tasklets are scheduled from the new per-CPU thread.
>> But my understanding is that Jan's motivation for these changes is not to
>> achieve true RT behavior, since Hyper-V doesn't provide that anyway.
>> The goal is simply to make PREEMPT_RT builds functional, though Jan may
>> have further comments on the goal.
> 
> I would be worried if the host would storming interrupts to the guest
> because it makes no progress.
> 
>>>> +		__vmbus_isr();
>>> Moving on. This (trying very hard here) even schedules tasklets. Why?
>>> You need to disable BH before doing so. Otherwise it ends in ksoftirqd.
>>> You don't want that.
>>
>> Again, Jan can comment on the impact of delays due to ending up
>> in ksoftirqd.
> 
> My point is that having this with threaded interrupt support would
> eliminate the usage of tasklets.
> 
>>> Couldn't the whole logic be integrated into the IRQ code? Then we could
>>> have mask/ unmask if supported/ provided and threaded interrupts. Then
>>> sysvec_hyperv_reenlightenment() could use a proper threaded interrupt
>>> instead apic_eoi() + schedule_delayed_work().
>>
>> As I described above, Hyper-V needs a per-CPU interrupt. It's faked up
>> on x86/x64 with the hardcoded HYPERVISOR_CALLBACK_VECTOR sysvec
>> entry, but on arm64 a normal Linux per-CPU IRQ is used. Once the execution
>> path gets to vmbus_isr(), the two architectures share the same code. Same
>> thing is done with the Hyper-V STIMER0 interrupt as a per-CPU interrupt.
> 
> This one has the "random" collecting on the right spot.
> 
>> If there's a better way to fake up a per-CPU interrupt on x86/x64, I'm open
>> to looking at it.
>>
>> As I recently discovered in discussion with Jan, standard Linux IRQ handling
>> will *not* thread per-CPU interrupts. So even on arm64 with a standard
>> Linux per-CPU IRQ is used for VMBus and STIMER0 interrupts, we can't
>> request threading.
> 
> It would require a statement from the x86 & IRQ maintainers if it is
> worth on x86 to make allow pass HYPERVISOR_CALLBACK_VECTOR to
> request_percpu_irq() and have an IRQF_ that this one needs to be forced
> threaded. Otherwise we would need to remain with the workarounds.
> 
> If you say that an interrupt storm can not occur, I would prefer
> |static DEFINE_WAIT_OVERRIDE_MAP(vmbus_map, LD_WAIT_CONFIG);
> |…
> |	lock_map_acquire_try(&vmbus_map);
> |	__vmbus_isr();
> |	lock_map_release(&vmbus_map);
> 
> while it has mostly the same effect.
> 
> Either way, that add_interrupt_randomness() should be moved to
> sysvec_hyperv_callback() like it has been done for
> sysvec_hyperv_stimer0(). It should be invoked twice now if gets there
> via vmbus_percpu_isr().

No, this would degrade arm64.

Jan

-- 
Siemens AG, Foundational Technologies
Linux Expert Center

  reply	other threads:[~2026-03-18 11:03 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-16 16:24 [PATCH v3] drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT Jan Kiszka
2026-02-17  6:42 ` Michael Kelley
2026-02-17 23:03 ` Bezdeka, Florian
2026-02-18  6:48   ` Jan Kiszka
2026-02-18  7:05 ` Wei Liu
2026-02-18  7:19   ` Saurabh Singh Sengar
2026-03-12 17:07 ` Sebastian Andrzej Siewior
2026-03-17  7:49   ` Jan Kiszka
2026-03-17 11:01     ` Sebastian Andrzej Siewior
2026-03-17 11:55       ` Jan Kiszka
2026-03-18  9:08         ` Sebastian Andrzej Siewior
2026-03-18 11:02           ` Jan Kiszka
2026-03-17 17:25   ` Michael Kelley
2026-03-18  5:52     ` Jan Kiszka
2026-03-18 10:01     ` Sebastian Andrzej Siewior
2026-03-18 11:03       ` Jan Kiszka [this message]
2026-03-18 11:21         ` Sebastian Andrzej Siewior
2026-03-18 12:12           ` Jan Kiszka
2026-03-19  3:43       ` Michael Kelley
2026-03-19 10:14         ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f248f1f-a4ad-442d-bd85-23e57e58eeba@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=bigeasy@linutronix.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=florian.bezdeka@siemens.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=levymitchell0@gmail.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mhklinux@outlook.com \
    --cc=mingo@redhat.com \
    --cc=namjain@linux.microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=tglx@kernel.org \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox