Re: [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nuno Das Neves <nunodasneves@linux.microsoft.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Michael Kelley <mhklinux@outlook.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>
Cc: "kys@microsoft.com" <kys@microsoft.com>,
	"haiyangz@microsoft.com" <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"decui@microsoft.com" <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"kw@linux.com" <kw@linux.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"jinankjain@linux.microsoft.com" <jinankjain@linux.microsoft.com>,
	"skinsburskii@linux.microsoft.com"
	<skinsburskii@linux.microsoft.com>,
	"mrathor@linux.microsoft.com" <mrathor@linux.microsoft.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function
Date: Mon, 23 Jun 2025 15:13:49 -0700	[thread overview]
Message-ID: <64adb508-8843-4eae-87fb-47797bf32b19@linux.microsoft.com> (raw)
In-Reply-To: <878qlmqtbn.ffs@tglx>

On 6/20/2025 9:19 AM, Thomas Gleixner wrote:
> On Wed, Jun 18 2025 at 14:08, Nuno Das Neves wrote:
>> On 6/11/2025 4:07 PM, Michael Kelley wrote:
>>> From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Tuesday, June 10, 2025 4:52 PM
>>>> +/**
>>>> + * hv_map_msi_interrupt() - "Map" the MSI IRQ in the hypervisor.
>>>> + * @data:      Describes the IRQ
>>>> + * @out_entry: Hypervisor (MSI) interrupt entry (can be NULL)
>>>> + *
>>>> + * Map the IRQ in the hypervisor by issuing a MAP_DEVICE_INTERRUPT hypercall.
>>>> + */
>>>> +int hv_map_msi_interrupt(struct irq_data *data,
>>>> +			 struct hv_interrupt_entry *out_entry)
>>>>  {
>>>> -	union hv_device_id device_id = hv_build_pci_dev_id(dev);
>>>> +	struct msi_desc *msidesc;
>>>> +	struct pci_dev *dev;
>>>> +	union hv_device_id device_id;
>>>> +	struct hv_interrupt_entry dummy;
>>>> +	struct irq_cfg *cfg = irqd_cfg(data);
>>>> +	const cpumask_t *affinity;
>>>> +	int cpu;
>>>> +	u64 res;
> 
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#variable-declarations
> 
>>>>
>>>> -	return hv_map_interrupt(device_id, false, cpu, vector, entry);
>>>> +	msidesc = irq_data_get_msi_desc(data);
>>>> +	dev = msi_desc_to_pci_dev(msidesc);
>>>> +	device_id = hv_build_pci_dev_id(dev);
>>>> +	affinity = irq_data_get_effective_affinity_mask(data);
>>>> +	cpu = cpumask_first_and(affinity, cpu_online_mask);
>>>
>>> Is the cpus_read_lock held at this point? I'm not sure what the
>>> overall calling sequence looks like. If it is not held, the CPU that
>>> is selected could go offline before hv_map_interrupt() is called.
>>> This computation of the target CPU is the same as in the code
>>> before this patch, but that existing code looks like it has the
>>> same problem.
>>>
>>
>> Thanks for pointing it out - It *looks* like the read lock is not held
>> everywhere this could be called, so it could indeed be a problem.
>>
>> I've been thinking about different ways around this but I lack the
>> knowledge to have an informed opinion about it:
>>
>> - We could take the cpu read lock in this function, would that work?
> 
> Obviously not.
> 
>> - I'm not actually sure why the code is getting the first cpu off the effective
>>   affinity mask in the first place. It is possible to get the apic id (and hence
>>   the cpu) already associated with the irq, as per e.g. x86_vector_msi_compose_msg()
>>   Maybe we could get the cpu that way, assuming that doesn't have a
>>   similar issue.
> 
> There is no reason to fiddle in the underlying low level data. The
> effective affinity mask is there for a reason.
> 
>> - We could just let this race happen, maybe the outcome isn't too catastrophic?
> 
> Let's terminate guesswork mode and look at the facts.
> 
> The point is that hv_map_msi_interrupt() is called from:
> 
>     1) hv_irq_compose_msi_msg()
> 
>     2) hv_arch_irq_unmask() (in patch 4/4)
> 
> Both functions are interrupt chip callbacks and invoked with the
> interrupt descriptor lock held.
> 
> At the point where they are called, the effective affinity mask is valid
> and immutable. Nothing can modify it as any modification requires the
> interrupt descriptor lock to be held. This applies to the CPU hotplug
> machinery too. So this AND cpu_online_mask is a complete pointless
> voodoo exercise.
> 

Thanks for explaining.

> Just use:
> 
>      cpu = cpumask_first(irq_data_get_effective_affinity_mask(data));
> 
> and be done with it.
> 
> Please fix that first with a seperate patch before moving this code
> around.

Will do!

Nuno

> 
> Thanks,
> 
>         tglx

next prev parent reply	other threads:[~2025-06-24  1:38 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 23:52 [PATCH 0/4] Nested virtualization fixes for root partition Nuno Das Neves
2025-06-10 23:52 ` [PATCH 1/4] PCI: hv: Do not do vmbus initialization on baremetal Nuno Das Neves
2025-06-11 23:06   ` Michael Kelley
2025-06-16 20:06     ` Nuno Das Neves
2025-06-10 23:52 ` [PATCH 2/4] Drivers: hv: Use nested hypercall for post message and signal event Nuno Das Neves
2025-06-11 23:06   ` Michael Kelley
2025-06-16 20:18     ` Nuno Das Neves
2025-06-10 23:52 ` [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function Nuno Das Neves
2025-06-11 23:07   ` Michael Kelley
2025-06-18 21:08     ` Nuno Das Neves
2025-06-19 22:02       ` Michael Kelley
2025-06-20 16:19       ` Thomas Gleixner
2025-06-23 22:13         ` Nuno Das Neves [this message]
2025-06-10 23:52 ` [PATCH 4/4] PCI: hv: Use the correct hypercall for unmasking interrupts on nested Nuno Das Neves
2025-06-11 23:07   ` Michael Kelley
2025-06-18 21:24     ` Nuno Das Neves
2025-06-13 19:36   ` Bjorn Helgaas
2025-06-18 22:45     ` Nuno Das Neves
2025-06-11 17:55 ` [PATCH 0/4] Nested virtualization fixes for root partition Roman Kisel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64adb508-8843-4eae-87fb-47797bf32b19@linux.microsoft.com \
    --to=nunodasneves@linux.microsoft.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jinankjain@linux.microsoft.com \
    --cc=kw@linux.com \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=mingo@redhat.com \
    --cc=mrathor@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.