public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Nuno Das Neves <nunodasneves@linux.microsoft.com>,
	Michael Kelley <mhklinux@outlook.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>
Cc: "kys@microsoft.com" <kys@microsoft.com>,
	"haiyangz@microsoft.com" <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"decui@microsoft.com" <decui@microsoft.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
	"kw@linux.com" <kw@linux.com>,
	"robh@kernel.org" <robh@kernel.org>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"jinankjain@linux.microsoft.com" <jinankjain@linux.microsoft.com>,
	"skinsburskii@linux.microsoft.com"
	<skinsburskii@linux.microsoft.com>,
	"mrathor@linux.microsoft.com" <mrathor@linux.microsoft.com>,
	"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function
Date: Fri, 20 Jun 2025 18:19:24 +0200	[thread overview]
Message-ID: <878qlmqtbn.ffs@tglx> (raw)
In-Reply-To: <8f96db3f-fc3b-44b6-ab28-26bca6e2615b@linux.microsoft.com>

On Wed, Jun 18 2025 at 14:08, Nuno Das Neves wrote:
> On 6/11/2025 4:07 PM, Michael Kelley wrote:
>> From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Tuesday, June 10, 2025 4:52 PM
>>> +/**
>>> + * hv_map_msi_interrupt() - "Map" the MSI IRQ in the hypervisor.
>>> + * @data:      Describes the IRQ
>>> + * @out_entry: Hypervisor (MSI) interrupt entry (can be NULL)
>>> + *
>>> + * Map the IRQ in the hypervisor by issuing a MAP_DEVICE_INTERRUPT hypercall.
>>> + */
>>> +int hv_map_msi_interrupt(struct irq_data *data,
>>> +			 struct hv_interrupt_entry *out_entry)
>>>  {
>>> -	union hv_device_id device_id = hv_build_pci_dev_id(dev);
>>> +	struct msi_desc *msidesc;
>>> +	struct pci_dev *dev;
>>> +	union hv_device_id device_id;
>>> +	struct hv_interrupt_entry dummy;
>>> +	struct irq_cfg *cfg = irqd_cfg(data);
>>> +	const cpumask_t *affinity;
>>> +	int cpu;
>>> +	u64 res;

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#variable-declarations

>>>
>>> -	return hv_map_interrupt(device_id, false, cpu, vector, entry);
>>> +	msidesc = irq_data_get_msi_desc(data);
>>> +	dev = msi_desc_to_pci_dev(msidesc);
>>> +	device_id = hv_build_pci_dev_id(dev);
>>> +	affinity = irq_data_get_effective_affinity_mask(data);
>>> +	cpu = cpumask_first_and(affinity, cpu_online_mask);
>> 
>> Is the cpus_read_lock held at this point? I'm not sure what the
>> overall calling sequence looks like. If it is not held, the CPU that
>> is selected could go offline before hv_map_interrupt() is called.
>> This computation of the target CPU is the same as in the code
>> before this patch, but that existing code looks like it has the
>> same problem.
>> 
>
> Thanks for pointing it out - It *looks* like the read lock is not held
> everywhere this could be called, so it could indeed be a problem.
>
> I've been thinking about different ways around this but I lack the
> knowledge to have an informed opinion about it:
>
> - We could take the cpu read lock in this function, would that work?

Obviously not.

> - I'm not actually sure why the code is getting the first cpu off the effective
>   affinity mask in the first place. It is possible to get the apic id (and hence
>   the cpu) already associated with the irq, as per e.g. x86_vector_msi_compose_msg()
>   Maybe we could get the cpu that way, assuming that doesn't have a
>   similar issue.

There is no reason to fiddle in the underlying low level data. The
effective affinity mask is there for a reason.

> - We could just let this race happen, maybe the outcome isn't too catastrophic?

Let's terminate guesswork mode and look at the facts.

The point is that hv_map_msi_interrupt() is called from:

    1) hv_irq_compose_msi_msg()

    2) hv_arch_irq_unmask() (in patch 4/4)

Both functions are interrupt chip callbacks and invoked with the
interrupt descriptor lock held.

At the point where they are called, the effective affinity mask is valid
and immutable. Nothing can modify it as any modification requires the
interrupt descriptor lock to be held. This applies to the CPU hotplug
machinery too. So this AND cpu_online_mask is a complete pointless
voodoo exercise.

Just use:

     cpu = cpumask_first(irq_data_get_effective_affinity_mask(data));

and be done with it.

Please fix that first with a seperate patch before moving this code
around.

Thanks,

        tglx


  parent reply	other threads:[~2025-06-20 17:00 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-10 23:52 [PATCH 0/4] Nested virtualization fixes for root partition Nuno Das Neves
2025-06-10 23:52 ` [PATCH 1/4] PCI: hv: Do not do vmbus initialization on baremetal Nuno Das Neves
2025-06-11 23:06   ` Michael Kelley
2025-06-16 20:06     ` Nuno Das Neves
2025-06-10 23:52 ` [PATCH 2/4] Drivers: hv: Use nested hypercall for post message and signal event Nuno Das Neves
2025-06-11 23:06   ` Michael Kelley
2025-06-16 20:18     ` Nuno Das Neves
2025-06-10 23:52 ` [PATCH 3/4] x86: hyperv: Expose hv_map_msi_interrupt function Nuno Das Neves
2025-06-11 23:07   ` Michael Kelley
2025-06-18 21:08     ` Nuno Das Neves
2025-06-19 22:02       ` Michael Kelley
2025-06-20 16:19       ` Thomas Gleixner [this message]
2025-06-23 22:13         ` Nuno Das Neves
2025-06-10 23:52 ` [PATCH 4/4] PCI: hv: Use the correct hypercall for unmasking interrupts on nested Nuno Das Neves
2025-06-11 23:07   ` Michael Kelley
2025-06-18 21:24     ` Nuno Das Neves
2025-06-13 19:36   ` Bjorn Helgaas
2025-06-18 22:45     ` Nuno Das Neves
2025-06-11 17:55 ` [PATCH 0/4] Nested virtualization fixes for root partition Roman Kisel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878qlmqtbn.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jinankjain@linux.microsoft.com \
    --cc=kw@linux.com \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=mingo@redhat.com \
    --cc=mrathor@linux.microsoft.com \
    --cc=nunodasneves@linux.microsoft.com \
    --cc=robh@kernel.org \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox