From: Reinette Chatre <reinette.chatre@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: <jgg@nvidia.com>, <yishaih@nvidia.com>,
<shameerali.kolothum.thodi@huawei.com>, <kevin.tian@intel.com>,
<tglx@linutronix.de>, <darwi@linutronix.de>,
<kvm@vger.kernel.org>, <dave.jiang@intel.com>,
<jing2.liu@intel.com>, <ashok.raj@intel.com>,
<fenghua.yu@intel.com>, <tom.zanussi@linux.intel.com>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x
Date: Tue, 4 Apr 2023 09:54:46 -0700 [thread overview]
Message-ID: <5efa361d-012b-bdb6-b5e5-869887bde98d@intel.com> (raw)
In-Reply-To: <20230403211841.0e206b67.alex.williamson@redhat.com>
Hi Alex,
On 4/3/2023 8:18 PM, Alex Williamson wrote:
> On Mon, 3 Apr 2023 15:50:54 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
>> On 4/3/2023 1:22 PM, Alex Williamson wrote:
>>> On Mon, 3 Apr 2023 10:31:23 -0700
>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>> On 3/31/2023 3:24 PM, Alex Williamson wrote:
>>>>> On Fri, 31 Mar 2023 10:49:16 -0700
>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>>>> On 3/30/2023 3:42 PM, Alex Williamson wrote:
>>>>>>> On Thu, 30 Mar 2023 16:40:50 -0600
>>>>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>>>>
>>>>>>>> On Tue, 28 Mar 2023 14:53:34 -0700
>>>>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>>>>>>
...
>>> If the goal is to allow the user to swap one eventfd for another, where
>>> the result will always be the new eventfd on success or the old eventfd
>>> on error, I don't see that this code does that, or that we've ever
>>> attempted to make such a guarantee. If the ioctl errors, I think the
>>> eventfds are generally deconfigured. We certainly have the unwind code
>>> that we discussed earlier that deconfigures all the vectors previously
>>> touched in the loop (which seems to be another path where we could
>>> de-allocate from the set of initial ctxs).
>>
>> Thank you for your patience in hearing and addressing my concerns. I plan
>> to remove new_ctx in the next version.
>>
>>>>> devices supporting vdev->has_dyn_msix only ever have active contexts
>>>>> allocated? Thanks,
>>>>
>>>> What do you see as an "active context"? A policy that is currently enforced
>>>> is that an allocated context always has an allocated interrupt associated
>>>> with it. I do not see how this could be expanded to also require an
>>>> enabled interrupt because interrupt enabling requires a trigger that
>>>> may not be available.
>>>
>>> A context is essentially meant to track a trigger, ie. an eventfd
>>> provided by the user. In the static case all the irqs are necessarily
>>> pre-allocated, therefore we had no reason to consider a dynamic array
>>> for the contexts. However, a given context is really only "active" if
>>> it has a trigger, otherwise it's just a placeholder. When the
>>> placeholder is filled by an eventfd, the pre-allocated irq is enabled.
>>
>> I see.
>>
>>>
>>> This proposal seems to be a hybrid approach, pre-allocating some
>>> initial set of irqs and contexts and expecting the differentiation to
>>> occur only when new vectors are added, though we have some disagreement
>>> about this per above. Unfortunately I don't see an API to enable MSI-X
>>> without some vectors, so some pre-allocation of irqs seems to be
>>> required regardless.
>>
>> Right. pci_alloc_irq_vectors() or equivalent continues to be needed to
>> enable MSI-X. Even so, it does seem possible (within vfio_msi_enable())
>> to just allocate one vector using pci_alloc_irq_vectors()
>> and then immediately free it using pci_msix_free_irq(). What do you think?
>
> QEMU does something similar but I think it can really only be described
> as a hack. In this case I think we can work with them being allocated
> since that's essentially the static path.
ok. In this case I understand the hybrid approach to be required. Without
something (a hack) like this I am not able to see how an "active context"
policy can be enforced though. Interrupts allocated during MSI-X enabling may
not have eventfd associated and thus cannot adhere to an "active context" policy. I
understand from earlier comments that we do not want to track where contexts
are allocated so I can only see a way to enforce a policy that a context has
an allocated interrupt, but not an enabled interrupt.
>> If I understand correctly this can be done without allocating any context
>> and leave MSI-X enabled without any interrupts allocated. This could be a
>> way to accomplish the "active context" policy for dynamic allocation.
>> This is not a policy that can be applied broadly to interrupt contexts though
>> because MSI and non-dynamic MSI-X could still have contexts with allocated
>> interrupts without eventfd.
>
> I think we could come up with wrappers that handle all cases, for
> example:
>
> int vfio_pci_alloc_irq(struct vfio_pci_core_device *vdev,
> unsigned int vector, int irq_type)
> {
> struct pci_dev *pdev = vdev->pdev;
> struct msi_map map;
> int irq;
>
> if (irq_type == VFIO_PCI_INTX_IRQ_INDEX)
> return pdev->irq ?: -EINVAL;
>
> irq = pci_irq_vector(pdev, vector);
> if (irq > 0 || irq_type == VFIO_PCI_MSI_IRQ_INDEX ||
> !vdev->has_dyn_msix)
> return irq;
>
> map = pci_msix_alloc_irq_at(pdev, vector, NULL);
>
> return map.index;
> }
>
> void vfio_pci_free_irq(struct vfio_pci_core_device *vdev,
> unsigned in vector, int irq_type)
> {
> struct msi_map map;
> int irq;
>
> if (irq_type != VFIO_PCI_INTX_MSIX_INDEX ||
> !vdev->has_dyn_msix)
> return;
>
> irq = pci_irq_vector(pdev, vector);
> map = { .index = vector, .virq = irq };
>
> if (WARN_ON(irq < 0))
> return;
>
> pci_msix_free_irq(pdev, msix_map);
> }
Thank you very much for taking the time to write this out. I am not able to
see where vfio_pci_alloc_irq()/vfio_pci_free_irq() would be called for
an INTx interrupt. Is the INTx handling there for robustness or am I
missing how it should be used for INTx interrupts?
> At that point, maybe we'd check whether it makes sense to embed the irq
> alloc/free within the ctx alloc/free.
I think doing so would be the right thing to do since it helps
to enforce the policy that interrupts and contexts are allocated together.
I think this can be done when switching around the initialization within
vfio_msi_set_vector_signal(). I need to look into this more.
>>> But if non-active contexts were only placeholders in the pre-dynamic
>>> world and we now manage them via a dynamic array, why is there any
>>> pre-allocation of contexts without knowing the nature of the eventfd to
>>> fill it? We could have more commonality between cases if contexts are
>>> always dynamically allocated, which might simplify differentiation of
>>> the has_dyn_msix cases largely to wrappers allocating and freeing irqs.
>>> Thanks,
>>
>> Thank you very much for your guidance. I will digest this some more and
>> see how wrappers could be used. In the mean time while trying to think how
>> to unify this code I do think there is an issue in this patch in that
>> the get_cached_msi_msg()/pci_write_msi_msg()
>> should not be in an else branch.
>>
>> Specifically, I think it needs to be:
>> if (msix) {
>> if (irq == -EINVAL) {
>> /* dynamically allocate interrupt */
>> }
>> get_cached_msi_msg(irq, &msg);
>> pci_write_msi_msg(irq, &msg);
>> }
>
> Yes, that's looked wrong to me all along, I think that resolves it.
Thank you very much.
Reinette
next prev parent reply other threads:[~2023-04-04 16:55 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-28 21:53 [PATCH V2 0/8] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 1/8] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 2/8] vfio/pci: Remove negative check on unsigned vector Reinette Chatre
2023-03-30 20:26 ` Alex Williamson
2023-03-30 22:32 ` Reinette Chatre
2023-03-30 22:54 ` Alex Williamson
2023-03-30 23:54 ` Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 3/8] vfio/pci: Prepare for dynamic interrupt context storage Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 4/8] vfio/pci: Use xarray for " Reinette Chatre
2023-04-07 7:21 ` Liu, Jing2
2023-04-07 16:44 ` Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 5/8] vfio/pci: Remove interrupt context counter Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 6/8] vfio/pci: Move to single error path Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x Reinette Chatre
2023-03-29 2:48 ` kernel test robot
2023-03-29 14:42 ` Reinette Chatre
2023-03-29 22:10 ` Reinette Chatre
2023-03-29 2:58 ` kernel test robot
2023-03-30 22:40 ` Alex Williamson
2023-03-30 22:42 ` Alex Williamson
2023-03-31 17:49 ` Reinette Chatre
2023-03-31 22:24 ` Alex Williamson
2023-04-03 17:31 ` Reinette Chatre
2023-04-03 20:22 ` Alex Williamson
2023-04-03 22:50 ` Reinette Chatre
2023-04-04 3:18 ` Alex Williamson
2023-04-04 3:51 ` Tian, Kevin
2023-04-04 17:29 ` Reinette Chatre
2023-04-04 18:43 ` Alex Williamson
2023-04-04 20:46 ` Reinette Chatre
2023-04-04 16:54 ` Reinette Chatre [this message]
2023-04-04 18:24 ` Alex Williamson
2023-04-06 20:13 ` Reinette Chatre
2023-03-31 10:02 ` Liu, Jing2
2023-03-31 13:51 ` Alex Williamson
2023-04-04 3:19 ` Liu, Jing2
2023-03-28 21:53 ` [PATCH V2 8/8] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
2023-03-29 3:29 ` kernel test robot
2023-03-29 3:29 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5efa361d-012b-bdb6-b5e5-869887bde98d@intel.com \
--to=reinette.chatre@intel.com \
--cc=alex.williamson@redhat.com \
--cc=ashok.raj@intel.com \
--cc=darwi@linutronix.de \
--cc=dave.jiang@intel.com \
--cc=fenghua.yu@intel.com \
--cc=jgg@nvidia.com \
--cc=jing2.liu@intel.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=tglx@linutronix.de \
--cc=tom.zanussi@linux.intel.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.