public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: <jgg@nvidia.com>, <yishaih@nvidia.com>,
	<shameerali.kolothum.thodi@huawei.com>, <kevin.tian@intel.com>,
	<tglx@linutronix.de>, <darwi@linutronix.de>,
	<kvm@vger.kernel.org>, <dave.jiang@intel.com>,
	<jing2.liu@intel.com>, <ashok.raj@intel.com>,
	<fenghua.yu@intel.com>, <tom.zanussi@linux.intel.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x
Date: Tue, 4 Apr 2023 09:54:46 -0700	[thread overview]
Message-ID: <5efa361d-012b-bdb6-b5e5-869887bde98d@intel.com> (raw)
In-Reply-To: <20230403211841.0e206b67.alex.williamson@redhat.com>

Hi Alex,

On 4/3/2023 8:18 PM, Alex Williamson wrote:
> On Mon, 3 Apr 2023 15:50:54 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
>> On 4/3/2023 1:22 PM, Alex Williamson wrote:
>>> On Mon, 3 Apr 2023 10:31:23 -0700
>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>> On 3/31/2023 3:24 PM, Alex Williamson wrote:  
>>>>> On Fri, 31 Mar 2023 10:49:16 -0700
>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:    
>>>>>> On 3/30/2023 3:42 PM, Alex Williamson wrote:    
>>>>>>> On Thu, 30 Mar 2023 16:40:50 -0600
>>>>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>>>>       
>>>>>>>> On Tue, 28 Mar 2023 14:53:34 -0700
>>>>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>>>>>>>      


...

>>> If the goal is to allow the user to swap one eventfd for another, where
>>> the result will always be the new eventfd on success or the old eventfd
>>> on error, I don't see that this code does that, or that we've ever
>>> attempted to make such a guarantee.  If the ioctl errors, I think the
>>> eventfds are generally deconfigured.   We certainly have the unwind code
>>> that we discussed earlier that deconfigures all the vectors previously
>>> touched in the loop (which seems to be another path where we could
>>> de-allocate from the set of initial ctxs).  
>>
>> Thank you for your patience in hearing and addressing my concerns. I plan
>> to remove new_ctx in the next version.
>>
>>>>> devices supporting vdev->has_dyn_msix only ever have active contexts
>>>>> allocated?  Thanks,    
>>>>
>>>> What do you see as an "active context"? A policy that is currently enforced
>>>> is that an allocated context always has an allocated interrupt associated
>>>> with it. I do not see how this could be expanded to also require an
>>>> enabled interrupt because interrupt enabling requires a trigger that
>>>> may not be available.  
>>>
>>> A context is essentially meant to track a trigger, ie. an eventfd
>>> provided by the user.  In the static case all the irqs are necessarily
>>> pre-allocated, therefore we had no reason to consider a dynamic array
>>> for the contexts.  However, a given context is really only "active" if
>>> it has a trigger, otherwise it's just a placeholder.  When the
>>> placeholder is filled by an eventfd, the pre-allocated irq is enabled.  
>>
>> I see.
>>
>>>
>>> This proposal seems to be a hybrid approach, pre-allocating some
>>> initial set of irqs and contexts and expecting the differentiation to
>>> occur only when new vectors are added, though we have some disagreement
>>> about this per above.  Unfortunately I don't see an API to enable MSI-X
>>> without some vectors, so some pre-allocation of irqs seems to be
>>> required regardless.  
>>
>> Right. pci_alloc_irq_vectors() or equivalent continues to be needed to
>> enable MSI-X. Even so, it does seem possible (within vfio_msi_enable())
>> to just allocate one vector using pci_alloc_irq_vectors()
>> and then immediately free it using pci_msix_free_irq(). What do you think?
> 
> QEMU does something similar but I think it can really only be described
> as a hack.  In this case I think we can work with them being allocated
> since that's essentially the static path.

ok. In this case I understand the hybrid approach to be required. Without
something (a hack) like this I am not able to see how an "active context"
policy can be enforced though. Interrupts allocated during MSI-X enabling may
not have eventfd associated and thus cannot adhere to an "active context" policy. I
understand from  earlier comments that we do not want to track where contexts
are allocated so I can only see a way to enforce a policy that a context has
an allocated interrupt, but not an enabled interrupt.

>> If I understand correctly this can be done without allocating any context
>> and leave MSI-X enabled without any interrupts allocated. This could be a
>> way to accomplish the "active context" policy for dynamic allocation.
>> This is not a policy that can be applied broadly to interrupt contexts though
>> because MSI and non-dynamic MSI-X could still have contexts with allocated
>> interrupts without eventfd.
> 
> I think we could come up with wrappers that handle all cases, for
> example:
> 
> int vfio_pci_alloc_irq(struct vfio_pci_core_device *vdev,
> 		       unsigned int vector, int irq_type)
> {
> 	struct pci_dev *pdev = vdev->pdev;
> 	struct msi_map map;
> 	int irq;
> 
> 	if (irq_type == VFIO_PCI_INTX_IRQ_INDEX)
> 		return pdev->irq ?: -EINVAL;
> 
> 	irq = pci_irq_vector(pdev, vector);
> 	if (irq > 0 || irq_type == VFIO_PCI_MSI_IRQ_INDEX ||
> 	    !vdev->has_dyn_msix)
> 		return irq;
> 
> 	map = pci_msix_alloc_irq_at(pdev, vector, NULL);
> 
> 	return map.index;
> }
> 
> void vfio_pci_free_irq(struct vfio_pci_core_device *vdev,
> 		       unsigned in vector, int irq_type)
> {
> 	struct msi_map map;
> 	int irq;
> 
> 	if (irq_type != VFIO_PCI_INTX_MSIX_INDEX ||
> 	    !vdev->has_dyn_msix)
> 		return;
> 
> 	irq = pci_irq_vector(pdev, vector);
> 	map = { .index = vector, .virq = irq };
> 
> 	if (WARN_ON(irq < 0))
> 		return;
> 
> 	pci_msix_free_irq(pdev, msix_map);
> }

Thank you very much for taking the time to write this out. I am not able to
see where vfio_pci_alloc_irq()/vfio_pci_free_irq() would be called for
an INTx interrupt. Is the INTx handling there for robustness or am I
missing how it should be used for INTx interrupts?

> At that point, maybe we'd check whether it makes sense to embed the irq
> alloc/free within the ctx alloc/free.

I think doing so would be the right thing to do since it helps
to enforce the policy that interrupts and contexts are allocated together.
I think this can be done when switching around the initialization within 
vfio_msi_set_vector_signal(). I need to look into this more.

>>> But if non-active contexts were only placeholders in the pre-dynamic
>>> world and we now manage them via a dynamic array, why is there any
>>> pre-allocation of contexts without knowing the nature of the eventfd to
>>> fill it?  We could have more commonality between cases if contexts are
>>> always dynamically allocated, which might simplify differentiation of
>>> the has_dyn_msix cases largely to wrappers allocating and freeing irqs.
>>> Thanks,  
>>
>> Thank you very much for your guidance. I will digest this some more and 
>> see how wrappers could be used. In the mean time while trying to think how
>> to unify this code I do think there is an issue in this patch in that
>> the get_cached_msi_msg()/pci_write_msi_msg()
>> should not be in an else branch.
>>
>> Specifically, I think it needs to be:
>> 	if (msix) {
>> 		if (irq == -EINVAL) {
>> 			/* dynamically allocate interrupt */
>> 		}
>> 		get_cached_msi_msg(irq, &msg);
>> 		pci_write_msi_msg(irq, &msg);
>> 	}
> 
> Yes, that's looked wrong to me all along, I think that resolves it.

Thank you very much.

Reinette


  parent reply	other threads:[~2023-04-04 16:55 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 21:53 [PATCH V2 0/8] vfio/pci: Support dynamic allocation of MSI-X interrupts Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 1/8] vfio/pci: Consolidate irq cleanup on MSI/MSI-X disable Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 2/8] vfio/pci: Remove negative check on unsigned vector Reinette Chatre
2023-03-30 20:26   ` Alex Williamson
2023-03-30 22:32     ` Reinette Chatre
2023-03-30 22:54       ` Alex Williamson
2023-03-30 23:54         ` Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 3/8] vfio/pci: Prepare for dynamic interrupt context storage Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 4/8] vfio/pci: Use xarray for " Reinette Chatre
2023-04-07  7:21   ` Liu, Jing2
2023-04-07 16:44     ` Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 5/8] vfio/pci: Remove interrupt context counter Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 6/8] vfio/pci: Move to single error path Reinette Chatre
2023-03-28 21:53 ` [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x Reinette Chatre
2023-03-29  2:48   ` kernel test robot
2023-03-29 14:42     ` Reinette Chatre
2023-03-29 22:10       ` Reinette Chatre
2023-03-29  2:58   ` kernel test robot
2023-03-30 22:40   ` Alex Williamson
2023-03-30 22:42     ` Alex Williamson
2023-03-31 17:49       ` Reinette Chatre
2023-03-31 22:24         ` Alex Williamson
2023-04-03 17:31           ` Reinette Chatre
2023-04-03 20:22             ` Alex Williamson
2023-04-03 22:50               ` Reinette Chatre
2023-04-04  3:18                 ` Alex Williamson
2023-04-04  3:51                   ` Tian, Kevin
2023-04-04 17:29                     ` Reinette Chatre
2023-04-04 18:43                       ` Alex Williamson
2023-04-04 20:46                         ` Reinette Chatre
2023-04-04 16:54                   ` Reinette Chatre [this message]
2023-04-04 18:24                     ` Alex Williamson
2023-04-06 20:13                       ` Reinette Chatre
2023-03-31 10:02   ` Liu, Jing2
2023-03-31 13:51     ` Alex Williamson
2023-04-04  3:19       ` Liu, Jing2
2023-03-28 21:53 ` [PATCH V2 8/8] vfio/pci: Clear VFIO_IRQ_INFO_NORESIZE for MSI-X Reinette Chatre
2023-03-29  3:29   ` kernel test robot
2023-03-29  3:29   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5efa361d-012b-bdb6-b5e5-869887bde98d@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=darwi@linutronix.de \
    --cc=dave.jiang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=jgg@nvidia.com \
    --cc=jing2.liu@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tglx@linutronix.de \
    --cc=tom.zanussi@linux.intel.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox