From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PVH]: Help: msi.c Date: Fri, 21 Dec 2012 16:01:53 -0500 Message-ID: <20121221210153.GA32115@phenom.dumpdata.com> References: <20121212171523.332a0a89@mantra.us.oracle.com> <20121212174312.68146c02@mantra.us.oracle.com> <50C9BF1E02000078000B00FC@nat28.tlf.novell.com> <50C9E88F02000078000B02A1@nat28.tlf.novell.com> <20121218211613.GA5697@phenom.dumpdata.com> <20121219153728.GG10062@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: Jan Beulich , xen-devel List-Id: xen-devel@lists.xenproject.org > > > > > pci_enable_msix -> msix_capability_init -> msix_program_entries > > > > > > > > > > Unfortunately msix_program_entries is called few lines after > > > > > arch_setup_msi_irqs, where we call PHYSDEVOP_map_pirq to map the MSI as > > > > > a pirq. > > > > > However after that is done, all the masking/unmask is done via irq_mask > > > > > that we handle properly masking/unmasking the corresponding event > > > > > channels. > > > > > > > > > > > > > > > Possible solutions on top of my head: > > > > > > > > There is also the potential to piggyback on Joerg's patches > > > > that introduce a new x86_msi_ops: compose_msi_msg. > > > > > > > > See here: https://lkml.org/lkml/2012/8/20/432 > > > > (I think there was also a more recent one posted at some point). > > > > > > Given that dom0 should never write to the MSI-X table, introducing a new > > > > How does this work with QEMU setting up MSI and MSI-X on behalf of > > guests? Or is that actually handled by Xen hypervisor? > > In the case of HVM guests, QEMU emulates the PCI config space and the > table, so it is OK for the guest to write to it. > > > > > msi_ops that replaces msix_program_entries (or at least the part of > > > msix_program_entries that masks all the entried) is the only solution > > > left. > > > > so this one (__msix_mask_irq): > > > > mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT; > > 198 if (flag) > > 199 mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT; > > 200 writel(mask_bits, desc->mask_base + offset); > > > > Yes, that's the one. Once could argue that __msix_mask_irq should call > mask_irq rather than writing to the table directly. You mean 'irq_mask ' ? Not really - that is within the IOAPIC domain. To be more generic it should encompass then also the other usages - that is the 'readl' and 'writel' users. My understading of the reason we have been fortunate enough to have this working right now is b/c the hypercall we do beforehand writes the 'pirq' in the MSI-X BAR and that is later what the Linux kernel does (by doing readl) - and we end up re-writing that value by the Linux kernel. The other thing we can do and entirely bypass the msi.c writes is xen_initdom_setup_msi_irqs make the desc->mask_base point to somewhere safe. Meaning point to an page we allocate when we setup the IRQs and we fill it with whatever we want (which I guess would be the pirq values we just got).