From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhenzhong Duan Subject: Re: [PATCH] Bypass mask bit of msix entry in xen Date: Thu, 21 Mar 2013 18:50:22 +0800 Message-ID: <514AE5EE.30402@oracle.com> References: <514A80CF.5040705@oracle.com> <514AEB0602000078000C76F6@nat28.tlf.novell.com> Reply-To: zhenzhong.duan@oracle.com Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <514AEB0602000078000C76F6@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Konrad Rzeszutek Wilk , Feng Jin , xen-devel List-Id: xen-devel@lists.xenproject.org On 2013-03-21 18:12, Jan Beulich wrote: >>>> On 21.03.13 at 04:38, Zhenzhong Duan wrote: >> Currently xen doesn't support passthrough mask bit in msix entry, >> this will conflict with qemu's control to that bit. >> Pass it in xen and let device module simulate it. >> >> Signed-off-by: Zhenzhong Duan >> --- >> xen/arch/x86/hvm/vmsi.c | 67 +++++++++++----------------------------------- >> 1 files changed, 16 insertions(+), 51 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c >> index cfc7c80..845cf4c 100644 >> --- a/xen/arch/x86/hvm/vmsi.c >> +++ b/xen/arch/x86/hvm/vmsi.c >> @@ -167,7 +167,7 @@ struct msixtbl_entry >> unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)]; >> #define MAX_MSIX_ACC_ENTRIES 3 >> struct { >> - uint32_t msi_ad[3]; /* Shadow of address low, high and data */ >> + uint32_t msi_ad[4]; /* Shadow of address low, high, data and control */ >> } gentries[MAX_MSIX_ACC_ENTRIES]; >> struct rcu_head rcu; >> }; >> @@ -213,7 +213,6 @@ static int msixtbl_read( >> { >> unsigned long offset; >> struct msixtbl_entry *entry; >> - void *virt; >> unsigned int nr_entry, index; >> int r = X86EMUL_UNHANDLEABLE; >> >> @@ -224,23 +223,14 @@ static int msixtbl_read( >> >> entry = msixtbl_find_entry(v, address); >> offset = address & (PCI_MSIX_ENTRY_SIZE - 1); >> + nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE; >> + >> + if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ) >> + goto out; >> + >> + index = offset / sizeof(uint32_t); >> + *pval = entry->gentries[nr_entry].msi_ad[index]; >> >> - if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET ) >> - { >> - nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE; >> - if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ) >> - goto out; >> - index = offset / sizeof(uint32_t); >> - *pval = entry->gentries[nr_entry].msi_ad[index]; >> - } >> - else >> - { >> - virt = msixtbl_addr_to_virt(entry, address); >> - if ( !virt ) >> - goto out; >> - *pval = readl(virt); >> - } >> - >> r = X86EMUL_OKAY; >> out: >> rcu_read_unlock(&msixtbl_rcu_lock); >> @@ -252,7 +242,6 @@ static int msixtbl_write(struct vcpu *v, unsigned long address, >> { >> unsigned long offset; >> struct msixtbl_entry *entry; >> - void *virt; >> unsigned int nr_entry, index; >> int r = X86EMUL_UNHANDLEABLE; >> >> @@ -264,42 +253,18 @@ static int msixtbl_write(struct vcpu *v, unsigned long address, >> entry = msixtbl_find_entry(v, address); >> nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE; >> >> - offset = address & (PCI_MSIX_ENTRY_SIZE - 1); >> - if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) >> - { >> - if ( nr_entry < MAX_MSIX_ACC_ENTRIES ) >> - { >> - index = offset / sizeof(uint32_t); >> - entry->gentries[nr_entry].msi_ad[index] = val; >> - } >> - set_bit(nr_entry, &entry->table_flags); >> - goto out; >> - } >> - >> - /* exit to device model if address/data has been modified */ >> - if ( test_and_clear_bit(nr_entry, &entry->table_flags) ) >> + if ( nr_entry >= MAX_MSIX_ACC_ENTRIES ) >> goto out; >> >> - virt = msixtbl_addr_to_virt(entry, address); >> - if ( !virt ) >> - goto out; >> + offset = address & (PCI_MSIX_ENTRY_SIZE - 1); >> + index = offset / sizeof(uint32_t); >> + entry->gentries[nr_entry].msi_ad[index] = val; >> >> - /* Do not allow the mask bit to be changed. */ >> -#if 0 /* XXX >> - * As the mask bit is the only defined bit in the word, and as the >> - * host MSI-X code doesn't preserve the other bits anyway, doing >> - * this is pointless. So for now just discard the write (also >> - * saving us from having to determine the matching irq_desc). >> - */ >> - spin_lock_irqsave(&desc->lock, flags); >> - orig = readl(virt); >> - val &= ~PCI_MSIX_VECTOR_BITMASK; >> - val |= orig & PCI_MSIX_VECTOR_BITMASK; >> - writel(val, virt); >> - spin_unlock_irqrestore(&desc->lock, flags); >> -#endif >> + if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) >> + set_bit(nr_entry, &entry->table_flags); >> + else >> + clear_bit(nr_entry, &entry->table_flags); >> >> - r = X86EMUL_OKAY; > With that you pass all writes to the device model. Which means the > acceleration code could as well be dropped altogether, rather than > modifying it in ways that look questionable to me without further > explanation. It only accelerate msix msixtbl_read with this change, and the acceleration need those code in msixtbl_write. > > Furthermore, without explanation I also don't see how the mask > bit is now being dealt with properly: From an abstract pov you'd > need to merge ("or") the guest-requested mask bit state with what > Xen needs for its own purposes. I don't see anything like that here > or in the qemu side patch. Right, I didn't consider combine xen's masking with guest's. My patch just target making irq affinity in old hvm guest work and sometimes no irq handler panic. But I would like to know what's the result if guest's mask setting doesn't pass to device, interrupt loss or something else? I see current implemention didn't passthrough mask bit too. thanks zduan