From: Zhenzhong Duan <zhenzhong.duan@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Feng Jin <joe.jin@oracle.com>,
xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH] Bypass mask bit of msix entry in xen
Date: Thu, 21 Mar 2013 18:50:22 +0800 [thread overview]
Message-ID: <514AE5EE.30402@oracle.com> (raw)
In-Reply-To: <514AEB0602000078000C76F6@nat28.tlf.novell.com>
On 2013-03-21 18:12, Jan Beulich wrote:
>>>> On 21.03.13 at 04:38, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>> Currently xen doesn't support passthrough mask bit in msix entry,
>> this will conflict with qemu's control to that bit.
>> Pass it in xen and let device module simulate it.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
>> ---
>> xen/arch/x86/hvm/vmsi.c | 67 +++++++++++-----------------------------------
>> 1 files changed, 16 insertions(+), 51 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
>> index cfc7c80..845cf4c 100644
>> --- a/xen/arch/x86/hvm/vmsi.c
>> +++ b/xen/arch/x86/hvm/vmsi.c
>> @@ -167,7 +167,7 @@ struct msixtbl_entry
>> unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
>> #define MAX_MSIX_ACC_ENTRIES 3
>> struct {
>> - uint32_t msi_ad[3]; /* Shadow of address low, high and data */
>> + uint32_t msi_ad[4]; /* Shadow of address low, high, data and control */
>> } gentries[MAX_MSIX_ACC_ENTRIES];
>> struct rcu_head rcu;
>> };
>> @@ -213,7 +213,6 @@ static int msixtbl_read(
>> {
>> unsigned long offset;
>> struct msixtbl_entry *entry;
>> - void *virt;
>> unsigned int nr_entry, index;
>> int r = X86EMUL_UNHANDLEABLE;
>>
>> @@ -224,23 +223,14 @@ static int msixtbl_read(
>>
>> entry = msixtbl_find_entry(v, address);
>> offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> + nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>> +
>> + if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>> + goto out;
>> +
>> + index = offset / sizeof(uint32_t);
>> + *pval = entry->gentries[nr_entry].msi_ad[index];
>>
>> - if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
>> - {
>> - nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>> - if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>> - goto out;
>> - index = offset / sizeof(uint32_t);
>> - *pval = entry->gentries[nr_entry].msi_ad[index];
>> - }
>> - else
>> - {
>> - virt = msixtbl_addr_to_virt(entry, address);
>> - if ( !virt )
>> - goto out;
>> - *pval = readl(virt);
>> - }
>> -
>> r = X86EMUL_OKAY;
>> out:
>> rcu_read_unlock(&msixtbl_rcu_lock);
>> @@ -252,7 +242,6 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>> {
>> unsigned long offset;
>> struct msixtbl_entry *entry;
>> - void *virt;
>> unsigned int nr_entry, index;
>> int r = X86EMUL_UNHANDLEABLE;
>>
>> @@ -264,42 +253,18 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>> entry = msixtbl_find_entry(v, address);
>> nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>>
>> - offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> - if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET)
>> - {
>> - if ( nr_entry < MAX_MSIX_ACC_ENTRIES )
>> - {
>> - index = offset / sizeof(uint32_t);
>> - entry->gentries[nr_entry].msi_ad[index] = val;
>> - }
>> - set_bit(nr_entry, &entry->table_flags);
>> - goto out;
>> - }
>> -
>> - /* exit to device model if address/data has been modified */
>> - if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
>> + if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>> goto out;
>>
>> - virt = msixtbl_addr_to_virt(entry, address);
>> - if ( !virt )
>> - goto out;
>> + offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> + index = offset / sizeof(uint32_t);
>> + entry->gentries[nr_entry].msi_ad[index] = val;
>>
>> - /* Do not allow the mask bit to be changed. */
>> -#if 0 /* XXX
>> - * As the mask bit is the only defined bit in the word, and as the
>> - * host MSI-X code doesn't preserve the other bits anyway, doing
>> - * this is pointless. So for now just discard the write (also
>> - * saving us from having to determine the matching irq_desc).
>> - */
>> - spin_lock_irqsave(&desc->lock, flags);
>> - orig = readl(virt);
>> - val &= ~PCI_MSIX_VECTOR_BITMASK;
>> - val |= orig & PCI_MSIX_VECTOR_BITMASK;
>> - writel(val, virt);
>> - spin_unlock_irqrestore(&desc->lock, flags);
>> -#endif
>> + if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET)
>> + set_bit(nr_entry, &entry->table_flags);
>> + else
>> + clear_bit(nr_entry, &entry->table_flags);
>>
>> - r = X86EMUL_OKAY;
> With that you pass all writes to the device model. Which means the
> acceleration code could as well be dropped altogether, rather than
> modifying it in ways that look questionable to me without further
> explanation.
It only accelerate msix msixtbl_read with this change, and the
acceleration need those
code in msixtbl_write.
>
> Furthermore, without explanation I also don't see how the mask
> bit is now being dealt with properly: From an abstract pov you'd
> need to merge ("or") the guest-requested mask bit state with what
> Xen needs for its own purposes. I don't see anything like that here
> or in the qemu side patch.
Right, I didn't consider combine xen's masking with guest's.
My patch just target making irq affinity in old hvm guest work and
sometimes no irq handler panic.
But I would like to know what's the result if guest's mask setting
doesn't pass to device,
interrupt loss or something else? I see current implemention didn't
passthrough mask bit too.
thanks
zduan
next prev parent reply other threads:[~2013-03-21 10:50 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-21 3:38 [PATCH] Bypass mask bit of msix entry in xen Zhenzhong Duan
2013-03-21 10:12 ` Jan Beulich
2013-03-21 10:50 ` Zhenzhong Duan [this message]
2013-03-21 13:10 ` Jan Beulich
2013-03-26 3:15 ` DuanZhenzhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=514AE5EE.30402@oracle.com \
--to=zhenzhong.duan@oracle.com \
--cc=JBeulich@suse.com \
--cc=joe.jin@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).