Re: [PATCH] Bypass mask bit of msix entry in xen

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Zhenzhong Duan <zhenzhong.duan@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Feng Jin <joe.jin@oracle.com>,
	xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH] Bypass mask bit of msix entry in xen
Date: Thu, 21 Mar 2013 18:50:22 +0800	[thread overview]
Message-ID: <514AE5EE.30402@oracle.com> (raw)
In-Reply-To: <514AEB0602000078000C76F6@nat28.tlf.novell.com>


On 2013-03-21 18:12, Jan Beulich wrote:
>>>> On 21.03.13 at 04:38, Zhenzhong Duan <zhenzhong.duan@oracle.com> wrote:
>> Currently xen doesn't support passthrough mask bit in msix entry,
>> this will conflict with qemu's control to that bit.
>> Pass it in xen and let device module simulate it.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
>> ---
>>   xen/arch/x86/hvm/vmsi.c |   67 +++++++++++-----------------------------------
>>   1 files changed, 16 insertions(+), 51 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
>> index cfc7c80..845cf4c 100644
>> --- a/xen/arch/x86/hvm/vmsi.c
>> +++ b/xen/arch/x86/hvm/vmsi.c
>> @@ -167,7 +167,7 @@ struct msixtbl_entry
>>       unsigned long table_flags[BITS_TO_LONGS(MAX_MSIX_TABLE_ENTRIES)];
>>   #define MAX_MSIX_ACC_ENTRIES 3
>>       struct {
>> -        uint32_t msi_ad[3];	/* Shadow of address low, high and data */
>> +        uint32_t msi_ad[4];	/* Shadow of address low, high, data and control */
>>       } gentries[MAX_MSIX_ACC_ENTRIES];
>>       struct rcu_head rcu;
>>   };
>> @@ -213,7 +213,6 @@ static int msixtbl_read(
>>   {
>>       unsigned long offset;
>>       struct msixtbl_entry *entry;
>> -    void *virt;
>>       unsigned int nr_entry, index;
>>       int r = X86EMUL_UNHANDLEABLE;
>>   
>> @@ -224,23 +223,14 @@ static int msixtbl_read(
>>   
>>       entry = msixtbl_find_entry(v, address);
>>       offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> +    nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>> +
>> +    if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>> +        goto out;
>> +
>> +    index = offset / sizeof(uint32_t);
>> +    *pval = entry->gentries[nr_entry].msi_ad[index];
>>   
>> -    if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET )
>> -    {
>> -        nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>> -        if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>> -            goto out;
>> -        index = offset / sizeof(uint32_t);
>> -        *pval = entry->gentries[nr_entry].msi_ad[index];
>> -    }
>> -    else
>> -    {
>> -        virt = msixtbl_addr_to_virt(entry, address);
>> -        if ( !virt )
>> -            goto out;
>> -        *pval = readl(virt);
>> -    }
>> -
>>       r = X86EMUL_OKAY;
>>   out:
>>       rcu_read_unlock(&msixtbl_rcu_lock);
>> @@ -252,7 +242,6 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>>   {
>>       unsigned long offset;
>>       struct msixtbl_entry *entry;
>> -    void *virt;
>>       unsigned int nr_entry, index;
>>       int r = X86EMUL_UNHANDLEABLE;
>>   
>> @@ -264,42 +253,18 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>>       entry = msixtbl_find_entry(v, address);
>>       nr_entry = (address - entry->gtable) / PCI_MSIX_ENTRY_SIZE;
>>   
>> -    offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> -    if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET)
>> -    {
>> -        if ( nr_entry < MAX_MSIX_ACC_ENTRIES )
>> -        {
>> -            index = offset / sizeof(uint32_t);
>> -            entry->gentries[nr_entry].msi_ad[index] = val;
>> -        }
>> -        set_bit(nr_entry, &entry->table_flags);
>> -        goto out;
>> -    }
>> -
>> -    /* exit to device model if address/data has been modified */
>> -    if ( test_and_clear_bit(nr_entry, &entry->table_flags) )
>> +    if ( nr_entry >= MAX_MSIX_ACC_ENTRIES )
>>           goto out;
>>   
>> -    virt = msixtbl_addr_to_virt(entry, address);
>> -    if ( !virt )
>> -        goto out;
>> +    offset = address & (PCI_MSIX_ENTRY_SIZE - 1);
>> +    index = offset / sizeof(uint32_t);
>> +    entry->gentries[nr_entry].msi_ad[index] = val;
>>   
>> -    /* Do not allow the mask bit to be changed. */
>> -#if 0 /* XXX
>> -       * As the mask bit is the only defined bit in the word, and as the
>> -       * host MSI-X code doesn't preserve the other bits anyway, doing
>> -       * this is pointless. So for now just discard the write (also
>> -       * saving us from having to determine the matching irq_desc).
>> -       */
>> -    spin_lock_irqsave(&desc->lock, flags);
>> -    orig = readl(virt);
>> -    val &= ~PCI_MSIX_VECTOR_BITMASK;
>> -    val |= orig & PCI_MSIX_VECTOR_BITMASK;
>> -    writel(val, virt);
>> -    spin_unlock_irqrestore(&desc->lock, flags);
>> -#endif
>> +    if ( offset != PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET)
>> +        set_bit(nr_entry, &entry->table_flags);
>> +    else
>> +        clear_bit(nr_entry, &entry->table_flags);
>>   
>> -    r = X86EMUL_OKAY;
> With that you pass all writes to the device model. Which means the
> acceleration code could as well be dropped altogether, rather than
> modifying it in ways that look questionable to me without further
> explanation.
It only accelerate msix msixtbl_read with this change, and the 
acceleration need those
code in msixtbl_write.
>
> Furthermore, without explanation I also don't see how the mask
> bit is now being dealt with properly: From an abstract pov you'd
> need to merge ("or") the guest-requested mask bit state with what
> Xen needs for its own purposes. I don't see anything like that here
> or in the qemu side patch.
Right, I didn't consider combine xen's masking with guest's.
My patch just target making irq affinity in old hvm guest work and 
sometimes no irq handler panic.
But I would like to know what's the result if guest's mask setting 
doesn't pass to device,
interrupt loss or something else? I see current implemention didn't 
passthrough mask bit too.

thanks
zduan

next prev parent reply	other threads:[~2013-03-21 10:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-21  3:38 [PATCH] Bypass mask bit of msix entry in xen Zhenzhong Duan
2013-03-21 10:12 ` Jan Beulich
2013-03-21 10:50   ` Zhenzhong Duan [this message]
2013-03-21 13:10     ` Jan Beulich
2013-03-26  3:15       ` DuanZhenzhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=514AE5EE.30402@oracle.com \
    --to=zhenzhong.duan@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).