All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhenzhong Duan <zhenzhong.duan@oracle.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Feng Jin <joe.jin@oracle.com>,
	Yuval Shaia <yuval.shaia@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Chien Yen <chien.yen@oracle.com>
Subject: Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
Date: Fri, 28 Jun 2013 10:33:04 +0800	[thread overview]
Message-ID: <51CCF5E0.908@oracle.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1306271247350.4782@kaball.uk.xensource.com>


On 2013-06-27 19:52, Stefano Stabellini wrote:
> On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-27 02:08, Stefano Stabellini wrote:
>>> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>>>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>>>> Stefano Stabellini wrote:
>>>>>>> Trimming some of the people in CC
>>>>>>>
>>>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>      
>>>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>>>        
>>>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>          
>>>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>>>            
>>>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>>>              
>>>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>>>        On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>>>          Looking at the hypervisor code I couldn't see
>>>>>>>>>>>> anything
>>>>>>>>>>>> obviously
>>>>>>>>>>>> wrong.
>>>>>>>>>>>>              I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>>>
>>>>>>>>>>>>         if ( is_hvm_domain(d) )
>>>>>>>>>>>>          {
>>>>>>>>>>>>              spin_lock(&d->event_lock);
>>>>>>>>>>>>              gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x
>>>>>>>>>>>> %s,
>>>>>>>>>>>> irq:
>>>>>>>>>>>> %d\n",
>>>>>>>>>>>>                  d->domain_id, pirq,
>>>>>>>>>>>> domain_pirq_to_emuirq(d,
>>>>>>>>>>>> pirq),
>>>>>>>>>>>>                  domain_pirq_to_emuirq(d, pirq) ==
>>>>>>>>>>>> IRQ_UNBOUND ?
>>>>>>>>>>>> "unbound" :
>>>>>>>>>>>> "",
>>>>>>>>>>>>                  domain_pirq_to_irq(d, pirq));
>>>>>>>>>>>>                                                                                               if
>>>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>>>                  ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>>>              spin_unlock(&d->event_lock);
>>>>>>>>>>>>              if ( domid == DOMID_SELF || ret )
>>>>>>>>>>>>                  goto free_domain;
>>>>>>>>>>>>
>>>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>>>
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>>>> here).
>>>>>>>>>>>>
>>>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does
>>>>>>>>>>>> not
>>>>>>>>>>>> happen.
>>>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be
>>>>>>>>>>>> depend
>>>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, all of that code looks to only clear
>>>>>>>>>>>> things
>>>>>>>>>>>> when
>>>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>>>> removal
>>>>>>>>>>>> in the radix tree:
>>>>>>>>>>>>
>>>>>>>>>>>>        if ( emuirq != IRQ_PT )
>>>>>>>>>>>>              radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>>>> emuirq);
>>>>>>>>>>>>                                                                              And
>>>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should
>>>>>>>>>>>> check
>>>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>>>            I think that you are looking in the wrong place.
>>>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>>>
>>>>>>>>>>>> The code above is correct as is because it is trying to
>>>>>>>>>>>> handle
>>>>>>>>>>>> emulated
>>>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are
>>>>>>>>>>>> not
>>>>>>>>>>>> added to
>>>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>>>          This patch fixes the issue, I have only tested MSI
>>>>>>>>>>>> (MSI-X
>>>>>>>>>>>> completely
>>>>>>>>>>>> untested).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int
>>>>>>>>>>>> pt_msgctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>               ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>>           }
>>>>>>>>>>>>           else
>>>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>> +    {
>>>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>>>> +        }
>>>>>>>>>>>> +    }
>>>>>>>>>>>>             /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>>>> translation
>>>>>>>>>>>> */
>>>>>>>>>>>>           if (!ptdev->msi_trans_en) {
>>>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int
>>>>>>>>>>>> pt_msixctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>                   pt_disable_msi_translate(ptdev);
>>>>>>>>>>>>               }
>>>>>>>>>>>>               pt_msix_update(ptdev);
>>>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>>>        Hi Stefano,
>>>>>>>>>>>> I made a test with this patch, os reboot when driver
>>>>>>>>>>>> reload.
>>>>>>>>>>>> If
>>>>>>>>>>>> use
>>>>>>>>>>>> pt_msix_disable
>>>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>>>> But I still see some error in qemu.log and xen console.
>>>>>>>>>>>> Seems
>>>>>>>>>>>> four
>>>>>>>>>>>> IRQs
>>>>>>>>>>>> are not freed
>>>>>>>>>>>> when unmap.
>>>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 103
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 102
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 101
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 100
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>>                
>>>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>>>> failing?
>>>>>>>>>>>              
>>>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
>>>>>>>>>> IRQ_UNBOUND
>>>>>>>>>> in
>>>>>>>>>> physdev_unmap_pirq.
>>>>>>>>>>            
>>>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>>>> question:
>>>>>>>>>
>>>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>>>
>>>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>>>> calls
>>>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>>>> should
>>>>>>>>> be IRQ_PT.
>>>>>>>>>
>>>>>>>>> I don't know if that's a normal condition, but in any case it
>>>>>>>>> should
>>>>>>>>> not create any problems to physdev_unmap_pirq, in fact the
>>>>>>>>> folloing
>>>>>>>>> check:
>>>>>>>>>
>>>>>>>>>         if ( domid == DOMID_SELF || ret )
>>>>>>>>>                 goto free_domain;
>>>>>>>>>
>>>>>>>>> should fail so Xen should continue and execute
>>>>>>>>> unmap_domain_pirq.
>>>>>>>>> That's
>>>>>>>>> what we want.
>>>>>>>>>          
>>>>>>>>    From linux side, request_irq->  request_threaded_irq->
>>>>>>>> __setup_irq->
>>>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>>>> If irq_enable isn't called, how does the driver receive interrupt,
>>>>>>>> I
>>>>>>>> did
>>>>>>>> see
>>>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>>>        
>>>>>>> Good to know
>>>>>>>
>>>>>>>      
>>>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>>>> clearing
>>>>>>>> the
>>>>>>>> mapping?
>>>>>>>>        
>>>>>>> What I am saying is that the error you are getting:
>>>>>>>
>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>
>>>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>>>> IRQ_UNBOUND.
>>>>>>> So, why are you getting this error? What is failing?
>>>>>>> I am ready to believe the problem is in Xen but Without
>>>>>>> understanding
>>>>>>> why you are getting the error it's hard to find a solution.
>>>>>>>      
>>>>>> I found the reason, you are looking at xen-unstable, I was working
>>>>>> with
>>>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>>>> succeed.
>>>>>> But we still need below patch to let driver reload succeed everytime.
>>>>>> Without
>>>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>>>
>>>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>>>            return;
>>>>>>        }
>>>>>>
>>>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq
>>>>>> ==
>>>>>> -1))
>>>>>>            entry->flags = 1;
>>>>>>        entry->io_mem[offset] = val;
>>>>> Interesting. I don't think this is the proper fix though.
>>>>> Does the appended patch change anything?
>>>>>
>>>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>>>> index 71fa6f0..cd5d9c7 100644
>>>>> --- a/hw/pt-msi.c
>>>>> +++ b/hw/pt-msi.c
>>>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev,
>>>>> int
>>>>> entry_nr)
>>>>>         uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>>>         int ret;
>>>>>     -    if ( !entry->flags )
>>>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>>>             return 0;
>>>>>           if (!gvec) {
>>>> Tested, not work.
>>>> If you look at msix_capability_init in kernel, line 707,722,
>>>> dev->msix->enabled is already set when pt_msix_update is called.
>>> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
>>> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
>>> pt_msix_update.
>> It does.
>> line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
>>                   this will set dev->msix->enabled first time
>> line 722, PCI_MSIX_FLAGS_ENABLE set
>>                   this trigger call of pt_msix_update
>>                   then dev->msix->enabled was set a second time
>>> I was assuming that you needed to add "|| entry->pirq == -1" because you
>>> needed to pass the check:
>>>
>>>       if ( !entry->flags )
>>>           return 0;
>>>
>>> at the beginning of pt_msix_update_one. Am I getting it right?
>> Right, as entry->pirq is set to -1 when driver unload.
>>> If that is case that I thought that we just needed to make sure that
>>> when ptdev->msix->enabled is still zero then we go through the test in
>>> pt_msix_update_one. Where is the mistake?
>> If you want to use dev->msix->enabled for checking, below patch could work.
>> But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
>> more,
>> but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
>>
>> zduan
>> ***********************************************************
>> diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
>> --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
>> +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
>> @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
>>           pt_msix_disable(ptdev);
>>       }
>>
>> -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
>> +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
>> PCI_MSIX_MASK);
>>
>>       return 0;
>>   }
>> diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
>> @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
>>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>       int ret;
>>
>> -    if ( !entry->flags )
>> +    if ( !entry->flags && dev->msix->enabled )
>>           return 0;
>>
>>       if (!gvec) {
> I understand now, thanks for the explanation.
> Among the two alternatives, I think that your first change is actually
> better.
>
> However we can probably still improve it a little bit by setting
> entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
> the driver reload case from the first driver initialization.
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cc4e280 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
>           }
>           /* clear msi-x info */
>           entry->pirq = -1;
> -        entry->flags = 0;
> +        entry->flags = 1;
>       }
>   }
Test passed.
But this change set entry->flags in all entrys, dev->msix->total_entries 
count of pirqs are
mapped when driver reload, no matter how many msix entrys driver is 
initializing.

zduan

WARNING: multiple messages have this Message-ID (diff)
From: Zhenzhong Duan <zhenzhong.duan@oracle.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Feng Jin <joe.jin@oracle.com>,
	Yuval Shaia <yuval.shaia@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Chien Yen <chien.yen@oracle.com>
Subject: Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time
Date: Fri, 28 Jun 2013 10:33:04 +0800	[thread overview]
Message-ID: <51CCF5E0.908@oracle.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1306271247350.4782@kaball.uk.xensource.com>


On 2013-06-27 19:52, Stefano Stabellini wrote:
> On Thu, 27 Jun 2013, Zhenzhong Duan wrote:
>> On 2013-06-27 02:08, Stefano Stabellini wrote:
>>> On Wed, 26 Jun 2013, Zhenzhong Duan wrote:
>>>> On 2013-06-26 01:51, Stefano Stabellini wrote:
>>>>> On Tue, 25 Jun 2013, DuanZhenzhong wrote:
>>>>>> Stefano Stabellini wrote:
>>>>>>> Trimming some of the people in CC
>>>>>>>
>>>>>>> On Mon, 24 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>      
>>>>>>>> On 2013-06-20 22:21, Stefano Stabellini wrote:
>>>>>>>>        
>>>>>>>>> On Thu, 20 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>          
>>>>>>>>>> On 2013-06-05 20:50, Stefano Stabellini wrote:
>>>>>>>>>>            
>>>>>>>>>>> On Wed, 5 Jun 2013, Zhenzhong Duan wrote:
>>>>>>>>>>>              
>>>>>>>>>>>> Stefano Stabellini wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 21 May 2013, Stefano Stabellini wrote:
>>>>>>>>>>>>        On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
>>>>>>>>>>>>          Looking at the hypervisor code I couldn't see
>>>>>>>>>>>> anything
>>>>>>>>>>>> obviously
>>>>>>>>>>>> wrong.
>>>>>>>>>>>>              I think the culprit is "physdev_unmap_pirq":
>>>>>>>>>>>>
>>>>>>>>>>>>         if ( is_hvm_domain(d) )
>>>>>>>>>>>>          {
>>>>>>>>>>>>              spin_lock(&d->event_lock);
>>>>>>>>>>>>              gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x
>>>>>>>>>>>> %s,
>>>>>>>>>>>> irq:
>>>>>>>>>>>> %d\n",
>>>>>>>>>>>>                  d->domain_id, pirq,
>>>>>>>>>>>> domain_pirq_to_emuirq(d,
>>>>>>>>>>>> pirq),
>>>>>>>>>>>>                  domain_pirq_to_emuirq(d, pirq) ==
>>>>>>>>>>>> IRQ_UNBOUND ?
>>>>>>>>>>>> "unbound" :
>>>>>>>>>>>> "",
>>>>>>>>>>>>                  domain_pirq_to_irq(d, pirq));
>>>>>>>>>>>>                                                                                               if
>>>>>>>>>>>> ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
>>>>>>>>>>>>                  ret = unmap_domain_pirq_emuirq(d, pirq);
>>>>>>>>>>>>              spin_unlock(&d->event_lock);
>>>>>>>>>>>>              if ( domid == DOMID_SELF || ret )
>>>>>>>>>>>>                  goto free_domain;
>>>>>>>>>>>>
>>>>>>>>>>>> It always tells me unbound:
>>>>>>>>>>>>
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
>>>>>>>>>>>> (XEN) irq.c:1873:d14 14, nr_pirqs: 56
>>>>>>>>>>>> (a bit older debug code, so the 'unbound' does not show up
>>>>>>>>>>>> here).
>>>>>>>>>>>>
>>>>>>>>>>>> Which means that the call to unmap_domain_pirq_emuirq does
>>>>>>>>>>>> not
>>>>>>>>>>>> happen.
>>>>>>>>>>>> The checks in unmap_domain_pirq_emuirq also look to be
>>>>>>>>>>>> depend
>>>>>>>>>>>> on the code being IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, all of that code looks to only clear
>>>>>>>>>>>> things
>>>>>>>>>>>> when
>>>>>>>>>>>> they are !IRQ_UNBOUND.
>>>>>>>>>>>>
>>>>>>>>>>>> But the other logic (IRQ_UNBOUND) looks to be missing a
>>>>>>>>>>>> removal
>>>>>>>>>>>> in the radix tree:
>>>>>>>>>>>>
>>>>>>>>>>>>        if ( emuirq != IRQ_PT )
>>>>>>>>>>>>              radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq,
>>>>>>>>>>>> emuirq);
>>>>>>>>>>>>                                                                              And
>>>>>>>>>>>> I think that is what is causing the leak - the radix tree
>>>>>>>>>>>> needs to be pruned? Or perhaps the allocate_pirq should
>>>>>>>>>>>> check
>>>>>>>>>>>> the radix tree for IRQ_UNBOUND ones and re-use them?
>>>>>>>>>>>>            I think that you are looking in the wrong place.
>>>>>>>>>>>> The issue is that QEMU doesn't call pt_msi_disable in
>>>>>>>>>>>> pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>>>>>>>>>>>>
>>>>>>>>>>>> The code above is correct as is because it is trying to
>>>>>>>>>>>> handle
>>>>>>>>>>>> emulated
>>>>>>>>>>>> IRQs and MSIs, not real passthrough MSIs. They latter are
>>>>>>>>>>>> not
>>>>>>>>>>>> added to
>>>>>>>>>>>> that radix tree, see physdev_hvm_map_pirq and
>>>>>>>>>>>> physdev_map_pirq.
>>>>>>>>>>>>          This patch fixes the issue, I have only tested MSI
>>>>>>>>>>>> (MSI-X
>>>>>>>>>>>> completely
>>>>>>>>>>>> untested).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/hw/pass-through.c b/hw/pass-through.c
>>>>>>>>>>>> index 304c438..079e465 100644
>>>>>>>>>>>> --- a/hw/pass-through.c
>>>>>>>>>>>> +++ b/hw/pass-through.c
>>>>>>>>>>>> @@ -3866,7 +3866,11 @@ static int
>>>>>>>>>>>> pt_msgctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>               ptdev->msi->flags |= PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>>           }
>>>>>>>>>>>>           else
>>>>>>>>>>>> -        ptdev->msi->flags &= ~PCI_MSI_FLAGS_ENABLE;
>>>>>>>>>>>> +    {
>>>>>>>>>>>> +        if (ptdev->msi->flags & PT_MSI_MAPPED) {
>>>>>>>>>>>> +            pt_msi_disable(ptdev);
>>>>>>>>>>>> +        }
>>>>>>>>>>>> +    }
>>>>>>>>>>>>             /* pass through MSI_ENABLE bit when no MSI-INTx
>>>>>>>>>>>> translation
>>>>>>>>>>>> */
>>>>>>>>>>>>           if (!ptdev->msi_trans_en) {
>>>>>>>>>>>> @@ -4013,6 +4017,8 @@ static int
>>>>>>>>>>>> pt_msixctrl_reg_write(struct
>>>>>>>>>>>> pt_dev
>>>>>>>>>>>> *ptdev,
>>>>>>>>>>>>                   pt_disable_msi_translate(ptdev);
>>>>>>>>>>>>               }
>>>>>>>>>>>>               pt_msix_update(ptdev);
>>>>>>>>>>>> +    } else if (!(*value & PCI_MSIX_ENABLE) &&
>>>>>>>>>>>> ptdev->msix->enabled) {
>>>>>>>>>>>> +        pt_msix_delete(ptdev);
>>>>>>>>>>>>        Hi Stefano,
>>>>>>>>>>>> I made a test with this patch, os reboot when driver
>>>>>>>>>>>> reload.
>>>>>>>>>>>> If
>>>>>>>>>>>> use
>>>>>>>>>>>> pt_msix_disable
>>>>>>>>>>>> instead of pt_msix_delete, driver could be reloaded.
>>>>>>>>>>>> But I still see some error in qemu.log and xen console.
>>>>>>>>>>>> Seems
>>>>>>>>>>>> four
>>>>>>>>>>>> IRQs
>>>>>>>>>>>> are not freed
>>>>>>>>>>>> when unmap.
>>>>>>>>>>>> --------------first load---------------------------
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 103
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 0 with pirq 67 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 102
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 1 with pirq 66 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 101
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 2 with pirq 65 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> pt_msix_update_one: pt_msix_update_one requested pirq =
>>>>>>>>>>>> 100
>>>>>>>>>>>> pt_msix_update_one: Update msix entry 3 with pirq 64 gvec
>>>>>>>>>>>> 0
>>>>>>>>>>>> ------------- first unload---------------------------
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 66, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 66
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 65, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 65
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>> pt_msix_disable: Unbind msix with pirq 64, gvec 0
>>>>>>>>>>>> pt_msix_disable: Unmap msix with pirq 64
>>>>>>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed.
>>>>>>>>>>>> [00:04.0]
>>>>>>>>>>>>                
>>>>>>>>>>> Can you add some printks in Xen (the hypercall name is
>>>>>>>>>>> PHYSDEVOP_unmap_pirq) to figure out exactly why they are
>>>>>>>>>>> failing?
>>>>>>>>>>>              
>>>>>>>>>> Did some test, domain_pirq_to_emuirq(d, unmap->pirq) =
>>>>>>>>>> IRQ_UNBOUND
>>>>>>>>>> in
>>>>>>>>>> physdev_unmap_pirq.
>>>>>>>>>>            
>>>>>>>>> That means that Linux didn't call irq_enable on the MSI-X in
>>>>>>>>> question:
>>>>>>>>>
>>>>>>>>> irq_enable -> __startup_pirq -> EVTCHNOP_bind_pirq
>>>>>>>>>
>>>>>>>>> EVTCHNOP_bind_pirq is implemented by evtchn_bind_pirq in Xen and
>>>>>>>>> calls
>>>>>>>>> map_domain_emuirq_pirq, so domain_pirq_to_emuirq(d, unmap->pirq)
>>>>>>>>> should
>>>>>>>>> be IRQ_PT.
>>>>>>>>>
>>>>>>>>> I don't know if that's a normal condition, but in any case it
>>>>>>>>> should
>>>>>>>>> not create any problems to physdev_unmap_pirq, in fact the
>>>>>>>>> folloing
>>>>>>>>> check:
>>>>>>>>>
>>>>>>>>>         if ( domid == DOMID_SELF || ret )
>>>>>>>>>                 goto free_domain;
>>>>>>>>>
>>>>>>>>> should fail so Xen should continue and execute
>>>>>>>>> unmap_domain_pirq.
>>>>>>>>> That's
>>>>>>>>> what we want.
>>>>>>>>>          
>>>>>>>>    From linux side, request_irq->  request_threaded_irq->
>>>>>>>> __setup_irq->
>>>>>>>> irq_startup->  startup_pirq-> EVTCHNOP_bind_pirq
>>>>>>>> If irq_enable isn't called, how does the driver receive interrupt,
>>>>>>>> I
>>>>>>>> did
>>>>>>>> see
>>>>>>>> four interrupts in /proc/interrupt and driver works ok.
>>>>>>>>        
>>>>>>> Good to know
>>>>>>>
>>>>>>>      
>>>>>>>> Could you have a look if there is something wrong in xen side of
>>>>>>>> clearing
>>>>>>>> the
>>>>>>>> mapping?
>>>>>>>>        
>>>>>>> What I am saying is that the error you are getting:
>>>>>>>
>>>>>>> pt_msix_disable: Unbind msix with pirq 67, gvec 0
>>>>>>> pt_msix_disable: Unmap msix with pirq 67
>>>>>>> pt_msix_disable: Error: Unmapping of MSI-X failed. [00:04.0]
>>>>>>>
>>>>>>> cannot be caused by domain_pirq_to_emuirq(d, pirq) returning
>>>>>>> IRQ_UNBOUND.
>>>>>>> So, why are you getting this error? What is failing?
>>>>>>> I am ready to believe the problem is in Xen but Without
>>>>>>> understanding
>>>>>>> why you are getting the error it's hard to find a solution.
>>>>>>>      
>>>>>> I found the reason, you are looking at xen-unstable, I was working
>>>>>> with
>>>>>> 4.1.30-OVM, it has patch of CVE-2012-4536 / XSA-21.
>>>>>> That patch set ret to -EINVAL initially. After remove that line, unmap
>>>>>> succeed.
>>>>>> But we still need below patch to let driver reload succeed everytime.
>>>>>> Without
>>>>>> that, 1st reload failed, 2nd succeed, 3 failed, ...
>>>>>>
>>>>>> diff -up --new-file ./hw/pt-msi.c.old1 ./hw/pt-msi.c
>>>>>> --- ./hw/pt-msi.c.old1  2013-06-26 01:36:08.000000000 +0800
>>>>>> +++ ./hw/pt-msi.c       2013-06-26 01:37:41.000000000 +0800
>>>>>> @@ -469,7 +469,7 @@ static void pci_msix_writel(void *opaque
>>>>>>            return;
>>>>>>        }
>>>>>>
>>>>>> -    if ( offset != 3 && entry->io_mem[offset] != val )
>>>>>> +    if ( offset != 3 && (entry->io_mem[offset] != val || entry->pirq
>>>>>> ==
>>>>>> -1))
>>>>>>            entry->flags = 1;
>>>>>>        entry->io_mem[offset] = val;
>>>>> Interesting. I don't think this is the proper fix though.
>>>>> Does the appended patch change anything?
>>>>>
>>>>> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
>>>>> index 71fa6f0..cd5d9c7 100644
>>>>> --- a/hw/pt-msi.c
>>>>> +++ b/hw/pt-msi.c
>>>>> @@ -302,7 +302,7 @@ static int pt_msix_update_one(struct pt_dev *dev,
>>>>> int
>>>>> entry_nr)
>>>>>         uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>>>>         int ret;
>>>>>     -    if ( !entry->flags )
>>>>> +    if ( !entry->flags && ptdev->msix->enabled )
>>>>>             return 0;
>>>>>           if (!gvec) {
>>>> Tested, not work.
>>>> If you look at msix_capability_init in kernel, line 707,722,
>>>> dev->msix->enabled is already set when pt_msix_update is called.
>>> Yeah, but it shouldn't be already set in QEMU. In fact in QEMU
>>> dev->msix->enabled is modified in pt_msixctrl_reg_write after calling to
>>> pt_msix_update.
>> It does.
>> line 707,  PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE is set
>>                   this will set dev->msix->enabled first time
>> line 722, PCI_MSIX_FLAGS_ENABLE set
>>                   this trigger call of pt_msix_update
>>                   then dev->msix->enabled was set a second time
>>> I was assuming that you needed to add "|| entry->pirq == -1" because you
>>> needed to pass the check:
>>>
>>>       if ( !entry->flags )
>>>           return 0;
>>>
>>> at the beginning of pt_msix_update_one. Am I getting it right?
>> Right, as entry->pirq is set to -1 when driver unload.
>>> If that is case that I thought that we just needed to make sure that
>>> when ptdev->msix->enabled is still zero then we go through the test in
>>> pt_msix_update_one. Where is the mistake?
>> If you want to use dev->msix->enabled for checking, below patch could work.
>> But in this case, dev->msix->enabled doesn't represent PCI_MSIX_ENABLE any
>> more,
>> but a combination of PCI_MSIX_ENABLE and ~PCI_MSIX_MASK
>>
>> zduan
>> ***********************************************************
>> diff -up ./hw/pass-through.c.old2 ./hw/pass-through.c
>> --- ./hw/pass-through.c.old2    2013-06-27 11:05:30.000000000 +0800
>> +++ ./hw/pass-through.c 2013-06-27 11:07:35.000000000 +0800
>> @@ -4027,7 +4027,7 @@ static int pt_msixctrl_reg_write(struct
>>           pt_msix_disable(ptdev);
>>       }
>>
>> -    ptdev->msix->enabled = !!(*value & PCI_MSIX_ENABLE);
>> +    ptdev->msix->enabled = (*value & PCI_MSIX_ENABLE) && !(*value &
>> PCI_MSIX_MASK);
>>
>>       return 0;
>>   }
>> diff -up ./hw/pt-msi.c.old2 ./hw/pt-msi.c
>> --- ./hw/pt-msi.c.old2  2013-06-27 11:26:12.000000000 +0800
>> +++ ./hw/pt-msi.c       2013-06-27 11:27:13.000000000 +0800
>> @@ -294,7 +294,7 @@ static int pt_msix_update_one(struct pt_
>>       uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr);
>>       int ret;
>>
>> -    if ( !entry->flags )
>> +    if ( !entry->flags && dev->msix->enabled )
>>           return 0;
>>
>>       if (!gvec) {
> I understand now, thanks for the explanation.
> Among the two alternatives, I think that your first change is actually
> better.
>
> However we can probably still improve it a little bit by setting
> entry->flags to 1 directly in pt_msix_disable? So that we don't confuse
> the driver reload case from the first driver initialization.
>
> diff --git a/hw/pt-msi.c b/hw/pt-msi.c
> index 71fa6f0..cc4e280 100644
> --- a/hw/pt-msi.c
> +++ b/hw/pt-msi.c
> @@ -408,7 +408,7 @@ void pt_msix_disable(struct pt_dev *dev)
>           }
>           /* clear msi-x info */
>           entry->pirq = -1;
> -        entry->flags = 0;
> +        entry->flags = 1;
>       }
>   }
Test passed.
But this change set entry->flags in all entrys, dev->msix->total_entries 
count of pirqs are
mapped when driver reload, no matter how many msix entrys driver is 
initializing.

zduan

  reply	other threads:[~2013-06-28  2:33 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-08  8:18 [PATCH] xen: reuse the same pirq allocated when driver load first time Zhenzhong Duan
2013-05-10 18:53 ` Konrad Rzeszutek Wilk
2013-05-13  7:44   ` Zhenzhong Duan
2013-05-13 11:06   ` Stefano Stabellini
2013-05-13 14:07     ` Konrad Rzeszutek Wilk
2013-05-13 14:50       ` Stefano Stabellini
2013-05-13 16:17         ` Konrad Rzeszutek Wilk
2013-05-13 17:24           ` Stefano Stabellini
2013-05-13 18:20             ` Konrad Rzeszutek Wilk
2013-05-14 13:49               ` Stefano Stabellini
2013-05-14 14:20                 ` Konrad Rzeszutek Wilk
2013-05-15  9:41                   ` Stefano Stabellini
2013-05-15 14:18                     ` Zhenzhong Duan
2013-05-17  2:22                     ` Zhenzhong Duan
2013-05-20 10:24                       ` Stefano Stabellini
2013-05-20 15:24                         ` Konrad Rzeszutek Wilk
2013-05-20 17:57                         ` Konrad Rzeszutek Wilk
2013-05-20 20:38                           ` Konrad Rzeszutek Wilk
2013-05-21 10:07                             ` [Xen-devel] " David Vrabel
2013-05-21 13:40                               ` Konrad Rzeszutek Wilk
2013-05-21 16:51                                 ` Stefano Stabellini
2013-05-21 20:42                                   ` Konrad Rzeszutek Wilk
2013-05-21 21:50                                     ` Stefano Stabellini
2013-05-21 22:41                                       ` Konrad Rzeszutek Wilk
2013-05-22  9:37                                         ` Jan Beulich
2013-05-22 15:14                                           ` Konrad Rzeszutek Wilk
2013-05-22 15:25                                             ` Jan Beulich
2013-05-22 16:41                                               ` Konrad Rzeszutek Wilk
2013-05-23  6:31                                                 ` Jan Beulich
2013-05-29 17:50                                   ` Stefano Stabellini
2013-05-30 17:48                                     ` Konrad Rzeszutek Wilk
2013-06-05  5:27                                     ` Zhenzhong Duan
2013-06-05 12:50                                       ` [Xen-devel] " Stefano Stabellini
2013-06-20  2:57                                         ` Zhenzhong Duan
2013-06-20 14:21                                           ` Stefano Stabellini
2013-06-24  7:19                                             ` Zhenzhong Duan
2013-06-24  7:19                                               ` Zhenzhong Duan
2013-06-24 17:18                                               ` Stefano Stabellini
2013-06-25  5:33                                                 ` DuanZhenzhong
2013-06-25  5:33                                                   ` DuanZhenzhong
2013-06-25  7:21                                                   ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Jan Beulich
2013-06-25  7:44                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix DuanZhenzhong
2013-06-25  8:36                                                       ` Jan Beulich
2013-06-25  8:43                                                         ` DuanZhenzhong
2013-06-25 10:56                                                     ` [PATCH 4.1] x86: fix emuirq regression from XSA-21 fix (was: Re: [PATCH] xen: reuse the same pirq allocated when driver load first time) Stefano Stabellini
2013-06-25 11:03                                                       ` Stefano Stabellini
2013-06-27  8:34                                                         ` Jan Beulich
2013-06-27 10:46                                                           ` Stefano Stabellini
2013-06-25 17:51                                                   ` [Xen-devel] [PATCH] xen: reuse the same pirq allocated when driver load first time Stefano Stabellini
2013-06-26  4:00                                                     ` Zhenzhong Duan
2013-06-26  4:00                                                       ` Zhenzhong Duan
2013-06-26 18:08                                                       ` Stefano Stabellini
2013-06-27  4:01                                                         ` Zhenzhong Duan
2013-06-27  4:01                                                           ` Zhenzhong Duan
2013-06-27 11:52                                                           ` Stefano Stabellini
2013-06-28  2:33                                                             ` Zhenzhong Duan [this message]
2013-06-28  2:33                                                               ` Zhenzhong Duan
2013-06-28 11:12                                                               ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51CCF5E0.908@oracle.com \
    --to=zhenzhong.duan@oracle.com \
    --cc=chien.yen@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=joe.jin@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=yuval.shaia@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.