public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Performance test result between virtio_pci MSI-X disable and enable
@ 2010-11-23  2:53 lidong chen
  2010-11-23  6:20 ` Avi Kivity
  2010-12-22 13:52 ` Michael S. Tsirkin
  0 siblings, 2 replies; 22+ messages in thread
From: lidong chen @ 2010-11-23  2:53 UTC (permalink / raw)
  To: Gleb Natapov, Avi Kivity, mst, aliguori, rusty, kvm

Test method:
Send the same traffic load between virtio_pci MSI-X disable and
enable,and compare the cpu rate of host os.
I used the same version of virtio driver, only modify the msi-x option.
the host os version is 2.6.32.
the virtio dirver is from rhel6.
the guest version  os is 2.6.16.

Test result:
with msi-x disable, the cpu rate of host os is 110%.
with msi-x enable, the cpu rate of host os is 140%.

the /proc/interrupt with msi-x disable is below:
           CPU0       CPU1
  0:   12326706          0    IO-APIC-edge  timer
  1:          8          0    IO-APIC-edge  i8042
  8:          0          0    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 10:    4783008          0   IO-APIC-level  virtio2, virtio3
 11:    5363828          0   IO-APIC-level  virtio1, virtio4, virtio5
 12:        104          0    IO-APIC-edge  i8042
NMI:    2857871    2650796
LOC:   12324952   12325609
ERR:          0
MIS:          0

the /proc/interrupt with msi-x enable is below:
          CPU0       CPU1
 0:    1896802          0    IO-APIC-edge  timer
 1:          8          0    IO-APIC-edge  i8042
 4:         14          0    IO-APIC-edge  serial
 8:          0          0    IO-APIC-edge  rtc
 9:          0          0   IO-APIC-level  acpi
 10:          0          0   IO-APIC-level  virtio1, virtio2, virtio5
 11:          1          0   IO-APIC-level  virtio0, virtio3, virtio4
 12:        104          0    IO-APIC-edge  i8042
 50:          1          0       PCI-MSI-X  virtio2-output
 58:          0          0       PCI-MSI-X  virtio3-config
 66:    2046985          0       PCI-MSI-X  virtio3-input
 74:          2          0       PCI-MSI-X  virtio3-output
 82:          0          0       PCI-MSI-X  virtio4-config
 90:        217          0       PCI-MSI-X  virtio4-input
 98:          0          0       PCI-MSI-X  virtio4-output
177:          0          0       PCI-MSI-X  virtio0-config
185:     341831          0       PCI-MSI-X  virtio0-input
193:          1          0       PCI-MSI-X  virtio0-output
201:          0          0       PCI-MSI-X  virtio1-config
209:     188747          0       PCI-MSI-X  virtio1-input
217:          1          0       PCI-MSI-X  virtio1-output
225:          0          0       PCI-MSI-X  virtio2-config
233:    2204149          0       PCI-MSI-X  virtio2-input
NMI:    1455767    1426226
LOC:    1896099    1896637
ERR:          0
MIS:          0

Code difference:
I disalbe msi-x by modify the function  vp_find_vqs like this:

static int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
                      struct virtqueue *vqs[],
                      vq_callback_t *callbacks[],
                      const char *names[])
{

#if 0
       int err;

       /* Try MSI-X with one vector per queue. */
       err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names, true, true);
       if (!err)
               return 0;
       /* Fallback: MSI-X with one vector for config, one shared for queues. */
       err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
                                true, false);
       if (!err)
               return 0;
       /* Finally fall back to regular interrupts. */
#endif

       return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
                                 false, false);
}

Conclusion:
msi-x enable waste more cpu resource is caused by MSIX mask bit. In
older kernels program this bit twice
on every interrupt. and caused ept violation.

So I think we should add a param to control this.with older kernels,
we should disable MSIX.
And I think this should deal by qemu.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  2:53 Performance test result between virtio_pci MSI-X disable and enable lidong chen
@ 2010-11-23  6:20 ` Avi Kivity
  2010-11-23  6:42   ` Gleb Natapov
  2010-11-23  7:27   ` lidong chen
  2010-12-22 13:52 ` Michael S. Tsirkin
  1 sibling, 2 replies; 22+ messages in thread
From: Avi Kivity @ 2010-11-23  6:20 UTC (permalink / raw)
  To: lidong chen; +Cc: Gleb Natapov, mst, aliguori, rusty, kvm

On 11/23/2010 04:53 AM, lidong chen wrote:
> Test method:
> Send the same traffic load between virtio_pci MSI-X disable and
> enable,and compare the cpu rate of host os.
> I used the same version of virtio driver, only modify the msi-x option.
> the host os version is 2.6.32.
> the virtio dirver is from rhel6.
> the guest version  os is 2.6.16.
>
> Test result:
> with msi-x disable, the cpu rate of host os is 110%.
> with msi-x enable, the cpu rate of host os is 140%.
>
...

> Conclusion:
> msi-x enable waste more cpu resource is caused by MSIX mask bit. In
> older kernels program this bit twice
> on every interrupt. and caused ept violation.
>
> So I think we should add a param to control this.with older kernels,
> we should disable MSIX.
> And I think this should deal by qemu.

There is now work in progress (by Sheng Yang) to speed up mask bit 
emulation, which should improve things.  Also, newer kernels don't hit 
the mask bit so hard.  You might try to backport the mask bit patches to 
your 2.6.16 guest.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  6:20 ` Avi Kivity
@ 2010-11-23  6:42   ` Gleb Natapov
  2010-11-23  7:27   ` lidong chen
  1 sibling, 0 replies; 22+ messages in thread
From: Gleb Natapov @ 2010-11-23  6:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: lidong chen, mst, aliguori, rusty, kvm

On Tue, Nov 23, 2010 at 08:20:19AM +0200, Avi Kivity wrote:
> On 11/23/2010 04:53 AM, lidong chen wrote:
> >Test method:
> >Send the same traffic load between virtio_pci MSI-X disable and
> >enable,and compare the cpu rate of host os.
> >I used the same version of virtio driver, only modify the msi-x option.
> >the host os version is 2.6.32.
> >the virtio dirver is from rhel6.
> >the guest version  os is 2.6.16.
> >
> >Test result:
> >with msi-x disable, the cpu rate of host os is 110%.
> >with msi-x enable, the cpu rate of host os is 140%.
> >
> ...
> 
> >Conclusion:
> >msi-x enable waste more cpu resource is caused by MSIX mask bit. In
> >older kernels program this bit twice
> >on every interrupt. and caused ept violation.
> >
> >So I think we should add a param to control this.with older kernels,
> >we should disable MSIX.
> >And I think this should deal by qemu.
> 
> There is now work in progress (by Sheng Yang) to speed up mask bit
> emulation, which should improve things.  Also, newer kernels don't
> hit the mask bit so hard.  You might try to backport the mask bit
> patches to your 2.6.16 guest.
> 
And IIRC we do have option to disable MSIX in qemu. I just don't
remember how it looks. Michael?

--
			Gleb.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  6:20 ` Avi Kivity
  2010-11-23  6:42   ` Gleb Natapov
@ 2010-11-23  7:27   ` lidong chen
  2010-11-23  7:39     ` Avi Kivity
  1 sibling, 1 reply; 22+ messages in thread
From: lidong chen @ 2010-11-23  7:27 UTC (permalink / raw)
  To: sheng.yang; +Cc: Gleb Natapov, mst, aliguori, rusty, kvm, Avi Kivity

can you tell me something about this problem.
thanks.

2010/11/23 Avi Kivity <avi@redhat.com>:
> On 11/23/2010 04:53 AM, lidong chen wrote:
>>
>> Test method:
>> Send the same traffic load between virtio_pci MSI-X disable and
>> enable,and compare the cpu rate of host os.
>> I used the same version of virtio driver, only modify the msi-x option.
>> the host os version is 2.6.32.
>> the virtio dirver is from rhel6.
>> the guest version  os is 2.6.16.
>>
>> Test result:
>> with msi-x disable, the cpu rate of host os is 110%.
>> with msi-x enable, the cpu rate of host os is 140%.
>>
> ...
>
>> Conclusion:
>> msi-x enable waste more cpu resource is caused by MSIX mask bit. In
>> older kernels program this bit twice
>> on every interrupt. and caused ept violation.
>>
>> So I think we should add a param to control this.with older kernels,
>> we should disable MSIX.
>> And I think this should deal by qemu.
>
> There is now work in progress (by Sheng Yang) to speed up mask bit
> emulation, which should improve things.  Also, newer kernels don't hit the
> mask bit so hard.  You might try to backport the mask bit patches to your
> 2.6.16 guest.
>
> --
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  7:27   ` lidong chen
@ 2010-11-23  7:39     ` Avi Kivity
  2010-11-30  9:10       ` lidong chen
  0 siblings, 1 reply; 22+ messages in thread
From: Avi Kivity @ 2010-11-23  7:39 UTC (permalink / raw)
  To: lidong chen; +Cc: sheng.yang, Gleb Natapov, mst, aliguori, rusty, kvm

On 11/23/2010 09:27 AM, lidong chen wrote:
> can you tell me something about this problem.
> thanks.

Which problem?

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  7:39     ` Avi Kivity
@ 2010-11-30  9:10       ` lidong chen
  2010-11-30  9:24         ` Yang, Sheng
  0 siblings, 1 reply; 22+ messages in thread
From: lidong chen @ 2010-11-30  9:10 UTC (permalink / raw)
  To: Avi Kivity, sheng.yang; +Cc: Gleb Natapov, mst, aliguori, rusty, kvm

sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.

I test kvm with sriov, which the vf driver could not disable msix.
so the host os waste a lot of cpu.  cpu rate of host os is 90%.

then I test xen with sriov, there ara also a lot of vm exits caused by
MSIX mask.
but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
and domain0 is 60%.

without sr-iov, the cpu rate of xen and domain0 is higher than kvm.

so i think the problem is kvm waste more cpu resource to deal with MSIX mask.
and we can see how xen deal with MSIX mask.

if this problem sloved, maybe with MSIX enabled, the performace is better.

2010/11/23 Avi Kivity <avi@redhat.com>:
> On 11/23/2010 09:27 AM, lidong chen wrote:
>>
>> can you tell me something about this problem.
>> thanks.
>
> Which problem?
>
> --
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-30  9:10       ` lidong chen
@ 2010-11-30  9:24         ` Yang, Sheng
  2010-12-01  8:41           ` lidong chen
  0 siblings, 1 reply; 22+ messages in thread
From: Yang, Sheng @ 2010-11-30  9:24 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> 
> I test kvm with sriov, which the vf driver could not disable msix.
> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> 
> then I test xen with sriov, there ara also a lot of vm exits caused by
> MSIX mask.
> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> and domain0 is 60%.
> 
> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> 
> so i think the problem is kvm waste more cpu resource to deal with MSIX
> mask. and we can see how xen deal with MSIX mask.
> 
> if this problem sloved, maybe with MSIX enabled, the performace is better.

Please refer to my posted patches for this issue. 

http://www.spinics.net/lists/kvm/msg44992.html

--
regards
Yang, Sheng

> 
> 2010/11/23 Avi Kivity <avi@redhat.com>:
> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> can you tell me something about this problem.
> >> thanks.
> > 
> > Which problem?
> > 
> > --
> > I have a truly marvellous patch that fixes the bug which this
> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-30  9:24         ` Yang, Sheng
@ 2010-12-01  8:41           ` lidong chen
  2010-12-01  8:49             ` Yang, Sheng
                               ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: lidong chen @ 2010-12-01  8:41 UTC (permalink / raw)
  To: Yang, Sheng
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

I used sr-iov, give each vm 2 vf.
after apply the patch, and i found performence is the same.

the reason is in function msix_mmio_write, mostly addr is not in mmio range.

static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
			   const void *val)
{
	struct kvm_assigned_dev_kernel *adev =
			container_of(this, struct kvm_assigned_dev_kernel,
				     msix_mmio_dev);
	int idx, r = 0;
	unsigned long new_val = *(unsigned long *)val;

	mutex_lock(&adev->kvm->lock);
	if (!msix_mmio_in_range(adev, addr, len)) {
		// return here.
                 r = -EOPNOTSUPP;
		goto out;
	}

i printk the value:
addr             start           end           len
F004C00C   F0044000  F0044030     4

00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
	Subsystem: Intel Corporation Unknown device 000c
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
	Latency: 0
	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
		Vector table: BAR=3 offset=00000000
		PBA: BAR=3 offset=00002000

00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
	Subsystem: Intel Corporation Unknown device 000c
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
	Latency: 0
	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
		Vector table: BAR=3 offset=00000000
		PBA: BAR=3 offset=00002000



+static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
+			      gpa_t addr, int len)
+{
+	gpa_t start, end;
+
+	BUG_ON(adev->msix_mmio_base == 0);
+	start = adev->msix_mmio_base;
+	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
+		adev->msix_max_entries_nr;
+	if (addr >= start && addr + len <= end)
+		return true;
+
+	return false;
+}





2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
>> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
>>
>> I test kvm with sriov, which the vf driver could not disable msix.
>> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
>>
>> then I test xen with sriov, there ara also a lot of vm exits caused by
>> MSIX mask.
>> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
>> and domain0 is 60%.
>>
>> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
>>
>> so i think the problem is kvm waste more cpu resource to deal with MSIX
>> mask. and we can see how xen deal with MSIX mask.
>>
>> if this problem sloved, maybe with MSIX enabled, the performace is better.
>
> Please refer to my posted patches for this issue.
>
> http://www.spinics.net/lists/kvm/msg44992.html
>
> --
> regards
> Yang, Sheng
>
>>
>> 2010/11/23 Avi Kivity <avi@redhat.com>:
>> > On 11/23/2010 09:27 AM, lidong chen wrote:
>> >> can you tell me something about this problem.
>> >> thanks.
>> >
>> > Which problem?
>> >
>> > --
>> > I have a truly marvellous patch that fixes the bug which this
>> > signature is too narrow to contain.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  8:41           ` lidong chen
@ 2010-12-01  8:49             ` Yang, Sheng
  2010-12-01  8:54               ` lidong chen
  2010-12-01  8:56             ` Yang, Sheng
  2010-12-01 14:03             ` Michael S. Tsirkin
  2 siblings, 1 reply; 22+ messages in thread
From: Yang, Sheng @ 2010-12-01  8:49 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
> I used sr-iov, give each vm 2 vf.
> after apply the patch, and i found performence is the same.
> 
> the reason is in function msix_mmio_write, mostly addr is not in mmio
> range.

Did you patch qemu as well? You can see it's impossible for kernel part to work 
alone...

http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html

--
regards
Yang, Sheng


> 
> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
> 			   const void *val)
> {
> 	struct kvm_assigned_dev_kernel *adev =
> 			container_of(this, struct kvm_assigned_dev_kernel,
> 				     msix_mmio_dev);
> 	int idx, r = 0;
> 	unsigned long new_val = *(unsigned long *)val;
> 
> 	mutex_lock(&adev->kvm->lock);
> 	if (!msix_mmio_in_range(adev, addr, len)) {
> 		// return here.
>                  r = -EOPNOTSUPP;
> 		goto out;
> 	}
> 
> i printk the value:
> addr             start           end           len
> F004C00C   F0044000  F0044030     4
> 
> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 
> 
> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> +			      gpa_t addr, int len)
> +{
> +	gpa_t start, end;
> +
> +	BUG_ON(adev->msix_mmio_base == 0);
> +	start = adev->msix_mmio_base;
> +	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> +		adev->msix_max_entries_nr;
> +	if (addr >= start && addr + len <= end)
> +		return true;
> +
> +	return false;
> +}
> 
> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> >> 
> >> I test kvm with sriov, which the vf driver could not disable msix.
> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >> 
> >> then I test xen with sriov, there ara also a lot of vm exits caused by
> >> MSIX mask.
> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> >> and domain0 is 60%.
> >> 
> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> >> 
> >> so i think the problem is kvm waste more cpu resource to deal with MSIX
> >> mask. and we can see how xen deal with MSIX mask.
> >> 
> >> if this problem sloved, maybe with MSIX enabled, the performace is
> >> better.
> > 
> > Please refer to my posted patches for this issue.
> > 
> > http://www.spinics.net/lists/kvm/msg44992.html
> > 
> > --
> > regards
> > Yang, Sheng
> > 
> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> >> can you tell me something about this problem.
> >> >> thanks.
> >> > 
> >> > Which problem?
> >> > 
> >> > --
> >> > I have a truly marvellous patch that fixes the bug which this
> >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  8:49             ` Yang, Sheng
@ 2010-12-01  8:54               ` lidong chen
  2010-12-01  9:02                 ` Yang, Sheng
  0 siblings, 1 reply; 22+ messages in thread
From: lidong chen @ 2010-12-01  8:54 UTC (permalink / raw)
  To: Yang, Sheng
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

yes, i patch qemu as well.

and i found the address of second vf is not in mmio range. the first
one is fine.

2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
>> I used sr-iov, give each vm 2 vf.
>> after apply the patch, and i found performence is the same.
>>
>> the reason is in function msix_mmio_write, mostly addr is not in mmio
>> range.
>
> Did you patch qemu as well? You can see it's impossible for kernel part to work
> alone...
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html
>
> --
> regards
> Yang, Sheng
>
>
>>
>> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
>>                          const void *val)
>> {
>>       struct kvm_assigned_dev_kernel *adev =
>>                       container_of(this, struct kvm_assigned_dev_kernel,
>>                                    msix_mmio_dev);
>>       int idx, r = 0;
>>       unsigned long new_val = *(unsigned long *)val;
>>
>>       mutex_lock(&adev->kvm->lock);
>>       if (!msix_mmio_in_range(adev, addr, len)) {
>>               // return here.
>>                  r = -EOPNOTSUPP;
>>               goto out;
>>       }
>>
>> i printk the value:
>> addr             start           end           len
>> F004C00C   F0044000  F0044030     4
>>
>> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
>>       Subsystem: Intel Corporation Unknown device 000c
>>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B-
>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR-
>>       Latency: 0
>>       Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
>>       Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
>>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>>               Vector table: BAR=3 offset=00000000
>>               PBA: BAR=3 offset=00002000
>>
>> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
>>       Subsystem: Intel Corporation Unknown device 000c
>>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B-
>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR-
>>       Latency: 0
>>       Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
>>       Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
>>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>>               Vector table: BAR=3 offset=00000000
>>               PBA: BAR=3 offset=00002000
>>
>>
>>
>> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
>> +                           gpa_t addr, int len)
>> +{
>> +     gpa_t start, end;
>> +
>> +     BUG_ON(adev->msix_mmio_base == 0);
>> +     start = adev->msix_mmio_base;
>> +     end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
>> +             adev->msix_max_entries_nr;
>> +     if (addr >= start && addr + len <= end)
>> +             return true;
>> +
>> +     return false;
>> +}
>>
>> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
>> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
>> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
>> >>
>> >> I test kvm with sriov, which the vf driver could not disable msix.
>> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
>> >>
>> >> then I test xen with sriov, there ara also a lot of vm exits caused by
>> >> MSIX mask.
>> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
>> >> and domain0 is 60%.
>> >>
>> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
>> >>
>> >> so i think the problem is kvm waste more cpu resource to deal with MSIX
>> >> mask. and we can see how xen deal with MSIX mask.
>> >>
>> >> if this problem sloved, maybe with MSIX enabled, the performace is
>> >> better.
>> >
>> > Please refer to my posted patches for this issue.
>> >
>> > http://www.spinics.net/lists/kvm/msg44992.html
>> >
>> > --
>> > regards
>> > Yang, Sheng
>> >
>> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
>> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
>> >> >> can you tell me something about this problem.
>> >> >> thanks.
>> >> >
>> >> > Which problem?
>> >> >
>> >> > --
>> >> > I have a truly marvellous patch that fixes the bug which this
>> >> > signature is too narrow to contain.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  8:41           ` lidong chen
  2010-12-01  8:49             ` Yang, Sheng
@ 2010-12-01  8:56             ` Yang, Sheng
  2010-12-01 14:03             ` Michael S. Tsirkin
  2 siblings, 0 replies; 22+ messages in thread
From: Yang, Sheng @ 2010-12-01  8:56 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
> I used sr-iov, give each vm 2 vf.
> after apply the patch, and i found performence is the same.
> 
> the reason is in function msix_mmio_write, mostly addr is not in mmio
> range.

This url maybe more convenient.

http://www.spinics.net/lists/kvm/msg44795.html

--
regards
Yang, Sheng

> 
> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
> 			   const void *val)
> {
> 	struct kvm_assigned_dev_kernel *adev =
> 			container_of(this, struct kvm_assigned_dev_kernel,
> 				     msix_mmio_dev);
> 	int idx, r = 0;
> 	unsigned long new_val = *(unsigned long *)val;
> 
> 	mutex_lock(&adev->kvm->lock);
> 	if (!msix_mmio_in_range(adev, addr, len)) {
> 		// return here.
>                  r = -EOPNOTSUPP;
> 		goto out;
> 	}
> 
> i printk the value:
> addr             start           end           len
> F004C00C   F0044000  F0044030     4
> 
> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 
> 
> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> +			      gpa_t addr, int len)
> +{
> +	gpa_t start, end;
> +
> +	BUG_ON(adev->msix_mmio_base == 0);
> +	start = adev->msix_mmio_base;
> +	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> +		adev->msix_max_entries_nr;
> +	if (addr >= start && addr + len <= end)
> +		return true;
> +
> +	return false;
> +}
> 
> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> >> 
> >> I test kvm with sriov, which the vf driver could not disable msix.
> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >> 
> >> then I test xen with sriov, there ara also a lot of vm exits caused by
> >> MSIX mask.
> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> >> and domain0 is 60%.
> >> 
> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> >> 
> >> so i think the problem is kvm waste more cpu resource to deal with MSIX
> >> mask. and we can see how xen deal with MSIX mask.
> >> 
> >> if this problem sloved, maybe with MSIX enabled, the performace is
> >> better.
> > 
> > Please refer to my posted patches for this issue.
> > 
> > http://www.spinics.net/lists/kvm/msg44992.html
> > 
> > --
> > regards
> > Yang, Sheng
> > 
> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> >> can you tell me something about this problem.
> >> >> thanks.
> >> > 
> >> > Which problem?
> >> > 
> >> > --
> >> > I have a truly marvellous patch that fixes the bug which this
> >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  8:54               ` lidong chen
@ 2010-12-01  9:02                 ` Yang, Sheng
  2010-12-01  9:29                   ` lidong chen
  2010-12-01  9:34                   ` Yang, Sheng
  0 siblings, 2 replies; 22+ messages in thread
From: Yang, Sheng @ 2010-12-01  9:02 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 16:54:16 lidong chen wrote:
> yes, i patch qemu as well.
> 
> and i found the address of second vf is not in mmio range. the first
> one is fine.

So looks like something wrong with MMIO register part. Could you check the 
registeration in assigned_dev_iomem_map() of the 4th patch for QEmu? I suppose 
something wrong with it. I would try to reproduce it here.

And if you only use one vf, how about the gain?

--
regards
Yang, Sheng

> 
> 2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> > On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
> >> I used sr-iov, give each vm 2 vf.
> >> after apply the patch, and i found performence is the same.
> >> 
> >> the reason is in function msix_mmio_write, mostly addr is not in mmio
> >> range.
> > 
> > Did you patch qemu as well? You can see it's impossible for kernel part
> > to work alone...
> > 
> > http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html
> > 
> > --
> > regards
> > Yang, Sheng
> > 
> >> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
> >> len, const void *val)
> >> {
> >>       struct kvm_assigned_dev_kernel *adev =
> >>                       container_of(this, struct kvm_assigned_dev_kernel,
> >>                                    msix_mmio_dev);
> >>       int idx, r = 0;
> >>       unsigned long new_val = *(unsigned long *)val;
> >> 
> >>       mutex_lock(&adev->kvm->lock);
> >>       if (!msix_mmio_in_range(adev, addr, len)) {
> >>               // return here.
> >>                  r = -EOPNOTSUPP;
> >>               goto out;
> >>       }
> >> 
> >> i printk the value:
> >> addr             start           end           len
> >> F004C00C   F0044000  F0044030     4
> >> 
> >> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> >> 01) Subsystem: Intel Corporation Unknown device 000c
> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> >> ParErr- Stepping- SERR- FastB2B-
> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> <TAbort- <MAbort- >SERR- <PERR-
> >>       Latency: 0
> >>       Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> >>       Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> >>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> >>               Vector table: BAR=3 offset=00000000
> >>               PBA: BAR=3 offset=00002000
> >> 
> >> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> >> 01) Subsystem: Intel Corporation Unknown device 000c
> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> >> ParErr- Stepping- SERR- FastB2B-
> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> <TAbort- <MAbort- >SERR- <PERR-
> >>       Latency: 0
> >>       Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> >>       Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> >>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> >>               Vector table: BAR=3 offset=00000000
> >>               PBA: BAR=3 offset=00002000
> >> 
> >> 
> >> 
> >> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> >> +                           gpa_t addr, int len)
> >> +{
> >> +     gpa_t start, end;
> >> +
> >> +     BUG_ON(adev->msix_mmio_base == 0);
> >> +     start = adev->msix_mmio_base;
> >> +     end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> >> +             adev->msix_max_entries_nr;
> >> +     if (addr >= start && addr + len <= end)
> >> +             return true;
> >> +
> >> +     return false;
> >> +}
> >> 
> >> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> >> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> >> >> 
> >> >> I test kvm with sriov, which the vf driver could not disable msix.
> >> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >> >> 
> >> >> then I test xen with sriov, there ara also a lot of vm exits caused
> >> >> by MSIX mask.
> >> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> >> >> and domain0 is 60%.
> >> >> 
> >> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> >> >> 
> >> >> so i think the problem is kvm waste more cpu resource to deal with
> >> >> MSIX mask. and we can see how xen deal with MSIX mask.
> >> >> 
> >> >> if this problem sloved, maybe with MSIX enabled, the performace is
> >> >> better.
> >> > 
> >> > Please refer to my posted patches for this issue.
> >> > 
> >> > http://www.spinics.net/lists/kvm/msg44992.html
> >> > 
> >> > --
> >> > regards
> >> > Yang, Sheng
> >> > 
> >> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> >> >> can you tell me something about this problem.
> >> >> >> thanks.
> >> >> > 
> >> >> > Which problem?
> >> >> > 
> >> >> > --
> >> >> > I have a truly marvellous patch that fixes the bug which this
> >> >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  9:02                 ` Yang, Sheng
@ 2010-12-01  9:29                   ` lidong chen
  2010-12-01  9:37                     ` Yang, Sheng
  2010-12-01  9:34                   ` Yang, Sheng
  1 sibling, 1 reply; 22+ messages in thread
From: lidong chen @ 2010-12-01  9:29 UTC (permalink / raw)
  To: Yang, Sheng
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

maybe because i modify the code in assigned_dev_iomem_map().

i used RHEL6, and calc_assigned_dev_id is below:

static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t devfn)
{
    return (uint32_t)bus << 8 | (uint32_t)devfn;
}

and in patch there are there param.
+                msix_mmio.id = calc_assigned_dev_id(r_dev->h_segnr,
+                        r_dev->h_busnr, r_dev->h_devfn);


#ifdef KVM_CAP_MSIX_MASK
            if (cap_mask) {
                memset(&msix_mmio, 0, sizeof msix_mmio);
                msix_mmio.id = calc_assigned_dev_id(r_dev->h_busnr,
r_dev->h_devfn);
                msix_mmio.type = KVM_MSIX_TYPE_ASSIGNED_DEV;
                msix_mmio.base_addr = e_phys + offset;
                msix_mmio.max_entries_nr = r_dev->max_msix_entries_nr;
                msix_mmio.flags = KVM_MSIX_MMIO_FLAG_REGISTER;
                ret = kvm_update_msix_mmio(kvm_context, &msix_mmio);
                if (ret)
                    fprintf(stderr, "fail to register in-kernel msix_mmio!\n");
            }
#endif



2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> On Wednesday 01 December 2010 16:54:16 lidong chen wrote:
>> yes, i patch qemu as well.
>>
>> and i found the address of second vf is not in mmio range. the first
>> one is fine.
>
> So looks like something wrong with MMIO register part. Could you check the
> registeration in assigned_dev_iomem_map() of the 4th patch for QEmu? I suppose
> something wrong with it. I would try to reproduce it here.
>
> And if you only use one vf, how about the gain?
>
> --
> regards
> Yang, Sheng
>
>>
>> 2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
>> > On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
>> >> I used sr-iov, give each vm 2 vf.
>> >> after apply the patch, and i found performence is the same.
>> >>
>> >> the reason is in function msix_mmio_write, mostly addr is not in mmio
>> >> range.
>> >
>> > Did you patch qemu as well? You can see it's impossible for kernel part
>> > to work alone...
>> >
>> > http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html
>> >
>> > --
>> > regards
>> > Yang, Sheng
>> >
>> >> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
>> >> len, const void *val)
>> >> {
>> >>       struct kvm_assigned_dev_kernel *adev =
>> >>                       container_of(this, struct kvm_assigned_dev_kernel,
>> >>                                    msix_mmio_dev);
>> >>       int idx, r = 0;
>> >>       unsigned long new_val = *(unsigned long *)val;
>> >>
>> >>       mutex_lock(&adev->kvm->lock);
>> >>       if (!msix_mmio_in_range(adev, addr, len)) {
>> >>               // return here.
>> >>                  r = -EOPNOTSUPP;
>> >>               goto out;
>> >>       }
>> >>
>> >> i printk the value:
>> >> addr             start           end           len
>> >> F004C00C   F0044000  F0044030     4
>> >>
>> >> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> >> 01) Subsystem: Intel Corporation Unknown device 000c
>> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> >> ParErr- Stepping- SERR- FastB2B-
>> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> >> <TAbort- <MAbort- >SERR- <PERR-
>> >>       Latency: 0
>> >>       Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
>> >>       Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
>> >>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> >>               Vector table: BAR=3 offset=00000000
>> >>               PBA: BAR=3 offset=00002000
>> >>
>> >> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> >> 01) Subsystem: Intel Corporation Unknown device 000c
>> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> >> ParErr- Stepping- SERR- FastB2B-
>> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> >> <TAbort- <MAbort- >SERR- <PERR-
>> >>       Latency: 0
>> >>       Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
>> >>       Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
>> >>       Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> >>               Vector table: BAR=3 offset=00000000
>> >>               PBA: BAR=3 offset=00002000
>> >>
>> >>
>> >>
>> >> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
>> >> +                           gpa_t addr, int len)
>> >> +{
>> >> +     gpa_t start, end;
>> >> +
>> >> +     BUG_ON(adev->msix_mmio_base == 0);
>> >> +     start = adev->msix_mmio_base;
>> >> +     end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
>> >> +             adev->msix_max_entries_nr;
>> >> +     if (addr >= start && addr + len <= end)
>> >> +             return true;
>> >> +
>> >> +     return false;
>> >> +}
>> >>
>> >> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
>> >> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
>> >> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
>> >> >>
>> >> >> I test kvm with sriov, which the vf driver could not disable msix.
>> >> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
>> >> >>
>> >> >> then I test xen with sriov, there ara also a lot of vm exits caused
>> >> >> by MSIX mask.
>> >> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
>> >> >> and domain0 is 60%.
>> >> >>
>> >> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
>> >> >>
>> >> >> so i think the problem is kvm waste more cpu resource to deal with
>> >> >> MSIX mask. and we can see how xen deal with MSIX mask.
>> >> >>
>> >> >> if this problem sloved, maybe with MSIX enabled, the performace is
>> >> >> better.
>> >> >
>> >> > Please refer to my posted patches for this issue.
>> >> >
>> >> > http://www.spinics.net/lists/kvm/msg44992.html
>> >> >
>> >> > --
>> >> > regards
>> >> > Yang, Sheng
>> >> >
>> >> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
>> >> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
>> >> >> >> can you tell me something about this problem.
>> >> >> >> thanks.
>> >> >> >
>> >> >> > Which problem?
>> >> >> >
>> >> >> > --
>> >> >> > I have a truly marvellous patch that fixes the bug which this
>> >> >> > signature is too narrow to contain.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  9:02                 ` Yang, Sheng
  2010-12-01  9:29                   ` lidong chen
@ 2010-12-01  9:34                   ` Yang, Sheng
  1 sibling, 0 replies; 22+ messages in thread
From: Yang, Sheng @ 2010-12-01  9:34 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 17:02:57 Yang, Sheng wrote:
> On Wednesday 01 December 2010 16:54:16 lidong chen wrote:
> > yes, i patch qemu as well.
> > 
> > and i found the address of second vf is not in mmio range. the first
> > one is fine.
> 
> So looks like something wrong with MMIO register part. Could you check the
> registeration in assigned_dev_iomem_map() of the 4th patch for QEmu? I
> suppose something wrong with it. I would try to reproduce it here.
> 
> And if you only use one vf, how about the gain?

It's weird... I've tried assign two vfs to the guest, and two devices' MSI-X mask 
bit accessing both being intercepted by kernel as expected...

So for msix_mmio_write/read, you can't see any mask bit accessing from the second 
device? 

Also the benefit of this patch would show only when mask bit operation intensity is 
high. So how about your interrupt intensity?

--
regards
Yang, Sheng

> 
> --
> regards
> Yang, Sheng
> 
> > 2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> > > On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
> > >> I used sr-iov, give each vm 2 vf.
> > >> after apply the patch, and i found performence is the same.
> > >> 
> > >> the reason is in function msix_mmio_write, mostly addr is not in mmio
> > >> range.
> > > 
> > > Did you patch qemu as well? You can see it's impossible for kernel part
> > > to work alone...
> > > 
> > > http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html
> > > 
> > > --
> > > regards
> > > Yang, Sheng
> > > 
> > >> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
> > >> len, const void *val)
> > >> {
> > >> 
> > >>       struct kvm_assigned_dev_kernel *adev =
> > >>       
> > >>                       container_of(this, struct
> > >>                       kvm_assigned_dev_kernel,
> > >>                       
> > >>                                    msix_mmio_dev);
> > >>       
> > >>       int idx, r = 0;
> > >>       unsigned long new_val = *(unsigned long *)val;
> > >>       
> > >>       mutex_lock(&adev->kvm->lock);
> > >>       if (!msix_mmio_in_range(adev, addr, len)) {
> > >>       
> > >>               // return here.
> > >>               
> > >>                  r = -EOPNOTSUPP;
> > >>               
> > >>               goto out;
> > >>       
> > >>       }
> > >> 
> > >> i printk the value:
> > >> addr             start           end           len
> > >> F004C00C   F0044000  F0044030     4
> > >> 
> > >> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed
> > >> (rev 01) Subsystem: Intel Corporation Unknown device 000c
> > >> 
> > >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > >> 
> > >> ParErr- Stepping- SERR- FastB2B-
> > >> 
> > >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > >> 
> > >> <TAbort- <MAbort- >SERR- <PERR-
> > >> 
> > >>       Latency: 0
> > >>       Region 0: Memory at f0040000 (32-bit, non-prefetchable)
> > >>       [size=16K] Region 3: Memory at f0044000 (32-bit,
> > >>       non-prefetchable) [size=16K] Capabilities: [40] MSI-X: Enable+
> > >>       Mask- TabSize=3
> > >>       
> > >>               Vector table: BAR=3 offset=00000000
> > >>               PBA: BAR=3 offset=00002000
> > >> 
> > >> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed
> > >> (rev 01) Subsystem: Intel Corporation Unknown device 000c
> > >> 
> > >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > >> 
> > >> ParErr- Stepping- SERR- FastB2B-
> > >> 
> > >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > >> 
> > >> <TAbort- <MAbort- >SERR- <PERR-
> > >> 
> > >>       Latency: 0
> > >>       Region 0: Memory at f0048000 (32-bit, non-prefetchable)
> > >>       [size=16K] Region 3: Memory at f004c000 (32-bit,
> > >>       non-prefetchable) [size=16K] Capabilities: [40] MSI-X: Enable+
> > >>       Mask- TabSize=3
> > >>       
> > >>               Vector table: BAR=3 offset=00000000
> > >>               PBA: BAR=3 offset=00002000
> > >> 
> > >> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> > >> +                           gpa_t addr, int len)
> > >> +{
> > >> +     gpa_t start, end;
> > >> +
> > >> +     BUG_ON(adev->msix_mmio_base == 0);
> > >> +     start = adev->msix_mmio_base;
> > >> +     end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> > >> +             adev->msix_max_entries_nr;
> > >> +     if (addr >= start && addr + len <= end)
> > >> +             return true;
> > >> +
> > >> +     return false;
> > >> +}
> > >> 
> > >> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > >> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> > >> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu
> > >> >> resource.
> > >> >> 
> > >> >> I test kvm with sriov, which the vf driver could not disable msix.
> > >> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> > >> >> 
> > >> >> then I test xen with sriov, there ara also a lot of vm exits caused
> > >> >> by MSIX mask.
> > >> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of
> > >> >> xen and domain0 is 60%.
> > >> >> 
> > >> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> > >> >> 
> > >> >> so i think the problem is kvm waste more cpu resource to deal with
> > >> >> MSIX mask. and we can see how xen deal with MSIX mask.
> > >> >> 
> > >> >> if this problem sloved, maybe with MSIX enabled, the performace is
> > >> >> better.
> > >> > 
> > >> > Please refer to my posted patches for this issue.
> > >> > 
> > >> > http://www.spinics.net/lists/kvm/msg44992.html
> > >> > 
> > >> > --
> > >> > regards
> > >> > Yang, Sheng
> > >> > 
> > >> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> > >> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> > >> >> >> can you tell me something about this problem.
> > >> >> >> thanks.
> > >> >> > 
> > >> >> > Which problem?
> > >> >> > 
> > >> >> > --
> > >> >> > I have a truly marvellous patch that fixes the bug which this
> > >> >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  9:29                   ` lidong chen
@ 2010-12-01  9:37                     ` Yang, Sheng
  0 siblings, 0 replies; 22+ messages in thread
From: Yang, Sheng @ 2010-12-01  9:37 UTC (permalink / raw)
  To: lidong chen
  Cc: Avi Kivity, Gleb Natapov, mst@redhat.com, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 17:29:44 lidong chen wrote:
> maybe because i modify the code in assigned_dev_iomem_map().
> 
> i used RHEL6, and calc_assigned_dev_id is below:
> 
> static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t devfn)
> {
>     return (uint32_t)bus << 8 | (uint32_t)devfn;
> }
> 
> and in patch there are there param.
> +                msix_mmio.id = calc_assigned_dev_id(r_dev->h_segnr,
> +                        r_dev->h_busnr, r_dev->h_devfn);

This one should be fine because h_segnr should be 0 here.

But I strongly recommend you to use latest KVM and latest QEmu, we won't know what 
would happen during the rebase... (maybe my patch is a little old for the latest 
one, so my kvm base is 365bb670a44b217870c2ee1065f57bb43b57e166, qemu base is 
420fe74769cc67baec6f3d962dc054e2972ca3ae).

Things to be checked:
1. If two devices' MMIO have been registered successfully.
2. If you can see the mask bit accessing in kernel from both devices.

--
regards
Yang, Sheng

> 
> 
> #ifdef KVM_CAP_MSIX_MASK
>             if (cap_mask) {
>                 memset(&msix_mmio, 0, sizeof msix_mmio);
>                 msix_mmio.id = calc_assigned_dev_id(r_dev->h_busnr,
> r_dev->h_devfn);
>                 msix_mmio.type = KVM_MSIX_TYPE_ASSIGNED_DEV;
>                 msix_mmio.base_addr = e_phys + offset;
>                 msix_mmio.max_entries_nr = r_dev->max_msix_entries_nr;
>                 msix_mmio.flags = KVM_MSIX_MMIO_FLAG_REGISTER;
>                 ret = kvm_update_msix_mmio(kvm_context, &msix_mmio);
>                 if (ret)
>                     fprintf(stderr, "fail to register in-kernel
> msix_mmio!\n"); }
> #endif
> 
> 2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> > On Wednesday 01 December 2010 16:54:16 lidong chen wrote:
> >> yes, i patch qemu as well.
> >> 
> >> and i found the address of second vf is not in mmio range. the first
> >> one is fine.
> > 
> > So looks like something wrong with MMIO register part. Could you check
> > the registeration in assigned_dev_iomem_map() of the 4th patch for QEmu?
> > I suppose something wrong with it. I would try to reproduce it here.
> > 
> > And if you only use one vf, how about the gain?
> > 
> > --
> > regards
> > Yang, Sheng
> > 
> >> 2010/12/1 Yang, Sheng <sheng.yang@intel.com>:
> >> > On Wednesday 01 December 2010 16:41:38 lidong chen wrote:
> >> >> I used sr-iov, give each vm 2 vf.
> >> >> after apply the patch, and i found performence is the same.
> >> >> 
> >> >> the reason is in function msix_mmio_write, mostly addr is not in mmio
> >> >> range.
> >> > 
> >> > Did you patch qemu as well? You can see it's impossible for kernel
> >> > part to work alone...
> >> > 
> >> > http://www.mail-archive.com/kvm@vger.kernel.org/msg44368.html
> >> > 
> >> > --
> >> > regards
> >> > Yang, Sheng
> >> > 
> >> >> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr,
> >> >> int len, const void *val)
> >> >> {
> >> >> 
> >> >>       struct kvm_assigned_dev_kernel *adev =
> >> >>       
> >> >>                       container_of(this, struct
> >> >>                       kvm_assigned_dev_kernel,
> >> >>                       
> >> >>                                    msix_mmio_dev);
> >> >>       
> >> >>       int idx, r = 0;
> >> >>       unsigned long new_val = *(unsigned long *)val;
> >> >>       
> >> >>       mutex_lock(&adev->kvm->lock);
> >> >>       if (!msix_mmio_in_range(adev, addr, len)) {
> >> >>       
> >> >>               // return here.
> >> >>               
> >> >>                  r = -EOPNOTSUPP;
> >> >>               
> >> >>               goto out;
> >> >>       
> >> >>       }
> >> >> 
> >> >> i printk the value:
> >> >> addr             start           end           len
> >> >> F004C00C   F0044000  F0044030     4
> >> >> 
> >> >> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed
> >> >> (rev 01) Subsystem: Intel Corporation Unknown device 000c
> >> >> 
> >> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> >> >> 
> >> >> ParErr- Stepping- SERR- FastB2B-
> >> >> 
> >> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> >> 
> >> >> <TAbort- <MAbort- >SERR- <PERR-
> >> >> 
> >> >>       Latency: 0
> >> >>       Region 0: Memory at f0040000 (32-bit, non-prefetchable)
> >> >>       [size=16K] Region 3: Memory at f0044000 (32-bit,
> >> >>       non-prefetchable) [size=16K] Capabilities: [40] MSI-X: Enable+
> >> >>       Mask- TabSize=3
> >> >>       
> >> >>               Vector table: BAR=3 offset=00000000
> >> >>               PBA: BAR=3 offset=00002000
> >> >> 
> >> >> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed
> >> >> (rev 01) Subsystem: Intel Corporation Unknown device 000c
> >> >> 
> >> >>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> >> >> 
> >> >> ParErr- Stepping- SERR- FastB2B-
> >> >> 
> >> >>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> >> 
> >> >> <TAbort- <MAbort- >SERR- <PERR-
> >> >> 
> >> >>       Latency: 0
> >> >>       Region 0: Memory at f0048000 (32-bit, non-prefetchable)
> >> >>       [size=16K] Region 3: Memory at f004c000 (32-bit,
> >> >>       non-prefetchable) [size=16K] Capabilities: [40] MSI-X: Enable+
> >> >>       Mask- TabSize=3
> >> >>       
> >> >>               Vector table: BAR=3 offset=00000000
> >> >>               PBA: BAR=3 offset=00002000
> >> >> 
> >> >> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> >> >> +                           gpa_t addr, int len)
> >> >> +{
> >> >> +     gpa_t start, end;
> >> >> +
> >> >> +     BUG_ON(adev->msix_mmio_base == 0);
> >> >> +     start = adev->msix_mmio_base;
> >> >> +     end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> >> >> +             adev->msix_max_entries_nr;
> >> >> +     if (addr >= start && addr + len <= end)
> >> >> +             return true;
> >> >> +
> >> >> +     return false;
> >> >> +}
> >> >> 
> >> >> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> >> >> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> >> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu
> >> >> >> resource.
> >> >> >> 
> >> >> >> I test kvm with sriov, which the vf driver could not disable msix.
> >> >> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >> >> >> 
> >> >> >> then I test xen with sriov, there ara also a lot of vm exits
> >> >> >> caused by MSIX mask.
> >> >> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of
> >> >> >> xen and domain0 is 60%.
> >> >> >> 
> >> >> >> without sr-iov, the cpu rate of xen and domain0 is higher than
> >> >> >> kvm.
> >> >> >> 
> >> >> >> so i think the problem is kvm waste more cpu resource to deal with
> >> >> >> MSIX mask. and we can see how xen deal with MSIX mask.
> >> >> >> 
> >> >> >> if this problem sloved, maybe with MSIX enabled, the performace is
> >> >> >> better.
> >> >> > 
> >> >> > Please refer to my posted patches for this issue.
> >> >> > 
> >> >> > http://www.spinics.net/lists/kvm/msg44992.html
> >> >> > 
> >> >> > --
> >> >> > regards
> >> >> > Yang, Sheng
> >> >> > 
> >> >> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> >> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> >> >> >> can you tell me something about this problem.
> >> >> >> >> thanks.
> >> >> >> > 
> >> >> >> > Which problem?
> >> >> >> > 
> >> >> >> > --
> >> >> >> > I have a truly marvellous patch that fixes the bug which this
> >> >> >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01  8:41           ` lidong chen
  2010-12-01  8:49             ` Yang, Sheng
  2010-12-01  8:56             ` Yang, Sheng
@ 2010-12-01 14:03             ` Michael S. Tsirkin
  2010-12-02  1:13               ` Yang, Sheng
  2 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2010-12-01 14:03 UTC (permalink / raw)
  To: lidong chen
  Cc: Yang, Sheng, Avi Kivity, Gleb Natapov, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
> I used sr-iov, give each vm 2 vf.
> after apply the patch, and i found performence is the same.
> 
> the reason is in function msix_mmio_write, mostly addr is not in mmio range.
> 
> static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
> 			   const void *val)
> {
> 	struct kvm_assigned_dev_kernel *adev =
> 			container_of(this, struct kvm_assigned_dev_kernel,
> 				     msix_mmio_dev);
> 	int idx, r = 0;
> 	unsigned long new_val = *(unsigned long *)val;
> 
> 	mutex_lock(&adev->kvm->lock);
> 	if (!msix_mmio_in_range(adev, addr, len)) {
> 		// return here.
>                  r = -EOPNOTSUPP;
> 		goto out;
> 	}
> 
> i printk the value:
> addr             start           end           len
> F004C00C   F0044000  F0044030     4
> 
> 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev 01)
> 	Subsystem: Intel Corporation Unknown device 000c
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> 	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00002000
> 
> 
> 
> +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> +			      gpa_t addr, int len)
> +{
> +	gpa_t start, end;
> +
> +	BUG_ON(adev->msix_mmio_base == 0);
> +	start = adev->msix_mmio_base;
> +	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> +		adev->msix_max_entries_nr;
> +	if (addr >= start && addr + len <= end)
> +		return true;
> +
> +	return false;
> +}


Hmm, this check looks wrong to me: there's no guarantee
that guest uses the first N entries in the table.
E.g. it could use a single entry, but only the last one.

> 
> 
> 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> >>
> >> I test kvm with sriov, which the vf driver could not disable msix.
> >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >>
> >> then I test xen with sriov, there ara also a lot of vm exits caused by
> >> MSIX mask.
> >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> >> and domain0 is 60%.
> >>
> >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> >>
> >> so i think the problem is kvm waste more cpu resource to deal with MSIX
> >> mask. and we can see how xen deal with MSIX mask.
> >>
> >> if this problem sloved, maybe with MSIX enabled, the performace is better.
> >
> > Please refer to my posted patches for this issue.
> >
> > http://www.spinics.net/lists/kvm/msg44992.html
> >
> > --
> > regards
> > Yang, Sheng
> >
> >>
> >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> >> can you tell me something about this problem.
> >> >> thanks.
> >> >
> >> > Which problem?
> >> >
> >> > --
> >> > I have a truly marvellous patch that fixes the bug which this
> >> > signature is too narrow to contain.
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-01 14:03             ` Michael S. Tsirkin
@ 2010-12-02  1:13               ` Yang, Sheng
  2010-12-02  9:49                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Yang, Sheng @ 2010-12-02  1:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: lidong chen, Avi Kivity, Gleb Natapov, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Wednesday 01 December 2010 22:03:58 Michael S. Tsirkin wrote:
> On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
> > I used sr-iov, give each vm 2 vf.
> > after apply the patch, and i found performence is the same.
> > 
> > the reason is in function msix_mmio_write, mostly addr is not in mmio
> > range.
> > 
> > static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
> > len,
> > 
> > 			   const void *val)
> > 
> > {
> > 
> > 	struct kvm_assigned_dev_kernel *adev =
> > 	
> > 			container_of(this, struct kvm_assigned_dev_kernel,
> > 			
> > 				     msix_mmio_dev);
> > 	
> > 	int idx, r = 0;
> > 	unsigned long new_val = *(unsigned long *)val;
> > 	
> > 	mutex_lock(&adev->kvm->lock);
> > 	if (!msix_mmio_in_range(adev, addr, len)) {
> > 	
> > 		// return here.
> > 		
> >                  r = -EOPNOTSUPP;
> > 		
> > 		goto out;
> > 	
> > 	}
> > 
> > i printk the value:
> > addr             start           end           len
> > F004C00C   F0044000  F0044030     4
> > 
> > 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> > 01)
> > 
> > 	Subsystem: Intel Corporation Unknown device 000c
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > 
> > Stepping- SERR- FastB2B-
> > 
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > 
> > <TAbort- <MAbort- >SERR- <PERR-
> > 
> > 	Latency: 0
> > 	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> > 	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> > 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> > 	
> > 		Vector table: BAR=3 offset=00000000
> > 		PBA: BAR=3 offset=00002000
> > 
> > 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> > 01)
> > 
> > 	Subsystem: Intel Corporation Unknown device 000c
> > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > 
> > Stepping- SERR- FastB2B-
> > 
> > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > 
> > <TAbort- <MAbort- >SERR- <PERR-
> > 
> > 	Latency: 0
> > 	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> > 	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> > 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> > 	
> > 		Vector table: BAR=3 offset=00000000
> > 		PBA: BAR=3 offset=00002000
> > 
> > +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> > +			      gpa_t addr, int len)
> > +{
> > +	gpa_t start, end;
> > +
> > +	BUG_ON(adev->msix_mmio_base == 0);
> > +	start = adev->msix_mmio_base;
> > +	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> > +		adev->msix_max_entries_nr;
> > +	if (addr >= start && addr + len <= end)
> > +		return true;
> > +
> > +	return false;
> > +}
> 
> Hmm, this check looks wrong to me: there's no guarantee
> that guest uses the first N entries in the table.
> E.g. it could use a single entry, but only the last one.

Please check the PCI spec.

--
regards
Yang, Sheng

 
> > 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> > >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> > >> 
> > >> I test kvm with sriov, which the vf driver could not disable msix.
> > >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> > >> 
> > >> then I test xen with sriov, there ara also a lot of vm exits caused by
> > >> MSIX mask.
> > >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> > >> and domain0 is 60%.
> > >> 
> > >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> > >> 
> > >> so i think the problem is kvm waste more cpu resource to deal with
> > >> MSIX mask. and we can see how xen deal with MSIX mask.
> > >> 
> > >> if this problem sloved, maybe with MSIX enabled, the performace is
> > >> better.
> > > 
> > > Please refer to my posted patches for this issue.
> > > 
> > > http://www.spinics.net/lists/kvm/msg44992.html
> > > 
> > > --
> > > regards
> > > Yang, Sheng
> > > 
> > >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> > >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> > >> >> can you tell me something about this problem.
> > >> >> thanks.
> > >> > 
> > >> > Which problem?
> > >> > 
> > >> > --
> > >> > I have a truly marvellous patch that fixes the bug which this
> > >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-02  1:13               ` Yang, Sheng
@ 2010-12-02  9:49                 ` Michael S. Tsirkin
  2010-12-02 11:52                   ` Sheng Yang
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2010-12-02  9:49 UTC (permalink / raw)
  To: Yang, Sheng
  Cc: lidong chen, Avi Kivity, Gleb Natapov, aliguori@us.ibm.com,
	rusty@rustcorp.com.au, kvm@vger.kernel.org

On Thu, Dec 02, 2010 at 09:13:28AM +0800, Yang, Sheng wrote:
> On Wednesday 01 December 2010 22:03:58 Michael S. Tsirkin wrote:
> > On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
> > > I used sr-iov, give each vm 2 vf.
> > > after apply the patch, and i found performence is the same.
> > > 
> > > the reason is in function msix_mmio_write, mostly addr is not in mmio
> > > range.
> > > 
> > > static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
> > > len,
> > > 
> > > 			   const void *val)
> > > 
> > > {
> > > 
> > > 	struct kvm_assigned_dev_kernel *adev =
> > > 	
> > > 			container_of(this, struct kvm_assigned_dev_kernel,
> > > 			
> > > 				     msix_mmio_dev);
> > > 	
> > > 	int idx, r = 0;
> > > 	unsigned long new_val = *(unsigned long *)val;
> > > 	
> > > 	mutex_lock(&adev->kvm->lock);
> > > 	if (!msix_mmio_in_range(adev, addr, len)) {
> > > 	
> > > 		// return here.
> > > 		
> > >                  r = -EOPNOTSUPP;
> > > 		
> > > 		goto out;
> > > 	
> > > 	}
> > > 
> > > i printk the value:
> > > addr             start           end           len
> > > F004C00C   F0044000  F0044030     4
> > > 
> > > 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> > > 01)
> > > 
> > > 	Subsystem: Intel Corporation Unknown device 000c
> > > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > 
> > > Stepping- SERR- FastB2B-
> > > 
> > > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > > 
> > > <TAbort- <MAbort- >SERR- <PERR-
> > > 
> > > 	Latency: 0
> > > 	Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> > > 	Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> > > 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> > > 	
> > > 		Vector table: BAR=3 offset=00000000
> > > 		PBA: BAR=3 offset=00002000
> > > 
> > > 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> > > 01)
> > > 
> > > 	Subsystem: Intel Corporation Unknown device 000c
> > > 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > 
> > > Stepping- SERR- FastB2B-
> > > 
> > > 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> > > 
> > > <TAbort- <MAbort- >SERR- <PERR-
> > > 
> > > 	Latency: 0
> > > 	Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> > > 	Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> > > 	Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> > > 	
> > > 		Vector table: BAR=3 offset=00000000
> > > 		PBA: BAR=3 offset=00002000
> > > 
> > > +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> > > +			      gpa_t addr, int len)
> > > +{
> > > +	gpa_t start, end;
> > > +
> > > +	BUG_ON(adev->msix_mmio_base == 0);
> > > +	start = adev->msix_mmio_base;
> > > +	end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> > > +		adev->msix_max_entries_nr;
> > > +	if (addr >= start && addr + len <= end)
> > > +		return true;
> > > +
> > > +	return false;
> > > +}
> > 
> > Hmm, this check looks wrong to me: there's no guarantee
> > that guest uses the first N entries in the table.
> > E.g. it could use a single entry, but only the last one.
> 
> Please check the PCI spec.


This is pretty explicit in the spec: the the last paragraph in the below:

IMPLEMENTATION NOTE
Handling MSI-X Vector Shortages

Handling MSI-X Vector Shortages
For the case where fewer vectors are allocated to a function than desired, software-
controlled aliasing as enabled by MSI-X is one approach for handling the situation. For
example, if a function supports five queues, each with an associated MSI-X table entry, but
only three vectors are allocated, the function could be designed for software still to configure
all five table entries, assigning one or more vectors to multiple table entries. Software could
assign the three vectors {A,B,C} to the five entries as ABCCC, ABBCC, ABCBA, or other
similar combinations.


Alternatively, the function could be designed for software to configure it (using a device-
specific mechanism) to use only three queues and three MSI-X table entries. Software could
assign the three vectors {A,B,C} to the five entries as ABC--, A-B-C, A--CB, or other similar
combinations.



> --
> regards
> Yang, Sheng
> 
>  
> > > 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> > > > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> > > >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> > > >> 
> > > >> I test kvm with sriov, which the vf driver could not disable msix.
> > > >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> > > >> 
> > > >> then I test xen with sriov, there ara also a lot of vm exits caused by
> > > >> MSIX mask.
> > > >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> > > >> and domain0 is 60%.
> > > >> 
> > > >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> > > >> 
> > > >> so i think the problem is kvm waste more cpu resource to deal with
> > > >> MSIX mask. and we can see how xen deal with MSIX mask.
> > > >> 
> > > >> if this problem sloved, maybe with MSIX enabled, the performace is
> > > >> better.
> > > > 
> > > > Please refer to my posted patches for this issue.
> > > > 
> > > > http://www.spinics.net/lists/kvm/msg44992.html
> > > > 
> > > > --
> > > > regards
> > > > Yang, Sheng
> > > > 
> > > >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> > > >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> > > >> >> can you tell me something about this problem.
> > > >> >> thanks.
> > > >> > 
> > > >> > Which problem?
> > > >> > 
> > > >> > --
> > > >> > I have a truly marvellous patch that fixes the bug which this
> > > >> > signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-02  9:49                 ` Michael S. Tsirkin
@ 2010-12-02 11:52                   ` Sheng Yang
  2010-12-02 12:23                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Sheng Yang @ 2010-12-02 11:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Yang, Sheng, lidong chen, Avi Kivity, Gleb Natapov,
	aliguori@us.ibm.com, rusty@rustcorp.com.au, kvm@vger.kernel.org

On Thu, Dec 2, 2010 at 5:49 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Dec 02, 2010 at 09:13:28AM +0800, Yang, Sheng wrote:
>> On Wednesday 01 December 2010 22:03:58 Michael S. Tsirkin wrote:
>> > On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
>> > > I used sr-iov, give each vm 2 vf.
>> > > after apply the patch, and i found performence is the same.
>> > >
>> > > the reason is in function msix_mmio_write, mostly addr is not in mmio
>> > > range.
>> > >
>> > > static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
>> > > len,
>> > >
>> > >                      const void *val)
>> > >
>> > > {
>> > >
>> > >   struct kvm_assigned_dev_kernel *adev =
>> > >
>> > >                   container_of(this, struct kvm_assigned_dev_kernel,
>> > >
>> > >                                msix_mmio_dev);
>> > >
>> > >   int idx, r = 0;
>> > >   unsigned long new_val = *(unsigned long *)val;
>> > >
>> > >   mutex_lock(&adev->kvm->lock);
>> > >   if (!msix_mmio_in_range(adev, addr, len)) {
>> > >
>> > >           // return here.
>> > >
>> > >                  r = -EOPNOTSUPP;
>> > >
>> > >           goto out;
>> > >
>> > >   }
>> > >
>> > > i printk the value:
>> > > addr             start           end           len
>> > > F004C00C   F0044000  F0044030     4
>> > >
>> > > 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> > > 01)
>> > >
>> > >   Subsystem: Intel Corporation Unknown device 000c
>> > >   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> > >
>> > > Stepping- SERR- FastB2B-
>> > >
>> > >   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> > >
>> > > <TAbort- <MAbort- >SERR- <PERR-
>> > >
>> > >   Latency: 0
>> > >   Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
>> > >   Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
>> > >   Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> > >
>> > >           Vector table: BAR=3 offset=00000000
>> > >           PBA: BAR=3 offset=00002000
>> > >
>> > > 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> > > 01)
>> > >
>> > >   Subsystem: Intel Corporation Unknown device 000c
>> > >   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> > >
>> > > Stepping- SERR- FastB2B-
>> > >
>> > >   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> > >
>> > > <TAbort- <MAbort- >SERR- <PERR-
>> > >
>> > >   Latency: 0
>> > >   Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
>> > >   Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
>> > >   Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> > >
>> > >           Vector table: BAR=3 offset=00000000
>> > >           PBA: BAR=3 offset=00002000
>> > >
>> > > +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
>> > > +                       gpa_t addr, int len)
>> > > +{
>> > > + gpa_t start, end;
>> > > +
>> > > + BUG_ON(adev->msix_mmio_base == 0);
>> > > + start = adev->msix_mmio_base;
>> > > + end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
>> > > +         adev->msix_max_entries_nr;
>> > > + if (addr >= start && addr + len <= end)
>> > > +         return true;
>> > > +
>> > > + return false;
>> > > +}
>> >
>> > Hmm, this check looks wrong to me: there's no guarantee
>> > that guest uses the first N entries in the table.
>> > E.g. it could use a single entry, but only the last one.
>>
>> Please check the PCI spec.
>
>
> This is pretty explicit in the spec: the the last paragraph in the below:
>
> IMPLEMENTATION NOTE
> Handling MSI-X Vector Shortages
>
> Handling MSI-X Vector Shortages
> For the case where fewer vectors are allocated to a function than desired,

You may not notice the premise here. Also check for "Table Size" would
help I think.

-- 
regards,
Yang, Sheng

software-
> controlled aliasing as enabled by MSI-X is one approach for handling the situation. For
> example, if a function supports five queues, each with an associated MSI-X table entry, but
> only three vectors are allocated, the function could be designed for software still to configure
> all five table entries, assigning one or more vectors to multiple table entries. Software could
> assign the three vectors {A,B,C} to the five entries as ABCCC, ABBCC, ABCBA, or other
> similar combinations.
>
>
> Alternatively, the function could be designed for software to configure it (using a device-
> specific mechanism) to use only three queues and three MSI-X table entries. Software could
> assign the three vectors {A,B,C} to the five entries as ABC--, A-B-C, A--CB, or other similar
> combinations.
>
>
>
>> --
>> regards
>> Yang, Sheng
>>
>>
>> > > 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
>> > > > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
>> > > >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
>> > > >>
>> > > >> I test kvm with sriov, which the vf driver could not disable msix.
>> > > >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
>> > > >>
>> > > >> then I test xen with sriov, there ara also a lot of vm exits caused by
>> > > >> MSIX mask.
>> > > >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
>> > > >> and domain0 is 60%.
>> > > >>
>> > > >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
>> > > >>
>> > > >> so i think the problem is kvm waste more cpu resource to deal with
>> > > >> MSIX mask. and we can see how xen deal with MSIX mask.
>> > > >>
>> > > >> if this problem sloved, maybe with MSIX enabled, the performace is
>> > > >> better.
>> > > >
>> > > > Please refer to my posted patches for this issue.
>> > > >
>> > > > http://www.spinics.net/lists/kvm/msg44992.html
>> > > >
>> > > > --
>> > > > regards
>> > > > Yang, Sheng
>> > > >
>> > > >> 2010/11/23 Avi Kivity <avi@redhat.com>:
>> > > >> > On 11/23/2010 09:27 AM, lidong chen wrote:
>> > > >> >> can you tell me something about this problem.
>> > > >> >> thanks.
>> > > >> >
>> > > >> > Which problem?
>> > > >> >
>> > > >> > --
>> > > >> > I have a truly marvellous patch that fixes the bug which this
>> > > >> > signature is too narrow to contain.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-02 11:52                   ` Sheng Yang
@ 2010-12-02 12:23                     ` Michael S. Tsirkin
  2010-12-02 14:01                       ` lidong chen
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2010-12-02 12:23 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Yang, Sheng, lidong chen, Avi Kivity, Gleb Natapov,
	aliguori@us.ibm.com, rusty@rustcorp.com.au, kvm@vger.kernel.org

On Thu, Dec 02, 2010 at 07:52:00PM +0800, Sheng Yang wrote:
> On Thu, Dec 2, 2010 at 5:49 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Thu, Dec 02, 2010 at 09:13:28AM +0800, Yang, Sheng wrote:
> >> On Wednesday 01 December 2010 22:03:58 Michael S. Tsirkin wrote:
> >> > On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
> >> > > I used sr-iov, give each vm 2 vf.
> >> > > after apply the patch, and i found performence is the same.
> >> > >
> >> > > the reason is in function msix_mmio_write, mostly addr is not in mmio
> >> > > range.
> >> > >
> >> > > static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
> >> > > len,
> >> > >
> >> > >                      const void *val)
> >> > >
> >> > > {
> >> > >
> >> > >   struct kvm_assigned_dev_kernel *adev =
> >> > >
> >> > >                   container_of(this, struct kvm_assigned_dev_kernel,
> >> > >
> >> > >                                msix_mmio_dev);
> >> > >
> >> > >   int idx, r = 0;
> >> > >   unsigned long new_val = *(unsigned long *)val;
> >> > >
> >> > >   mutex_lock(&adev->kvm->lock);
> >> > >   if (!msix_mmio_in_range(adev, addr, len)) {
> >> > >
> >> > >           // return here.
> >> > >
> >> > >                  r = -EOPNOTSUPP;
> >> > >
> >> > >           goto out;
> >> > >
> >> > >   }
> >> > >
> >> > > i printk the value:
> >> > > addr             start           end           len
> >> > > F004C00C   F0044000  F0044030     4
> >> > >
> >> > > 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> >> > > 01)
> >> > >
> >> > >   Subsystem: Intel Corporation Unknown device 000c
> >> > >   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> >> > >
> >> > > Stepping- SERR- FastB2B-
> >> > >
> >> > >   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> > >
> >> > > <TAbort- <MAbort- >SERR- <PERR-
> >> > >
> >> > >   Latency: 0
> >> > >   Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
> >> > >   Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
> >> > >   Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> >> > >
> >> > >           Vector table: BAR=3 offset=00000000
> >> > >           PBA: BAR=3 offset=00002000
> >> > >
> >> > > 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
> >> > > 01)
> >> > >
> >> > >   Subsystem: Intel Corporation Unknown device 000c
> >> > >   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> >> > >
> >> > > Stepping- SERR- FastB2B-
> >> > >
> >> > >   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >> > >
> >> > > <TAbort- <MAbort- >SERR- <PERR-
> >> > >
> >> > >   Latency: 0
> >> > >   Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
> >> > >   Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
> >> > >   Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
> >> > >
> >> > >           Vector table: BAR=3 offset=00000000
> >> > >           PBA: BAR=3 offset=00002000
> >> > >
> >> > > +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
> >> > > +                       gpa_t addr, int len)
> >> > > +{
> >> > > + gpa_t start, end;
> >> > > +
> >> > > + BUG_ON(adev->msix_mmio_base == 0);
> >> > > + start = adev->msix_mmio_base;
> >> > > + end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
> >> > > +         adev->msix_max_entries_nr;
> >> > > + if (addr >= start && addr + len <= end)
> >> > > +         return true;
> >> > > +
> >> > > + return false;
> >> > > +}
> >> >
> >> > Hmm, this check looks wrong to me: there's no guarantee
> >> > that guest uses the first N entries in the table.
> >> > E.g. it could use a single entry, but only the last one.
> >>
> >> Please check the PCI spec.
> >
> >
> > This is pretty explicit in the spec: the the last paragraph in the below:
> >
> > IMPLEMENTATION NOTE
> > Handling MSI-X Vector Shortages
> >
> > Handling MSI-X Vector Shortages
> > For the case where fewer vectors are allocated to a function than desired,
> 
> You may not notice the premise here.

I noticed it.

> Also check for "Table Size" would
> help I think.

It would help if msix_max_entries_nr was MSIX Table Size N
(encoded in PCI config space as  N - 1).
Is this what it is? Maybe add this in some comment.

> -- 
> regards,
> Yang, Sheng
> 
> software-
> > controlled aliasing as enabled by MSI-X is one approach for handling the situation. For
> > example, if a function supports five queues, each with an associated MSI-X table entry, but
> > only three vectors are allocated, the function could be designed for software still to configure
> > all five table entries, assigning one or more vectors to multiple table entries. Software could
> > assign the three vectors {A,B,C} to the five entries as ABCCC, ABBCC, ABCBA, or other
> > similar combinations.
> >
> >
> > Alternatively, the function could be designed for software to configure it (using a device-
> > specific mechanism) to use only three queues and three MSI-X table entries. Software could
> > assign the three vectors {A,B,C} to the five entries as ABC--, A-B-C, A--CB, or other similar
> > combinations.
> >
> >
> >
> >> --
> >> regards
> >> Yang, Sheng
> >>
> >>
> >> > > 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
> >> > > > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
> >> > > >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
> >> > > >>
> >> > > >> I test kvm with sriov, which the vf driver could not disable msix.
> >> > > >> so the host os waste a lot of cpu.  cpu rate of host os is 90%.
> >> > > >>
> >> > > >> then I test xen with sriov, there ara also a lot of vm exits caused by
> >> > > >> MSIX mask.
> >> > > >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
> >> > > >> and domain0 is 60%.
> >> > > >>
> >> > > >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
> >> > > >>
> >> > > >> so i think the problem is kvm waste more cpu resource to deal with
> >> > > >> MSIX mask. and we can see how xen deal with MSIX mask.
> >> > > >>
> >> > > >> if this problem sloved, maybe with MSIX enabled, the performace is
> >> > > >> better.
> >> > > >
> >> > > > Please refer to my posted patches for this issue.
> >> > > >
> >> > > > http://www.spinics.net/lists/kvm/msg44992.html
> >> > > >
> >> > > > --
> >> > > > regards
> >> > > > Yang, Sheng
> >> > > >
> >> > > >> 2010/11/23 Avi Kivity <avi@redhat.com>:
> >> > > >> > On 11/23/2010 09:27 AM, lidong chen wrote:
> >> > > >> >> can you tell me something about this problem.
> >> > > >> >> thanks.
> >> > > >> >
> >> > > >> > Which problem?
> >> > > >> >
> >> > > >> > --
> >> > > >> > I have a truly marvellous patch that fixes the bug which this
> >> > > >> > signature is too narrow to contain.
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-12-02 12:23                     ` Michael S. Tsirkin
@ 2010-12-02 14:01                       ` lidong chen
  0 siblings, 0 replies; 22+ messages in thread
From: lidong chen @ 2010-12-02 14:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sheng Yang, Yang, Sheng, Avi Kivity, Gleb Natapov,
	aliguori@us.ibm.com, rusty@rustcorp.com.au, kvm@vger.kernel.org

 i apply patch correctly.

the addr is not in mmio range because kvm_io_bus_write test the addr
for each device.
/* kvm_io_bus_write - called under kvm->slots_lock */
int kvm_io_bus_write(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
		     int len, const void *val)
{
	int i;
	struct kvm_io_bus *bus = rcu_dereference(kvm->buses[bus_idx]);
	for (i = 0; i < bus->dev_count; i++)
		if (!kvm_iodevice_write(bus->devs[i], addr, len, val))
			return 0;
	return -EOPNOTSUPP;
}

the test result(cpu rate) before apply patch:
                  hostos    guestos
CPU0         15.85       58.05
CPU1          2.97	70.56
CPU2          3.25	69.00
CPU3          3.31	68.59
CPU4          5.11	47.46
CPU5          4.70	29.28
CPU6          6.00	5.96
CPU7          4.75	2.58

the test result(cpu rate) after apply patch:
                  hostos    guestos
CPU0         11.89      60.92
CPU1          2.18    	68.07
CPU2         2.61        66.18
CPU3         2.44        66.46
CPU4         2.67        44.98
CPU5         2.31        26.81
CPU6         1.93        5.65
CPU7         1.79         2.14

the total cpu rate reduce from 397% to 369%.

i pin the vcpu below, and only vcpu0 handle msix interrupt.
virsh vcpupin brd2vm1sriov 0 0
virsh vcpupin brd2vm1sriov 1 1
virsh vcpupin brd2vm1sriov 2 2
virsh vcpupin brd2vm1sriov 3 3
virsh vcpupin brd2vm1sriov 4 4
virsh vcpupin brd2vm1sriov 5 5
virsh vcpupin brd2vm1sriov 6 6
virsh vcpupin brd2vm1sriov 7 7

I will test virtio_net with msix enable later, and compare to it with
msix disable.
to see which is better?


2010/12/2 Michael S. Tsirkin <mst@redhat.com>:
> On Thu, Dec 02, 2010 at 07:52:00PM +0800, Sheng Yang wrote:
>> On Thu, Dec 2, 2010 at 5:49 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Thu, Dec 02, 2010 at 09:13:28AM +0800, Yang, Sheng wrote:
>> >> On Wednesday 01 December 2010 22:03:58 Michael S. Tsirkin wrote:
>> >> > On Wed, Dec 01, 2010 at 04:41:38PM +0800, lidong chen wrote:
>> >> > > I used sr-iov, give each vm 2 vf.
>> >> > > after apply the patch, and i found performence is the same.
>> >> > >
>> >> > > the reason is in function msix_mmio_write, mostly addr is not in mmio
>> >> > > range.
>> >> > >
>> >> > > static int msix_mmio_write(struct kvm_io_device *this, gpa_t addr, int
>> >> > > len,
>> >> > >
>> >> > > á á á á á á á á á á áconst void *val)
>> >> > >
>> >> > > {
>> >> > >
>> >> > > á struct kvm_assigned_dev_kernel *adev =
>> >> > >
>> >> > > á á á á á á á á á container_of(this, struct kvm_assigned_dev_kernel,
>> >> > >
>> >> > > á á á á á á á á á á á á á á á ámsix_mmio_dev);
>> >> > >
>> >> > > á int idx, r = 0;
>> >> > > á unsigned long new_val = *(unsigned long *)val;
>> >> > >
>> >> > > á mutex_lock(&adev->kvm->lock);
>> >> > > á if (!msix_mmio_in_range(adev, addr, len)) {
>> >> > >
>> >> > > á á á á á // return here.
>> >> > >
>> >> > > á á á á á á á á ár = -EOPNOTSUPP;
>> >> > >
>> >> > > á á á á á goto out;
>> >> > >
>> >> > > á }
>> >> > >
>> >> > > i printk the value:
>> >> > > addr á á á á á á start á á á á á end á á á á á len
>> >> > > F004C00C á F0044000 áF0044030 á á 4
>> >> > >
>> >> > > 00:06.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> >> > > 01)
>> >> > >
>> >> > > á Subsystem: Intel Corporation Unknown device 000c
>> >> > > á Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> >> > >
>> >> > > Stepping- SERR- FastB2B-
>> >> > >
>> >> > > á Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> >> > >
>> >> > > <TAbort- <MAbort- >SERR- <PERR-
>> >> > >
>> >> > > á Latency: 0
>> >> > > á Region 0: Memory at f0040000 (32-bit, non-prefetchable) [size=16K]
>> >> > > á Region 3: Memory at f0044000 (32-bit, non-prefetchable) [size=16K]
>> >> > > á Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> >> > >
>> >> > > á á á á á Vector table: BAR=3 offset=00000000
>> >> > > á á á á á PBA: BAR=3 offset=00002000
>> >> > >
>> >> > > 00:07.0 Ethernet controller: Intel Corporation Unknown device 10ed (rev
>> >> > > 01)
>> >> > >
>> >> > > á Subsystem: Intel Corporation Unknown device 000c
>> >> > > á Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> >> > >
>> >> > > Stepping- SERR- FastB2B-
>> >> > >
>> >> > > á Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> >> > >
>> >> > > <TAbort- <MAbort- >SERR- <PERR-
>> >> > >
>> >> > > á Latency: 0
>> >> > > á Region 0: Memory at f0048000 (32-bit, non-prefetchable) [size=16K]
>> >> > > á Region 3: Memory at f004c000 (32-bit, non-prefetchable) [size=16K]
>> >> > > á Capabilities: [40] MSI-X: Enable+ Mask- TabSize=3
>> >> > >
>> >> > > á á á á á Vector table: BAR=3 offset=00000000
>> >> > > á á á á á PBA: BAR=3 offset=00002000
>> >> > >
>> >> > > +static bool msix_mmio_in_range(struct kvm_assigned_dev_kernel *adev,
>> >> > > + á á á á á á á á á á á gpa_t addr, int len)
>> >> > > +{
>> >> > > + gpa_t start, end;
>> >> > > +
>> >> > > + BUG_ON(adev->msix_mmio_base == 0);
>> >> > > + start = adev->msix_mmio_base;
>> >> > > + end = adev->msix_mmio_base + PCI_MSIX_ENTRY_SIZE *
>> >> > > + á á á á adev->msix_max_entries_nr;
>> >> > > + if (addr >= start && addr + len <= end)
>> >> > > + á á á á return true;
>> >> > > +
>> >> > > + return false;
>> >> > > +}
>> >> >
>> >> > Hmm, this check looks wrong to me: there's no guarantee
>> >> > that guest uses the first N entries in the table.
>> >> > E.g. it could use a single entry, but only the last one.
>> >>
>> >> Please check the PCI spec.
>> >
>> >
>> > This is pretty explicit in the spec: the the last paragraph in the below:
>> >
>> > IMPLEMENTATION NOTE
>> > Handling MSI-X Vector Shortages
>> >
>> > Handling MSI-X Vector Shortages
>> > For the case where fewer vectors are allocated to a function than desired,
>>
>> You may not notice the premise here.
>
> I noticed it.
>
>> Also check for "Table Size" would
>> help I think.
>
> It would help if msix_max_entries_nr was MSIX Table Size N
> (encoded in PCI config space as  N - 1).
> Is this what it is? Maybe add this in some comment.
>
>> --
>> regards,
>> Yang, Sheng
>>
>> software-
>> > controlled aliasing as enabled by MSI-X is one approach for handling the situation. For
>> > example, if a function supports five queues, each with an associated MSI-X table entry, but
>> > only three vectors are allocated, the function could be designed for software still to configure
>> > all five table entries, assigning one or more vectors to multiple table entries. Software could
>> > assign the three vectors {A,B,C} to the five entries as ABCCC, ABBCC, ABCBA, or other
>> > similar combinations.
>> >
>> >
>> > Alternatively, the function could be designed for software to configure it (using a device-
>> > specific mechanism) to use only three queues and three MSI-X table entries. Software could
>> > assign the three vectors {A,B,C} to the five entries as ABC--, A-B-C, A--CB, or other similar
>> > combinations.
>> >
>> >
>> >
>> >> --
>> >> regards
>> >> Yang, Sheng
>> >>
>> >>
>> >> > > 2010/11/30 Yang, Sheng <sheng.yang@intel.com>:
>> >> > > > On Tuesday 30 November 2010 17:10:11 lidong chen wrote:
>> >> > > >> sr-iov also meet this problem, MSIX mask waste a lot of cpu resource.
>> >> > > >>
>> >> > > >> I test kvm with sriov, which the vf driver could not disable msix.
>> >> > > >> so the host os waste a lot of cpu. ácpu rate of host os is 90%.
>> >> > > >>
>> >> > > >> then I test xen with sriov, there ara also a lot of vm exits caused by
>> >> > > >> MSIX mask.
>> >> > > >> but the cpu rate of xen and domain0 is less than kvm. cpu rate of xen
>> >> > > >> and domain0 is 60%.
>> >> > > >>
>> >> > > >> without sr-iov, the cpu rate of xen and domain0 is higher than kvm.
>> >> > > >>
>> >> > > >> so i think the problem is kvm waste more cpu resource to deal with
>> >> > > >> MSIX mask. and we can see how xen deal with MSIX mask.
>> >> > > >>
>> >> > > >> if this problem sloved, maybe with MSIX enabled, the performace is
>> >> > > >> better.
>> >> > > >
>> >> > > > Please refer to my posted patches for this issue.
>> >> > > >
>> >> > > > http://www.spinics.net/lists/kvm/msg44992.html
>> >> > > >
>> >> > > > --
>> >> > > > regards
>> >> > > > Yang, Sheng
>> >> > > >
>> >> > > >> 2010/11/23 Avi Kivity <avi@redhat.com>:
>> >> > > >> > On 11/23/2010 09:27 AM, lidong chen wrote:
>> >> > > >> >> can you tell me something about this problem.
>> >> > > >> >> thanks.
>> >> > > >> >
>> >> > > >> > Which problem?
>> >> > > >> >
>> >> > > >> > --
>> >> > > >> > I have a truly marvellous patch that fixes the bug which this
>> >> > > >> > signature is too narrow to contain.
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe kvm" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at áhttp://vger.kernel.org/majordomo-info.html
>> >
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Performance test result between virtio_pci MSI-X disable and enable
  2010-11-23  2:53 Performance test result between virtio_pci MSI-X disable and enable lidong chen
  2010-11-23  6:20 ` Avi Kivity
@ 2010-12-22 13:52 ` Michael S. Tsirkin
  1 sibling, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2010-12-22 13:52 UTC (permalink / raw)
  To: lidong chen; +Cc: Gleb Natapov, Avi Kivity, aliguori, rusty, kvm

On Tue, Nov 23, 2010 at 10:53:10AM +0800, lidong chen wrote:
> Test method:
> Send the same traffic load between virtio_pci MSI-X disable and
> enable,and compare the cpu rate of host os.
> I used the same version of virtio driver, only modify the msi-x option.
> the host os version is 2.6.32.
> the virtio dirver is from rhel6.
> the guest version  os is 2.6.16.
> 
> Test result:
> with msi-x disable, the cpu rate of host os is 110%.
> with msi-x enable, the cpu rate of host os is 140%.
> 
> the /proc/interrupt with msi-x disable is below:
>            CPU0       CPU1
>   0:   12326706          0    IO-APIC-edge  timer
>   1:          8          0    IO-APIC-edge  i8042
>   8:          0          0    IO-APIC-edge  rtc
>   9:          0          0   IO-APIC-level  acpi
>  10:    4783008          0   IO-APIC-level  virtio2, virtio3
>  11:    5363828          0   IO-APIC-level  virtio1, virtio4, virtio5
>  12:        104          0    IO-APIC-edge  i8042
> NMI:    2857871    2650796
> LOC:   12324952   12325609
> ERR:          0
> MIS:          0
> 
> the /proc/interrupt with msi-x enable is below:
>           CPU0       CPU1
>  0:    1896802          0    IO-APIC-edge  timer
>  1:          8          0    IO-APIC-edge  i8042
>  4:         14          0    IO-APIC-edge  serial
>  8:          0          0    IO-APIC-edge  rtc
>  9:          0          0   IO-APIC-level  acpi
>  10:          0          0   IO-APIC-level  virtio1, virtio2, virtio5
>  11:          1          0   IO-APIC-level  virtio0, virtio3, virtio4

This one probably means there's a bug: when msix
is enabled there should not be any level interrupts.

>  12:        104          0    IO-APIC-edge  i8042
>  50:          1          0       PCI-MSI-X  virtio2-output
>  58:          0          0       PCI-MSI-X  virtio3-config
>  66:    2046985          0       PCI-MSI-X  virtio3-input
>  74:          2          0       PCI-MSI-X  virtio3-output
>  82:          0          0       PCI-MSI-X  virtio4-config
>  90:        217          0       PCI-MSI-X  virtio4-input
>  98:          0          0       PCI-MSI-X  virtio4-output
> 177:          0          0       PCI-MSI-X  virtio0-config
> 185:     341831          0       PCI-MSI-X  virtio0-input
> 193:          1          0       PCI-MSI-X  virtio0-output
> 201:          0          0       PCI-MSI-X  virtio1-config
> 209:     188747          0       PCI-MSI-X  virtio1-input
> 217:          1          0       PCI-MSI-X  virtio1-output
> 225:          0          0       PCI-MSI-X  virtio2-config
> 233:    2204149          0       PCI-MSI-X  virtio2-input
> NMI:    1455767    1426226
> LOC:    1896099    1896637
> ERR:          0
> MIS:          0

I just noticed that above msi-x shows 4M interrupts
and 1.5M NMI but non-MSI shows 10M and 3M.

> Code difference:
> I disalbe msi-x by modify the function  vp_find_vqs like this:

You can simply supply nvectors=0 in qemu.

> static int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>                       struct virtqueue *vqs[],
>                       vq_callback_t *callbacks[],
>                       const char *names[])
> {
> 
> #if 0
>        int err;
> 
>        /* Try MSI-X with one vector per queue. */
>        err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names, true, true);
>        if (!err)
>                return 0;
>        /* Fallback: MSI-X with one vector for config, one shared for queues. */
>        err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
>                                 true, false);
>        if (!err)
>                return 0;
>        /* Finally fall back to regular interrupts. */
> #endif
> 
>        return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
>                                  false, false);
> }
> 
> Conclusion:
> msi-x enable waste more cpu resource is caused by MSIX mask bit. In
> older kernels program this bit twice
> on every interrupt. and caused ept violation.

Wait a second, older kernels don't have msix support in virtio,
do they?

> So I think we should add a param to control this.with older kernels,
> we should disable MSIX.
> And I think this should deal by qemu.

I would like to see a comparison of msix enabled and disabled
with a guest that supports msix natively.

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2010-12-22 13:53 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-23  2:53 Performance test result between virtio_pci MSI-X disable and enable lidong chen
2010-11-23  6:20 ` Avi Kivity
2010-11-23  6:42   ` Gleb Natapov
2010-11-23  7:27   ` lidong chen
2010-11-23  7:39     ` Avi Kivity
2010-11-30  9:10       ` lidong chen
2010-11-30  9:24         ` Yang, Sheng
2010-12-01  8:41           ` lidong chen
2010-12-01  8:49             ` Yang, Sheng
2010-12-01  8:54               ` lidong chen
2010-12-01  9:02                 ` Yang, Sheng
2010-12-01  9:29                   ` lidong chen
2010-12-01  9:37                     ` Yang, Sheng
2010-12-01  9:34                   ` Yang, Sheng
2010-12-01  8:56             ` Yang, Sheng
2010-12-01 14:03             ` Michael S. Tsirkin
2010-12-02  1:13               ` Yang, Sheng
2010-12-02  9:49                 ` Michael S. Tsirkin
2010-12-02 11:52                   ` Sheng Yang
2010-12-02 12:23                     ` Michael S. Tsirkin
2010-12-02 14:01                       ` lidong chen
2010-12-22 13:52 ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox