[PATCH v4 0/3] KVM: Dynamic Halt-Polling

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/3] KVM: Dynamic Halt-Polling
@ 2015-08-27  9:47 Wanpeng Li
       [not found] ` <55E221B2.3000702@kieser.ca>
  2015-09-01 21:45 ` David Matlack
  0 siblings, 2 replies; 17+ messages in thread
From: Wanpeng Li @ 2015-08-27  9:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Matlack, kvm, linux-kernel, Wanpeng Li

v3 -> v4:
 * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
   when idle VCPU is detected 

v2 -> v3:
 * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
 * drop the macros and hard coding the numbers in the param definitions
 * update the comments "5-7 us"
 * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
   vcpu->halt_poll_ns start at zero
 * drop the wrappers 
 * move the grow/shrink logic before "out:" w/ "if (waited)"

v1 -> v2:
 * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of 
   the module parameter
 * use the shrink/grow matrix which is suggested by David
 * set halt_poll_ns_max to 2ms

There is a downside of halt_poll_ns since poll is still happen for idle 
VCPU which can waste cpu usage. This patchset add the ability to adjust 
halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and 
shrinks halt_poll_ns when idle VCPU is detected.

There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink. 


Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of 
halt-poll is the default 500000ns, the max halt_poll_ns of dynamic 
halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
The test method is almost from David.

+-----------------+----------------+-------------------+
|                 |                |                   |
|  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
+-----------------+----------------+-------------------+
|                 |                |                   |
|    ~0.9%        |    ~1.8%       |     ~1.2%         |
+-----------------+----------------+-------------------+
                                             
The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the 
dynamic halt-poll drop it to ~0.3% which means that reduce the 67% overhead 
introduced by always halt-poll.

Wanpeng Li (3):
  KVM: make halt_poll_ns per-VCPU
  KVM: dynamic halt_poll_ns adjustment
  KVM: trace kvm_halt_poll_ns grow/shrink

 include/linux/kvm_host.h   |  1 +
 include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c        | 50 +++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 78 insertions(+), 3 deletions(-)
-- 
1.9.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4 0/3] KVM: Dynamic Halt-Polling
@ 2015-08-27  9:52 Wanpeng Li
  0 siblings, 0 replies; 17+ messages in thread
From: Wanpeng Li @ 2015-08-27  9:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Matlack, kvm, linux-kernel, Wanpeng Li

v3 -> v4:
 * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
   when idle VCPU is detected 

v2 -> v3:
 * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
 * drop the macros and hard coding the numbers in the param definitions
 * update the comments "5-7 us"
 * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
   vcpu->halt_poll_ns start at zero
 * drop the wrappers 
 * move the grow/shrink logic before "out:" w/ "if (waited)"

v1 -> v2:
 * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of 
   the module parameter
 * use the shrink/grow matrix which is suggested by David
 * set halt_poll_ns_max to 2ms

There is a downside of halt_poll_ns since poll is still happen for idle 
VCPU which can waste cpu usage. This patchset add the ability to adjust 
halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and 
shrinks halt_poll_ns when idle VCPU is detected.

There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink. 

Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of 
halt-poll is the default 500000ns, the max halt_poll_ns of dynamic 
halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
The test method is almost from David.

+-----------------+----------------+-------------------+
|                 |                |                   |
|  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
+-----------------+----------------+-------------------+
|                 |                |                   |
|    ~0.9%        |    ~1.8%       |     ~1.2%         |
+-----------------+----------------+-------------------+

The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the 
dynamic halt-poll drop it to ~0.3% which means that reduce the 67% overhead 
introduced by always halt-poll.

Wanpeng Li (3):
  KVM: make halt_poll_ns per-VCPU
  KVM: dynamic halt_poll_ns adjustment
  KVM: trace kvm_halt_poll_ns grow/shrink

 include/linux/kvm_host.h   |  1 +
 include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c        | 50 +++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 78 insertions(+), 3 deletions(-)
-- 
1.9.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
       [not found] ` <55E221B2.3000702@kieser.ca>
@ 2015-08-29 22:21   ` Wanpeng Li
  2015-08-29 22:26     ` Peter Kieser
  0 siblings, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-08-29 22:21 UTC (permalink / raw)
  To: Peter Kieser
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

Hi Peter,
On 8/30/15 5:18 AM, Peter Kieser wrote:
> Hi Wanpeng,
>
> Do I need to set any module parameters to use your patch, or should 
> halt_poll_ns automatically tune with just your patch series applied?
>

You don't need any module parameters.

Regards,
Wanpeng Li

> Thanks.
>
> On 2015-08-27 2:47 AM, Wanpeng Li wrote:
>> v3 -> v4:
>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and 
>> shrinks
>>     when idle VCPU is detected
>>
>> v2 -> v3:
>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or 
>> /halt_poll_ns_shrink
>>   * drop the macros and hard coding the numbers in the param definitions
>>   * update the comments "5-7 us"
>>   * remove halt_poll_ns_max and use halt_poll_ns as the max 
>> halt_poll_ns time,
>>     vcpu->halt_poll_ns start at zero
>>   * drop the wrappers
>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>
>> v1 -> v2:
>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
>>     the module parameter
>>   * use the shrink/grow matrix which is suggested by David
>>   * set halt_poll_ns_max to 2ms
>>
>> There is a downside of halt_poll_ns since poll is still happen for idle
>> VCPU which can waste cpu usage. This patchset add the ability to adjust
>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and
>> shrinks halt_poll_ns when idle VCPU is detected.
>>
>> There are two new kernel parameters for changing the halt_poll_ns:
>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>
>>
>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>> The test method is almost from David.
>>
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>> +-----------------+----------------+-------------------+
>>                                               The always halt-poll 
>> will increase ~0.9% cpu usage for idle vCPUs and the
>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% 
>> overhead
>> introduced by always halt-poll.
>>
>> Wanpeng Li (3):
>>    KVM: make halt_poll_ns per-VCPU
>>    KVM: dynamic halt_poll_ns adjustment
>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>
>>   include/linux/kvm_host.h   |  1 +
>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>   virt/kvm/kvm_main.c        | 50 
>> +++++++++++++++++++++++++++++++++++++++++++---
>>   3 files changed, 78 insertions(+), 3 deletions(-)
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-29 22:21   ` Wanpeng Li
@ 2015-08-29 22:26     ` Peter Kieser
  2015-08-29 23:55       ` Wanpeng Li
  2015-08-31  7:44       ` Wanpeng Li
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Kieser @ 2015-08-29 22:26 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 5311 bytes --]

Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher CPU 
usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. qemu 
parameters:

qemu-system-x86_64 -enable-kvm -name arwan-20150704 -S -machine 
pc-q35-2.2,accel=kvm,usb=off -cpu 
Haswell,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 8192 
-realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 
7c2fc02d-2798-4fc9-ad04-db5f1af92723 -no-user-config -nodefaults 
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/arwan-20150704.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime 
-no-shutdown -boot strict=on -device 
i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device 
pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device 
nec-usb-xhci,id=usb1,bus=pci.2,addr=0x4 -device 
virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x5 -drive 
file=/dev/mapper/crypt-arwan-20150704,if=none,id=drive-virtio-disk0,format=raw,cache=none,discard=unmap,aio=native 
-device 
virtio-blk-pci,scsi=off,bus=pci.2,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 
-drive 
file=/usr/share/virtio-win/virtio-win.iso,if=none,media=cdrom,id=drive-sata0-0-2,readonly=on,format=raw 
-device ide-cd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2,bootindex=1 
-netdev tap,fds=31:32:33:34,id=hostnet0,vhost=on,vhostfds=35:36:37:38 
-device 
virtio-net-pci,guest_csum=off,guest_tso4=off,guest_tso6=off,mq=on,vectors=10,netdev=hostnet0,id=net0,mac=52:54:00:f3:6b:c4,bus=pci.2,addr=0x2 
-chardev 
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/arwan-20150704.org.qemu.guest_agent.0,server,nowait 
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 
-chardev spicevmc,id=charchannel1,name=vdagent -device 
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 
-vnc 127.0.0.1:4 -device 
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1 
-device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x1 -msg timestamp=on

I revert patch, qemu shows 17% CPU usage on host. Thoughts?

-Peter

On 2015-08-29 3:21 PM, Wanpeng Li wrote:
> Hi Peter,
> On 8/30/15 5:18 AM, Peter Kieser wrote:
>> Hi Wanpeng,
>>
>> Do I need to set any module parameters to use your patch, or should 
>> halt_poll_ns automatically tune with just your patch series applied?
>>
>
> You don't need any module parameters.
>
> Regards,
> Wanpeng Li
>
>> Thanks.
>>
>> On 2015-08-27 2:47 AM, Wanpeng Li wrote:
>>> v3 -> v4:
>>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and 
>>> shrinks
>>>     when idle VCPU is detected
>>>
>>> v2 -> v3:
>>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or 
>>> /halt_poll_ns_shrink
>>>   * drop the macros and hard coding the numbers in the param 
>>> definitions
>>>   * update the comments "5-7 us"
>>>   * remove halt_poll_ns_max and use halt_poll_ns as the max 
>>> halt_poll_ns time,
>>>     vcpu->halt_poll_ns start at zero
>>>   * drop the wrappers
>>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>
>>> v1 -> v2:
>>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
>>>     the module parameter
>>>   * use the shrink/grow matrix which is suggested by David
>>>   * set halt_poll_ns_max to 2ms
>>>
>>> There is a downside of halt_poll_ns since poll is still happen for idle
>>> VCPU which can waste cpu usage. This patchset add the ability to adjust
>>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives 
>>> and
>>> shrinks halt_poll_ns when idle VCPU is detected.
>>>
>>> There are two new kernel parameters for changing the halt_poll_ns:
>>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>>
>>>
>>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>>> The test method is almost from David.
>>>
>>> +-----------------+----------------+-------------------+
>>> |                 |                |                   |
>>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>>> +-----------------+----------------+-------------------+
>>> |                 |                |                   |
>>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>>> +-----------------+----------------+-------------------+
>>>                                               The always halt-poll 
>>> will increase ~0.9% cpu usage for idle vCPUs and the
>>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% 
>>> overhead
>>> introduced by always halt-poll.
>>>
>>> Wanpeng Li (3):
>>>    KVM: make halt_poll_ns per-VCPU
>>>    KVM: dynamic halt_poll_ns adjustment
>>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>>
>>>   include/linux/kvm_host.h   |  1 +
>>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>>   virt/kvm/kvm_main.c        | 50 
>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>   3 files changed, 78 insertions(+), 3 deletions(-)
>>
>

-- 
Peter Kieser
604.338.9294 / peter@kieser.ca



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4311 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-29 22:26     ` Peter Kieser
@ 2015-08-29 23:55       ` Wanpeng Li
  2015-08-30  0:13         ` Peter Kieser
  2015-08-31  7:44       ` Wanpeng Li
  1 sibling, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-08-29 23:55 UTC (permalink / raw)
  To: Peter Kieser
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

On 8/30/15 6:26 AM, Peter Kieser wrote:
> Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher CPU 
> usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. qemu 
> parameters:

Thanks for the report. If Paolo's patch "kvm: add halt_poll_ns module 
parameter" is applied on your 3.18? Btw, do you test the linux guest?

Regards,
Wanpeng Li

>
> qemu-system-x86_64 -enable-kvm -name arwan-20150704 -S -machine 
> pc-q35-2.2,accel=kvm,usb=off -cpu 
> Haswell,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 8192 
> -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 
> 7c2fc02d-2798-4fc9-ad04-db5f1af92723 -no-user-config -nodefaults 
> -chardev 
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/arwan-20150704.monitor,server,nowait 
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime 
> -no-shutdown -boot strict=on -device 
> i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device 
> pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device 
> nec-usb-xhci,id=usb1,bus=pci.2,addr=0x4 -device 
> virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x5 -drive 
> file=/dev/mapper/crypt-arwan-20150704,if=none,id=drive-virtio-disk0,format=raw,cache=none,discard=unmap,aio=native 
> -device 
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 
> -drive 
> file=/usr/share/virtio-win/virtio-win.iso,if=none,media=cdrom,id=drive-sata0-0-2,readonly=on,format=raw 
> -device 
> ide-cd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2,bootindex=1 
> -netdev tap,fds=31:32:33:34,id=hostnet0,vhost=on,vhostfds=35:36:37:38 
> -device 
> virtio-net-pci,guest_csum=off,guest_tso4=off,guest_tso6=off,mq=on,vectors=10,netdev=hostnet0,id=net0,mac=52:54:00:f3:6b:c4,bus=pci.2,addr=0x2 
> -chardev 
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/arwan-20150704.org.qemu.guest_agent.0,server,nowait 
> -device 
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 
> -chardev spicevmc,id=charchannel1,name=vdagent -device 
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 
> -vnc 127.0.0.1:4 -device 
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1 
> -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x1 -msg 
> timestamp=on
>
> I revert patch, qemu shows 17% CPU usage on host. Thoughts?
>
> -Peter
>
> On 2015-08-29 3:21 PM, Wanpeng Li wrote:
>> Hi Peter,
>> On 8/30/15 5:18 AM, Peter Kieser wrote:
>>> Hi Wanpeng,
>>>
>>> Do I need to set any module parameters to use your patch, or should 
>>> halt_poll_ns automatically tune with just your patch series applied?
>>>
>>
>> You don't need any module parameters.
>>
>> Regards,
>> Wanpeng Li
>>
>>> Thanks.
>>>
>>> On 2015-08-27 2:47 AM, Wanpeng Li wrote:
>>>> v3 -> v4:
>>>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and 
>>>> shrinks
>>>>     when idle VCPU is detected
>>>>
>>>> v2 -> v3:
>>>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or 
>>>> /halt_poll_ns_shrink
>>>>   * drop the macros and hard coding the numbers in the param 
>>>> definitions
>>>>   * update the comments "5-7 us"
>>>>   * remove halt_poll_ns_max and use halt_poll_ns as the max 
>>>> halt_poll_ns time,
>>>>     vcpu->halt_poll_ns start at zero
>>>>   * drop the wrappers
>>>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>
>>>> v1 -> v2:
>>>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu 
>>>> instead of
>>>>     the module parameter
>>>>   * use the shrink/grow matrix which is suggested by David
>>>>   * set halt_poll_ns_max to 2ms
>>>>
>>>> There is a downside of halt_poll_ns since poll is still happen for 
>>>> idle
>>>> VCPU which can waste cpu usage. This patchset add the ability to 
>>>> adjust
>>>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt 
>>>> arrives and
>>>> shrinks halt_poll_ns when idle VCPU is detected.
>>>>
>>>> There are two new kernel parameters for changing the halt_poll_ns:
>>>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>>>
>>>>
>>>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>>>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>>>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>>>> The test method is almost from David.
>>>>
>>>> +-----------------+----------------+-------------------+
>>>> |                 |                |                   |
>>>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>>>> +-----------------+----------------+-------------------+
>>>> |                 |                |                   |
>>>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>>>> +-----------------+----------------+-------------------+
>>>>                                               The always halt-poll 
>>>> will increase ~0.9% cpu usage for idle vCPUs and the
>>>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% 
>>>> overhead
>>>> introduced by always halt-poll.
>>>>
>>>> Wanpeng Li (3):
>>>>    KVM: make halt_poll_ns per-VCPU
>>>>    KVM: dynamic halt_poll_ns adjustment
>>>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>>>
>>>>   include/linux/kvm_host.h   |  1 +
>>>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>>>   virt/kvm/kvm_main.c        | 50 
>>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>>   3 files changed, 78 insertions(+), 3 deletions(-)
>>>
>>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-29 23:55       ` Wanpeng Li
@ 2015-08-30  0:13         ` Peter Kieser
  2015-08-30  0:21           ` Wanpeng Li
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Kieser @ 2015-08-30  0:13 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 615 bytes --]



On 2015-08-29 4:55 PM, Wanpeng Li wrote:
> On 8/30/15 6:26 AM, Peter Kieser wrote:
>> Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher 
>> CPU usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. 
>> qemu parameters:
>
> Thanks for the report. If Paolo's patch "kvm: add halt_poll_ns module 
> parameter" is applied on your 3.18? Btw, do you test the linux guest? 

No high CPU usage on Linux guests. Following patch series are applied 
(in order):

* kvm: add halt_poll_ns module parameter
* KVM: make halt_poll_ns static
* KVM: Dynamic Halt-Polling v4

-Peter



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4311 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-30  0:13         ` Peter Kieser
@ 2015-08-30  0:21           ` Wanpeng Li
  0 siblings, 0 replies; 17+ messages in thread
From: Wanpeng Li @ 2015-08-30  0:21 UTC (permalink / raw)
  To: Peter Kieser
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

On 8/30/15 8:13 AM, Peter Kieser wrote:
> On 2015-08-29 4:55 PM, Wanpeng Li wrote:
>> On 8/30/15 6:26 AM, Peter Kieser wrote:
>>> Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher 
>>> CPU usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. 
>>> qemu parameters:
>>
>> Thanks for the report. If Paolo's patch "kvm: add halt_poll_ns module 
>> parameter" is applied on your 3.18? Btw, do you test the linux guest? 
>
> No high CPU usage on Linux guests. 

What's the difference w/ and w/o the patchset?

> Following patch series are applied (in order):
>
> * kvm: add halt_poll_ns module parameter
> * KVM: make halt_poll_ns static
> * KVM: Dynamic Halt-Polling v4
>

I will find a windows 10 x64 guest to figure out what happens tomorrow. 
Do you test other windows guest(like win7)? Btw, could you test v3 
dynamic halt-polling(against windows 10 guest) which David gives the 
shrink/grow logic different with v4. Many thanks for your time, Peter! ;-)

Regards,
Wanepng Li

> -Peter
>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-29 22:26     ` Peter Kieser
  2015-08-29 23:55       ` Wanpeng Li
@ 2015-08-31  7:44       ` Wanpeng Li
  2015-08-31  7:47         ` Wanpeng Li
  1 sibling, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-08-31  7:44 UTC (permalink / raw)
  To: Peter Kieser
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

On 8/30/15 6:26 AM, Peter Kieser wrote:
> Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher CPU
> usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. qemu
> parameters:

Interesting. I test this against latest kvm tree and stable qemu 2.0.0, 
4 vCPUs on pCPU0(other pCPUs are offline to easy observe %C0 and to 
avoid vCPUs schedule overhead influence). I just ignore the fluctuation 
and post the most common result of %C0 against the Windows 10 x86 guest.

+-----------------+----------------+-----------------------+
|                 |                |                       |
|  w/o halt-poll  |  w/ halt-poll  | dynamic(v4) halt-poll |
+-----------------+----------------+-----------------------+
|                 |                |                       |
|    ~2.1%        |    ~3.0%       |         ~2.4%         |
+-----------------+----------------+-----------------------+

Regards,
Wanpeng Li

>
> qemu-system-x86_64 -enable-kvm -name arwan-20150704 -S -machine
> pc-q35-2.2,accel=kvm,usb=off -cpu
> Haswell,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 8192
> -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
> 7c2fc02d-2798-4fc9-ad04-db5f1af92723 -no-user-config -nodefaults
> -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/arwan-20150704.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime
> -no-shutdown -boot strict=on -device
> i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device
> pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device
> nec-usb-xhci,id=usb1,bus=pci.2,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x5 -drive
> file=/dev/mapper/crypt-arwan-20150704,if=none,id=drive-virtio-disk0,format=raw,cache=none,discard=unmap,aio=native
> -device
> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2
> -drive
> file=/usr/share/virtio-win/virtio-win.iso,if=none,media=cdrom,id=drive-sata0-0-2,readonly=on,format=raw
> -device
> ide-cd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2,bootindex=1
> -netdev tap,fds=31:32:33:34,id=hostnet0,vhost=on,vhostfds=35:36:37:38
> -device
> virtio-net-pci,guest_csum=off,guest_tso4=off,guest_tso6=off,mq=on,vectors=10,netdev=hostnet0,id=net0,mac=52:54:00:f3:6b:c4,bus=pci.2,addr=0x2
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/arwan-20150704.org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
> -chardev spicevmc,id=charchannel1,name=vdagent -device
> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0
> -vnc 127.0.0.1:4 -device
> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1
> -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x1 -msg
> timestamp=on
>
> I revert patch, qemu shows 17% CPU usage on host. Thoughts?
>
> -Peter
>
> On 2015-08-29 3:21 PM, Wanpeng Li wrote:
>> Hi Peter,
>> On 8/30/15 5:18 AM, Peter Kieser wrote:
>>> Hi Wanpeng,
>>>
>>> Do I need to set any module parameters to use your patch, or should
>>> halt_poll_ns automatically tune with just your patch series applied?
>>>
>>
>> You don't need any module parameters.
>>
>> Regards,
>> Wanpeng Li
>>
>>> Thanks.
>>>
>>> On 2015-08-27 2:47 AM, Wanpeng Li wrote:
>>>> v3 -> v4:
>>>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and
>>>> shrinks
>>>>     when idle VCPU is detected
>>>>
>>>> v2 -> v3:
>>>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>> /halt_poll_ns_shrink
>>>>   * drop the macros and hard coding the numbers in the param
>>>> definitions
>>>>   * update the comments "5-7 us"
>>>>   * remove halt_poll_ns_max and use halt_poll_ns as the max
>>>> halt_poll_ns time,
>>>>     vcpu->halt_poll_ns start at zero
>>>>   * drop the wrappers
>>>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>
>>>> v1 -> v2:
>>>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu
>>>> instead of
>>>>     the module parameter
>>>>   * use the shrink/grow matrix which is suggested by David
>>>>   * set halt_poll_ns_max to 2ms
>>>>
>>>> There is a downside of halt_poll_ns since poll is still happen for
>>>> idle
>>>> VCPU which can waste cpu usage. This patchset add the ability to
>>>> adjust
>>>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt
>>>> arrives and
>>>> shrinks halt_poll_ns when idle VCPU is detected.
>>>>
>>>> There are two new kernel parameters for changing the halt_poll_ns:
>>>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>>>
>>>>
>>>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>>>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>>>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>>>> The test method is almost from David.
>>>>
>>>> +-----------------+----------------+-------------------+
>>>> |                 |                |                   |
>>>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>>>> +-----------------+----------------+-------------------+
>>>> |                 |                |                   |
>>>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>>>> +-----------------+----------------+-------------------+
>>>>                                               The always halt-poll
>>>> will increase ~0.9% cpu usage for idle vCPUs and the
>>>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67%
>>>> overhead
>>>> introduced by always halt-poll.
>>>>
>>>> Wanpeng Li (3):
>>>>    KVM: make halt_poll_ns per-VCPU
>>>>    KVM: dynamic halt_poll_ns adjustment
>>>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>>>
>>>>   include/linux/kvm_host.h   |  1 +
>>>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>>>   virt/kvm/kvm_main.c        | 50
>>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>>   3 files changed, 78 insertions(+), 3 deletions(-)
>>>
>>
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-31  7:44       ` Wanpeng Li
@ 2015-08-31  7:47         ` Wanpeng Li
  0 siblings, 0 replies; 17+ messages in thread
From: Wanpeng Li @ 2015-08-31  7:47 UTC (permalink / raw)
  To: Peter Kieser
  Cc: Paolo Bonzini, David Matlack, kvm, linux-kernel@vger.kernel.org

On 8/31/15 3:44 PM, Wanpeng Li wrote:
> On 8/30/15 6:26 AM, Peter Kieser wrote:
>> Thanks, Wanpeng. Applied this to Linux 3.18 and seeing much higher CPU
>> usage (200%) for qemu 2.4.0 process on a Windows 10 x64 guest. qemu
>> parameters:
>
> Interesting. I test this against latest kvm tree and stable qemu 
> 2.0.0, 4 vCPUs on pCPU0(other pCPUs are offline to easy observe %C0 
> and to avoid vCPUs schedule overhead influence). I just ignore the 
> fluctuation and post the most common result of %C0 against the Windows 
> 10 x86 guest.

s/x86/x64

>
> +-----------------+----------------+-----------------------+
> |                 |                |                       |
> |  w/o halt-poll  |  w/ halt-poll  | dynamic(v4) halt-poll |
> +-----------------+----------------+-----------------------+
> |                 |                |                       |
> |    ~2.1%        |    ~3.0%       |         ~2.4%         |
> +-----------------+----------------+-----------------------+
>
> Regards,
> Wanpeng Li
>
>>
>> qemu-system-x86_64 -enable-kvm -name arwan-20150704 -S -machine
>> pc-q35-2.2,accel=kvm,usb=off -cpu
>> Haswell,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 8192
>> -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid
>> 7c2fc02d-2798-4fc9-ad04-db5f1af92723 -no-user-config -nodefaults
>> -chardev
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/arwan-20150704.monitor,server,nowait 
>>
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime
>> -no-shutdown -boot strict=on -device
>> i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device
>> pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device
>> nec-usb-xhci,id=usb1,bus=pci.2,addr=0x4 -device
>> virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x5 -drive
>> file=/dev/mapper/crypt-arwan-20150704,if=none,id=drive-virtio-disk0,format=raw,cache=none,discard=unmap,aio=native 
>>
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.2,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 
>>
>> -drive
>> file=/usr/share/virtio-win/virtio-win.iso,if=none,media=cdrom,id=drive-sata0-0-2,readonly=on,format=raw 
>>
>> -device
>> ide-cd,bus=ide.2,drive=drive-sata0-0-2,id=sata0-0-2,bootindex=1
>> -netdev tap,fds=31:32:33:34,id=hostnet0,vhost=on,vhostfds=35:36:37:38
>> -device
>> virtio-net-pci,guest_csum=off,guest_tso4=off,guest_tso6=off,mq=on,vectors=10,netdev=hostnet0,id=net0,mac=52:54:00:f3:6b:c4,bus=pci.2,addr=0x2 
>>
>> -chardev
>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/arwan-20150704.org.qemu.guest_agent.0,server,nowait 
>>
>> -device
>> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 
>>
>> -chardev spicevmc,id=charchannel1,name=vdagent -device
>> virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 
>>
>> -vnc 127.0.0.1:4 -device
>> qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pcie.0,addr=0x1 
>>
>> -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x1 -msg
>> timestamp=on
>>
>> I revert patch, qemu shows 17% CPU usage on host. Thoughts?
>>
>> -Peter
>>
>> On 2015-08-29 3:21 PM, Wanpeng Li wrote:
>>> Hi Peter,
>>> On 8/30/15 5:18 AM, Peter Kieser wrote:
>>>> Hi Wanpeng,
>>>>
>>>> Do I need to set any module parameters to use your patch, or should
>>>> halt_poll_ns automatically tune with just your patch series applied?
>>>>
>>>
>>> You don't need any module parameters.
>>>
>>> Regards,
>>> Wanpeng Li
>>>
>>>> Thanks.
>>>>
>>>> On 2015-08-27 2:47 AM, Wanpeng Li wrote:
>>>>> v3 -> v4:
>>>>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and
>>>>> shrinks
>>>>>     when idle VCPU is detected
>>>>>
>>>>> v2 -> v3:
>>>>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>>> /halt_poll_ns_shrink
>>>>>   * drop the macros and hard coding the numbers in the param
>>>>> definitions
>>>>>   * update the comments "5-7 us"
>>>>>   * remove halt_poll_ns_max and use halt_poll_ns as the max
>>>>> halt_poll_ns time,
>>>>>     vcpu->halt_poll_ns start at zero
>>>>>   * drop the wrappers
>>>>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>>
>>>>> v1 -> v2:
>>>>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu
>>>>> instead of
>>>>>     the module parameter
>>>>>   * use the shrink/grow matrix which is suggested by David
>>>>>   * set halt_poll_ns_max to 2ms
>>>>>
>>>>> There is a downside of halt_poll_ns since poll is still happen for
>>>>> idle
>>>>> VCPU which can waste cpu usage. This patchset add the ability to
>>>>> adjust
>>>>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt
>>>>> arrives and
>>>>> shrinks halt_poll_ns when idle VCPU is detected.
>>>>>
>>>>> There are two new kernel parameters for changing the halt_poll_ns:
>>>>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>>>>
>>>>>
>>>>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>>>>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>>>>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>>>>> The test method is almost from David.
>>>>>
>>>>> +-----------------+----------------+-------------------+
>>>>> |                 |                |                   |
>>>>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>>>>> +-----------------+----------------+-------------------+
>>>>> |                 |                |                   |
>>>>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>>>>> +-----------------+----------------+-------------------+
>>>>>                                               The always halt-poll
>>>>> will increase ~0.9% cpu usage for idle vCPUs and the
>>>>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67%
>>>>> overhead
>>>>> introduced by always halt-poll.
>>>>>
>>>>> Wanpeng Li (3):
>>>>>    KVM: make halt_poll_ns per-VCPU
>>>>>    KVM: dynamic halt_poll_ns adjustment
>>>>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>>>>
>>>>>   include/linux/kvm_host.h   |  1 +
>>>>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>>>>   virt/kvm/kvm_main.c        | 50
>>>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>>>   3 files changed, 78 insertions(+), 3 deletions(-)
>>>>
>>>
>>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-08-27  9:47 [PATCH v4 0/3] KVM: Dynamic Halt-Polling Wanpeng Li
       [not found] ` <55E221B2.3000702@kieser.ca>
@ 2015-09-01 21:45 ` David Matlack
  2015-09-01 22:30   ` Wanpeng Li
  1 sibling, 1 reply; 17+ messages in thread
From: David Matlack @ 2015-09-01 21:45 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org

On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
> v3 -> v4:
>  * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
>    when idle VCPU is detected
>
> v2 -> v3:
>  * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
>  * drop the macros and hard coding the numbers in the param definitions
>  * update the comments "5-7 us"
>  * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
>    vcpu->halt_poll_ns start at zero
>  * drop the wrappers
>  * move the grow/shrink logic before "out:" w/ "if (waited)"

I posted a patchset which adds dynamic poll toggling (on/off switch). I think
this gives you a good place to build your dynamic growth patch on top. The
toggling patch has close to zero overhead for idle VMs and equivalent
performance VMs doing message passing as always-poll. It's a patch that's been
in my queue for a few weeks but just haven't had the time to send out. We can
win even more with your patchset by only polling as much as we need (via
dynamic growth/shrink). It also gives us a better place to stand for choosing
a default for halt_poll_ns. (We can run experiments and see how high
vcpu->halt_poll_ns tends to grow.)

The reason I posted a separate patch for toggling is because it adds timers
to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get
called multiple times for one halt). To do dynamic poll adjustment correctly,
we have to time the length of each halt. Otherwise we hit some bad edge cases:

  v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew every
  time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1 ms ->
  2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0 when
  the halts are long.

  v4: v4 fixed the idle overhead problem but broke dynamic growth for message
  passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would grow.
  That means vcpu->halt_poll_ns will always be maxed out, even when the halt
  time is much less than the max.

I think we can fix both edge cases if we make grow/shrink decisions based on
the length of kvm_vcpu_block rather than the arrival of a guest interrupt
during polling.

Some thoughts for dynamic growth:
  * Given Windows 10 timer tick (1 ms), let's set the maximum poll time to
    less than 1ms. 200 us has been a good value for always-poll. We can
    probably go a bit higher once we have your patch. Maybe 500 us?

  * The base case of dynamic growth (the first grow() after being at 0) should
    be small. 500 us is too big. When I run TCP_RR in my guest I see poll times
    of < 10 us. TCP_RR is on the lower-end of message passing workload latency,
    so 10 us would be a good base case.

>
> v1 -> v2:
>  * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
>    the module parameter
>  * use the shrink/grow matrix which is suggested by David
>  * set halt_poll_ns_max to 2ms
>
> There is a downside of halt_poll_ns since poll is still happen for idle
> VCPU which can waste cpu usage. This patchset add the ability to adjust
> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and
> shrinks halt_poll_ns when idle VCPU is detected.
>
> There are two new kernel parameters for changing the halt_poll_ns:
> halt_poll_ns_grow and halt_poll_ns_shrink.
>
>
> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
> The test method is almost from David.
>
> +-----------------+----------------+-------------------+
> |                 |                |                   |
> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
> +-----------------+----------------+-------------------+
> |                 |                |                   |
> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
> +-----------------+----------------+-------------------+
>
> The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the
> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% overhead
> introduced by always halt-poll.
>
> Wanpeng Li (3):
>   KVM: make halt_poll_ns per-VCPU
>   KVM: dynamic halt_poll_ns adjustment
>   KVM: trace kvm_halt_poll_ns grow/shrink
>
>  include/linux/kvm_host.h   |  1 +
>  include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>  virt/kvm/kvm_main.c        | 50 +++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 78 insertions(+), 3 deletions(-)
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-01 21:45 ` David Matlack
@ 2015-09-01 22:30   ` Wanpeng Li
  2015-09-01 22:34     ` David Matlack
  0 siblings, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-09-01 22:30 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org

On 9/2/15 5:45 AM, David Matlack wrote:
> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>> v3 -> v4:
>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
>>     when idle VCPU is detected
>>
>> v2 -> v3:
>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or /halt_poll_ns_shrink
>>   * drop the macros and hard coding the numbers in the param definitions
>>   * update the comments "5-7 us"
>>   * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns time,
>>     vcpu->halt_poll_ns start at zero
>>   * drop the wrappers
>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
> I posted a patchset which adds dynamic poll toggling (on/off switch). I think
> this gives you a good place to build your dynamic growth patch on top. The
> toggling patch has close to zero overhead for idle VMs and equivalent
> performance VMs doing message passing as always-poll. It's a patch that's been
> in my queue for a few weeks but just haven't had the time to send out. We can
> win even more with your patchset by only polling as much as we need (via
> dynamic growth/shrink). It also gives us a better place to stand for choosing
> a default for halt_poll_ns. (We can run experiments and see how high
> vcpu->halt_poll_ns tends to grow.)
>
> The reason I posted a separate patch for toggling is because it adds timers
> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get
> called multiple times for one halt). To do dynamic poll adjustment correctly,
> we have to time the length of each halt. Otherwise we hit some bad edge cases:
>
>    v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew every
>    time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1 ms ->
>    2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0 when
>    the halts are long.
>
>    v4: v4 fixed the idle overhead problem but broke dynamic growth for message
>    passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would grow.
>    That means vcpu->halt_poll_ns will always be maxed out, even when the halt
>    time is much less than the max.
>
> I think we can fix both edge cases if we make grow/shrink decisions based on
> the length of kvm_vcpu_block rather than the arrival of a guest interrupt
> during polling.
>
> Some thoughts for dynamic growth:
>    * Given Windows 10 timer tick (1 ms), let's set the maximum poll time to
>      less than 1ms. 200 us has been a good value for always-poll. We can
>      probably go a bit higher once we have your patch. Maybe 500 us?
>
>    * The base case of dynamic growth (the first grow() after being at 0) should
>      be small. 500 us is too big. When I run TCP_RR in my guest I see poll times
>      of < 10 us. TCP_RR is on the lower-end of message passing workload latency,
>      so 10 us would be a good base case.

How to get your TCP_RR benchmark?

Regards,
Wanpeng Li

>> v1 -> v2:
>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
>>     the module parameter
>>   * use the shrink/grow matrix which is suggested by David
>>   * set halt_poll_ns_max to 2ms
>>
>> There is a downside of halt_poll_ns since poll is still happen for idle
>> VCPU which can waste cpu usage. This patchset add the ability to adjust
>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and
>> shrinks halt_poll_ns when idle VCPU is detected.
>>
>> There are two new kernel parameters for changing the halt_poll_ns:
>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>
>>
>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>> The test method is almost from David.
>>
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>> +-----------------+----------------+-------------------+
>>
>> The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the
>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67% overhead
>> introduced by always halt-poll.
>>
>> Wanpeng Li (3):
>>    KVM: make halt_poll_ns per-VCPU
>>    KVM: dynamic halt_poll_ns adjustment
>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>
>>   include/linux/kvm_host.h   |  1 +
>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>   virt/kvm/kvm_main.c        | 50 +++++++++++++++++++++++++++++++++++++++++++---
>>   3 files changed, 78 insertions(+), 3 deletions(-)
>> --
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-01 22:30   ` Wanpeng Li
@ 2015-09-01 22:34     ` David Matlack
  2015-09-01 22:58       ` Wanpeng Li
  0 siblings, 1 reply; 17+ messages in thread
From: David Matlack @ 2015-09-01 22:34 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org

On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
> On 9/2/15 5:45 AM, David Matlack wrote:
>>
>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com>
>> wrote:
>>>
>>> v3 -> v4:
>>>   * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
>>>     when idle VCPU is detected
>>>
>>> v2 -> v3:
>>>   * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>> /halt_poll_ns_shrink
>>>   * drop the macros and hard coding the numbers in the param definitions
>>>   * update the comments "5-7 us"
>>>   * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns
>>> time,
>>>     vcpu->halt_poll_ns start at zero
>>>   * drop the wrappers
>>>   * move the grow/shrink logic before "out:" w/ "if (waited)"
>>
>> I posted a patchset which adds dynamic poll toggling (on/off switch). I
>> think
>> this gives you a good place to build your dynamic growth patch on top. The
>> toggling patch has close to zero overhead for idle VMs and equivalent
>> performance VMs doing message passing as always-poll. It's a patch that's
>> been
>> in my queue for a few weeks but just haven't had the time to send out. We
>> can
>> win even more with your patchset by only polling as much as we need (via
>> dynamic growth/shrink). It also gives us a better place to stand for
>> choosing
>> a default for halt_poll_ns. (We can run experiments and see how high
>> vcpu->halt_poll_ns tends to grow.)
>>
>> The reason I posted a separate patch for toggling is because it adds
>> timers
>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get
>> called multiple times for one halt). To do dynamic poll adjustment
>> correctly,
>> we have to time the length of each halt. Otherwise we hit some bad edge
>> cases:
>>
>>    v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew
>> every
>>    time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1
>> ms ->
>>    2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0
>> when
>>    the halts are long.
>>
>>    v4: v4 fixed the idle overhead problem but broke dynamic growth for
>> message
>>    passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would
>> grow.
>>    That means vcpu->halt_poll_ns will always be maxed out, even when the
>> halt
>>    time is much less than the max.
>>
>> I think we can fix both edge cases if we make grow/shrink decisions based
>> on
>> the length of kvm_vcpu_block rather than the arrival of a guest interrupt
>> during polling.
>>
>> Some thoughts for dynamic growth:
>>    * Given Windows 10 timer tick (1 ms), let's set the maximum poll time
>> to
>>      less than 1ms. 200 us has been a good value for always-poll. We can
>>      probably go a bit higher once we have your patch. Maybe 500 us?
>>
>>    * The base case of dynamic growth (the first grow() after being at 0)
>> should
>>      be small. 500 us is too big. When I run TCP_RR in my guest I see poll
>> times
>>      of < 10 us. TCP_RR is on the lower-end of message passing workload
>> latency,
>>      so 10 us would be a good base case.
>
>
> How to get your TCP_RR benchmark?
>
> Regards,
> Wanpeng Li

Install the netperf package, or build from here:
http://www.netperf.org/netperf/DownloadNetperf.html

In the vm:

# ./netserver
# ./netperf -t TCP_RR

Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
passing workload in order to test halt-polling).

>
>
>>> v1 -> v2:
>>>   * change kvm_vcpu_block to read halt_poll_ns from the vcpu instead of
>>>     the module parameter
>>>   * use the shrink/grow matrix which is suggested by David
>>>   * set halt_poll_ns_max to 2ms
>>>
>>> There is a downside of halt_poll_ns since poll is still happen for idle
>>> VCPU which can waste cpu usage. This patchset add the ability to adjust
>>> halt_poll_ns dynamically, grows halt_poll_ns if an interrupt arrives and
>>> shrinks halt_poll_ns when idle VCPU is detected.
>>>
>>> There are two new kernel parameters for changing the halt_poll_ns:
>>> halt_poll_ns_grow and halt_poll_ns_shrink.
>>>
>>>
>>> Test w/ high cpu overcommit ratio, pin vCPUs, and the halt_poll_ns of
>>> halt-poll is the default 500000ns, the max halt_poll_ns of dynamic
>>> halt-poll is 2ms. Then watch the %C0 in the dump of Powertop tool.
>>> The test method is almost from David.
>>>
>>> +-----------------+----------------+-------------------+
>>> |                 |                |                   |
>>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>>> +-----------------+----------------+-------------------+
>>> |                 |                |                   |
>>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>>> +-----------------+----------------+-------------------+
>>>
>>> The always halt-poll will increase ~0.9% cpu usage for idle vCPUs and the
>>> dynamic halt-poll drop it to ~0.3% which means that reduce the 67%
>>> overhead
>>> introduced by always halt-poll.
>>>
>>> Wanpeng Li (3):
>>>    KVM: make halt_poll_ns per-VCPU
>>>    KVM: dynamic halt_poll_ns adjustment
>>>    KVM: trace kvm_halt_poll_ns grow/shrink
>>>
>>>   include/linux/kvm_host.h   |  1 +
>>>   include/trace/events/kvm.h | 30 ++++++++++++++++++++++++++++
>>>   virt/kvm/kvm_main.c        | 50
>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>   3 files changed, 78 insertions(+), 3 deletions(-)
>>> --
>>> 1.9.1
>>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-01 22:34     ` David Matlack
@ 2015-09-01 22:58       ` Wanpeng Li
  2015-09-01 23:24         ` David Matlack
  0 siblings, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-09-01 22:58 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org

On 9/2/15 6:34 AM, David Matlack wrote:
> On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>> On 9/2/15 5:45 AM, David Matlack wrote:
>>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com>
>>> wrote:
>>>> v3 -> v4:
>>>>    * bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
>>>>      when idle VCPU is detected
>>>>
>>>> v2 -> v3:
>>>>    * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>> /halt_poll_ns_shrink
>>>>    * drop the macros and hard coding the numbers in the param definitions
>>>>    * update the comments "5-7 us"
>>>>    * remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns
>>>> time,
>>>>      vcpu->halt_poll_ns start at zero
>>>>    * drop the wrappers
>>>>    * move the grow/shrink logic before "out:" w/ "if (waited)"
>>> I posted a patchset which adds dynamic poll toggling (on/off switch). I
>>> think
>>> this gives you a good place to build your dynamic growth patch on top. The
>>> toggling patch has close to zero overhead for idle VMs and equivalent
>>> performance VMs doing message passing as always-poll. It's a patch that's
>>> been
>>> in my queue for a few weeks but just haven't had the time to send out. We
>>> can
>>> win even more with your patchset by only polling as much as we need (via
>>> dynamic growth/shrink). It also gives us a better place to stand for
>>> choosing
>>> a default for halt_poll_ns. (We can run experiments and see how high
>>> vcpu->halt_poll_ns tends to grow.)
>>>
>>> The reason I posted a separate patch for toggling is because it adds
>>> timers
>>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get
>>> called multiple times for one halt). To do dynamic poll adjustment

Why this can happen?

>>> correctly,
>>> we have to time the length of each halt. Otherwise we hit some bad edge
>>> cases:
>>>
>>>     v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew
>>> every
>>>     time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1
>>> ms ->
>>>     2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0
>>> when
>>>     the halts are long.
>>>
>>>     v4: v4 fixed the idle overhead problem but broke dynamic growth for
>>> message
>>>     passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would
>>> grow.
>>>     That means vcpu->halt_poll_ns will always be maxed out, even when the
>>> halt
>>>     time is much less than the max.
>>>
>>> I think we can fix both edge cases if we make grow/shrink decisions based
>>> on
>>> the length of kvm_vcpu_block rather than the arrival of a guest interrupt
>>> during polling.
>>>
>>> Some thoughts for dynamic growth:
>>>     * Given Windows 10 timer tick (1 ms), let's set the maximum poll time
>>> to
>>>       less than 1ms. 200 us has been a good value for always-poll. We can
>>>       probably go a bit higher once we have your patch. Maybe 500 us?

Did you test your patch against a windows guest?

>>>
>>>     * The base case of dynamic growth (the first grow() after being at 0)
>>> should
>>>       be small. 500 us is too big. When I run TCP_RR in my guest I see poll
>>> times
>>>       of < 10 us. TCP_RR is on the lower-end of message passing workload
>>> latency,
>>>       so 10 us would be a good base case.
>>
>> How to get your TCP_RR benchmark?
>>
>> Regards,
>> Wanpeng Li
> Install the netperf package, or build from here:
> http://www.netperf.org/netperf/DownloadNetperf.html
>
> In the vm:
>
> # ./netserver
> # ./netperf -t TCP_RR
>
> Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
> passing workload in order to test halt-polling).

Ah, ok, I use the same benchmark as yours.

Regards,
Wanpeng Li



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-01 22:58       ` Wanpeng Li
@ 2015-09-01 23:24         ` David Matlack
  2015-09-02  0:29           ` Wanpeng Li
  0 siblings, 1 reply; 17+ messages in thread
From: David Matlack @ 2015-09-01 23:24 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org

On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
> On 9/2/15 6:34 AM, David Matlack wrote:
>>
>> On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>>>
>>> On 9/2/15 5:45 AM, David Matlack wrote:
>>>>
>>>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com>
>>>> wrote:
>>>>>
>>>>> v3 -> v4:
>>>>>    * bring back grow vcpu->halt_poll_ns when interrupt arrives and
>>>>> shrinks
>>>>>      when idle VCPU is detected
>>>>>
>>>>> v2 -> v3:
>>>>>    * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>>> /halt_poll_ns_shrink
>>>>>    * drop the macros and hard coding the numbers in the param
>>>>> definitions
>>>>>    * update the comments "5-7 us"
>>>>>    * remove halt_poll_ns_max and use halt_poll_ns as the max
>>>>> halt_poll_ns
>>>>> time,
>>>>>      vcpu->halt_poll_ns start at zero
>>>>>    * drop the wrappers
>>>>>    * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>
>>>> I posted a patchset which adds dynamic poll toggling (on/off switch). I
>>>> think
>>>> this gives you a good place to build your dynamic growth patch on top.
>>>> The
>>>> toggling patch has close to zero overhead for idle VMs and equivalent
>>>> performance VMs doing message passing as always-poll. It's a patch
>>>> that's
>>>> been
>>>> in my queue for a few weeks but just haven't had the time to send out.
>>>> We
>>>> can
>>>> win even more with your patchset by only polling as much as we need (via
>>>> dynamic growth/shrink). It also gives us a better place to stand for
>>>> choosing
>>>> a default for halt_poll_ns. (We can run experiments and see how high
>>>> vcpu->halt_poll_ns tends to grow.)
>>>>
>>>> The reason I posted a separate patch for toggling is because it adds
>>>> timers
>>>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can
>>>> get
>>>> called multiple times for one halt). To do dynamic poll adjustment
>
>
> Why this can happen?

Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery
of TSC deadline timer interrupt). I don't think the edge case exists in
the latest kernel.

>
>
>>>> correctly,
>>>> we have to time the length of each halt. Otherwise we hit some bad edge
>>>> cases:
>>>>
>>>>     v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns
>>>> grew
>>>> every
>>>>     time we had a long halt. So idle VMs looked like: 0 us -> 500 us ->
>>>> 1
>>>> ms ->
>>>>     2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at
>>>> 0
>>>> when
>>>>     the halts are long.
>>>>
>>>>     v4: v4 fixed the idle overhead problem but broke dynamic growth for
>>>> message
>>>>     passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns
>>>> would
>>>> grow.
>>>>     That means vcpu->halt_poll_ns will always be maxed out, even when
>>>> the
>>>> halt
>>>>     time is much less than the max.
>>>>
>>>> I think we can fix both edge cases if we make grow/shrink decisions
>>>> based
>>>> on
>>>> the length of kvm_vcpu_block rather than the arrival of a guest
>>>> interrupt
>>>> during polling.
>>>>
>>>> Some thoughts for dynamic growth:
>>>>     * Given Windows 10 timer tick (1 ms), let's set the maximum poll
>>>> time
>>>> to
>>>>       less than 1ms. 200 us has been a good value for always-poll. We
>>>> can
>>>>       probably go a bit higher once we have your patch. Maybe 500 us?
>
>
> Did you test your patch against a windows guest?

I have not. I tested against a 250HZ linux guest to check how it performs
against a ticking guest. Presumably, windows should be the same, but at a
higher tick rate. Do you have a test for Windows?

>
>>>>
>>>>     * The base case of dynamic growth (the first grow() after being at
>>>> 0)
>>>> should
>>>>       be small. 500 us is too big. When I run TCP_RR in my guest I see
>>>> poll
>>>> times
>>>>       of < 10 us. TCP_RR is on the lower-end of message passing workload
>>>> latency,
>>>>       so 10 us would be a good base case.
>>>
>>>
>>> How to get your TCP_RR benchmark?
>>>
>>> Regards,
>>> Wanpeng Li
>>
>> Install the netperf package, or build from here:
>> http://www.netperf.org/netperf/DownloadNetperf.html
>>
>> In the vm:
>>
>> # ./netserver
>> # ./netperf -t TCP_RR
>>
>> Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
>> passing workload in order to test halt-polling).
>
>
> Ah, ok, I use the same benchmark as yours.
>
> Regards,
> Wanpeng Li
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-01 23:24         ` David Matlack
@ 2015-09-02  0:29           ` Wanpeng Li
  2015-09-02  1:49             ` David Matlack
  0 siblings, 1 reply; 17+ messages in thread
From: Wanpeng Li @ 2015-09-02  0:29 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org,
	Peter Kieser

On 9/2/15 7:24 AM, David Matlack wrote:
> On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>> On 9/2/15 6:34 AM, David Matlack wrote:
>>> On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>>>> On 9/2/15 5:45 AM, David Matlack wrote:
>>>>> On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <wanpeng.li@hotmail.com>
>>>>> wrote:
>>>>>> v3 -> v4:
>>>>>>     * bring back grow vcpu->halt_poll_ns when interrupt arrives and
>>>>>> shrinks
>>>>>>       when idle VCPU is detected
>>>>>>
>>>>>> v2 -> v3:
>>>>>>     * grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
>>>>>> /halt_poll_ns_shrink
>>>>>>     * drop the macros and hard coding the numbers in the param
>>>>>> definitions
>>>>>>     * update the comments "5-7 us"
>>>>>>     * remove halt_poll_ns_max and use halt_poll_ns as the max
>>>>>> halt_poll_ns
>>>>>> time,
>>>>>>       vcpu->halt_poll_ns start at zero
>>>>>>     * drop the wrappers
>>>>>>     * move the grow/shrink logic before "out:" w/ "if (waited)"
>>>>> I posted a patchset which adds dynamic poll toggling (on/off switch). I
>>>>> think
>>>>> this gives you a good place to build your dynamic growth patch on top.
>>>>> The
>>>>> toggling patch has close to zero overhead for idle VMs and equivalent
>>>>> performance VMs doing message passing as always-poll. It's a patch
>>>>> that's
>>>>> been
>>>>> in my queue for a few weeks but just haven't had the time to send out.
>>>>> We
>>>>> can
>>>>> win even more with your patchset by only polling as much as we need (via
>>>>> dynamic growth/shrink). It also gives us a better place to stand for
>>>>> choosing
>>>>> a default for halt_poll_ns. (We can run experiments and see how high
>>>>> vcpu->halt_poll_ns tends to grow.)
>>>>>
>>>>> The reason I posted a separate patch for toggling is because it adds
>>>>> timers
>>>>> to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can
>>>>> get
>>>>> called multiple times for one halt). To do dynamic poll adjustment
>>
>> Why this can happen?
> Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery
> of TSC deadline timer interrupt). I don't think the edge case exists in
> the latest kernel.

Yeah, hope we both(include Peter Kieser) can test against latest kvm 
tree to avoid confusing. The reason to introduce the adaptive 
halt-polling toggle is to handle the "edge case" as you mentioned above. 
So I think we can make more efforts improve v4 instead. I will improve 
v4 to handle short halt today. ;-)

>
>>
>>>>> correctly,
>>>>> we have to time the length of each halt. Otherwise we hit some bad edge
>>>>> cases:
>>>>>
>>>>>      v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns
>>>>> grew
>>>>> every
>>>>>      time we had a long halt. So idle VMs looked like: 0 us -> 500 us ->
>>>>> 1
>>>>> ms ->
>>>>>      2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at
>>>>> 0
>>>>> when
>>>>>      the halts are long.
>>>>>
>>>>>      v4: v4 fixed the idle overhead problem but broke dynamic growth for
>>>>> message
>>>>>      passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns
>>>>> would
>>>>> grow.
>>>>>      That means vcpu->halt_poll_ns will always be maxed out, even when
>>>>> the
>>>>> halt
>>>>>      time is much less than the max.
>>>>>
>>>>> I think we can fix both edge cases if we make grow/shrink decisions
>>>>> based
>>>>> on
>>>>> the length of kvm_vcpu_block rather than the arrival of a guest
>>>>> interrupt
>>>>> during polling.
>>>>>
>>>>> Some thoughts for dynamic growth:
>>>>>      * Given Windows 10 timer tick (1 ms), let's set the maximum poll
>>>>> time
>>>>> to
>>>>>        less than 1ms. 200 us has been a good value for always-poll. We
>>>>> can
>>>>>        probably go a bit higher once we have your patch. Maybe 500 us?
>>
>> Did you test your patch against a windows guest?
> I have not. I tested against a 250HZ linux guest to check how it performs
> against a ticking guest. Presumably, windows should be the same, but at a
> higher tick rate. Do you have a test for Windows?

I just test the idle vCPUs usage.


V4 for windows 10:

+-----------------+----------------+-----------------------+
|                                 | 
|                                           |
|  w/o halt-poll           |  w/ halt-poll          | dynamic(v4) 
halt-poll         |
+-----------------+----------------+-----------------------+
|                                 | 
|                                           |
|    ~2.1%                    |    ~3.0%                  | ~2.4%       
                   |
+-----------------+----------------+-----------------------+

V4  for linux guest:

+-----------------+----------------+-------------------+
|                 |                |                   |
|  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
+-----------------+----------------+-------------------+
|                 |                |                   |
|    ~0.9%        |    ~1.8%       |     ~1.2%         |
+-----------------+----------------+-------------------+


Regards,
Wanpeng Li

>
>>>>>      * The base case of dynamic growth (the first grow() after being at
>>>>> 0)
>>>>> should
>>>>>        be small. 500 us is too big. When I run TCP_RR in my guest I see
>>>>> poll
>>>>> times
>>>>>        of < 10 us. TCP_RR is on the lower-end of message passing workload
>>>>> latency,
>>>>>        so 10 us would be a good base case.
>>>>
>>>> How to get your TCP_RR benchmark?
>>>>
>>>> Regards,
>>>> Wanpeng Li
>>> Install the netperf package, or build from here:
>>> http://www.netperf.org/netperf/DownloadNetperf.html
>>>
>>> In the vm:
>>>
>>> # ./netserver
>>> # ./netperf -t TCP_RR
>>>
>>> Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
>>> passing workload in order to test halt-polling).
>>
>> Ah, ok, I use the same benchmark as yours.
>>
>> Regards,
>> Wanpeng Li
>>
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-02  0:29           ` Wanpeng Li
@ 2015-09-02  1:49             ` David Matlack
  2015-09-02  6:01               ` Wanpeng Li
  0 siblings, 1 reply; 17+ messages in thread
From: David Matlack @ 2015-09-02  1:49 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org,
	Peter Kieser

On Tue, Sep 1, 2015 at 5:29 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
> On 9/2/15 7:24 AM, David Matlack wrote:
>>
>> On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
<snip>
>>>
>>> Why this can happen?
>>
>> Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery
>> of TSC deadline timer interrupt). I don't think the edge case exists in
>> the latest kernel.
>
>
> Yeah, hope we both(include Peter Kieser) can test against latest kvm tree to
> avoid confusing. The reason to introduce the adaptive halt-polling toggle is
> to handle the "edge case" as you mentioned above. So I think we can make
> more efforts improve v4 instead. I will improve v4 to handle short halt
> today. ;-)

That's fine. It's just easier to convey my ideas with a patch. FYI the
other reason for the toggle patch was to add the timer for kvm_vcpu_block,
which I think is the only way to get dynamic halt-polling right. Feel free
to work on top of v4!

>
<snip>
>>>
>>> Did you test your patch against a windows guest?
>>
>> I have not. I tested against a 250HZ linux guest to check how it performs
>> against a ticking guest. Presumably, windows should be the same, but at a
>> higher tick rate. Do you have a test for Windows?
>
>
> I just test the idle vCPUs usage.
>
>
> V4 for windows 10:
>
> +-----------------+----------------+-----------------------+
> |                                 | |
> |
> |  w/o halt-poll           |  w/ halt-poll          | dynamic(v4) halt-poll
> |
> +-----------------+----------------+-----------------------+
> |                                 | |
> |
> |    ~2.1%                    |    ~3.0%                  | ~2.4%
> |
> +-----------------+----------------+-----------------------+

I'm not seeing the same results with v4. With a 250HZ ticking guest
I see 15% c0 with halt_poll_ns=2000000 and 1.27% with halt_poll_ns=0.
Are you running one vcpu per pcpu?

(The reason for the overhead: the new tracepoint shows each vcpu is
alternating between 0 and 500 us.)

>
> V4  for linux guest:
>
> +-----------------+----------------+-------------------+
> |                 |                |                   |
> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
> +-----------------+----------------+-------------------+
> |                 |                |                   |
> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
> +-----------------+----------------+-------------------+
>
>
> Regards,
> Wanpeng Li

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling
  2015-09-02  1:49             ` David Matlack
@ 2015-09-02  6:01               ` Wanpeng Li
  0 siblings, 0 replies; 17+ messages in thread
From: Wanpeng Li @ 2015-09-02  6:01 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, kvm list, linux-kernel@vger.kernel.org,
	Peter Kieser

On 9/2/15 9:49 AM, David Matlack wrote:
> On Tue, Sep 1, 2015 at 5:29 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
>> On 9/2/15 7:24 AM, David Matlack wrote:
>>> On Tue, Sep 1, 2015 at 3:58 PM, Wanpeng Li <wanpeng.li@hotmail.com> wrote:
> <snip>
>>>> Why this can happen?
>>> Ah, probably because I'm missing 9c8fd1ba220 (KVM: x86: optimize delivery
>>> of TSC deadline timer interrupt). I don't think the edge case exists in
>>> the latest kernel.
>>
>> Yeah, hope we both(include Peter Kieser) can test against latest kvm tree to
>> avoid confusing. The reason to introduce the adaptive halt-polling toggle is
>> to handle the "edge case" as you mentioned above. So I think we can make
>> more efforts improve v4 instead. I will improve v4 to handle short halt
>> today. ;-)
> That's fine. It's just easier to convey my ideas with a patch. FYI the
> other reason for the toggle patch was to add the timer for kvm_vcpu_block,
> which I think is the only way to get dynamic halt-polling right. Feel free
> to work on top of v4!

I introduce your idea to shrink/grow poll time in v5 by detecting 
long/short halt and the performance looks good. Many thanks your help, 
David! ;-)

Regards,
Wanpeng Li

>
> <snip>
>>>> Did you test your patch against a windows guest?
>>> I have not. I tested against a 250HZ linux guest to check how it performs
>>> against a ticking guest. Presumably, windows should be the same, but at a
>>> higher tick rate. Do you have a test for Windows?
>>
>> I just test the idle vCPUs usage.
>>
>>
>> V4 for windows 10:
>>
>> +-----------------+----------------+-----------------------+
>> |                                 | |
>> |
>> |  w/o halt-poll           |  w/ halt-poll          | dynamic(v4) halt-poll
>> |
>> +-----------------+----------------+-----------------------+
>> |                                 | |
>> |
>> |    ~2.1%                    |    ~3.0%                  | ~2.4%
>> |
>> +-----------------+----------------+-----------------------+
> I'm not seeing the same results with v4. With a 250HZ ticking guest
> I see 15% c0 with halt_poll_ns=2000000 and 1.27% with halt_poll_ns=0.
> Are you running one vcpu per pcpu?
>
> (The reason for the overhead: the new tracepoint shows each vcpu is
> alternating between 0 and 500 us.)
>
>> V4  for linux guest:
>>
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |  w/o halt-poll  |  w/ halt-poll  | dynamic halt-poll |
>> +-----------------+----------------+-------------------+
>> |                 |                |                   |
>> |    ~0.9%        |    ~1.8%       |     ~1.2%         |
>> +-----------------+----------------+-------------------+
>>
>>
>> Regards,
>> Wanpeng Li


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-09-02  6:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-27  9:47 [PATCH v4 0/3] KVM: Dynamic Halt-Polling Wanpeng Li
     [not found] ` <55E221B2.3000702@kieser.ca>
2015-08-29 22:21   ` Wanpeng Li
2015-08-29 22:26     ` Peter Kieser
2015-08-29 23:55       ` Wanpeng Li
2015-08-30  0:13         ` Peter Kieser
2015-08-30  0:21           ` Wanpeng Li
2015-08-31  7:44       ` Wanpeng Li
2015-08-31  7:47         ` Wanpeng Li
2015-09-01 21:45 ` David Matlack
2015-09-01 22:30   ` Wanpeng Li
2015-09-01 22:34     ` David Matlack
2015-09-01 22:58       ` Wanpeng Li
2015-09-01 23:24         ` David Matlack
2015-09-02  0:29           ` Wanpeng Li
2015-09-02  1:49             ` David Matlack
2015-09-02  6:01               ` Wanpeng Li
  -- strict thread matches above, loose matches on Subject: below --
2015-08-27  9:52 Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox