Re: [PATCH] vhost: Add polling mode - Christian Borntraeger

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Razya Ladelsky <razya@il.ibm.com>, mst@redhat.com, kvm@vger.kernel.org
Cc: ERANRA@il.ibm.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, GLIKSON@il.ibm.com,
	YOSSIKU@il.ibm.com, abel.gordon@gmail.com, JOELN@il.ibm.com,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] vhost: Add polling mode
Date: Wed, 20 Aug 2014 10:41:32 +0200	[thread overview]
Message-ID: <53F45F3C.5000207@de.ibm.com> (raw)
In-Reply-To: <20140810083035.0CF58380729@moren.haifa.ibm.com>

On 10/08/14 10:30, Razya Ladelsky wrote:
> From: Razya Ladelsky <razya@il.ibm.com>
> Date: Thu, 31 Jul 2014 09:47:20 +0300
> Subject: [PATCH] vhost: Add polling mode
> 
> When vhost is waiting for buffers from the guest driver (e.g., more packets to
> send in vhost-net's transmit queue), it normally goes to sleep and waits for the
> guest to "kick" it. This kick involves a PIO in the guest, and therefore an exit
> (and possibly userspace involvement in translating this PIO exit into a file
> descriptor event), all of which hurts performance.
> 
> If the system is under-utilized (has cpu time to spare), vhost can continuously
> poll the virtqueues for new buffers, and avoid asking the guest to kick us.
> This patch adds an optional polling mode to vhost, that can be enabled via a
> kernel module parameter, "poll_start_rate".
> 
> When polling is active for a virtqueue, the guest is asked to disable
> notification (kicks), and the worker thread continuously checks for new buffers.
> When it does discover new buffers, it simulates a "kick" by invoking the
> underlying backend driver (such as vhost-net), which thinks it got a real kick
> from the guest, and acts accordingly. If the underlying driver asks not to be
> kicked, we disable polling on this virtqueue.
> 
> We start polling on a virtqueue when we notice it has work to do. Polling on
> this virtqueue is later disabled after 3 seconds of polling turning up no new
> work, as in this case we are better off returning to the exit-based notification
> mechanism. The default timeout of 3 seconds can be changed with the
> "poll_stop_idle" kernel module parameter.
> 
> This polling approach makes lot of sense for new HW with posted-interrupts for
> which we have exitless host-to-guest notifications. But even with support for
> posted interrupts, guest-to-host communication still causes exits. Polling adds
> the missing part.
> 
> When systems are overloaded, there won't be enough cpu time for the various
> vhost threads to poll their guests' devices. For these scenarios, we plan to add
> support for vhost threads that can be shared by multiple devices, even of
> multiple vms.
> Our ultimate goal is to implement the I/O acceleration features described in:
> KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
> https://www.youtube.com/watch?v=9EyweibHfEs
> and
> https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
> 
> I ran some experiments with TCP stream netperf and filebench (having 2 threads
> performing random reads) benchmarks on an IBM System x3650 M4.
> I have two machines, A and B. A hosts the vms, B runs the netserver.
> The vms (on A) run netperf, its destination server is running on B.
> All runs loaded the guests in a way that they were (cpu) saturated. For example,
> I ran netperf with 64B messages, which is heavily loading the vm (which is why
> its throughput is low).
> The idea was to get it 100% loaded, so we can see that the polling is getting it
> to produce higher throughput.
> 
> The system had two cores per guest, as to allow for both the vcpu and the vhost
> thread to run concurrently for maximum throughput (but I didn't pin the threads
> to specific cores).
> My experiments were fair in a sense that for both cases, with or without
> polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity that
> way). The only difference was whether polling was enabled/disabled.
> 
> Results:
> 
> Netperf, 1 vm:
> The polling patch improved throughput by ~33% (1516 MB/sec -> 2046 MB/sec).
> Number of exits/sec decreased 6x.
> The same improvement was shown when I tested with 3 vms running netperf
> (4086 MB/sec -> 5545 MB/sec).
> 
> filebench, 1 vm:
> ops/sec improved by 13% with the polling patch. Number of exits was reduced by
> 31%.
> The same experiment with 3 vms running filebench showed similar numbers.
> 
> Signed-off-by: Razya Ladelsky <razya@il.ibm.com>

Gave it a quick try on s390/kvm. As expected it makes no difference for big streaming workload like iperf.
uperf with a 1-1 round robin got indeed faster by about 30%.
The high CPU consumption is something that bothers me though, as virtualized systems tend to be full.


> +static int poll_start_rate = 0;
> +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(poll_start_rate, "Start continuous polling of virtqueue when rate of events is at least this number per jiffy. If 0, never start polling.");
> +
> +static int poll_stop_idle = 3*HZ; /* 3 seconds */
> +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(poll_stop_idle, "Stop continuous polling of virtqueue after this many jiffies of no work.");

This seems ridicoudly high. Even one jiffie is an eternity, so setting it to 1 as a default would reduce the CPU overhead for most cases.
If we dont have a packet in one millisecond, we can surely go back to the kick approach, I think.

Christian

WARNING: multiple messages have this Message-ID (diff)

From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Razya Ladelsky <razya@il.ibm.com>, mst@redhat.com, kvm@vger.kernel.org
Cc: GLIKSON@il.ibm.com, ERANRA@il.ibm.com, YOSSIKU@il.ibm.com,
	JOELN@il.ibm.com, abel.gordon@gmail.com,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] vhost: Add polling mode
Date: Wed, 20 Aug 2014 10:41:32 +0200	[thread overview]
Message-ID: <53F45F3C.5000207@de.ibm.com> (raw)
In-Reply-To: <20140810083035.0CF58380729@moren.haifa.ibm.com>

On 10/08/14 10:30, Razya Ladelsky wrote:
> From: Razya Ladelsky <razya@il.ibm.com>
> Date: Thu, 31 Jul 2014 09:47:20 +0300
> Subject: [PATCH] vhost: Add polling mode
> 
> When vhost is waiting for buffers from the guest driver (e.g., more packets to
> send in vhost-net's transmit queue), it normally goes to sleep and waits for the
> guest to "kick" it. This kick involves a PIO in the guest, and therefore an exit
> (and possibly userspace involvement in translating this PIO exit into a file
> descriptor event), all of which hurts performance.
> 
> If the system is under-utilized (has cpu time to spare), vhost can continuously
> poll the virtqueues for new buffers, and avoid asking the guest to kick us.
> This patch adds an optional polling mode to vhost, that can be enabled via a
> kernel module parameter, "poll_start_rate".
> 
> When polling is active for a virtqueue, the guest is asked to disable
> notification (kicks), and the worker thread continuously checks for new buffers.
> When it does discover new buffers, it simulates a "kick" by invoking the
> underlying backend driver (such as vhost-net), which thinks it got a real kick
> from the guest, and acts accordingly. If the underlying driver asks not to be
> kicked, we disable polling on this virtqueue.
> 
> We start polling on a virtqueue when we notice it has work to do. Polling on
> this virtqueue is later disabled after 3 seconds of polling turning up no new
> work, as in this case we are better off returning to the exit-based notification
> mechanism. The default timeout of 3 seconds can be changed with the
> "poll_stop_idle" kernel module parameter.
> 
> This polling approach makes lot of sense for new HW with posted-interrupts for
> which we have exitless host-to-guest notifications. But even with support for
> posted interrupts, guest-to-host communication still causes exits. Polling adds
> the missing part.
> 
> When systems are overloaded, there won't be enough cpu time for the various
> vhost threads to poll their guests' devices. For these scenarios, we plan to add
> support for vhost threads that can be shared by multiple devices, even of
> multiple vms.
> Our ultimate goal is to implement the I/O acceleration features described in:
> KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
> https://www.youtube.com/watch?v=9EyweibHfEs
> and
> https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
> 
> I ran some experiments with TCP stream netperf and filebench (having 2 threads
> performing random reads) benchmarks on an IBM System x3650 M4.
> I have two machines, A and B. A hosts the vms, B runs the netserver.
> The vms (on A) run netperf, its destination server is running on B.
> All runs loaded the guests in a way that they were (cpu) saturated. For example,
> I ran netperf with 64B messages, which is heavily loading the vm (which is why
> its throughput is low).
> The idea was to get it 100% loaded, so we can see that the polling is getting it
> to produce higher throughput.
> 
> The system had two cores per guest, as to allow for both the vcpu and the vhost
> thread to run concurrently for maximum throughput (but I didn't pin the threads
> to specific cores).
> My experiments were fair in a sense that for both cases, with or without
> polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity that
> way). The only difference was whether polling was enabled/disabled.
> 
> Results:
> 
> Netperf, 1 vm:
> The polling patch improved throughput by ~33% (1516 MB/sec -> 2046 MB/sec).
> Number of exits/sec decreased 6x.
> The same improvement was shown when I tested with 3 vms running netperf
> (4086 MB/sec -> 5545 MB/sec).
> 
> filebench, 1 vm:
> ops/sec improved by 13% with the polling patch. Number of exits was reduced by
> 31%.
> The same experiment with 3 vms running filebench showed similar numbers.
> 
> Signed-off-by: Razya Ladelsky <razya@il.ibm.com>

Gave it a quick try on s390/kvm. As expected it makes no difference for big streaming workload like iperf.
uperf with a 1-1 round robin got indeed faster by about 30%.
The high CPU consumption is something that bothers me though, as virtualized systems tend to be full.


> +static int poll_start_rate = 0;
> +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(poll_start_rate, "Start continuous polling of virtqueue when rate of events is at least this number per jiffy. If 0, never start polling.");
> +
> +static int poll_stop_idle = 3*HZ; /* 3 seconds */
> +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(poll_stop_idle, "Stop continuous polling of virtqueue after this many jiffies of no work.");

This seems ridicoudly high. Even one jiffie is an eternity, so setting it to 1 as a default would reduce the CPU overhead for most cases.
If we dont have a packet in one millisecond, we can surely go back to the kick approach, I think.

Christian

next prev parent reply	other threads:[~2014-08-20  8:41 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1407659404-razya@il.ibm.com>
2014-08-10  8:30 ` [PATCH] vhost: Add polling mode Razya Ladelsky
2014-08-10  8:30 ` Razya Ladelsky
2014-08-10  8:30 ` Razya Ladelsky
2014-08-10  8:30   ` Razya Ladelsky
2014-08-10  8:30 ` Razya Ladelsky
2014-08-10 19:45   ` Michael S. Tsirkin
2014-08-10 19:45     ` Michael S. Tsirkin
2014-08-11 19:46     ` David Miller
2014-08-11 19:46       ` David Miller
2014-08-12  9:18       ` Michael S. Tsirkin
2014-08-12  9:18         ` Michael S. Tsirkin
2014-08-12 10:57         ` Razya Ladelsky
2014-08-12 10:57           ` Razya Ladelsky
2014-08-13 12:15           ` Michael S. Tsirkin
2014-08-13 12:15             ` Michael S. Tsirkin
2014-08-17 12:35             ` Razya Ladelsky
2014-08-17 12:35               ` Razya Ladelsky
2014-08-17 12:58               ` Michael S. Tsirkin
2014-08-17 12:58                 ` Michael S. Tsirkin
2014-08-19  8:36                 ` Razya Ladelsky
2014-08-19  8:36                   ` Razya Ladelsky
2014-08-20 11:05                   ` Michael S. Tsirkin
2014-08-20 11:05                     ` Michael S. Tsirkin
2016-09-04  8:45     ` Razya Ladelsky
2016-09-04  8:45     ` Razya Ladelsky
2014-08-20  8:41   ` Christian Borntraeger [this message]
2014-08-20  8:41     ` Christian Borntraeger
2014-08-20 10:32     ` Michael S. Tsirkin
2014-08-20 10:32       ` Michael S. Tsirkin
2014-08-21 13:53     ` Razya Ladelsky
2014-08-21 13:53       ` Razya Ladelsky
2014-08-22  9:30       ` Zhang Haoyu
2014-08-22 10:01       ` Zhang Haoyu
2014-08-20 10:57   ` Michael S. Tsirkin
2014-08-20 10:57     ` Michael S. Tsirkin
2014-08-21 14:23     ` Razya Ladelsky
2014-08-21 14:23       ` Razya Ladelsky
2014-08-21 14:29       ` David Laight
2014-08-21 14:29         ` David Laight
2014-08-24 12:26         ` Razya Ladelsky
2014-08-24 12:26           ` Razya Ladelsky
2014-07-21 13:23 Razya Ladelsky
2014-07-23  5:26 ` Jason Wang
2014-07-23  8:12   ` Razya Ladelsky
2014-07-23  8:42     ` Jason Wang
2014-07-23  8:48       ` Abel Gordon
2014-07-24  5:57         ` Jason Wang
2014-07-29  1:30         ` Zhang Haoyu
2014-07-29  7:15           ` Razya Ladelsky
2014-07-29  8:06 ` Michael S. Tsirkin
2014-07-29 10:30   ` Razya Ladelsky
2014-07-29 10:44     ` Michael S. Tsirkin
2014-07-29 12:23       ` Razya Ladelsky
2014-07-29 12:40         ` Michael S. Tsirkin
2014-07-30  6:32           ` Razya Ladelsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53F45F3C.5000207@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=ERANRA@il.ibm.com \
    --cc=GLIKSON@il.ibm.com \
    --cc=JOELN@il.ibm.com \
    --cc=YOSSIKU@il.ibm.com \
    --cc=abel.gordon@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=razya@il.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.