From mboxrd@z Thu Jan  1 00:00:00 1970
From: Peter Lieven <pl@dlh.net>
Subject: Re: [RFC 0/4] KVM in-kernel PM Timer implementation
Date: Tue, 21 Feb 2012 19:10:58 +0100
Message-ID: <4F43DE32.9090707@dlh.net>
References: <1367781905.795471292413997183.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Anthony Liguori <anthony@codemonkey.ws>, kvm@vger.kernel.org,
	glommer@redhat.com, zamsden@redhat.com, avi@redhat.com,
	mtosatti@redhat.com
To: Ulrich Obergfell <uobergfe@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from ssl.dlh.net ([91.198.192.8]:47737 "EHLO ssl.dlh.net"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1754839Ab2BUSLE (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 21 Feb 2012 13:11:04 -0500
In-Reply-To: <1367781905.795471292413997183.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 15.12.2010 12:53, Ulrich Obergfell wrote:
> ----- "Anthony Liguori"<anthony@codemonkey.ws>  wrote:
>
>> On 12/14/2010 06:09 AM, Ulrich Obergfell wrote:
> [...]
>
>>> Parts 1 thru 4 of this RFC contain experimental source code which
>>> I recently used to investigate the performance benefit. In a Linux
>>> guest, I was running a program that calls gettimeofday() 'n' times
>>> in a loop (the PM Timer register is read during each call). With
>>> in-kernel PM Timer, I observed a significant reduction of program
>>> execution time.
>>>
>> I've played with this in the past.  Can you post real numbers,
>> preferably, with a real work load?
>
> Anthony,
>
> I only experimented with a gettimeofday() loop. With this test scenario
> I observed that in-kernel PM Timer reduced the program execution time to
> roughly half of the execution time that it takes with userspace PM Timer.
> Please find some example results below (these results were obtained while
> the host was not busy). The relative difference of in-kernel PM Timer
> versus userspace PM Timer is high, whereas the absolute difference per
> call appears to be low. So, the benefit much depends on how frequently
> gettimeofday() is called in a real work load. I don't have any numbers
> from a real work load. When I began working on this, I was motivated by
> the fact that the Linux kernel itself provides an optimization for the
> gettimeofday() call ('vxtime'). So, from this I presumed that there
> would be real work loads which would benefit from the optimization of
> the gettimeofday() call (otherwise, why would we have 'vxtime' ?).
> Of course, 'vxtime' is not related to PM based time keeping. However,
> the experimental code shows an approach to optimize gettimeofday() in
> KVM virtual machines.
>
>
> Regards,
>
> Uli
>
>
> - host:
>
> # grep "model name" /proc/cpuinfo | sort | uniq -c
>        8 model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
>
> # uname -r
> 2.6.37-rc4
>
>
> - guest:
>
> # grep "model name" /proc/cpuinfo | sort | uniq -c
>        4 model name : QEMU Virtual CPU version 0.13.50
>
>
> - test program ('gtod.c'):
>
> #include<sys/time.h>
> #include<stdlib.h>
>
> struct timeval tv;
>
> main(int argc, char *argv[])
> {
> 	int i = atoi(argv[1]);
> 	while (i-->  0)
> 		gettimeofday(&tv, NULL);
> }
>
>
> - example results with in-kernel PM Timer:
>
> # for i in 1 2 3
>> do
>> time ./gtod 25000000
>> done
> real	0m44.302s
> user	0m1.090s
> sys	0m43.163s
>
> real	0m44.509s
> user	0m1.100s
> sys	0m43.393s
>
> real	0m45.290s
> user	0m1.160s
> sys	0m44.123s
>
> # for i in 10000000 50000000 100000000
>> do
>> time ./gtod $i
>> done
> real	0m17.981s
> user	0m0.810s
> sys	0m17.157s
>
> real	1m27.253s
> user	0m1.930s
> sys	1m25.307s
>
> real	2m51.801s
> user	0m3.359s
> sys	2m48.384s
>
>
> - example results with userspace PM Timer:
>
> # for i in 1 2 3
>> do
>> time ./gtod 25000000
>> done
> real	1m24.185s
> user	0m2.000s
> sys	1m22.168s
>
> real	1m23.508s
> user	0m1.750s
> sys	1m21.738s
>
> real	1m24.437s
> user	0m1.900s
> sys	1m22.517s
>
> # for i in 10000000 50000000 100000000
>> do
>> time ./gtod $i
>> done
> real	0m33.479s
> user	0m0.680s
> sys	0m32.785s
>
> real	2m50.831s
> user	0m3.389s
> sys	2m47.405s
>
> real	5m42.304s
> user	0m7.319s
> sys	5m34.919s
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

i currently analyze a performance regression togehter with Gleb where a 
Windows 7 / Win2008R2 VM hammers the pmtimer approx. 15000 times/s during
I/O. the performance thus is very bad and the cpu is at 100%.

has anyone made any further work on the in-kernel pm timer or a full 
implementation?

would it be possible to rebase this old experimental patch to see if it 
helps in the performance regression we came across?

thank you,
peter