From mboxrd@z Thu Jan  1 00:00:00 1970
From: "David S. Ahern" <daahern@cisco.com>
Subject: Re: [kvm-devel] performance with guests running 2.4 kernels (specifically
 RHEL3)
Date: Thu, 22 May 2008 16:08:53 -0600
Message-ID: <4835EEF5.9010600@cisco.com>
References: <48054518.3000104@cisco.com>	<4805BCF1.6040605@qumranet.com>	<4807BD53.6020304@cisco.com>	<48085485.3090205@qumranet.com>	<480C188F.3020101@cisco.com>	<480C5C39.4040300@qumranet.com>	<480E492B.3060500@cisco.com>	<480EEDA0.3080209@qumranet.com>	<480F546C.2030608@cisco.com>	<481215DE.3000302@cisco.com>	<20080428181550.GA3965@dmt>	<4816617F.3080403@cisco.com>	<4817F30C.6050308@cisco.com>	<48184228.2020701@qumranet.com>	<481876A9.1010806@cisco.com>	<48187903.2070409@qumranet.com>	<4826E744.1080107@qumranet.com>	<4826F668.6030305@qumranet.com> <48290FC2.4070505@cisco.com> <48294272.5020801@qumranet.com> <482B4D29.7010202@cisco.com> <482C1633.5070302@qumranet.com> <482E5F9C.6000207@cisco.com> <482FCEE1.5040306@qumranet.com> <4830F90A.1020809@cisco.com> <4830FE8D.6010006@cisco.com> <4
 8318E64.8090706@qumranet.com> <4832DDEB.4000100@qumranet.com>
Mime-Version: 1.0
Content-Type: multipart/mixed;
 boundary="------------090400040701000009020707"
Cc: kvm@vger.kernel.org
To: Avi Kivity <avi@qumranet.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from sj-iport-2.cisco.com ([171.71.176.71]:23675 "EHLO
	sj-iport-2.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1758837AbYEVWJo (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 22 May 2008 18:09:44 -0400
In-Reply-To: <4832DDEB.4000100@qumranet.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

This is a multi-part message in MIME format.
--------------090400040701000009020707
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

The short answer is that I am still see large system time hiccups in the
guests due to kscand in the guest scanning its active lists. I do see
better response for a KVM_MAX_PTE_HISTORY of 3 than with 4. (For
completeness I also tried a history of 2, but it performed worse than 3
which is no surprise given the meaning of it.)


I have been able to scratch out a simplistic program that stimulates
kscand activity similar to what is going on in my real guest (see
attached). The program requests a memory allocation, initializes it (to
get it backed) and then in a loop sweeps through the memory in chunks
similar to a program using parts of its memory here and there but
eventually accessing all of it.

Start the RHEL3/CentOS 3 guest with *2GB* of RAM (or more). The key is
using a fair amount of highmem. Start a couple of instances of the
attached. For example, I've been using these 2:

	memuser 768M 120 5 300
	memuser 384M 300 10 600

Together these instances take up a 1GB of RAM and once initialized
consume very little CPU. On kvm they make kscand and kswapd go nuts
every 5-15 minutes. For comparison, I do not see the same behavior for
an identical setup running on esx 3.5.

david


Avi Kivity wrote:
> Avi Kivity wrote:
>>
>> There are (at least) three options available:
>> - detect and special-case this scenario
>> - change the flood detector to be per page table instead of per vcpu
>> - change the flood detector to look at a list of recently used page
>> tables instead of the last page table
>>
>> I'm having a hard time trying to pick between the second and third
>> options.
>>
> 
> The answer turns out to be "yes", so here's a patch that adds a pte
> access history table for each shadowed guest page-table.  Let me know if
> it helps.  Benchmarking a variety of workloads on all guests supported
> by kvm is left as an exercise for the reader, but I suspect the patch
> will either improve things all around, or can be modified to do so.
> 

--------------090400040701000009020707
Content-Type: text/x-csrc;
 name="memuser.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="memuser.c"

/* simple program to malloc memory, inialize it, and
 * then repetitively use it to keep it active.
 */

#include <sys/time.h>
#include <time.h>

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <libgen.h>

/* goal is to sweep memory every T1 sec by accessing a
 * percentage at a time and sleeping T2 sec in between accesses.
 * Once all the memory has been accessed, sleep for T3 sec
 * before starting the cycle over.
 */
#define T1  180
#define T2  5
#define T3  300


const char *timestamp(void);

void usage(const char *prog) {
	fprintf(stderr, "\nusage: %s memlen{M|K}) [t1 t2 t3]\n", prog);
}


int main(int argc, char *argv[])
{
	int len;
	char *endp;
	int factor, endp_len;
	int start, incr;
	int t1 = T1, t2 = T2, t3 = T3;
	char *mem;
	char c = 0;

	if (argc < 2) {
		usage(basename(argv[0]));
		return 1;
	}


	/*
	 * determine memory to request
	 */
	len = (int) strtol(argv[1], &endp, 0);
	factor = 1;
	endp_len = strlen(endp);
	if ((endp_len == 1) && ((*endp == 'M') || (*endp == 'm')))
		factor = 1024 * 1024;
	else if ((endp_len == 1) && ((*endp == 'K') || (*endp == 'k')))
		factor = 1024;
	else if (endp_len) {
		fprintf(stderr, "invalid memory len.\n");
		return 1;
	}
	len *= factor;

	if (len == 0) {
		fprintf(stdout, "memory len is 0.\n");
		return 1;
	}


	/*
	 * convert times if given
	 */
	if (argc > 2) {
		if (argc < 5) {
			usage(basename(argv[0]));
			return 1;
		}

		t1 = atoi(argv[2]);
		t2 = atoi(argv[3]);
		t3 = atoi(argv[4]);
	}


	/*
	 *  amount of memory to sweep at one time
	 */
	if (t1 && t2)
		incr = len / t1 * t2;
	else
		incr = len;


	mem = (char *) malloc(len);
	if (mem == NULL) {
		fprintf(stderr, "malloc failed\n");
		return 1;
	}
	printf("memory allocated. initializing to 0\n");
	memset(mem, 0, len);

	start = 0;
	printf("%s starting memory update.\n", timestamp());
	while (1) {
		c++;
		if (c == 0x7f) c = 0;
		memset(mem + start, c, incr);
		start += incr;

		if ((start >= len) || ((start + incr) >= len)) {
			printf("%s scan complete. sleeping %d\n", 
                              timestamp(), t3);
			start = 0;
			sleep(t3);
			printf("%s starting memory update.\n", timestamp());
		} else if (t2)
			sleep(t2);
	}

	return 0;
}

const char *timestamp(void)
{
    static char date[64];
    struct timeval now;
    struct tm ltime;

    memset(date, 0, sizeof(date));

    if (gettimeofday(&now, NULL) == 0)
    {
        if (localtime_r(&now.tv_sec, &ltime))
            strftime(date, sizeof(date), "%m/%d %H:%M:%S", &ltime);
    }

    if (strlen(date) == 0)
        strcpy(date, "unknown");

    return date;
}

--------------090400040701000009020707--