Re: [KVM-AUTOTEST][PATCH] timedrift support

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bear Yang <byang@redhat.com>
To: Lucas Meneghel Rodrigues <mrodrigu@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	uril@redhat.com, kvm@vger.kernel.org
Subject: Re: [KVM-AUTOTEST][PATCH] timedrift support
Date: Tue, 12 May 2009 21:07:24 +0800	[thread overview]
Message-ID: <4A09748C.6040909@redhat.com> (raw)
In-Reply-To: <4A096C28.1060900@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 17786 bytes --]

Sorry forgot to attach my new patch.
Bear Yang wrote:
> Hi Lucas:
> First, I want to say really thanks for your kindly,carefully words and 
> suggestions. now,  I modified my scripts follow your opinions.
> 1. Add the genload to timedrift, but I am not sure whether it is right 
> or not to add the information  CVS relevant. If it is not necessary. I 
> will remove them next time.
> 2. Replace the API os.system to utils.system
> 3. Replace the API os.environ.get('HOSTNAME') to socket.gethostname()
> 4. for the snippet of the code below:
> +    if utils.system(ntp_cmd, ignore_status=True) != 0:
> +        raise error.TestFail, "NTP server has not starting correctly..."
>
> Your suggestion is "Instead of the if clause we'd put a try/except 
> block", but I am not clear how to do it. Would you please give me some 
> guides for this. Sorry.
>
> Other thing about functional the clauses which to get vm handle below:
>
> +    # get vm handle
> +    vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
> +    if not vm:
> +        raise error.TestError, "VM object not found in environment"
> +    if not vm.is_alive():
> +        raise error.TestError, "VM seems to be dead; Test requires a 
> living VM"
>
> I agree with you on this point, I remember that somebody to do this 
> before. but seems upstream not accept his modification.
>
> Have a good day
>
> thanks.
>
>
> Lucas Meneghel Rodrigues wrote:
>> On Mon, 2009-05-11 at 18:40 +0800, Bear Yang wrote:
>>  
>>> Hello.
>>> I have modified my script according Marcelo's suggestion. and 
>>> resubmit my script to you all. :)
>>>
>>> Marcelo, Seems except you, no one care my script. I  still want to 
>>> say any suggestion  on my script would be greatly appreciated.
>>>
>>> Thanks.
>>>
>>> Bear
>>>
>>>     
>>
>> Hi Bear, sorry, I had some hectic days here so I still haven't reviewed
>> your patch.
>> As a general comment, I realize that in several occasions we are using
>> os.system() to execute commands on the host, when we would usually
>> prefer to use the utils.system() or utils.run() API, since it already
>> throws an exception when exit code != 0 (you can allways set ignore_fail
>> = True to avoid this behavior if needed) and we are working on doing a
>> better handling of stdout and stderr upstream.
>>
>> My comments follow:
>>
>> diff -urN kvm_runtest_2.bak/cpu_stress.c kvm_runtest_2/cpu_stress.c
>> --- kvm_runtest_2.bak/cpu_stress.c    1969-12-31 19:00:00.000000000 
>> -0500
>> +++ kvm_runtest_2/cpu_stress.c    2009-05-05 22:35:34.000000000 -0400
>> @@ -0,0 +1,61 @@
>> +#define _GNU_SOURCE
>> +#include <stdio.h>
>> +#include <pthread.h>
>> +#include <sched.h>
>> +#include <stdlib.h>
>> +#include <fcntl.h>
>> +#include <math.h>
>> +#include <unistd.h>
>> +
>> +#define MAX_CPUS 256
>> +#define BUFFSIZE 1024
>> +
>> +
>> +void worker_child(int cpu)
>> +{
>> +    int cur_freq;
>> +    int min_freq;
>> +    int max_freq;
>> +    int last_freq;
>> +    cpu_set_t mask;
>> +    int i;
>> +    double x;
>> +        int d = 0;
>> +    /*
>> +     * bind this thread to the specified cpu +     */
>> +    CPU_ZERO(&mask);
>> +    CPU_SET(cpu, &mask);
>> +    sched_setaffinity(0, CPU_SETSIZE, &mask);
>> +
>> +    while (d++ != 500000) {
>> +            for (i=0; i<100000; i++)
>> +                x = sqrt(x);
>> +    }
>> +
>> +    _exit(0);
>> +
>> +}
>> +
>> +
>> +main() {
>> +    cpu_set_t mask;
>> +    int i;
>> +    int code;
>> +
>> +    if (sched_getaffinity(0, CPU_SETSIZE, &mask) < 0){
>> +        perror ("sched_getaffinity");
>> +        exit(1);
>> +    }
>> +
>> +    for (i=0; i<CPU_SETSIZE; i++)
>> +        if (CPU_ISSET(i, &mask)){
>> +            printf ("CPU%d\n",i);
>> +            if (fork() == 0)
>> +                worker_child(i);
>> +        }
>> +
>> +
>> +    wait(&code);
>> +    exit (WEXITSTATUS(code));
>> +}
>>
>> I believe we might want to use a more complete stress system, that 
>> can do IO stress and put 'memory pressure' on the host system. When I 
>> need to cause stress on a host, what I end up doing is to hack the 
>> stress.c program from LTP, because it can do memory and IO stress as 
>> well.
>> I will send you the stress.c program on a separate e-mail.
>>
>> diff -urN kvm_runtest_2.bak/kvm_runtest_2.py 
>> kvm_runtest_2/kvm_runtest_2.py
>> --- kvm_runtest_2.bak/kvm_runtest_2.py    2009-04-29 
>> 06:17:29.000000000 -0400
>> +++ kvm_runtest_2/kvm_runtest_2.py    2009-04-29 08:06:32.000000000 
>> -0400
>> @@ -36,6 +36,8 @@
>>                  "autotest":     test_routine("kvm_tests",           
>> "run_autotest"),
>>                  "kvm_install":  test_routine("kvm_install",         
>> "run_kvm_install"),
>>                  "linux_s3":     test_routine("kvm_tests",           
>> "run_linux_s3"),
>> +                "ntp_server_setup": test_routine("kvm_tests",       
>> "run_ntp_server_setup"),
>> +                "timedrift":    test_routine("kvm_tests",           
>> "run_timedrift"),
>>                  }
>>  
>>          # Make it possible to import modules from the test's bindir
>> diff -urN kvm_runtest_2.bak/kvm_tests.cfg.sample 
>> kvm_runtest_2/kvm_tests.cfg.sample
>> --- kvm_runtest_2.bak/kvm_tests.cfg.sample    2009-04-29 
>> 06:17:29.000000000 -0400
>> +++ kvm_runtest_2/kvm_tests.cfg.sample    2009-04-29 
>> 08:09:36.000000000 -0400
>> @@ -81,6 +81,10 @@
>>      - linux_s3:      install setup
>>          type = linux_s3
>>  
>> +    - ntp_server_setup:
>> +        type = ntp_server_setup
>> +    - timedrift:      ntp_server_setup
>> +        type = timedrift
>>  # NICs
>>  variants:
>>      - @rtl8139:
>> diff -urN kvm_runtest_2.bak/kvm_tests.py kvm_runtest_2/kvm_tests.py
>> --- kvm_runtest_2.bak/kvm_tests.py    2009-04-29 06:17:29.000000000 
>> -0400
>> +++ kvm_runtest_2/kvm_tests.py    2009-05-11 06:00:32.000000000 -0400
>> @@ -394,3 +394,247 @@
>>      kvm_log.info("VM resumed after S3")
>>  
>>      session.close()
>> +
>> +def run_ntp_server_setup(test, params, env):
>> +    +    """NTP server configuration and related network file 
>> modification
>> +    """
>> +
>> +    kvm_log.info("stop the iptables service if it is running for 
>> timedrift testing")
>> +
>> +    if not os.system("/etc/init.d/iptables status"):
>> +        os.system("/etc/init.d/iptables stop")
>> +
>> +    # prevent dhcp client modify the ntp.conf
>> +    kvm_log.info("prevent dhcp client modify the ntp.conf")
>> +
>> +    config_file = "/etc/sysconfig/network"
>> +    network_file = open("/etc/sysconfig/network", "a")
>> +    string = "PEERNTP=no"
>> +
>> +    if os.system("grep %s %s" % (string, config_file)):
>> +        network_file.writelines(str(string)+'\n')
>> +    +    network_file.close()
>> +  +    # stop the ntp service if it is running
>> +    kvm_log.info("stop ntp service if it is running")
>> +
>> +    if not os.system("/etc/init.d/ntpd status"):
>> +        os.system("/etc/init.d/ntpd stop")
>> +        ntp_running = True
>> +
>> +    kvm_log.info("start ntp server on host with the custom config 
>> file.")
>> +
>> +    ntp_cmd = '''
>> +        echo "restrict default kod nomodify notrap nopeer noquery" 
>> >> /etc/timedrift.ntp.conf;\
>> +        echo "restrict 127.0.0.1" >> /etc/timedrift.ntp.conf;\
>> +        echo "driftfile /var/lib/ntp/drift" >> 
>> /etc/timedrift.ntp.conf;\
>> +        echo "keys /etc/ntp/keys" >> /etc/timedrift.ntp.conf;\
>> +        echo "server 127.127.1.0" >> /etc/timedrift.ntp.conf;\
>> +        echo "fudge 127.127.1.0 stratum 1" >> /etc/timedrift.ntp.conf;\
>> +        ntpd -c /etc/timedrift.ntp.conf;
>> +        '''
>> +    if os.system(ntp_cmd):
>> +        raise error.TestFail, "NTP server has not starting correct..."
>>
>> Here you could have used regular utils.system API instead of 
>> os.system since it integrates better with the autotest infrastructure.
>> Instead of the if clause we'd put a try/except block. Minor 
>> nipticking, "NTP server has not started correctly..."
>>
>> +    #kvm_log.info("sync system clock to BIOS")
>> +    #os.system("/sbin/hwclock --systohc")
>> +   +def run_timedrift(test, params, env):
>> +    """judge wether the guest clock will encounter timedrift prblem 
>> or not. including three stages:
>>
>> Typo, "whether"
>>
>> +       1: try to sync the clock with host, if the offset value of 
>> guest clock is large than 1 sec.
>> +       2: running the cpu stress testing program<cpu_stress.c> on guest
>> +       3: then run analyze loop totally 20 times to determine if the 
>> clock on guest has time drift.
>> +    """
>> +    # variables using in timedrift testcase
>> +    cpu_stress_program = "cpu_stress.c"
>> +    remote_dir = '/root'
>> +
>> +    clock_resource_cmd = "cat 
>> /sys/devices/system/clocksource/clocksource0/current_clocksource"
>> +
>> +    pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm_runtest_2')
>> +    cpu_stress_test = os.path.join(pwd, cpu_stress_program)
>> +    cpu_stress_cmdline = 'cd %s;gcc %s -lm;./a.out &' % (remote_dir, 
>> os.path.basename(cpu_stress_test))
>> +
>> +    cpu_stress_search_cmdline = "ps -ef|grep 'a.out'|grep -v grep"
>> +
>> +    hostname = os.environ.get("HOSTNAME")
>>
>> Can't we use socket.gethostname() here instead of relying on 
>> environment variable values?
>> +    if "localhost.localdomain" == hostname:
>> +        hostname = os.popen('hostname').read().split('\n')[0]
>> +        kvm_log.info("since get wrong hostname from python 
>> evnironment, then use the hostname get from system call(hostname).")
>> +
>> +    kvm_log.info("get host name :%s" % hostname)
>> +
>> +    # ntpdate info command and ntpdate sync command
>> +    ntpdate_info_cmd = "ntpdate -q %s" % hostname
>> +    ntpdate_sync_cmd = "ntpdate %s" % hostname
>> +
>> +    # get vm handle
>> +    vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
>> +    if not vm:
>> +        raise error.TestError, "VM object not found in environment"
>> +    if not vm.is_alive():
>> +        raise error.TestError, "VM seems to be dead; Test requires a 
>> living VM"
>>
>> I am seeing this piece of code to get the VM handle on several tests, 
>> I am starting to think we should factor this on an utility function...
>>
>> +    kvm_log.info("Waiting for guest to be up...")
>> +
>> +    pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
>> +    if not pxssh:
>> +        raise error.TestFail, "Could not log into guest"
>> +
>> +    kvm_log.info("Logged into guest IN run_timedrift function.")
>> +
>> +    # clock resource get from host and guest
>> +    host_clock_resource = 
>> os.popen(clock_resource_cmd).read().split('\n')[0]
>> +    kvm_log.info("the clock resource on host is :%s" % 
>> host_clock_resource)
>> +
>> +    pxssh.sendline(clock_resource_cmd)
>> +    s, o = pxssh.read_up_to_prompt()
>> +    guest_clock_resource = o.splitlines()[-2]
>> +    kvm_log.info("the clock resource on guest is :%s" % 
>> guest_clock_resource)
>> +
>> +    if host_clock_resource != guest_clock_resource:
>> +        #raise error.TestFail, "Host and Guest using different clock 
>> resource"
>> +        kvm_log.info("Host and Guest using different clock 
>> resource,Let's moving on.")
>> +    else:
>> +        kvm_log.info("Host and Guest using same clock resource,Let's 
>> moving on.")
>>
>> Little mistake here, "Let's move on."
>>
>> +    # helper function: +    # ntpdate_op: a entire process to get 
>> ntpdate command line result from guest.
>> +    # time_drift_or_not: get the numeric handing by regular 
>> expression and make timedrift calulation.
>> +    def ntpdate_op(command):
>> +        output = []
>> +        try:
>> +            pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
>> +            if not pxssh:
>> +                raise error.TestFail, "Could not log into guest"
>> +
>> +            kvm_log.info("Logged in:(ntpdate_op)")
>> +
>> +            while True:
>> +                pxssh.sendline(command)
>> +                s, output = pxssh.read_up_to_prompt()
>> +                if "time server" in output:
>> +                    # output is a string contain the (ntpdate -q) 
>> infor on guest
>> +                    return True, output
>> +                else:
>> +                    continue
>> +        except:
>> +            pxssh.close()
>> +            return False, output
>> +        return False, output
>> +
>> +    def time_drift_or_not(output):
>> +        date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
>> +        num = float(date_string[0])
>> +        if num >= 1:
>> +            kvm_log.info("guest clock has drifted in this scenario 
>> :%s %s" % (date_string, num))
>> +            return False
>> +        else:
>> +            kvm_log.info("guest clock running veracious in now stage 
>> :%s %s" % (date_string, num))
>> +            return True
>> +
>> +    # send the command and get the ouput from guest
>> +    # this loop will pick out several conditions need to be process
>> +    # Actually, we want to get the info match "time server", then 
>> script can analyzing it to
>> +    # determine if guest's clock need sync with host or not.
>> +    while True:
>> +        pxssh.sendline(ntpdate_info_cmd)
>> +        s, output = pxssh.read_up_to_prompt()
>> +        kvm_log.info("the ntpdate query info get from guest is 
>> below: \n%s" %output)
>> +        if ("no server suitable" not in output) and ("time server" 
>> not in output):
>> +            kvm_log.info("very creazying output got. let's try again")
>> +            continue
>> +        elif "no server suitable" in output:
>> +            kvm_log.info("seems NTP server is not ready for servicing")
>> +            time.sleep(30)
>> +            continue
>> +        elif "time server" in output:
>> +            # get the ntpdate info from guest
>> +            # kvm_log.info("Got the correct output for analyze. The 
>> output is below: \n%s" %output) +            break
>> +
>> +    kvm_log.info("get the ntpdate infomation from guest successfully 
>> :%s" % os.popen('date').read())
>> +
>> +    # judge the clock need to sync with host or not
>> +    while True:
>> +        date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
>> +        num = float(date_string[0])
>> +        if num >= 1:
>> +            kvm_log.info("guest need sync with the server: %s" % 
>> hostname)
>> +            s, output = ntpdate_op(ntpdate_sync_cmd)
>> +            if s:
>> +                continue
>> +        else:
>> +            #pxssh.sendline("hwclock --systohc")
>> +            #kvm_log.info("guest clock sync prcdure is finished. 
>> then sync the guest clock to guest bios.")
>> +
>> +            #pxssh.sendline("hwclock --show")
>> +            #s, o = pxssh.read_up_to_prompt()
>> +            #kvm_log.info("the date infomation get from guest bios 
>> is :\n%s" % o)
>> +
>> +            pxssh.sendline(ntpdate_info_cmd)
>> +            s, o = pxssh.read_up_to_prompt()
>> +            kvm_log.info("guest clock after sync with host is :\n%s" 
>> % o)
>> +
>> +            break
>> +
>> +    kvm_log.info("Timedrift Preparation *Finished* at last :%s" % 
>> os.popen('date').read())
>> +
>> +    if not vm.scp_to_remote(cpu_stress_test, remote_dir):
>> +        raise error.TestError, "Could not copy program to guest."
>> +
>> +    pxssh.sendline(ntpdate_info_cmd)
>> +    s, o = pxssh.read_up_to_prompt()
>> +    kvm_log.info("the ntpdate query from host *BEFORE* running the 
>> cpu stress program.\n%s" % o)
>> +    pxssh.sendline(cpu_stress_cmdline)
>> +    s, o = pxssh.read_up_to_prompt()
>> +    kvm_log.info("running command line on guest and sleeping for 
>> 1200 secs.\n%s" % o)
>> +
>> +    time.sleep(1200)
>> +
>> +    while True:
>> +        if pxssh.get_command_status(cpu_stress_search_cmdline):
>> +            #(s, o) = 
>> pxssh.get_command_status_output(cpu_stress_search_cmdline)
>> +            #print "s is :%s" % s
>> +            #print "o is :%s" % o
>> +            #print "--------------------------------------------"
>> +            #aaa = pxssh.get_command_status(cpu_stress_search_cmdline)
>> +            #print "aaa is :%s" % aaa
>> +            #print "--------------------------------------------"
>> +
>> +            print "stress testing process has been completed and quit."
>> +            break
>> +        else:
>> +            print "stress testing on CPU has not finished 
>> yet.waiting for next detect after sleep 60 secs."
>> +            time.sleep(60)
>> +            continue
>> +
>> +    pxssh.sendline(ntpdate_info_cmd)
>> +    s, o = pxssh.read_up_to_prompt()
>> +    kvm_log.info("the ntpdate query from host *AFTER* running the 
>> cpu stress program.\n%s" % o)
>> +
>> +    pxssh.close()
>> +
>> +    # Sleep for analyze...
>> +    kvm_log.info("sleeping(180 secs) Starting... :%s" % 
>> os.popen('date').read())
>> +    time.sleep(180)
>> +    kvm_log.info("wakeup to get the analyzing... :%s" % 
>> os.popen('date').read())
>> +    count = 0
>> +    for i in range(1, 21):
>> +        kvm_log.info("this is %s time to get clock info from guest." 
>> % i)
>> +        s, o = ntpdate_op(ntpdate_info_cmd)
>> +        +        if not s:
>> +            raise error.TestFail, "Guest seems hang or ssh service 
>> based on guest has been crash down"
>> +        +        if not time_drift_or_not(o):
>> +            count += 1
>> +
>> +        if count == 5:
>> +            raise error.TestFail, "TimeDrift testing Abort because 
>> guest's clock has drift too much"
>> +
>> +        kvm_log.info("*********************** Sleep 30 seconds for 
>> next loop *************************")
>> +        time.sleep(60)
>> +
>>
>>   
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: timedrift-3.patch --]
[-- Type: text/x-patch, Size: 45377 bytes --]

diff -urN kvm_runtest_2.bak/genload/CVS/Entries kvm_runtest_2/genload/CVS/Entries
--- kvm_runtest_2.bak/genload/CVS/Entries	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Entries	2009-05-11 16:47:49.000000000 -0400
@@ -0,0 +1,3 @@
+/Makefile/1.4/Mon Sep 29 16:22:14 2008//
+/README/1.1/Fri Dec 13 21:34:13 2002//
+/stress.c/1.3/Thu Jul 26 12:40:17 2007//
diff -urN kvm_runtest_2.bak/genload/CVS/Repository kvm_runtest_2/genload/CVS/Repository
--- kvm_runtest_2.bak/genload/CVS/Repository	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Repository	2009-05-11 16:47:44.000000000 -0400
@@ -0,0 +1 @@
+ltp/tools/genload
diff -urN kvm_runtest_2.bak/genload/CVS/Root kvm_runtest_2/genload/CVS/Root
--- kvm_runtest_2.bak/genload/CVS/Root	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Root	2009-05-11 16:47:44.000000000 -0400
@@ -0,0 +1 @@
+:pserver:anonymous@ltp.cvs.sourceforge.net:/cvsroot/ltp
diff -urN kvm_runtest_2.bak/genload/Makefile kvm_runtest_2/genload/Makefile
--- kvm_runtest_2.bak/genload/Makefile	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/Makefile	2008-09-29 12:22:14.000000000 -0400
@@ -0,0 +1,14 @@
+CFLAGS+= -DPACKAGE=\"stress\" -DVERSION=\"0.17pre11\" 
+
+LDLIBS+= -lm
+
+SRCS=$(wildcard *.c)
+TARGETS=$(patsubst %.c,%,$(SRCS))
+
+all: $(TARGETS)
+
+install:
+	@ln -f $(TARGETS) ../../testcases/bin/genload
+
+clean:
+	rm -fr $(TARGETS)
diff -urN kvm_runtest_2.bak/genload/README kvm_runtest_2/genload/README
--- kvm_runtest_2.bak/genload/README	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/README	2002-12-13 16:34:13.000000000 -0500
@@ -0,0 +1,72 @@
+USAGE
+
+See the program's usage statement by invoking with --help.
+
+NOTES
+
+This program works really well for me, but it might not have some of the
+features that you want.  If you would like, please extend the code and send
+me the patch[1].  Enjoy the program :-)
+
+Please use the context diff format.  That is: save the original program
+as stress.c.orig, then make and test your desired changes to stress.c, then
+run 'diff -u stress.c.orig stress.c' to produce a context patch.  Thanks.
+
+Amos Waterland <apw@rossby.metr.ou.edu>
+Norman, Oklahoma
+27 Nov 2001
+
+EXAMPLES
+[examples]
+
+The simple case is that you just want to bring the system load average up to
+an arbitrary value.  The following forks 13 processes, each of which spins
+in a tight loop calculating the sqrt() of a random number acquired with
+rand().
+
+  % stress -c 13
+
+Long options are supported, as well as is making the output less verbose.
+The following forks 1024 processes, and only reports error messages if any.
+
+  % stress --quiet --hogcpu 1k
+
+To see how your system performs when it is I/O bound, use the -i switch.
+The following forks 4 processes, each of which spins in a tight loop calling
+sync(), which is a system call that flushes memory buffers to disk.
+
+  % stress -i 4
+
+Multiple hogs may be combined on the same command line.  The following does
+everything the preceding examples did in one command, but also turns up the
+verbosity level as well as showing how to cause the command to
+self-terminate after 1 minute.
+
+  % stress -c 13 -i 4 --verbose --timeout 1m
+
+An value of 0 normally denotes infinity.  The following is how to do a fork
+bomb (be careful with this).
+
+  % stress -c 0
+
+For the -m and -d options, a value of 0 means to redo their operation an
+infinite number of times.  To allocate and free 128MB in a redo loop use the
+following command.  This can be useful for "bouncing" against the system RAM
+ceiling.
+
+  % stress -m 0 --hogvm-bytes 128M
+
+For the -m and -d options, a negative value of n means to redo the operation
+abs(n) times.  Here is now to allocate and free 5MB three times in a row.
+
+  % stress -m -3 --hogvm-bytes 5m
+
+You can write a file of arbitrary length to disk.  The file is created with
+mkstemp() in the current directory, the default is to unlink it, but
+unlinking can be overridden with the --hoghdd-noclean flag.
+
+  % stress -d 1 --hoghdd-noclean --hoghdd-bytes 13
+
+Large file support is enabled.
+
+  % stress -d 1 --hoghdd-noclean --hoghdd-bytes 3G
diff -urN kvm_runtest_2.bak/genload/stress.c kvm_runtest_2/genload/stress.c
--- kvm_runtest_2.bak/genload/stress.c	1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/stress.c	2007-07-26 08:40:17.000000000 -0400
@@ -0,0 +1,983 @@
+/* A program to put stress on a POSIX system (stress).
+ *
+ * Copyright (C) 2001, 2002 Amos Waterland <awaterl@yahoo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc., 59
+ * Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <ctype.h>
+#include <errno.h>
+#include <libgen.h>
+#include <math.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <signal.h>
+#include <time.h>
+#include <unistd.h>
+#include <sys/wait.h>
+
+/* By default, print all messages of severity info and above.  */
+static int global_debug = 2;
+
+/* By default, just print warning for non-critical errors.  */
+static int global_ignore = 1;
+
+/* By default, retry on non-critical errors every 50ms.  */
+static int global_retry = 50000;
+
+/* By default, use this as backoff coefficient for good fork throughput.  */
+static int global_backoff = 3000;
+
+/* By default, do not timeout.  */
+static int global_timeout = 0;
+
+/* Name of this program */
+static char *global_progname = PACKAGE;
+
+/* By default, do not hang after allocating memory.  */
+static int global_vmhang = 0;
+
+/* Implemention of runtime-selectable severity message printing.  */
+#define dbg if (global_debug >= 3) \
+            fprintf (stdout, "%s: debug: (%d) ", global_progname, __LINE__), \
+            fprintf
+#define out if (global_debug >= 2) \
+            fprintf (stdout, "%s: info: ", global_progname), \
+            fprintf
+#define wrn if (global_debug >= 1) \
+            fprintf (stderr, "%s: warn: (%d) ", global_progname, __LINE__), \
+            fprintf
+#define err if (global_debug >= 0) \
+            fprintf (stderr, "%s: error: (%d) ", global_progname, __LINE__), \
+            fprintf
+
+/* Implementation of check for option argument correctness.  */
+#define assert_arg(A) \
+          if (++i == argc || ((arg = argv[i])[0] == '-' && \
+              !isdigit ((int)arg[1]) )) \
+            { \
+              err (stderr, "missing argument to option '%s'\n", A); \
+              exit (1); \
+            }
+
+/* Prototypes for utility functions.  */
+int usage (int status);
+int version (int status);
+long long atoll_s (const char *nptr);
+long long atoll_b (const char *nptr);
+
+/* Prototypes for the worker functions.  */
+int hogcpu (long long forks);
+int hogio (long long forks);
+int hogvm (long long forks, long long chunks, long long bytes);
+int hoghdd (long long forks, int clean, long long files, long long bytes);
+
+int
+main (int argc, char **argv)
+{
+  int i, pid, children = 0, retval = 0;
+  long starttime, stoptime, runtime;
+
+  /* Variables that indicate which options have been selected.  */
+  int do_dryrun = 0;
+  int do_timeout = 0;
+  int do_cpu = 0;               /* Default to 1 fork. */
+  long long do_cpu_forks = 1;
+  int do_io = 0;                /* Default to 1 fork. */
+  long long do_io_forks = 1;
+  int do_vm = 0;                /* Default to 1 fork, 1 chunk of 256MB.  */
+  long long do_vm_forks = 1;
+  long long do_vm_chunks = 1;
+  long long do_vm_bytes = 256 * 1024 * 1024;
+  int do_hdd = 0;               /* Default to 1 fork, clean, 1 file of 1GB.  */
+  long long do_hdd_forks = 1;
+  int do_hdd_clean = 0;
+  long long do_hdd_files = 1;
+  long long do_hdd_bytes = 1024 * 1024 * 1024;
+
+  /* Record our start time.  */
+  if ((starttime = time (NULL)) == -1)
+    {
+      err (stderr, "failed to acquire current time\n");
+      exit (1);
+    }
+
+  /* SuSv3 does not define any error conditions for this function.  */
+  global_progname = basename (argv[0]);
+
+  /* For portability, parse command line options without getopt_long.  */
+  for (i = 1; i < argc; i++)
+    {
+      char *arg = argv[i];
+
+      if (strcmp (arg, "--help") == 0 || strcmp (arg, "-?") == 0)
+        {
+          usage (0);
+        }
+      else if (strcmp (arg, "--version") == 0)
+        {
+          version (0);
+        }
+      else if (strcmp (arg, "--verbose") == 0 || strcmp (arg, "-v") == 0)
+        {
+          global_debug = 3;
+        }
+      else if (strcmp (arg, "--quiet") == 0 || strcmp (arg, "-q") == 0)
+        {
+          global_debug = 0;
+        }
+      else if (strcmp (arg, "--dry-run") == 0 || strcmp (arg, "-n") == 0)
+        {
+          do_dryrun = 1;
+        }
+      else if (strcmp (arg, "--no-retry") == 0)
+        {
+          global_ignore = 0;
+          dbg (stdout, "turning off ignore of non-critical errors");
+        }
+      else if (strcmp (arg, "--retry-delay") == 0)
+        {
+          assert_arg ("--retry-delay");
+          global_retry = atoll (arg);
+          dbg (stdout, "setting retry delay to %dus\n", global_retry);
+        }
+      else if (strcmp (arg, "--backoff") == 0)
+        {
+          assert_arg ("--backoff");
+          global_backoff = atoll (arg);
+          if (global_backoff < 0)
+            {
+              err (stderr, "invalid backoff factor: %i\n", global_backoff);
+              exit (1);
+            }
+          dbg (stdout, "setting backoff coeffient to %dus\n", global_backoff);
+        }
+      else if (strcmp (arg, "--timeout") == 0 || strcmp (arg, "-t") == 0)
+        {
+          do_timeout = 1;
+          assert_arg ("--timeout");
+          global_timeout = atoll_s (arg);
+          dbg (stdout, "setting timeout to %ds\n", global_timeout);
+        }
+      else if (strcmp (arg, "--cpu") == 0 || strcmp (arg, "-c") == 0)
+        {
+          do_cpu = 1;
+          assert_arg ("--cpu");
+          do_cpu_forks = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--io") == 0 || strcmp (arg, "-i") == 0)
+        {
+          do_io = 1;
+          assert_arg ("--io");
+          do_io_forks = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--vm") == 0 || strcmp (arg, "-m") == 0)
+        {
+          do_vm = 1;
+          assert_arg ("--vm");
+          do_vm_forks = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--vm-chunks") == 0)
+        {
+          assert_arg ("--vm-chunks");
+          do_vm_chunks = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--vm-bytes") == 0)
+        {
+          assert_arg ("--vm-bytes");
+          do_vm_bytes = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--vm-hang") == 0)
+        {
+          global_vmhang = 1;
+        }
+      else if (strcmp (arg, "--hdd") == 0 || strcmp (arg, "-d") == 0)
+        {
+          do_hdd = 1;
+          assert_arg ("--hdd");
+          do_hdd_forks = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--hdd-noclean") == 0)
+        {
+          do_hdd_clean = 2;
+        }
+      else if (strcmp (arg, "--hdd-files") == 0)
+        {
+          assert_arg ("--hdd-files");
+          do_hdd_files = atoll_b (arg);
+        }
+      else if (strcmp (arg, "--hdd-bytes") == 0)
+        {
+          assert_arg ("--hdd-bytes");
+          do_hdd_bytes = atoll_b (arg);
+        }
+      else
+        {
+          err (stderr, "unrecognized option: %s\n", arg);
+          exit (1);
+        }
+    }
+
+  /* Hog CPU option.  */
+  if (do_cpu)
+    {
+      out (stdout, "dispatching %lli hogcpu forks\n", do_cpu_forks);
+
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          if (do_dryrun)
+            exit (0);
+          exit (hogcpu (do_cpu_forks));
+        case -1:               /* error */
+          err (stderr, "hogcpu dispatcher fork failed\n");
+          exit (1);
+        default:               /* parent */
+          children++;
+          dbg (stdout, "--> hogcpu dispatcher forked (%i)\n", pid);
+        }
+    }
+
+  /* Hog I/O option.  */
+  if (do_io)
+    {
+      out (stdout, "dispatching %lli hogio forks\n", do_io_forks);
+
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          if (do_dryrun)
+            exit (0);
+          exit (hogio (do_io_forks));
+        case -1:               /* error */
+          err (stderr, "hogio dispatcher fork failed\n");
+          exit (1);
+        default:               /* parent */
+          children++;
+          dbg (stdout, "--> hogio dispatcher forked (%i)\n", pid);
+        }
+    }
+
+  /* Hog VM option.  */
+  if (do_vm)
+    {
+      out (stdout,
+           "dispatching %lli hogvm forks, each %lli chunks of %lli bytes\n",
+           do_vm_forks, do_vm_chunks, do_vm_bytes);
+
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          if (do_dryrun)
+            exit (0);
+          exit (hogvm (do_vm_forks, do_vm_chunks, do_vm_bytes));
+        case -1:               /* error */
+          err (stderr, "hogvm dispatcher fork failed\n");
+          exit (1);
+        default:               /* parent */
+          children++;
+          dbg (stdout, "--> hogvm dispatcher forked (%i)\n", pid);
+        }
+    }
+
+  /* Hog HDD option.  */
+  if (do_hdd)
+    {
+      out (stdout, "dispatching %lli hoghdd forks, each %lli files of "
+           "%lli bytes\n", do_hdd_forks, do_hdd_files, do_hdd_bytes);
+
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          if (do_dryrun)
+            exit (0);
+          exit (hoghdd
+                (do_hdd_forks, do_hdd_clean, do_hdd_files, do_hdd_bytes));
+        case -1:               /* error */
+          err (stderr, "hoghdd dispatcher fork failed\n");
+          exit (1);
+        default:               /* parent */
+          children++;
+          dbg (stdout, "--> hoghdd dispatcher forked (%i)\n", pid);
+        }
+    }
+
+  /* We have no work to do, so bail out.  */
+  if (children == 0)
+    usage (0);
+
+  /* Wait for our children to exit.  */
+  while (children)
+    {
+      int status, ret;
+
+      if ((pid = wait (&status)) > 0)
+        {
+          if ((WIFEXITED (status)) != 0)
+            {
+              if ((ret = WEXITSTATUS (status)) != 0)
+                {
+                  err (stderr, "dispatcher %i returned error %i\n", pid, ret);
+                  retval += ret;
+                }
+              else
+                {
+                  dbg (stdout, "<-- dispatcher return (%i)\n", pid);
+                }
+            }
+          else
+            {
+              err (stderr, "dispatcher did not exit normally\n");
+              ++retval;
+            }
+
+          --children;
+        }
+      else
+        {
+          dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+          err (stderr, "detected missing dispatcher children\n");
+          ++retval;
+          break;
+        }
+    }
+
+  /* Record our stop time.  */
+  if ((stoptime = time (NULL)) == -1)
+    {
+      err (stderr, "failed to acquire current time\n");
+      exit (1);
+    }
+
+  /* Calculate our runtime.  */
+  runtime = stoptime - starttime;
+
+  /* Print final status message.  */
+  if (retval)
+    {
+      err (stderr, "failed run completed in %lis\n", runtime);
+    }
+  else
+    {
+      out (stdout, "successful run completed in %lis\n", runtime);
+    }
+
+  exit (retval);
+}
+
+int
+usage (int status)
+{
+  char *mesg =
+    "`%s' imposes certain types of compute stress on your system\n\n"
+    "Usage: %s [OPTION [ARG]] ...\n\n"
+    " -?, --help            show this help statement\n"
+    "     --version         show version statement\n"
+    " -v, --verbose         be verbose\n"
+    " -q, --quiet           be quiet\n"
+    " -n, --dry-run         show what would have been done\n"
+    "     --no-retry        exit rather than retry non-critical errors\n"
+    "     --retry-delay n   wait n us before continuing past error\n"
+    " -t, --timeout n       timeout after n seconds\n"
+    "     --backoff n       wait for factor of n us before starting work\n"
+    " -c, --cpu n           spawn n procs spinning on sqrt()\n"
+    " -i, --io n            spawn n procs spinning on sync()\n"
+    " -m, --vm n            spawn n procs spinning on malloc()\n"
+    "     --vm-chunks c     malloc c chunks (default is 1)\n"
+    "     --vm-bytes b      malloc chunks of b bytes (default is 256MB)\n"
+    "     --vm-hang         hang in a sleep loop after memory allocated\n"
+    " -d, --hdd n           spawn n procs spinning on write()\n"
+    "     --hdd-noclean     do not unlink file to which random data written\n"
+    "     --hdd-files f     write to f files (default is 1)\n"
+    "     --hdd-bytes b     write b bytes (default is 1GB)\n\n"
+    "Infinity is denoted with 0.  For -m, -d: n=0 means infinite redo,\n"
+    "n<0 means redo abs(n) times. Valid suffixes are m,h,d,y for time;\n"
+    "k,m,g for size.\n\n";
+
+  fprintf (stdout, mesg, global_progname, global_progname);
+
+  if (status <= 0)
+    exit (-1 * status);
+
+  return 0;
+}
+
+int
+version (int status)
+{
+  char *mesg = "%s %s\n";
+
+  fprintf (stdout, mesg, global_progname, VERSION);
+
+  if (status <= 0)
+    exit (-1 * status);
+
+  return 0;
+}
+
+/* Convert a string representation of a number with an optional size suffix
+ * to a long long.
+ */
+long long
+atoll_b (const char *nptr)
+{
+  int pos;
+  char suffix;
+  long long factor = 1;
+
+  if ((pos = strlen (nptr) - 1) < 0)
+    {
+      err (stderr, "invalid string\n");
+      exit (1);
+    }
+
+  switch (suffix = nptr[pos])
+    {
+    case 'k':
+    case 'K':
+      factor = 1024;
+      break;
+    case 'm':
+    case 'M':
+      factor = 1024 * 1024;
+      break;
+    case 'g':
+    case 'G':
+      factor = 1024 * 1024 * 1024;
+      break;
+    default:
+      if (suffix < '0' || suffix > '9')
+        {
+          err (stderr, "unrecognized suffix: %c\n", suffix);
+          exit (1);
+        }
+    }
+
+  factor = atoll (nptr) * factor;
+
+  return factor;
+}
+
+/* Convert a string representation of a number with an optional time suffix
+ * to a long long.
+ */
+long long
+atoll_s (const char *nptr)
+{
+  int pos;
+  char suffix;
+  long long factor = 1;
+
+  if ((pos = strlen (nptr) - 1) < 0)
+    {
+      err (stderr, "invalid string\n");
+      exit (1);
+    }
+
+  switch (suffix = nptr[pos])
+    {
+    case 's':
+    case 'S':
+      factor = 1;
+      break;
+    case 'm':
+    case 'M':
+      factor = 60;
+      break;
+    case 'h':
+    case 'H':
+      factor = 60 * 60;
+      break;
+    case 'd':
+    case 'D':
+      factor = 60 * 60 * 24;
+      break;
+    case 'y':
+    case 'Y':
+      factor = 60 * 60 * 24 * 360;
+      break;
+    default:
+      if (suffix < '0' || suffix > '9')
+        {
+          err (stderr, "unrecognized suffix: %c\n", suffix);
+          exit (1);
+        }
+    }
+
+  factor = atoll (nptr) * factor;
+
+  return factor;
+}
+
+int
+hogcpu (long long forks)
+{
+  long long i;
+  double d;
+  int pid, retval = 0;
+
+  /* Make local copies of global variables.  */
+  int ignore = global_ignore;
+  int retry = global_retry;
+  int timeout = global_timeout;
+  long backoff = global_backoff * forks;
+
+  dbg (stdout, "using backoff sleep of %lius for hogcpu\n", backoff);
+
+  for (i = 0; forks == 0 || i < forks; i++)
+    {
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          alarm (timeout);
+
+          /* Use a backoff sleep to ensure we get good fork throughput.  */
+          usleep (backoff);
+
+          while (1)
+            d = sqrt (rand ());
+
+          /* This case never falls through; alarm signal can cause exit.  */
+        case -1:               /* error */
+          if (ignore)
+            {
+              ++retval;
+              wrn (stderr, "hogcpu worker fork failed, continuing\n");
+              usleep (retry);
+              continue;
+            }
+
+          err (stderr, "hogcpu worker fork failed\n");
+          return 1;
+        default:               /* parent */
+          dbg (stdout, "--> hogcpu worker forked (%i)\n", pid);
+        }
+    }
+
+  /* Wait for our children to exit.  */
+  while (i)
+    {
+      int status, ret;
+
+      if ((pid = wait (&status)) > 0)
+        {
+          if ((WIFEXITED (status)) != 0)
+            {
+              if ((ret = WEXITSTATUS (status)) != 0)
+                {
+                  err (stderr, "hogcpu worker %i exited %i\n", pid, ret);
+                  retval += ret;
+                }
+              else
+                {
+                  dbg (stdout, "<-- hogcpu worker exited (%i)\n", pid);
+                }
+            }
+          else
+            {
+              dbg (stdout, "<-- hogcpu worker signalled (%i)\n", pid);
+            }
+
+          --i;
+        }
+      else
+        {
+          dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+          err (stderr, "detected missing hogcpu worker children\n");
+          ++retval;
+          break;
+        }
+    }
+
+  return retval;
+}
+
+int
+hogio (long long forks)
+{
+  long long i;
+  int pid, retval = 0;
+
+  /* Make local copies of global variables.  */
+  int ignore = global_ignore;
+  int retry = global_retry;
+  int timeout = global_timeout;
+  long backoff = global_backoff * forks;
+
+  dbg (stdout, "using backoff sleep of %lius for hogio\n", backoff);
+
+  for (i = 0; forks == 0 || i < forks; i++)
+    {
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          alarm (timeout);
+
+          /* Use a backoff sleep to ensure we get good fork throughput.  */
+          usleep (backoff);
+
+          while (1)
+            sync ();
+
+          /* This case never falls through; alarm signal can cause exit.  */
+        case -1:               /* error */
+          if (ignore)
+            {
+              ++retval;
+              wrn (stderr, "hogio worker fork failed, continuing\n");
+              usleep (retry);
+              continue;
+            }
+
+          err (stderr, "hogio worker fork failed\n");
+          return 1;
+        default:               /* parent */
+          dbg (stdout, "--> hogio worker forked (%i)\n", pid);
+        }
+    }
+
+  /* Wait for our children to exit.  */
+  while (i)
+    {
+      int status, ret;
+
+      if ((pid = wait (&status)) > 0)
+        {
+          if ((WIFEXITED (status)) != 0)
+            {
+              if ((ret = WEXITSTATUS (status)) != 0)
+                {
+                  err (stderr, "hogio worker %i exited %i\n", pid, ret);
+                  retval += ret;
+                }
+              else
+                {
+                  dbg (stdout, "<-- hogio worker exited (%i)\n", pid);
+                }
+            }
+          else
+            {
+              dbg (stdout, "<-- hogio worker signalled (%i)\n", pid);
+            }
+
+          --i;
+        }
+      else
+        {
+          dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+          err (stderr, "detected missing hogio worker children\n");
+          ++retval;
+          break;
+        }
+    }
+
+  return retval;
+}
+
+int
+hogvm (long long forks, long long chunks, long long bytes)
+{
+  long long i, j, k;
+  int pid, retval = 0;
+  char **ptr;
+
+  /* Make local copies of global variables.  */
+  int ignore = global_ignore;
+  int retry = global_retry;
+  int timeout = global_timeout;
+  long backoff = global_backoff * forks;
+
+  dbg (stdout, "using backoff sleep of %lius for hogvm\n", backoff);
+
+  if (bytes == 0)
+    {
+      /* 512MB is guess at the largest value can than be malloced at once.  */
+      bytes = 512 * 1024 * 1024;
+    }
+
+  for (i = 0; forks == 0 || i < forks; i++)
+    {
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          alarm (timeout);
+
+          /* Use a backoff sleep to ensure we get good fork throughput.  */
+          usleep (backoff);
+
+          while (1)
+            {
+              ptr = (char **) malloc ( chunks * 2);
+              for (j = 0; chunks == 0 || j < chunks; j++)
+                {
+                  if ((ptr[j] = (char *) malloc (bytes * sizeof (char))))
+                    {
+                      for (k = 0; k < bytes; k++)
+                        ptr[j][k] = 'Z';   /* Ensure that COW happens.  */
+                      dbg (stdout, "hogvm worker malloced %lli bytes\n", k);
+                    }
+                  else if (ignore)
+                    {
+                      ++retval;
+                      wrn (stderr, "hogvm malloc failed, continuing\n");
+                      usleep (retry);
+                      continue;
+                    }
+                  else
+                    {
+                      ++retval;
+                      err (stderr, "hogvm malloc failed\n");
+                      break;
+                    }
+                }
+              if (global_vmhang && retval == 0)
+                {
+                  dbg (stdout, "sleeping forever with allocated memory\n");
+                  while (1)
+                    sleep (1024);
+                }
+              if (retval == 0)
+                {
+                  dbg (stdout,
+                       "hogvm worker freeing memory and starting over\n");
+                  for (j = 0; chunks == 0 || j < chunks; j++) {
+                      free (ptr[j]);
+                  }
+                  free(ptr);
+                  continue;
+                }
+
+              exit (retval);
+            }
+
+          /* This case never falls through; alarm signal can cause exit.  */
+        case -1:               /* error */
+          if (ignore)
+            {
+              ++retval;
+              wrn (stderr, "hogvm worker fork failed, continuing\n");
+              usleep (retry);
+              continue;
+            }
+
+          err (stderr, "hogvm worker fork failed\n");
+          return 1;
+        default:               /* parent */
+          dbg (stdout, "--> hogvm worker forked (%i)\n", pid);
+        }
+    }
+
+  /* Wait for our children to exit.  */
+  while (i)
+    {
+      int status, ret;
+
+      if ((pid = wait (&status)) > 0)
+        {
+          if ((WIFEXITED (status)) != 0)
+            {
+              if ((ret = WEXITSTATUS (status)) != 0)
+                {
+                  err (stderr, "hogvm worker %i exited %i\n", pid, ret);
+                  retval += ret;
+                }
+              else
+                {
+                  dbg (stdout, "<-- hogvm worker exited (%i)\n", pid);
+                }
+            }
+          else
+            {
+              dbg (stdout, "<-- hogvm worker signalled (%i)\n", pid);
+            }
+
+          --i;
+        }
+      else
+        {
+          dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+          err (stderr, "detected missing hogvm worker children\n");
+          ++retval;
+          break;
+        }
+    }
+
+  return retval;
+}
+
+int
+hoghdd (long long forks, int clean, long long files, long long bytes)
+{
+  long long i, j;
+  int fd, pid, retval = 0;
+  int chunk = (1024 * 1024) - 1;        /* Minimize slow writing.  */
+  char buff[chunk];
+
+  /* Make local copies of global variables.  */
+  int ignore = global_ignore;
+  int retry = global_retry;
+  int timeout = global_timeout;
+  long backoff = global_backoff * forks;
+
+  /* Initialize buffer with some random ASCII data.  */
+  dbg (stdout, "seeding buffer with random data\n");
+  for (i = 0; i < chunk - 1; i++)
+    {
+      j = rand ();
+      j = (j < 0) ? -j : j;
+      j %= 95;
+      j += 32;
+      buff[i] = j;
+    }
+  buff[i] = '\n';
+
+  dbg (stdout, "using backoff sleep of %lius for hoghdd\n", backoff);
+
+  for (i = 0; forks == 0 || i < forks; i++)
+    {
+      switch (pid = fork ())
+        {
+        case 0:                /* child */
+          alarm (timeout);
+
+          /* Use a backoff sleep to ensure we get good fork throughput.  */
+          usleep (backoff);
+
+          while (1)
+            {
+              for (i = 0; i < files; i++)
+                {
+                  char name[] = "./stress.XXXXXX";
+
+                  if ((fd = mkstemp (name)) < 0)
+                    {
+                      perror ("mkstemp");
+                      err (stderr, "mkstemp failed\n");
+                      exit (1);
+                    }
+
+                  if (clean == 0)
+                    {
+                      dbg (stdout, "unlinking %s\n", name);
+                      if (unlink (name))
+                        {
+                          err (stderr, "unlink failed\n");
+                          exit (1);
+                        }
+                    }
+
+                  dbg (stdout, "fast writing to %s\n", name);
+                  for (j = 0; bytes == 0 || j + chunk < bytes; j += chunk)
+                    {
+                      if (write (fd, buff, chunk) != chunk)
+                        {
+                          err (stderr, "write failed\n");
+                          exit (1);
+                        }
+                    }
+
+                  dbg (stdout, "slow writing to %s\n", name);
+                  for (; bytes == 0 || j < bytes - 1; j++)
+                    {
+                      if (write (fd, "Z", 1) != 1)
+                        {
+                          err (stderr, "write failed\n");
+                          exit (1);
+                        }
+                    }
+                  if (write (fd, "\n", 1) != 1)
+                    {
+                      err (stderr, "write failed\n");
+                      exit (1);
+                    }
+                  ++j;
+
+                  dbg (stdout, "closing %s after writing %lli bytes\n", name,
+                       j);
+                  close (fd);
+
+                  if (clean == 1)
+                    {
+                      if (unlink (name))
+                        {
+                          err (stderr, "unlink failed\n");
+                          exit (1);
+                        }
+                    }
+                }
+              if (retval == 0)
+                {
+                  dbg (stdout, "hoghdd worker starting over\n");
+                  continue;
+                }
+
+              exit (retval);
+            }
+
+          /* This case never falls through; alarm signal can cause exit.  */
+        case -1:               /* error */
+          if (ignore)
+            {
+              ++retval;
+              wrn (stderr, "hoghdd worker fork failed, continuing\n");
+              usleep (retry);
+              continue;
+            }
+
+          err (stderr, "hoghdd worker fork failed\n");
+          return 1;
+        default:               /* parent */
+          dbg (stdout, "--> hoghdd worker forked (%i)\n", pid);
+        }
+    }
+
+  /* Wait for our children to exit.  */
+  while (i)
+    {
+      int status, ret;
+
+      if ((pid = wait (&status)) > 0)
+        {
+          if ((WIFEXITED (status)) != 0)
+            {
+              if ((ret = WEXITSTATUS (status)) != 0)
+                {
+                  err (stderr, "hoghdd worker %i exited %i\n", pid, ret);
+                  retval += ret;
+                }
+              else
+                {
+                  dbg (stdout, "<-- hoghdd worker exited (%i)\n", pid);
+                }
+            }
+          else
+            {
+              dbg (stdout, "<-- hoghdd worker signalled (%i)\n", pid);
+            }
+
+          --i;
+        }
+      else
+        {
+          dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+          err (stderr, "detected missing hoghdd worker children\n");
+          ++retval;
+          break;
+        }
+    }
+
+  return retval;
+}
diff -urN kvm_runtest_2.bak/kvm_runtest_2.py kvm_runtest_2/kvm_runtest_2.py
--- kvm_runtest_2.bak/kvm_runtest_2.py	2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_runtest_2.py	2009-04-29 08:06:32.000000000 -0400
@@ -36,6 +36,8 @@
                 "autotest":     test_routine("kvm_tests",           "run_autotest"),
                 "kvm_install":  test_routine("kvm_install",         "run_kvm_install"),
                 "linux_s3":     test_routine("kvm_tests",           "run_linux_s3"),
+                "ntp_server_setup": test_routine("kvm_tests",       "run_ntp_server_setup"),
+                "timedrift":    test_routine("kvm_tests",           "run_timedrift"),
                 }
 
         # Make it possible to import modules from the test's bindir
diff -urN kvm_runtest_2.bak/kvm_tests.cfg.sample kvm_runtest_2/kvm_tests.cfg.sample
--- kvm_runtest_2.bak/kvm_tests.cfg.sample	2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_tests.cfg.sample	2009-05-12 04:48:51.000000000 -0400
@@ -81,6 +81,11 @@
     - linux_s3:      install setup
         type = linux_s3
 
+    - ntp_server_setup:
+        type = ntp_server_setup
+    - timedrift:      ntp_server_setup
+        type = timedrift
+        stress_test_specify = './stress -c 10 --timeout 15m'
 # NICs
 variants:
     - @rtl8139:
diff -urN kvm_runtest_2.bak/kvm_tests.py kvm_runtest_2/kvm_tests.py
--- kvm_runtest_2.bak/kvm_tests.py	2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_tests.py	2009-05-12 08:16:11.000000000 -0400
@@ -394,3 +394,263 @@
     kvm_log.info("VM resumed after S3")
 
     session.close()
+
+def run_ntp_server_setup(test, params, env):
+    
+    """NTP server configuration and related network file modification
+    """
+
+    kvm_log.info("stop the iptables service if it is running for timedrift testing")
+
+    #if not os.system("/etc/init.d/iptables status"):
+    #    os.system("/etc/init.d/iptables stop")
+    if utils.system("/etc/init.d/iptables status", ignore_status=True) == 0:
+        utils.system("/etc/init.d/iptables stop", ignore_status=True)
+
+
+    # prevent dhcp client modify the ntp.conf
+    kvm_log.info("prevent dhcp client modify the ntp.conf")
+
+    config_file = "/etc/sysconfig/network"
+    network_file = open("/etc/sysconfig/network", "a")
+    string = "PEERNTP=no"
+
+    if os.system("grep %s %s" % (string, config_file)):
+        network_file.writelines(str(string)+'\n')
+    
+    network_file.close()
+  
+    # stop the ntp service if it is running
+    kvm_log.info("stop ntp service if it is running")
+
+    #if not os.system("/etc/init.d/ntpd status"):
+    if utils.system("/etc/init.d/ntpd status", ignore_status=True) == 0:
+        utils.system("/etc/init.d/ntpd stop", ignore_status=True)
+    #    os.system("/etc/init.d/ntpd stop")
+    #    ntp_running = True
+
+    kvm_log.info("start ntp server on host with the custom config file.")
+
+    ntp_cmd = '''
+        echo "restrict default kod nomodify notrap nopeer noquery" > /etc/timedrift.ntp.conf;\
+        echo "restrict 127.0.0.1" >> /etc/timedrift.ntp.conf;\
+        echo "driftfile /var/lib/ntp/drift" >> /etc/timedrift.ntp.conf;\
+        echo "keys /etc/ntp/keys" >> /etc/timedrift.ntp.conf;\
+        echo "server 127.127.1.0" >> /etc/timedrift.ntp.conf;\
+        echo "fudge 127.127.1.0 stratum 1" >> /etc/timedrift.ntp.conf;\
+        ntpd -c /etc/timedrift.ntp.conf;
+        '''
+    #if os.system(ntp_cmd):
+    if utils.system(ntp_cmd, ignore_status=True) != 0:
+        raise error.TestFail, "NTP server has not starting correctly..."
+
+    #kvm_log.info("sync system clock to BIOS")
+    #os.system("/sbin/hwclock --systohc")
+   
+def run_timedrift(test, params, env):
+    """judge whether the guest clock will encounter timedrift prblem or not. including three stages:
+       1: try to sync the clock with host, if the offset value of guest clock is large than 1 sec.
+       2: running the cpu stress testing program<cpu_stress.c> on guest
+       3: then run analyze loop totally 20 times to determine if the clock on guest has time drift.
+    """
+    # variables using in timedrift testcase
+    #cpu_stress_program = 'cpu_stress.c'
+    cpu_stress_program = 'genload'
+    #remote_dir = '/root'
+    remote_dir = '/'
+
+    clock_resource_cmd = 'cat /sys/devices/system/clocksource/clocksource0/current_clocksource'
+
+    pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm_runtest_2')
+    stress_test_dir = os.path.join(pwd, cpu_stress_program)
+
+    stress_test_specify = params.get('stress_test_specify')
+    kvm_log.info("stress command is :%s" % stress_test_specify)
+    #stress_cmdline = 'cd %s;gcc %s -lm;./a.out &' % (remote_dir, os.path.basename(stress_test_dir))
+    stress_cmdline = 'cd %s%s;make;%s &' % (remote_dir, os.path.basename(stress_test_dir), stress_test_specify)
+
+    stress_search_cmdline = "ps -ef|grep 'stress'|grep -v grep"
+
+    #hostname = os.environ.get("HOSTNAME")
+    hostname = socket.gethostname()
+    if "localhost.localdomain" == hostname:
+        hostname = os.popen('hostname').read().split('\n')[0]
+        kvm_log.info("since get wrong hostname from python evnironment, then use the hostname get from system call(hostname).")
+
+    kvm_log.info("get host name :%s" % hostname)
+
+    # ntpdate info command and ntpdate sync command
+    ntpdate_info_cmd = "ntpdate -q %s" % hostname
+    ntpdate_sync_cmd = "ntpdate %s" % hostname
+
+    # get vm handle
+    vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
+    if not vm:
+        raise error.TestError, "VM object not found in environment"
+    if not vm.is_alive():
+        raise error.TestError, "VM seems to be dead; Test requires a living VM"
+
+    kvm_log.info("Waiting for guest to be up...")
+
+    pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
+    if not pxssh:
+        raise error.TestFail, "Could not log into guest"
+
+    kvm_log.info("Logged into guest IN run_timedrift function.")
+
+    # clock resource get from host and guest
+    kvm_log.info("****************** clock resource info ******************")
+    host_clock_resource = os.popen(clock_resource_cmd).read().split('\n')[0]
+    kvm_log.info("the clock resource on HOST is :%s" % host_clock_resource)
+
+    pxssh.sendline(clock_resource_cmd)
+    s, o = pxssh.read_up_to_prompt()
+    guest_clock_resource = o.splitlines()[-2]
+    kvm_log.info("the clock resource on guest is :%s" % guest_clock_resource)
+    kvm_log.info("*********************************************************")
+
+    if host_clock_resource != guest_clock_resource:
+        #raise error.TestFail, "Host and Guest using different clock resource"
+        kvm_log.info("Host and Guest using different clock resource,Let's move on.")
+    else:
+        kvm_log.info("Host and Guest using same clock resource,Let's move on.")
+
+    # helper function: 
+    # ntpdate_op: a entire process to get ntpdate command line result from guest.
+    # time_drift_or_not: get the numeric handing by regular expression and make timedrift calulation.
+    def ntpdate_op(command):
+        output = []
+        try:
+            pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
+            if not pxssh:
+                raise error.TestFail, "Could not log into guest"
+
+            kvm_log.info("Logged in:(ntpdate_op)")
+
+            while True:
+                pxssh.sendline(command)
+                s, output = pxssh.read_up_to_prompt()
+                if "time server" in output:
+                    # output is a string contain the (ntpdate -q) infor on guest
+                    return True, output
+                else:
+                    continue
+        except:
+            pxssh.close()
+            return False, output
+        return False, output
+
+    def time_drift_or_not(output):
+        date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
+        num = float(date_string[0])
+        if num >= 1:
+            kvm_log.info("guest clock has drifted in this scenario :%s %s" % (date_string, num))
+            return False
+        else:
+            kvm_log.info("guest clock running veracious in now stage :%s %s" % (date_string, num))
+            return True
+
+    # send the command and get the ouput from guest
+    # this loop will pick out several conditions need to be process
+    # Actually, we want to get the info match "time server", then script can analyzing it to
+    # determine if guest's clock need sync with host or not.
+    while True:
+        pxssh.sendline(ntpdate_info_cmd)
+        s, output = pxssh.read_up_to_prompt()
+        kvm_log.info("the ntpdate query info get from guest is below: \n%s" %output)
+        if ("no server suitable" not in output) and ("time server" not in output):
+            kvm_log.info("very creazying output got. let's try again")
+            continue
+        elif "no server suitable" in output:
+            kvm_log.info("seems NTP server is not ready for servicing")
+            time.sleep(30)
+            continue
+        elif "time server" in output:
+            # get the ntpdate info from guest
+            # kvm_log.info("Got the correct output for analyze. The output is below: \n%s" %output) 
+            break
+
+    kvm_log.info("get the ntpdate infomation from guest successfully :%s" % os.popen('date').read())
+
+    # judge the clock need to sync with host or not
+    while True:
+        date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
+        num = float(date_string[0])
+        if num >= 1:
+            kvm_log.info("guest need sync with the server: %s" % hostname)
+            s, output = ntpdate_op(ntpdate_sync_cmd)
+            if s:
+                continue
+        else:
+            #pxssh.sendline("hwclock --systohc")
+            #kvm_log.info("guest clock sync prcdure is finished. then sync the guest clock to guest bios.")
+
+            #pxssh.sendline("hwclock --show")
+            #s, o = pxssh.read_up_to_prompt()
+            #kvm_log.info("the date infomation get from guest bios is :\n%s" % o)
+
+            pxssh.sendline(ntpdate_info_cmd)
+            s, o = pxssh.read_up_to_prompt()
+            kvm_log.info("guest clock after sync with host is :\n%s" % o)
+
+            break
+
+    kvm_log.info("Timedrift Preparation *Finished* at last :%s" % os.popen('date').read())
+
+    if not vm.scp_to_remote(stress_test_dir, remote_dir):
+        raise error.TestError, "Could not copy program to guest."
+
+    pxssh.sendline(ntpdate_info_cmd)
+    s, o = pxssh.read_up_to_prompt()
+    kvm_log.info("the ntpdate query from host *BEFORE* running the cpu stress program.\n%s" % o)
+    pxssh.sendline(stress_cmdline)
+    s, o = pxssh.read_up_to_prompt()
+    kvm_log.info("running command line on guest and sleeping for 1200 secs.\n%s" % o)
+
+    time.sleep(1200)
+
+    while True:
+        if pxssh.get_command_status(stress_search_cmdline):
+            #(s, o) = pxssh.get_command_status_output(stress_search_cmdline)
+            #print "s is :%s" % s
+            #print "o is :%s" % o
+            #print "--------------------------------------------"
+            #aaa = pxssh.get_command_status(stress_search_cmdline)
+            #print "aaa is :%s" % aaa
+            #print "--------------------------------------------"
+
+            print "stress testing process has been completed and quit."
+            break
+        else:
+            print "stress testing on CPU has not finished yet.waiting for next detect after sleep 60 secs."
+            time.sleep(60)
+            continue
+
+    pxssh.sendline(ntpdate_info_cmd)
+    s, o = pxssh.read_up_to_prompt()
+    kvm_log.info("the ntpdate query from host *AFTER* running the cpu stress program.\n%s" % o)
+
+    pxssh.close()
+
+    # Sleep for analyze...
+    kvm_log.info("sleeping(180 secs) Starting... :%s" % os.popen('date').read())
+    time.sleep(180)
+    kvm_log.info("wakeup to get the analyzing... :%s" % os.popen('date').read())
+    count = 0
+    for i in range(1, 21):
+        kvm_log.info("this is %s time to get clock info from guest." % i)
+        s, o = ntpdate_op(ntpdate_info_cmd)
+        
+        if not s:
+            raise error.TestFail, "Guest seems hang or ssh service based on guest has been crash down"
+        
+        if not time_drift_or_not(o):
+            count += 1
+
+        if count == 5:
+            raise error.TestFail, "TimeDrift testing Abort because guest's clock has drift too much"
+
+        kvm_log.info("*********************** Sleep 30 seconds for next loop *************************")
+        time.sleep(60)
+
+

next prev parent reply	other threads:[~2009-05-12 13:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-06  4:02 [KVM-AUTOTEST][PATCH] timedrift support Bear Yang
2009-05-06 13:02 ` Marcelo Tosatti
2009-05-11 10:40   ` Bear Yang
2009-05-11 11:05     ` Yaniv Kaul
2009-05-11 12:59     ` Lucas Meneghel Rodrigues
2009-05-12 12:31       ` Bear Yang
2009-05-12 13:07         ` Bear Yang [this message]
2009-05-13 18:34           ` Lucas Meneghel Rodrigues
2009-05-15 16:30             ` bear
2009-05-16 20:36               ` Yaniv Kaul
2009-05-18  1:54                 ` Bear Yang
2009-05-18  4:20                   ` Yaniv Kaul
2009-05-18  5:18                     ` Bear Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A09748C.6040909@redhat.com \
    --to=byang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mrodrigu@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=uril@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).