From: Bear Yang <byang@redhat.com>
To: Lucas Meneghel Rodrigues <mrodrigu@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
uril@redhat.com, kvm@vger.kernel.org
Subject: Re: [KVM-AUTOTEST][PATCH] timedrift support
Date: Tue, 12 May 2009 21:07:24 +0800 [thread overview]
Message-ID: <4A09748C.6040909@redhat.com> (raw)
In-Reply-To: <4A096C28.1060900@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 17786 bytes --]
Sorry forgot to attach my new patch.
Bear Yang wrote:
> Hi Lucas:
> First, I want to say really thanks for your kindly,carefully words and
> suggestions. now, I modified my scripts follow your opinions.
> 1. Add the genload to timedrift, but I am not sure whether it is right
> or not to add the information CVS relevant. If it is not necessary. I
> will remove them next time.
> 2. Replace the API os.system to utils.system
> 3. Replace the API os.environ.get('HOSTNAME') to socket.gethostname()
> 4. for the snippet of the code below:
> + if utils.system(ntp_cmd, ignore_status=True) != 0:
> + raise error.TestFail, "NTP server has not starting correctly..."
>
> Your suggestion is "Instead of the if clause we'd put a try/except
> block", but I am not clear how to do it. Would you please give me some
> guides for this. Sorry.
>
> Other thing about functional the clauses which to get vm handle below:
>
> + # get vm handle
> + vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
> + if not vm:
> + raise error.TestError, "VM object not found in environment"
> + if not vm.is_alive():
> + raise error.TestError, "VM seems to be dead; Test requires a
> living VM"
>
> I agree with you on this point, I remember that somebody to do this
> before. but seems upstream not accept his modification.
>
> Have a good day
>
> thanks.
>
>
> Lucas Meneghel Rodrigues wrote:
>> On Mon, 2009-05-11 at 18:40 +0800, Bear Yang wrote:
>>
>>> Hello.
>>> I have modified my script according Marcelo's suggestion. and
>>> resubmit my script to you all. :)
>>>
>>> Marcelo, Seems except you, no one care my script. I still want to
>>> say any suggestion on my script would be greatly appreciated.
>>>
>>> Thanks.
>>>
>>> Bear
>>>
>>>
>>
>> Hi Bear, sorry, I had some hectic days here so I still haven't reviewed
>> your patch.
>> As a general comment, I realize that in several occasions we are using
>> os.system() to execute commands on the host, when we would usually
>> prefer to use the utils.system() or utils.run() API, since it already
>> throws an exception when exit code != 0 (you can allways set ignore_fail
>> = True to avoid this behavior if needed) and we are working on doing a
>> better handling of stdout and stderr upstream.
>>
>> My comments follow:
>>
>> diff -urN kvm_runtest_2.bak/cpu_stress.c kvm_runtest_2/cpu_stress.c
>> --- kvm_runtest_2.bak/cpu_stress.c 1969-12-31 19:00:00.000000000
>> -0500
>> +++ kvm_runtest_2/cpu_stress.c 2009-05-05 22:35:34.000000000 -0400
>> @@ -0,0 +1,61 @@
>> +#define _GNU_SOURCE
>> +#include <stdio.h>
>> +#include <pthread.h>
>> +#include <sched.h>
>> +#include <stdlib.h>
>> +#include <fcntl.h>
>> +#include <math.h>
>> +#include <unistd.h>
>> +
>> +#define MAX_CPUS 256
>> +#define BUFFSIZE 1024
>> +
>> +
>> +void worker_child(int cpu)
>> +{
>> + int cur_freq;
>> + int min_freq;
>> + int max_freq;
>> + int last_freq;
>> + cpu_set_t mask;
>> + int i;
>> + double x;
>> + int d = 0;
>> + /*
>> + * bind this thread to the specified cpu + */
>> + CPU_ZERO(&mask);
>> + CPU_SET(cpu, &mask);
>> + sched_setaffinity(0, CPU_SETSIZE, &mask);
>> +
>> + while (d++ != 500000) {
>> + for (i=0; i<100000; i++)
>> + x = sqrt(x);
>> + }
>> +
>> + _exit(0);
>> +
>> +}
>> +
>> +
>> +main() {
>> + cpu_set_t mask;
>> + int i;
>> + int code;
>> +
>> + if (sched_getaffinity(0, CPU_SETSIZE, &mask) < 0){
>> + perror ("sched_getaffinity");
>> + exit(1);
>> + }
>> +
>> + for (i=0; i<CPU_SETSIZE; i++)
>> + if (CPU_ISSET(i, &mask)){
>> + printf ("CPU%d\n",i);
>> + if (fork() == 0)
>> + worker_child(i);
>> + }
>> +
>> +
>> + wait(&code);
>> + exit (WEXITSTATUS(code));
>> +}
>>
>> I believe we might want to use a more complete stress system, that
>> can do IO stress and put 'memory pressure' on the host system. When I
>> need to cause stress on a host, what I end up doing is to hack the
>> stress.c program from LTP, because it can do memory and IO stress as
>> well.
>> I will send you the stress.c program on a separate e-mail.
>>
>> diff -urN kvm_runtest_2.bak/kvm_runtest_2.py
>> kvm_runtest_2/kvm_runtest_2.py
>> --- kvm_runtest_2.bak/kvm_runtest_2.py 2009-04-29
>> 06:17:29.000000000 -0400
>> +++ kvm_runtest_2/kvm_runtest_2.py 2009-04-29 08:06:32.000000000
>> -0400
>> @@ -36,6 +36,8 @@
>> "autotest": test_routine("kvm_tests",
>> "run_autotest"),
>> "kvm_install": test_routine("kvm_install",
>> "run_kvm_install"),
>> "linux_s3": test_routine("kvm_tests",
>> "run_linux_s3"),
>> + "ntp_server_setup": test_routine("kvm_tests",
>> "run_ntp_server_setup"),
>> + "timedrift": test_routine("kvm_tests",
>> "run_timedrift"),
>> }
>>
>> # Make it possible to import modules from the test's bindir
>> diff -urN kvm_runtest_2.bak/kvm_tests.cfg.sample
>> kvm_runtest_2/kvm_tests.cfg.sample
>> --- kvm_runtest_2.bak/kvm_tests.cfg.sample 2009-04-29
>> 06:17:29.000000000 -0400
>> +++ kvm_runtest_2/kvm_tests.cfg.sample 2009-04-29
>> 08:09:36.000000000 -0400
>> @@ -81,6 +81,10 @@
>> - linux_s3: install setup
>> type = linux_s3
>>
>> + - ntp_server_setup:
>> + type = ntp_server_setup
>> + - timedrift: ntp_server_setup
>> + type = timedrift
>> # NICs
>> variants:
>> - @rtl8139:
>> diff -urN kvm_runtest_2.bak/kvm_tests.py kvm_runtest_2/kvm_tests.py
>> --- kvm_runtest_2.bak/kvm_tests.py 2009-04-29 06:17:29.000000000
>> -0400
>> +++ kvm_runtest_2/kvm_tests.py 2009-05-11 06:00:32.000000000 -0400
>> @@ -394,3 +394,247 @@
>> kvm_log.info("VM resumed after S3")
>>
>> session.close()
>> +
>> +def run_ntp_server_setup(test, params, env):
>> + + """NTP server configuration and related network file
>> modification
>> + """
>> +
>> + kvm_log.info("stop the iptables service if it is running for
>> timedrift testing")
>> +
>> + if not os.system("/etc/init.d/iptables status"):
>> + os.system("/etc/init.d/iptables stop")
>> +
>> + # prevent dhcp client modify the ntp.conf
>> + kvm_log.info("prevent dhcp client modify the ntp.conf")
>> +
>> + config_file = "/etc/sysconfig/network"
>> + network_file = open("/etc/sysconfig/network", "a")
>> + string = "PEERNTP=no"
>> +
>> + if os.system("grep %s %s" % (string, config_file)):
>> + network_file.writelines(str(string)+'\n')
>> + + network_file.close()
>> + + # stop the ntp service if it is running
>> + kvm_log.info("stop ntp service if it is running")
>> +
>> + if not os.system("/etc/init.d/ntpd status"):
>> + os.system("/etc/init.d/ntpd stop")
>> + ntp_running = True
>> +
>> + kvm_log.info("start ntp server on host with the custom config
>> file.")
>> +
>> + ntp_cmd = '''
>> + echo "restrict default kod nomodify notrap nopeer noquery"
>> >> /etc/timedrift.ntp.conf;\
>> + echo "restrict 127.0.0.1" >> /etc/timedrift.ntp.conf;\
>> + echo "driftfile /var/lib/ntp/drift" >>
>> /etc/timedrift.ntp.conf;\
>> + echo "keys /etc/ntp/keys" >> /etc/timedrift.ntp.conf;\
>> + echo "server 127.127.1.0" >> /etc/timedrift.ntp.conf;\
>> + echo "fudge 127.127.1.0 stratum 1" >> /etc/timedrift.ntp.conf;\
>> + ntpd -c /etc/timedrift.ntp.conf;
>> + '''
>> + if os.system(ntp_cmd):
>> + raise error.TestFail, "NTP server has not starting correct..."
>>
>> Here you could have used regular utils.system API instead of
>> os.system since it integrates better with the autotest infrastructure.
>> Instead of the if clause we'd put a try/except block. Minor
>> nipticking, "NTP server has not started correctly..."
>>
>> + #kvm_log.info("sync system clock to BIOS")
>> + #os.system("/sbin/hwclock --systohc")
>> + +def run_timedrift(test, params, env):
>> + """judge wether the guest clock will encounter timedrift prblem
>> or not. including three stages:
>>
>> Typo, "whether"
>>
>> + 1: try to sync the clock with host, if the offset value of
>> guest clock is large than 1 sec.
>> + 2: running the cpu stress testing program<cpu_stress.c> on guest
>> + 3: then run analyze loop totally 20 times to determine if the
>> clock on guest has time drift.
>> + """
>> + # variables using in timedrift testcase
>> + cpu_stress_program = "cpu_stress.c"
>> + remote_dir = '/root'
>> +
>> + clock_resource_cmd = "cat
>> /sys/devices/system/clocksource/clocksource0/current_clocksource"
>> +
>> + pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm_runtest_2')
>> + cpu_stress_test = os.path.join(pwd, cpu_stress_program)
>> + cpu_stress_cmdline = 'cd %s;gcc %s -lm;./a.out &' % (remote_dir,
>> os.path.basename(cpu_stress_test))
>> +
>> + cpu_stress_search_cmdline = "ps -ef|grep 'a.out'|grep -v grep"
>> +
>> + hostname = os.environ.get("HOSTNAME")
>>
>> Can't we use socket.gethostname() here instead of relying on
>> environment variable values?
>> + if "localhost.localdomain" == hostname:
>> + hostname = os.popen('hostname').read().split('\n')[0]
>> + kvm_log.info("since get wrong hostname from python
>> evnironment, then use the hostname get from system call(hostname).")
>> +
>> + kvm_log.info("get host name :%s" % hostname)
>> +
>> + # ntpdate info command and ntpdate sync command
>> + ntpdate_info_cmd = "ntpdate -q %s" % hostname
>> + ntpdate_sync_cmd = "ntpdate %s" % hostname
>> +
>> + # get vm handle
>> + vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
>> + if not vm:
>> + raise error.TestError, "VM object not found in environment"
>> + if not vm.is_alive():
>> + raise error.TestError, "VM seems to be dead; Test requires a
>> living VM"
>>
>> I am seeing this piece of code to get the VM handle on several tests,
>> I am starting to think we should factor this on an utility function...
>>
>> + kvm_log.info("Waiting for guest to be up...")
>> +
>> + pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
>> + if not pxssh:
>> + raise error.TestFail, "Could not log into guest"
>> +
>> + kvm_log.info("Logged into guest IN run_timedrift function.")
>> +
>> + # clock resource get from host and guest
>> + host_clock_resource =
>> os.popen(clock_resource_cmd).read().split('\n')[0]
>> + kvm_log.info("the clock resource on host is :%s" %
>> host_clock_resource)
>> +
>> + pxssh.sendline(clock_resource_cmd)
>> + s, o = pxssh.read_up_to_prompt()
>> + guest_clock_resource = o.splitlines()[-2]
>> + kvm_log.info("the clock resource on guest is :%s" %
>> guest_clock_resource)
>> +
>> + if host_clock_resource != guest_clock_resource:
>> + #raise error.TestFail, "Host and Guest using different clock
>> resource"
>> + kvm_log.info("Host and Guest using different clock
>> resource,Let's moving on.")
>> + else:
>> + kvm_log.info("Host and Guest using same clock resource,Let's
>> moving on.")
>>
>> Little mistake here, "Let's move on."
>>
>> + # helper function: + # ntpdate_op: a entire process to get
>> ntpdate command line result from guest.
>> + # time_drift_or_not: get the numeric handing by regular
>> expression and make timedrift calulation.
>> + def ntpdate_op(command):
>> + output = []
>> + try:
>> + pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
>> + if not pxssh:
>> + raise error.TestFail, "Could not log into guest"
>> +
>> + kvm_log.info("Logged in:(ntpdate_op)")
>> +
>> + while True:
>> + pxssh.sendline(command)
>> + s, output = pxssh.read_up_to_prompt()
>> + if "time server" in output:
>> + # output is a string contain the (ntpdate -q)
>> infor on guest
>> + return True, output
>> + else:
>> + continue
>> + except:
>> + pxssh.close()
>> + return False, output
>> + return False, output
>> +
>> + def time_drift_or_not(output):
>> + date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
>> + num = float(date_string[0])
>> + if num >= 1:
>> + kvm_log.info("guest clock has drifted in this scenario
>> :%s %s" % (date_string, num))
>> + return False
>> + else:
>> + kvm_log.info("guest clock running veracious in now stage
>> :%s %s" % (date_string, num))
>> + return True
>> +
>> + # send the command and get the ouput from guest
>> + # this loop will pick out several conditions need to be process
>> + # Actually, we want to get the info match "time server", then
>> script can analyzing it to
>> + # determine if guest's clock need sync with host or not.
>> + while True:
>> + pxssh.sendline(ntpdate_info_cmd)
>> + s, output = pxssh.read_up_to_prompt()
>> + kvm_log.info("the ntpdate query info get from guest is
>> below: \n%s" %output)
>> + if ("no server suitable" not in output) and ("time server"
>> not in output):
>> + kvm_log.info("very creazying output got. let's try again")
>> + continue
>> + elif "no server suitable" in output:
>> + kvm_log.info("seems NTP server is not ready for servicing")
>> + time.sleep(30)
>> + continue
>> + elif "time server" in output:
>> + # get the ntpdate info from guest
>> + # kvm_log.info("Got the correct output for analyze. The
>> output is below: \n%s" %output) + break
>> +
>> + kvm_log.info("get the ntpdate infomation from guest successfully
>> :%s" % os.popen('date').read())
>> +
>> + # judge the clock need to sync with host or not
>> + while True:
>> + date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
>> + num = float(date_string[0])
>> + if num >= 1:
>> + kvm_log.info("guest need sync with the server: %s" %
>> hostname)
>> + s, output = ntpdate_op(ntpdate_sync_cmd)
>> + if s:
>> + continue
>> + else:
>> + #pxssh.sendline("hwclock --systohc")
>> + #kvm_log.info("guest clock sync prcdure is finished.
>> then sync the guest clock to guest bios.")
>> +
>> + #pxssh.sendline("hwclock --show")
>> + #s, o = pxssh.read_up_to_prompt()
>> + #kvm_log.info("the date infomation get from guest bios
>> is :\n%s" % o)
>> +
>> + pxssh.sendline(ntpdate_info_cmd)
>> + s, o = pxssh.read_up_to_prompt()
>> + kvm_log.info("guest clock after sync with host is :\n%s"
>> % o)
>> +
>> + break
>> +
>> + kvm_log.info("Timedrift Preparation *Finished* at last :%s" %
>> os.popen('date').read())
>> +
>> + if not vm.scp_to_remote(cpu_stress_test, remote_dir):
>> + raise error.TestError, "Could not copy program to guest."
>> +
>> + pxssh.sendline(ntpdate_info_cmd)
>> + s, o = pxssh.read_up_to_prompt()
>> + kvm_log.info("the ntpdate query from host *BEFORE* running the
>> cpu stress program.\n%s" % o)
>> + pxssh.sendline(cpu_stress_cmdline)
>> + s, o = pxssh.read_up_to_prompt()
>> + kvm_log.info("running command line on guest and sleeping for
>> 1200 secs.\n%s" % o)
>> +
>> + time.sleep(1200)
>> +
>> + while True:
>> + if pxssh.get_command_status(cpu_stress_search_cmdline):
>> + #(s, o) =
>> pxssh.get_command_status_output(cpu_stress_search_cmdline)
>> + #print "s is :%s" % s
>> + #print "o is :%s" % o
>> + #print "--------------------------------------------"
>> + #aaa = pxssh.get_command_status(cpu_stress_search_cmdline)
>> + #print "aaa is :%s" % aaa
>> + #print "--------------------------------------------"
>> +
>> + print "stress testing process has been completed and quit."
>> + break
>> + else:
>> + print "stress testing on CPU has not finished
>> yet.waiting for next detect after sleep 60 secs."
>> + time.sleep(60)
>> + continue
>> +
>> + pxssh.sendline(ntpdate_info_cmd)
>> + s, o = pxssh.read_up_to_prompt()
>> + kvm_log.info("the ntpdate query from host *AFTER* running the
>> cpu stress program.\n%s" % o)
>> +
>> + pxssh.close()
>> +
>> + # Sleep for analyze...
>> + kvm_log.info("sleeping(180 secs) Starting... :%s" %
>> os.popen('date').read())
>> + time.sleep(180)
>> + kvm_log.info("wakeup to get the analyzing... :%s" %
>> os.popen('date').read())
>> + count = 0
>> + for i in range(1, 21):
>> + kvm_log.info("this is %s time to get clock info from guest."
>> % i)
>> + s, o = ntpdate_op(ntpdate_info_cmd)
>> + + if not s:
>> + raise error.TestFail, "Guest seems hang or ssh service
>> based on guest has been crash down"
>> + + if not time_drift_or_not(o):
>> + count += 1
>> +
>> + if count == 5:
>> + raise error.TestFail, "TimeDrift testing Abort because
>> guest's clock has drift too much"
>> +
>> + kvm_log.info("*********************** Sleep 30 seconds for
>> next loop *************************")
>> + time.sleep(60)
>> +
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: timedrift-3.patch --]
[-- Type: text/x-patch, Size: 45377 bytes --]
diff -urN kvm_runtest_2.bak/genload/CVS/Entries kvm_runtest_2/genload/CVS/Entries
--- kvm_runtest_2.bak/genload/CVS/Entries 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Entries 2009-05-11 16:47:49.000000000 -0400
@@ -0,0 +1,3 @@
+/Makefile/1.4/Mon Sep 29 16:22:14 2008//
+/README/1.1/Fri Dec 13 21:34:13 2002//
+/stress.c/1.3/Thu Jul 26 12:40:17 2007//
diff -urN kvm_runtest_2.bak/genload/CVS/Repository kvm_runtest_2/genload/CVS/Repository
--- kvm_runtest_2.bak/genload/CVS/Repository 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Repository 2009-05-11 16:47:44.000000000 -0400
@@ -0,0 +1 @@
+ltp/tools/genload
diff -urN kvm_runtest_2.bak/genload/CVS/Root kvm_runtest_2/genload/CVS/Root
--- kvm_runtest_2.bak/genload/CVS/Root 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/CVS/Root 2009-05-11 16:47:44.000000000 -0400
@@ -0,0 +1 @@
+:pserver:anonymous@ltp.cvs.sourceforge.net:/cvsroot/ltp
diff -urN kvm_runtest_2.bak/genload/Makefile kvm_runtest_2/genload/Makefile
--- kvm_runtest_2.bak/genload/Makefile 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/Makefile 2008-09-29 12:22:14.000000000 -0400
@@ -0,0 +1,14 @@
+CFLAGS+= -DPACKAGE=\"stress\" -DVERSION=\"0.17pre11\"
+
+LDLIBS+= -lm
+
+SRCS=$(wildcard *.c)
+TARGETS=$(patsubst %.c,%,$(SRCS))
+
+all: $(TARGETS)
+
+install:
+ @ln -f $(TARGETS) ../../testcases/bin/genload
+
+clean:
+ rm -fr $(TARGETS)
diff -urN kvm_runtest_2.bak/genload/README kvm_runtest_2/genload/README
--- kvm_runtest_2.bak/genload/README 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/README 2002-12-13 16:34:13.000000000 -0500
@@ -0,0 +1,72 @@
+USAGE
+
+See the program's usage statement by invoking with --help.
+
+NOTES
+
+This program works really well for me, but it might not have some of the
+features that you want. If you would like, please extend the code and send
+me the patch[1]. Enjoy the program :-)
+
+Please use the context diff format. That is: save the original program
+as stress.c.orig, then make and test your desired changes to stress.c, then
+run 'diff -u stress.c.orig stress.c' to produce a context patch. Thanks.
+
+Amos Waterland <apw@rossby.metr.ou.edu>
+Norman, Oklahoma
+27 Nov 2001
+
+EXAMPLES
+[examples]
+
+The simple case is that you just want to bring the system load average up to
+an arbitrary value. The following forks 13 processes, each of which spins
+in a tight loop calculating the sqrt() of a random number acquired with
+rand().
+
+ % stress -c 13
+
+Long options are supported, as well as is making the output less verbose.
+The following forks 1024 processes, and only reports error messages if any.
+
+ % stress --quiet --hogcpu 1k
+
+To see how your system performs when it is I/O bound, use the -i switch.
+The following forks 4 processes, each of which spins in a tight loop calling
+sync(), which is a system call that flushes memory buffers to disk.
+
+ % stress -i 4
+
+Multiple hogs may be combined on the same command line. The following does
+everything the preceding examples did in one command, but also turns up the
+verbosity level as well as showing how to cause the command to
+self-terminate after 1 minute.
+
+ % stress -c 13 -i 4 --verbose --timeout 1m
+
+An value of 0 normally denotes infinity. The following is how to do a fork
+bomb (be careful with this).
+
+ % stress -c 0
+
+For the -m and -d options, a value of 0 means to redo their operation an
+infinite number of times. To allocate and free 128MB in a redo loop use the
+following command. This can be useful for "bouncing" against the system RAM
+ceiling.
+
+ % stress -m 0 --hogvm-bytes 128M
+
+For the -m and -d options, a negative value of n means to redo the operation
+abs(n) times. Here is now to allocate and free 5MB three times in a row.
+
+ % stress -m -3 --hogvm-bytes 5m
+
+You can write a file of arbitrary length to disk. The file is created with
+mkstemp() in the current directory, the default is to unlink it, but
+unlinking can be overridden with the --hoghdd-noclean flag.
+
+ % stress -d 1 --hoghdd-noclean --hoghdd-bytes 13
+
+Large file support is enabled.
+
+ % stress -d 1 --hoghdd-noclean --hoghdd-bytes 3G
diff -urN kvm_runtest_2.bak/genload/stress.c kvm_runtest_2/genload/stress.c
--- kvm_runtest_2.bak/genload/stress.c 1969-12-31 19:00:00.000000000 -0500
+++ kvm_runtest_2/genload/stress.c 2007-07-26 08:40:17.000000000 -0400
@@ -0,0 +1,983 @@
+/* A program to put stress on a POSIX system (stress).
+ *
+ * Copyright (C) 2001, 2002 Amos Waterland <awaterl@yahoo.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc., 59
+ * Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <ctype.h>
+#include <errno.h>
+#include <libgen.h>
+#include <math.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <signal.h>
+#include <time.h>
+#include <unistd.h>
+#include <sys/wait.h>
+
+/* By default, print all messages of severity info and above. */
+static int global_debug = 2;
+
+/* By default, just print warning for non-critical errors. */
+static int global_ignore = 1;
+
+/* By default, retry on non-critical errors every 50ms. */
+static int global_retry = 50000;
+
+/* By default, use this as backoff coefficient for good fork throughput. */
+static int global_backoff = 3000;
+
+/* By default, do not timeout. */
+static int global_timeout = 0;
+
+/* Name of this program */
+static char *global_progname = PACKAGE;
+
+/* By default, do not hang after allocating memory. */
+static int global_vmhang = 0;
+
+/* Implemention of runtime-selectable severity message printing. */
+#define dbg if (global_debug >= 3) \
+ fprintf (stdout, "%s: debug: (%d) ", global_progname, __LINE__), \
+ fprintf
+#define out if (global_debug >= 2) \
+ fprintf (stdout, "%s: info: ", global_progname), \
+ fprintf
+#define wrn if (global_debug >= 1) \
+ fprintf (stderr, "%s: warn: (%d) ", global_progname, __LINE__), \
+ fprintf
+#define err if (global_debug >= 0) \
+ fprintf (stderr, "%s: error: (%d) ", global_progname, __LINE__), \
+ fprintf
+
+/* Implementation of check for option argument correctness. */
+#define assert_arg(A) \
+ if (++i == argc || ((arg = argv[i])[0] == '-' && \
+ !isdigit ((int)arg[1]) )) \
+ { \
+ err (stderr, "missing argument to option '%s'\n", A); \
+ exit (1); \
+ }
+
+/* Prototypes for utility functions. */
+int usage (int status);
+int version (int status);
+long long atoll_s (const char *nptr);
+long long atoll_b (const char *nptr);
+
+/* Prototypes for the worker functions. */
+int hogcpu (long long forks);
+int hogio (long long forks);
+int hogvm (long long forks, long long chunks, long long bytes);
+int hoghdd (long long forks, int clean, long long files, long long bytes);
+
+int
+main (int argc, char **argv)
+{
+ int i, pid, children = 0, retval = 0;
+ long starttime, stoptime, runtime;
+
+ /* Variables that indicate which options have been selected. */
+ int do_dryrun = 0;
+ int do_timeout = 0;
+ int do_cpu = 0; /* Default to 1 fork. */
+ long long do_cpu_forks = 1;
+ int do_io = 0; /* Default to 1 fork. */
+ long long do_io_forks = 1;
+ int do_vm = 0; /* Default to 1 fork, 1 chunk of 256MB. */
+ long long do_vm_forks = 1;
+ long long do_vm_chunks = 1;
+ long long do_vm_bytes = 256 * 1024 * 1024;
+ int do_hdd = 0; /* Default to 1 fork, clean, 1 file of 1GB. */
+ long long do_hdd_forks = 1;
+ int do_hdd_clean = 0;
+ long long do_hdd_files = 1;
+ long long do_hdd_bytes = 1024 * 1024 * 1024;
+
+ /* Record our start time. */
+ if ((starttime = time (NULL)) == -1)
+ {
+ err (stderr, "failed to acquire current time\n");
+ exit (1);
+ }
+
+ /* SuSv3 does not define any error conditions for this function. */
+ global_progname = basename (argv[0]);
+
+ /* For portability, parse command line options without getopt_long. */
+ for (i = 1; i < argc; i++)
+ {
+ char *arg = argv[i];
+
+ if (strcmp (arg, "--help") == 0 || strcmp (arg, "-?") == 0)
+ {
+ usage (0);
+ }
+ else if (strcmp (arg, "--version") == 0)
+ {
+ version (0);
+ }
+ else if (strcmp (arg, "--verbose") == 0 || strcmp (arg, "-v") == 0)
+ {
+ global_debug = 3;
+ }
+ else if (strcmp (arg, "--quiet") == 0 || strcmp (arg, "-q") == 0)
+ {
+ global_debug = 0;
+ }
+ else if (strcmp (arg, "--dry-run") == 0 || strcmp (arg, "-n") == 0)
+ {
+ do_dryrun = 1;
+ }
+ else if (strcmp (arg, "--no-retry") == 0)
+ {
+ global_ignore = 0;
+ dbg (stdout, "turning off ignore of non-critical errors");
+ }
+ else if (strcmp (arg, "--retry-delay") == 0)
+ {
+ assert_arg ("--retry-delay");
+ global_retry = atoll (arg);
+ dbg (stdout, "setting retry delay to %dus\n", global_retry);
+ }
+ else if (strcmp (arg, "--backoff") == 0)
+ {
+ assert_arg ("--backoff");
+ global_backoff = atoll (arg);
+ if (global_backoff < 0)
+ {
+ err (stderr, "invalid backoff factor: %i\n", global_backoff);
+ exit (1);
+ }
+ dbg (stdout, "setting backoff coeffient to %dus\n", global_backoff);
+ }
+ else if (strcmp (arg, "--timeout") == 0 || strcmp (arg, "-t") == 0)
+ {
+ do_timeout = 1;
+ assert_arg ("--timeout");
+ global_timeout = atoll_s (arg);
+ dbg (stdout, "setting timeout to %ds\n", global_timeout);
+ }
+ else if (strcmp (arg, "--cpu") == 0 || strcmp (arg, "-c") == 0)
+ {
+ do_cpu = 1;
+ assert_arg ("--cpu");
+ do_cpu_forks = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--io") == 0 || strcmp (arg, "-i") == 0)
+ {
+ do_io = 1;
+ assert_arg ("--io");
+ do_io_forks = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--vm") == 0 || strcmp (arg, "-m") == 0)
+ {
+ do_vm = 1;
+ assert_arg ("--vm");
+ do_vm_forks = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--vm-chunks") == 0)
+ {
+ assert_arg ("--vm-chunks");
+ do_vm_chunks = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--vm-bytes") == 0)
+ {
+ assert_arg ("--vm-bytes");
+ do_vm_bytes = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--vm-hang") == 0)
+ {
+ global_vmhang = 1;
+ }
+ else if (strcmp (arg, "--hdd") == 0 || strcmp (arg, "-d") == 0)
+ {
+ do_hdd = 1;
+ assert_arg ("--hdd");
+ do_hdd_forks = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--hdd-noclean") == 0)
+ {
+ do_hdd_clean = 2;
+ }
+ else if (strcmp (arg, "--hdd-files") == 0)
+ {
+ assert_arg ("--hdd-files");
+ do_hdd_files = atoll_b (arg);
+ }
+ else if (strcmp (arg, "--hdd-bytes") == 0)
+ {
+ assert_arg ("--hdd-bytes");
+ do_hdd_bytes = atoll_b (arg);
+ }
+ else
+ {
+ err (stderr, "unrecognized option: %s\n", arg);
+ exit (1);
+ }
+ }
+
+ /* Hog CPU option. */
+ if (do_cpu)
+ {
+ out (stdout, "dispatching %lli hogcpu forks\n", do_cpu_forks);
+
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ if (do_dryrun)
+ exit (0);
+ exit (hogcpu (do_cpu_forks));
+ case -1: /* error */
+ err (stderr, "hogcpu dispatcher fork failed\n");
+ exit (1);
+ default: /* parent */
+ children++;
+ dbg (stdout, "--> hogcpu dispatcher forked (%i)\n", pid);
+ }
+ }
+
+ /* Hog I/O option. */
+ if (do_io)
+ {
+ out (stdout, "dispatching %lli hogio forks\n", do_io_forks);
+
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ if (do_dryrun)
+ exit (0);
+ exit (hogio (do_io_forks));
+ case -1: /* error */
+ err (stderr, "hogio dispatcher fork failed\n");
+ exit (1);
+ default: /* parent */
+ children++;
+ dbg (stdout, "--> hogio dispatcher forked (%i)\n", pid);
+ }
+ }
+
+ /* Hog VM option. */
+ if (do_vm)
+ {
+ out (stdout,
+ "dispatching %lli hogvm forks, each %lli chunks of %lli bytes\n",
+ do_vm_forks, do_vm_chunks, do_vm_bytes);
+
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ if (do_dryrun)
+ exit (0);
+ exit (hogvm (do_vm_forks, do_vm_chunks, do_vm_bytes));
+ case -1: /* error */
+ err (stderr, "hogvm dispatcher fork failed\n");
+ exit (1);
+ default: /* parent */
+ children++;
+ dbg (stdout, "--> hogvm dispatcher forked (%i)\n", pid);
+ }
+ }
+
+ /* Hog HDD option. */
+ if (do_hdd)
+ {
+ out (stdout, "dispatching %lli hoghdd forks, each %lli files of "
+ "%lli bytes\n", do_hdd_forks, do_hdd_files, do_hdd_bytes);
+
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ if (do_dryrun)
+ exit (0);
+ exit (hoghdd
+ (do_hdd_forks, do_hdd_clean, do_hdd_files, do_hdd_bytes));
+ case -1: /* error */
+ err (stderr, "hoghdd dispatcher fork failed\n");
+ exit (1);
+ default: /* parent */
+ children++;
+ dbg (stdout, "--> hoghdd dispatcher forked (%i)\n", pid);
+ }
+ }
+
+ /* We have no work to do, so bail out. */
+ if (children == 0)
+ usage (0);
+
+ /* Wait for our children to exit. */
+ while (children)
+ {
+ int status, ret;
+
+ if ((pid = wait (&status)) > 0)
+ {
+ if ((WIFEXITED (status)) != 0)
+ {
+ if ((ret = WEXITSTATUS (status)) != 0)
+ {
+ err (stderr, "dispatcher %i returned error %i\n", pid, ret);
+ retval += ret;
+ }
+ else
+ {
+ dbg (stdout, "<-- dispatcher return (%i)\n", pid);
+ }
+ }
+ else
+ {
+ err (stderr, "dispatcher did not exit normally\n");
+ ++retval;
+ }
+
+ --children;
+ }
+ else
+ {
+ dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+ err (stderr, "detected missing dispatcher children\n");
+ ++retval;
+ break;
+ }
+ }
+
+ /* Record our stop time. */
+ if ((stoptime = time (NULL)) == -1)
+ {
+ err (stderr, "failed to acquire current time\n");
+ exit (1);
+ }
+
+ /* Calculate our runtime. */
+ runtime = stoptime - starttime;
+
+ /* Print final status message. */
+ if (retval)
+ {
+ err (stderr, "failed run completed in %lis\n", runtime);
+ }
+ else
+ {
+ out (stdout, "successful run completed in %lis\n", runtime);
+ }
+
+ exit (retval);
+}
+
+int
+usage (int status)
+{
+ char *mesg =
+ "`%s' imposes certain types of compute stress on your system\n\n"
+ "Usage: %s [OPTION [ARG]] ...\n\n"
+ " -?, --help show this help statement\n"
+ " --version show version statement\n"
+ " -v, --verbose be verbose\n"
+ " -q, --quiet be quiet\n"
+ " -n, --dry-run show what would have been done\n"
+ " --no-retry exit rather than retry non-critical errors\n"
+ " --retry-delay n wait n us before continuing past error\n"
+ " -t, --timeout n timeout after n seconds\n"
+ " --backoff n wait for factor of n us before starting work\n"
+ " -c, --cpu n spawn n procs spinning on sqrt()\n"
+ " -i, --io n spawn n procs spinning on sync()\n"
+ " -m, --vm n spawn n procs spinning on malloc()\n"
+ " --vm-chunks c malloc c chunks (default is 1)\n"
+ " --vm-bytes b malloc chunks of b bytes (default is 256MB)\n"
+ " --vm-hang hang in a sleep loop after memory allocated\n"
+ " -d, --hdd n spawn n procs spinning on write()\n"
+ " --hdd-noclean do not unlink file to which random data written\n"
+ " --hdd-files f write to f files (default is 1)\n"
+ " --hdd-bytes b write b bytes (default is 1GB)\n\n"
+ "Infinity is denoted with 0. For -m, -d: n=0 means infinite redo,\n"
+ "n<0 means redo abs(n) times. Valid suffixes are m,h,d,y for time;\n"
+ "k,m,g for size.\n\n";
+
+ fprintf (stdout, mesg, global_progname, global_progname);
+
+ if (status <= 0)
+ exit (-1 * status);
+
+ return 0;
+}
+
+int
+version (int status)
+{
+ char *mesg = "%s %s\n";
+
+ fprintf (stdout, mesg, global_progname, VERSION);
+
+ if (status <= 0)
+ exit (-1 * status);
+
+ return 0;
+}
+
+/* Convert a string representation of a number with an optional size suffix
+ * to a long long.
+ */
+long long
+atoll_b (const char *nptr)
+{
+ int pos;
+ char suffix;
+ long long factor = 1;
+
+ if ((pos = strlen (nptr) - 1) < 0)
+ {
+ err (stderr, "invalid string\n");
+ exit (1);
+ }
+
+ switch (suffix = nptr[pos])
+ {
+ case 'k':
+ case 'K':
+ factor = 1024;
+ break;
+ case 'm':
+ case 'M':
+ factor = 1024 * 1024;
+ break;
+ case 'g':
+ case 'G':
+ factor = 1024 * 1024 * 1024;
+ break;
+ default:
+ if (suffix < '0' || suffix > '9')
+ {
+ err (stderr, "unrecognized suffix: %c\n", suffix);
+ exit (1);
+ }
+ }
+
+ factor = atoll (nptr) * factor;
+
+ return factor;
+}
+
+/* Convert a string representation of a number with an optional time suffix
+ * to a long long.
+ */
+long long
+atoll_s (const char *nptr)
+{
+ int pos;
+ char suffix;
+ long long factor = 1;
+
+ if ((pos = strlen (nptr) - 1) < 0)
+ {
+ err (stderr, "invalid string\n");
+ exit (1);
+ }
+
+ switch (suffix = nptr[pos])
+ {
+ case 's':
+ case 'S':
+ factor = 1;
+ break;
+ case 'm':
+ case 'M':
+ factor = 60;
+ break;
+ case 'h':
+ case 'H':
+ factor = 60 * 60;
+ break;
+ case 'd':
+ case 'D':
+ factor = 60 * 60 * 24;
+ break;
+ case 'y':
+ case 'Y':
+ factor = 60 * 60 * 24 * 360;
+ break;
+ default:
+ if (suffix < '0' || suffix > '9')
+ {
+ err (stderr, "unrecognized suffix: %c\n", suffix);
+ exit (1);
+ }
+ }
+
+ factor = atoll (nptr) * factor;
+
+ return factor;
+}
+
+int
+hogcpu (long long forks)
+{
+ long long i;
+ double d;
+ int pid, retval = 0;
+
+ /* Make local copies of global variables. */
+ int ignore = global_ignore;
+ int retry = global_retry;
+ int timeout = global_timeout;
+ long backoff = global_backoff * forks;
+
+ dbg (stdout, "using backoff sleep of %lius for hogcpu\n", backoff);
+
+ for (i = 0; forks == 0 || i < forks; i++)
+ {
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ alarm (timeout);
+
+ /* Use a backoff sleep to ensure we get good fork throughput. */
+ usleep (backoff);
+
+ while (1)
+ d = sqrt (rand ());
+
+ /* This case never falls through; alarm signal can cause exit. */
+ case -1: /* error */
+ if (ignore)
+ {
+ ++retval;
+ wrn (stderr, "hogcpu worker fork failed, continuing\n");
+ usleep (retry);
+ continue;
+ }
+
+ err (stderr, "hogcpu worker fork failed\n");
+ return 1;
+ default: /* parent */
+ dbg (stdout, "--> hogcpu worker forked (%i)\n", pid);
+ }
+ }
+
+ /* Wait for our children to exit. */
+ while (i)
+ {
+ int status, ret;
+
+ if ((pid = wait (&status)) > 0)
+ {
+ if ((WIFEXITED (status)) != 0)
+ {
+ if ((ret = WEXITSTATUS (status)) != 0)
+ {
+ err (stderr, "hogcpu worker %i exited %i\n", pid, ret);
+ retval += ret;
+ }
+ else
+ {
+ dbg (stdout, "<-- hogcpu worker exited (%i)\n", pid);
+ }
+ }
+ else
+ {
+ dbg (stdout, "<-- hogcpu worker signalled (%i)\n", pid);
+ }
+
+ --i;
+ }
+ else
+ {
+ dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+ err (stderr, "detected missing hogcpu worker children\n");
+ ++retval;
+ break;
+ }
+ }
+
+ return retval;
+}
+
+int
+hogio (long long forks)
+{
+ long long i;
+ int pid, retval = 0;
+
+ /* Make local copies of global variables. */
+ int ignore = global_ignore;
+ int retry = global_retry;
+ int timeout = global_timeout;
+ long backoff = global_backoff * forks;
+
+ dbg (stdout, "using backoff sleep of %lius for hogio\n", backoff);
+
+ for (i = 0; forks == 0 || i < forks; i++)
+ {
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ alarm (timeout);
+
+ /* Use a backoff sleep to ensure we get good fork throughput. */
+ usleep (backoff);
+
+ while (1)
+ sync ();
+
+ /* This case never falls through; alarm signal can cause exit. */
+ case -1: /* error */
+ if (ignore)
+ {
+ ++retval;
+ wrn (stderr, "hogio worker fork failed, continuing\n");
+ usleep (retry);
+ continue;
+ }
+
+ err (stderr, "hogio worker fork failed\n");
+ return 1;
+ default: /* parent */
+ dbg (stdout, "--> hogio worker forked (%i)\n", pid);
+ }
+ }
+
+ /* Wait for our children to exit. */
+ while (i)
+ {
+ int status, ret;
+
+ if ((pid = wait (&status)) > 0)
+ {
+ if ((WIFEXITED (status)) != 0)
+ {
+ if ((ret = WEXITSTATUS (status)) != 0)
+ {
+ err (stderr, "hogio worker %i exited %i\n", pid, ret);
+ retval += ret;
+ }
+ else
+ {
+ dbg (stdout, "<-- hogio worker exited (%i)\n", pid);
+ }
+ }
+ else
+ {
+ dbg (stdout, "<-- hogio worker signalled (%i)\n", pid);
+ }
+
+ --i;
+ }
+ else
+ {
+ dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+ err (stderr, "detected missing hogio worker children\n");
+ ++retval;
+ break;
+ }
+ }
+
+ return retval;
+}
+
+int
+hogvm (long long forks, long long chunks, long long bytes)
+{
+ long long i, j, k;
+ int pid, retval = 0;
+ char **ptr;
+
+ /* Make local copies of global variables. */
+ int ignore = global_ignore;
+ int retry = global_retry;
+ int timeout = global_timeout;
+ long backoff = global_backoff * forks;
+
+ dbg (stdout, "using backoff sleep of %lius for hogvm\n", backoff);
+
+ if (bytes == 0)
+ {
+ /* 512MB is guess at the largest value can than be malloced at once. */
+ bytes = 512 * 1024 * 1024;
+ }
+
+ for (i = 0; forks == 0 || i < forks; i++)
+ {
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ alarm (timeout);
+
+ /* Use a backoff sleep to ensure we get good fork throughput. */
+ usleep (backoff);
+
+ while (1)
+ {
+ ptr = (char **) malloc ( chunks * 2);
+ for (j = 0; chunks == 0 || j < chunks; j++)
+ {
+ if ((ptr[j] = (char *) malloc (bytes * sizeof (char))))
+ {
+ for (k = 0; k < bytes; k++)
+ ptr[j][k] = 'Z'; /* Ensure that COW happens. */
+ dbg (stdout, "hogvm worker malloced %lli bytes\n", k);
+ }
+ else if (ignore)
+ {
+ ++retval;
+ wrn (stderr, "hogvm malloc failed, continuing\n");
+ usleep (retry);
+ continue;
+ }
+ else
+ {
+ ++retval;
+ err (stderr, "hogvm malloc failed\n");
+ break;
+ }
+ }
+ if (global_vmhang && retval == 0)
+ {
+ dbg (stdout, "sleeping forever with allocated memory\n");
+ while (1)
+ sleep (1024);
+ }
+ if (retval == 0)
+ {
+ dbg (stdout,
+ "hogvm worker freeing memory and starting over\n");
+ for (j = 0; chunks == 0 || j < chunks; j++) {
+ free (ptr[j]);
+ }
+ free(ptr);
+ continue;
+ }
+
+ exit (retval);
+ }
+
+ /* This case never falls through; alarm signal can cause exit. */
+ case -1: /* error */
+ if (ignore)
+ {
+ ++retval;
+ wrn (stderr, "hogvm worker fork failed, continuing\n");
+ usleep (retry);
+ continue;
+ }
+
+ err (stderr, "hogvm worker fork failed\n");
+ return 1;
+ default: /* parent */
+ dbg (stdout, "--> hogvm worker forked (%i)\n", pid);
+ }
+ }
+
+ /* Wait for our children to exit. */
+ while (i)
+ {
+ int status, ret;
+
+ if ((pid = wait (&status)) > 0)
+ {
+ if ((WIFEXITED (status)) != 0)
+ {
+ if ((ret = WEXITSTATUS (status)) != 0)
+ {
+ err (stderr, "hogvm worker %i exited %i\n", pid, ret);
+ retval += ret;
+ }
+ else
+ {
+ dbg (stdout, "<-- hogvm worker exited (%i)\n", pid);
+ }
+ }
+ else
+ {
+ dbg (stdout, "<-- hogvm worker signalled (%i)\n", pid);
+ }
+
+ --i;
+ }
+ else
+ {
+ dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+ err (stderr, "detected missing hogvm worker children\n");
+ ++retval;
+ break;
+ }
+ }
+
+ return retval;
+}
+
+int
+hoghdd (long long forks, int clean, long long files, long long bytes)
+{
+ long long i, j;
+ int fd, pid, retval = 0;
+ int chunk = (1024 * 1024) - 1; /* Minimize slow writing. */
+ char buff[chunk];
+
+ /* Make local copies of global variables. */
+ int ignore = global_ignore;
+ int retry = global_retry;
+ int timeout = global_timeout;
+ long backoff = global_backoff * forks;
+
+ /* Initialize buffer with some random ASCII data. */
+ dbg (stdout, "seeding buffer with random data\n");
+ for (i = 0; i < chunk - 1; i++)
+ {
+ j = rand ();
+ j = (j < 0) ? -j : j;
+ j %= 95;
+ j += 32;
+ buff[i] = j;
+ }
+ buff[i] = '\n';
+
+ dbg (stdout, "using backoff sleep of %lius for hoghdd\n", backoff);
+
+ for (i = 0; forks == 0 || i < forks; i++)
+ {
+ switch (pid = fork ())
+ {
+ case 0: /* child */
+ alarm (timeout);
+
+ /* Use a backoff sleep to ensure we get good fork throughput. */
+ usleep (backoff);
+
+ while (1)
+ {
+ for (i = 0; i < files; i++)
+ {
+ char name[] = "./stress.XXXXXX";
+
+ if ((fd = mkstemp (name)) < 0)
+ {
+ perror ("mkstemp");
+ err (stderr, "mkstemp failed\n");
+ exit (1);
+ }
+
+ if (clean == 0)
+ {
+ dbg (stdout, "unlinking %s\n", name);
+ if (unlink (name))
+ {
+ err (stderr, "unlink failed\n");
+ exit (1);
+ }
+ }
+
+ dbg (stdout, "fast writing to %s\n", name);
+ for (j = 0; bytes == 0 || j + chunk < bytes; j += chunk)
+ {
+ if (write (fd, buff, chunk) != chunk)
+ {
+ err (stderr, "write failed\n");
+ exit (1);
+ }
+ }
+
+ dbg (stdout, "slow writing to %s\n", name);
+ for (; bytes == 0 || j < bytes - 1; j++)
+ {
+ if (write (fd, "Z", 1) != 1)
+ {
+ err (stderr, "write failed\n");
+ exit (1);
+ }
+ }
+ if (write (fd, "\n", 1) != 1)
+ {
+ err (stderr, "write failed\n");
+ exit (1);
+ }
+ ++j;
+
+ dbg (stdout, "closing %s after writing %lli bytes\n", name,
+ j);
+ close (fd);
+
+ if (clean == 1)
+ {
+ if (unlink (name))
+ {
+ err (stderr, "unlink failed\n");
+ exit (1);
+ }
+ }
+ }
+ if (retval == 0)
+ {
+ dbg (stdout, "hoghdd worker starting over\n");
+ continue;
+ }
+
+ exit (retval);
+ }
+
+ /* This case never falls through; alarm signal can cause exit. */
+ case -1: /* error */
+ if (ignore)
+ {
+ ++retval;
+ wrn (stderr, "hoghdd worker fork failed, continuing\n");
+ usleep (retry);
+ continue;
+ }
+
+ err (stderr, "hoghdd worker fork failed\n");
+ return 1;
+ default: /* parent */
+ dbg (stdout, "--> hoghdd worker forked (%i)\n", pid);
+ }
+ }
+
+ /* Wait for our children to exit. */
+ while (i)
+ {
+ int status, ret;
+
+ if ((pid = wait (&status)) > 0)
+ {
+ if ((WIFEXITED (status)) != 0)
+ {
+ if ((ret = WEXITSTATUS (status)) != 0)
+ {
+ err (stderr, "hoghdd worker %i exited %i\n", pid, ret);
+ retval += ret;
+ }
+ else
+ {
+ dbg (stdout, "<-- hoghdd worker exited (%i)\n", pid);
+ }
+ }
+ else
+ {
+ dbg (stdout, "<-- hoghdd worker signalled (%i)\n", pid);
+ }
+
+ --i;
+ }
+ else
+ {
+ dbg (stdout, "wait() returned error: %s\n", strerror (errno));
+ err (stderr, "detected missing hoghdd worker children\n");
+ ++retval;
+ break;
+ }
+ }
+
+ return retval;
+}
diff -urN kvm_runtest_2.bak/kvm_runtest_2.py kvm_runtest_2/kvm_runtest_2.py
--- kvm_runtest_2.bak/kvm_runtest_2.py 2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_runtest_2.py 2009-04-29 08:06:32.000000000 -0400
@@ -36,6 +36,8 @@
"autotest": test_routine("kvm_tests", "run_autotest"),
"kvm_install": test_routine("kvm_install", "run_kvm_install"),
"linux_s3": test_routine("kvm_tests", "run_linux_s3"),
+ "ntp_server_setup": test_routine("kvm_tests", "run_ntp_server_setup"),
+ "timedrift": test_routine("kvm_tests", "run_timedrift"),
}
# Make it possible to import modules from the test's bindir
diff -urN kvm_runtest_2.bak/kvm_tests.cfg.sample kvm_runtest_2/kvm_tests.cfg.sample
--- kvm_runtest_2.bak/kvm_tests.cfg.sample 2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_tests.cfg.sample 2009-05-12 04:48:51.000000000 -0400
@@ -81,6 +81,11 @@
- linux_s3: install setup
type = linux_s3
+ - ntp_server_setup:
+ type = ntp_server_setup
+ - timedrift: ntp_server_setup
+ type = timedrift
+ stress_test_specify = './stress -c 10 --timeout 15m'
# NICs
variants:
- @rtl8139:
diff -urN kvm_runtest_2.bak/kvm_tests.py kvm_runtest_2/kvm_tests.py
--- kvm_runtest_2.bak/kvm_tests.py 2009-04-29 06:17:29.000000000 -0400
+++ kvm_runtest_2/kvm_tests.py 2009-05-12 08:16:11.000000000 -0400
@@ -394,3 +394,263 @@
kvm_log.info("VM resumed after S3")
session.close()
+
+def run_ntp_server_setup(test, params, env):
+
+ """NTP server configuration and related network file modification
+ """
+
+ kvm_log.info("stop the iptables service if it is running for timedrift testing")
+
+ #if not os.system("/etc/init.d/iptables status"):
+ # os.system("/etc/init.d/iptables stop")
+ if utils.system("/etc/init.d/iptables status", ignore_status=True) == 0:
+ utils.system("/etc/init.d/iptables stop", ignore_status=True)
+
+
+ # prevent dhcp client modify the ntp.conf
+ kvm_log.info("prevent dhcp client modify the ntp.conf")
+
+ config_file = "/etc/sysconfig/network"
+ network_file = open("/etc/sysconfig/network", "a")
+ string = "PEERNTP=no"
+
+ if os.system("grep %s %s" % (string, config_file)):
+ network_file.writelines(str(string)+'\n')
+
+ network_file.close()
+
+ # stop the ntp service if it is running
+ kvm_log.info("stop ntp service if it is running")
+
+ #if not os.system("/etc/init.d/ntpd status"):
+ if utils.system("/etc/init.d/ntpd status", ignore_status=True) == 0:
+ utils.system("/etc/init.d/ntpd stop", ignore_status=True)
+ # os.system("/etc/init.d/ntpd stop")
+ # ntp_running = True
+
+ kvm_log.info("start ntp server on host with the custom config file.")
+
+ ntp_cmd = '''
+ echo "restrict default kod nomodify notrap nopeer noquery" > /etc/timedrift.ntp.conf;\
+ echo "restrict 127.0.0.1" >> /etc/timedrift.ntp.conf;\
+ echo "driftfile /var/lib/ntp/drift" >> /etc/timedrift.ntp.conf;\
+ echo "keys /etc/ntp/keys" >> /etc/timedrift.ntp.conf;\
+ echo "server 127.127.1.0" >> /etc/timedrift.ntp.conf;\
+ echo "fudge 127.127.1.0 stratum 1" >> /etc/timedrift.ntp.conf;\
+ ntpd -c /etc/timedrift.ntp.conf;
+ '''
+ #if os.system(ntp_cmd):
+ if utils.system(ntp_cmd, ignore_status=True) != 0:
+ raise error.TestFail, "NTP server has not starting correctly..."
+
+ #kvm_log.info("sync system clock to BIOS")
+ #os.system("/sbin/hwclock --systohc")
+
+def run_timedrift(test, params, env):
+ """judge whether the guest clock will encounter timedrift prblem or not. including three stages:
+ 1: try to sync the clock with host, if the offset value of guest clock is large than 1 sec.
+ 2: running the cpu stress testing program<cpu_stress.c> on guest
+ 3: then run analyze loop totally 20 times to determine if the clock on guest has time drift.
+ """
+ # variables using in timedrift testcase
+ #cpu_stress_program = 'cpu_stress.c'
+ cpu_stress_program = 'genload'
+ #remote_dir = '/root'
+ remote_dir = '/'
+
+ clock_resource_cmd = 'cat /sys/devices/system/clocksource/clocksource0/current_clocksource'
+
+ pwd = os.path.join(os.environ['AUTODIR'],'tests/kvm_runtest_2')
+ stress_test_dir = os.path.join(pwd, cpu_stress_program)
+
+ stress_test_specify = params.get('stress_test_specify')
+ kvm_log.info("stress command is :%s" % stress_test_specify)
+ #stress_cmdline = 'cd %s;gcc %s -lm;./a.out &' % (remote_dir, os.path.basename(stress_test_dir))
+ stress_cmdline = 'cd %s%s;make;%s &' % (remote_dir, os.path.basename(stress_test_dir), stress_test_specify)
+
+ stress_search_cmdline = "ps -ef|grep 'stress'|grep -v grep"
+
+ #hostname = os.environ.get("HOSTNAME")
+ hostname = socket.gethostname()
+ if "localhost.localdomain" == hostname:
+ hostname = os.popen('hostname').read().split('\n')[0]
+ kvm_log.info("since get wrong hostname from python evnironment, then use the hostname get from system call(hostname).")
+
+ kvm_log.info("get host name :%s" % hostname)
+
+ # ntpdate info command and ntpdate sync command
+ ntpdate_info_cmd = "ntpdate -q %s" % hostname
+ ntpdate_sync_cmd = "ntpdate %s" % hostname
+
+ # get vm handle
+ vm = kvm_utils.env_get_vm(env,params.get("main_vm"))
+ if not vm:
+ raise error.TestError, "VM object not found in environment"
+ if not vm.is_alive():
+ raise error.TestError, "VM seems to be dead; Test requires a living VM"
+
+ kvm_log.info("Waiting for guest to be up...")
+
+ pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
+ if not pxssh:
+ raise error.TestFail, "Could not log into guest"
+
+ kvm_log.info("Logged into guest IN run_timedrift function.")
+
+ # clock resource get from host and guest
+ kvm_log.info("****************** clock resource info ******************")
+ host_clock_resource = os.popen(clock_resource_cmd).read().split('\n')[0]
+ kvm_log.info("the clock resource on HOST is :%s" % host_clock_resource)
+
+ pxssh.sendline(clock_resource_cmd)
+ s, o = pxssh.read_up_to_prompt()
+ guest_clock_resource = o.splitlines()[-2]
+ kvm_log.info("the clock resource on guest is :%s" % guest_clock_resource)
+ kvm_log.info("*********************************************************")
+
+ if host_clock_resource != guest_clock_resource:
+ #raise error.TestFail, "Host and Guest using different clock resource"
+ kvm_log.info("Host and Guest using different clock resource,Let's move on.")
+ else:
+ kvm_log.info("Host and Guest using same clock resource,Let's move on.")
+
+ # helper function:
+ # ntpdate_op: a entire process to get ntpdate command line result from guest.
+ # time_drift_or_not: get the numeric handing by regular expression and make timedrift calulation.
+ def ntpdate_op(command):
+ output = []
+ try:
+ pxssh = kvm_utils.wait_for(vm.ssh_login, 240, 0, 2)
+ if not pxssh:
+ raise error.TestFail, "Could not log into guest"
+
+ kvm_log.info("Logged in:(ntpdate_op)")
+
+ while True:
+ pxssh.sendline(command)
+ s, output = pxssh.read_up_to_prompt()
+ if "time server" in output:
+ # output is a string contain the (ntpdate -q) infor on guest
+ return True, output
+ else:
+ continue
+ except:
+ pxssh.close()
+ return False, output
+ return False, output
+
+ def time_drift_or_not(output):
+ date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
+ num = float(date_string[0])
+ if num >= 1:
+ kvm_log.info("guest clock has drifted in this scenario :%s %s" % (date_string, num))
+ return False
+ else:
+ kvm_log.info("guest clock running veracious in now stage :%s %s" % (date_string, num))
+ return True
+
+ # send the command and get the ouput from guest
+ # this loop will pick out several conditions need to be process
+ # Actually, we want to get the info match "time server", then script can analyzing it to
+ # determine if guest's clock need sync with host or not.
+ while True:
+ pxssh.sendline(ntpdate_info_cmd)
+ s, output = pxssh.read_up_to_prompt()
+ kvm_log.info("the ntpdate query info get from guest is below: \n%s" %output)
+ if ("no server suitable" not in output) and ("time server" not in output):
+ kvm_log.info("very creazying output got. let's try again")
+ continue
+ elif "no server suitable" in output:
+ kvm_log.info("seems NTP server is not ready for servicing")
+ time.sleep(30)
+ continue
+ elif "time server" in output:
+ # get the ntpdate info from guest
+ # kvm_log.info("Got the correct output for analyze. The output is below: \n%s" %output)
+ break
+
+ kvm_log.info("get the ntpdate infomation from guest successfully :%s" % os.popen('date').read())
+
+ # judge the clock need to sync with host or not
+ while True:
+ date_string = re.findall(r'offset [+-]?(.*) sec', output, re.M)
+ num = float(date_string[0])
+ if num >= 1:
+ kvm_log.info("guest need sync with the server: %s" % hostname)
+ s, output = ntpdate_op(ntpdate_sync_cmd)
+ if s:
+ continue
+ else:
+ #pxssh.sendline("hwclock --systohc")
+ #kvm_log.info("guest clock sync prcdure is finished. then sync the guest clock to guest bios.")
+
+ #pxssh.sendline("hwclock --show")
+ #s, o = pxssh.read_up_to_prompt()
+ #kvm_log.info("the date infomation get from guest bios is :\n%s" % o)
+
+ pxssh.sendline(ntpdate_info_cmd)
+ s, o = pxssh.read_up_to_prompt()
+ kvm_log.info("guest clock after sync with host is :\n%s" % o)
+
+ break
+
+ kvm_log.info("Timedrift Preparation *Finished* at last :%s" % os.popen('date').read())
+
+ if not vm.scp_to_remote(stress_test_dir, remote_dir):
+ raise error.TestError, "Could not copy program to guest."
+
+ pxssh.sendline(ntpdate_info_cmd)
+ s, o = pxssh.read_up_to_prompt()
+ kvm_log.info("the ntpdate query from host *BEFORE* running the cpu stress program.\n%s" % o)
+ pxssh.sendline(stress_cmdline)
+ s, o = pxssh.read_up_to_prompt()
+ kvm_log.info("running command line on guest and sleeping for 1200 secs.\n%s" % o)
+
+ time.sleep(1200)
+
+ while True:
+ if pxssh.get_command_status(stress_search_cmdline):
+ #(s, o) = pxssh.get_command_status_output(stress_search_cmdline)
+ #print "s is :%s" % s
+ #print "o is :%s" % o
+ #print "--------------------------------------------"
+ #aaa = pxssh.get_command_status(stress_search_cmdline)
+ #print "aaa is :%s" % aaa
+ #print "--------------------------------------------"
+
+ print "stress testing process has been completed and quit."
+ break
+ else:
+ print "stress testing on CPU has not finished yet.waiting for next detect after sleep 60 secs."
+ time.sleep(60)
+ continue
+
+ pxssh.sendline(ntpdate_info_cmd)
+ s, o = pxssh.read_up_to_prompt()
+ kvm_log.info("the ntpdate query from host *AFTER* running the cpu stress program.\n%s" % o)
+
+ pxssh.close()
+
+ # Sleep for analyze...
+ kvm_log.info("sleeping(180 secs) Starting... :%s" % os.popen('date').read())
+ time.sleep(180)
+ kvm_log.info("wakeup to get the analyzing... :%s" % os.popen('date').read())
+ count = 0
+ for i in range(1, 21):
+ kvm_log.info("this is %s time to get clock info from guest." % i)
+ s, o = ntpdate_op(ntpdate_info_cmd)
+
+ if not s:
+ raise error.TestFail, "Guest seems hang or ssh service based on guest has been crash down"
+
+ if not time_drift_or_not(o):
+ count += 1
+
+ if count == 5:
+ raise error.TestFail, "TimeDrift testing Abort because guest's clock has drift too much"
+
+ kvm_log.info("*********************** Sleep 30 seconds for next loop *************************")
+ time.sleep(60)
+
+
next prev parent reply other threads:[~2009-05-12 13:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-06 4:02 [KVM-AUTOTEST][PATCH] timedrift support Bear Yang
2009-05-06 13:02 ` Marcelo Tosatti
2009-05-11 10:40 ` Bear Yang
2009-05-11 11:05 ` Yaniv Kaul
2009-05-11 12:59 ` Lucas Meneghel Rodrigues
2009-05-12 12:31 ` Bear Yang
2009-05-12 13:07 ` Bear Yang [this message]
2009-05-13 18:34 ` Lucas Meneghel Rodrigues
2009-05-15 16:30 ` bear
2009-05-16 20:36 ` Yaniv Kaul
2009-05-18 1:54 ` Bear Yang
2009-05-18 4:20 ` Yaniv Kaul
2009-05-18 5:18 ` Bear Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A09748C.6040909@redhat.com \
--to=byang@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mrodrigu@redhat.com \
--cc=mtosatti@redhat.com \
--cc=uril@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).