[KVM-AUTOTEST PATCH 1/7] KVM test: migration test: move the bulk of the code to a utility function

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [KVM-AUTOTEST PATCH 1/7] KVM test: migration test: move the bulk of the code to a utility function
@ 2009-10-07 17:54 Michael Goldish
  2009-10-07 17:54 ` [KVM-AUTOTEST PATCH 2/7] KVM test: timedrift test: move the get_time() helper function to kvm_test_utils.py Michael Goldish
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

Move most of the code to a utility function in kvm_test_utils.py, in order to
make it reusable.

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_test_utils.py  |   72 +++++++++++++++++++++++++++++++++++
 client/tests/kvm/tests/migration.py |   62 +-----------------------------
 2 files changed, 74 insertions(+), 60 deletions(-)

diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py
index 601b350..096a056 100644
--- a/client/tests/kvm/kvm_test_utils.py
+++ b/client/tests/kvm/kvm_test_utils.py
@@ -59,3 +59,75 @@ def wait_for_login(vm, nic_index=0, timeout=240):
         raise error.TestFail("Could not log into guest '%s'" % vm.name)
     logging.info("Logged in")
     return session
+
+
+def migrate(vm, env=None):
+    """
+    Migrate a VM locally and re-register it in the environment.
+
+    @param vm: The VM to migrate.
+    @param env: The environment dictionary.  If omitted, the migrated VM will
+            not be registered.
+    @return: The post-migration VM.
+    """
+    # Helper functions
+    def mig_finished():
+        s, o = vm.send_monitor_cmd("info migrate")
+        return s == 0 and not "Migration status: active" in o
+
+    def mig_succeeded():
+        s, o = vm.send_monitor_cmd("info migrate")
+        return s == 0 and "Migration status: completed" in o
+
+    def mig_failed():
+        s, o = vm.send_monitor_cmd("info migrate")
+        return s == 0 and "Migration status: failed" in o
+
+    # See if migration is supported
+    s, o = vm.send_monitor_cmd("help info")
+    if not "info migrate" in o:
+        raise error.TestError("Migration is not supported")
+
+    # Clone the source VM and ask the clone to wait for incoming migration
+    dest_vm = vm.clone()
+    dest_vm.create(for_migration=True)
+
+    try:
+        # Define the migration command
+        cmd = "migrate -d tcp:localhost:%d" % dest_vm.migration_port
+        logging.debug("Migrating with command: %s" % cmd)
+
+        # Migrate
+        s, o = vm.send_monitor_cmd(cmd)
+        if s:
+            logging.error("Migration command failed (command: %r, output: %r)"
+                          % (cmd, o))
+            raise error.TestFail("Migration command failed")
+
+        # Wait for migration to finish
+        if not kvm_utils.wait_for(mig_finished, 90, 2, 2,
+                                  "Waiting for migration to finish..."):
+            raise error.TestFail("Timeout elapsed while waiting for migration "
+                                 "to finish")
+
+        # Report migration status
+        if mig_succeeded():
+            logging.info("Migration finished successfully")
+        elif mig_failed():
+            raise error.TestFail("Migration failed")
+        else:
+            raise error.TestFail("Migration ended with unknown status")
+
+        # Kill the source VM
+        vm.destroy(gracefully=False)
+
+        # Replace the source VM with the new cloned VM
+        if env is not None:
+            kvm_utils.env_register_vm(env, vm.name, dest_vm)
+
+        # Return the new cloned VM
+        return dest_vm
+
+    except:
+        dest_vm.destroy()
+        raise
diff --git a/client/tests/kvm/tests/migration.py b/client/tests/kvm/tests/migration.py
index 2bbf17b..4b13b5d 100644
--- a/client/tests/kvm/tests/migration.py
+++ b/client/tests/kvm/tests/migration.py
@@ -21,79 +21,21 @@ def run_migration(test, params, env):
     """
     vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
 
-    # See if migration is supported
-    s, o = vm.send_monitor_cmd("help info")
-    if not "info migrate" in o:
-        raise error.TestError("Migration is not supported")
-
     # Log into guest and get the output of migration_test_command
     session = kvm_test_utils.wait_for_login(vm)
     migration_test_command = params.get("migration_test_command")
     reference_output = session.get_command_output(migration_test_command)
     session.close()
 
-    # Clone the main VM and ask it to wait for incoming migration
-    dest_vm = vm.clone()
-    dest_vm.create(for_migration=True)
-
-    try:
-        # Define the migration command
-        cmd = "migrate -d tcp:localhost:%d" % dest_vm.migration_port
-        logging.debug("Migration command: %s" % cmd)
-
-        # Migrate
-        s, o = vm.send_monitor_cmd(cmd)
-        if s:
-            logging.error("Migration command failed (command: %r, output: %r)"
-                          % (cmd, o))
-            raise error.TestFail("Migration command failed")
-
-        # Define some helper functions
-        def mig_finished():
-            s, o = vm.send_monitor_cmd("info migrate")
-            return s == 0 and not "Migration status: active" in o
-
-        def mig_succeeded():
-            s, o = vm.send_monitor_cmd("info migrate")
-            return s == 0 and "Migration status: completed" in o
-
-        def mig_failed():
-            s, o = vm.send_monitor_cmd("info migrate")
-            return s == 0 and "Migration status: failed" in o
-
-        # Wait for migration to finish
-        if not kvm_utils.wait_for(mig_finished, 90, 2, 2,
-                                  "Waiting for migration to finish..."):
-            raise error.TestFail("Timeout elapsed while waiting for migration "
-                                 "to finish")
-
-        # Report migration status
-        if mig_succeeded():
-            logging.info("Migration finished successfully")
-        elif mig_failed():
-            raise error.TestFail("Migration failed")
-        else:
-            raise error.TestFail("Migration ended with unknown status")
-
-        # Kill the source VM
-        vm.destroy(gracefully=False)
-
-        # Replace the source VM with the new cloned VM
-        kvm_utils.env_register_vm(env, params.get("main_vm"), dest_vm)
-
-    except:
-        dest_vm.destroy(gracefully=False)
-        raise
+    # Migrate the VM
+    dest_vm = kvm_test_utils.migrate(vm, env)
 
     # Log into guest and get the output of migration_test_command
     logging.info("Logging into guest after migration...")
-
     session = dest_vm.remote_login()
     if not session:
         raise error.TestFail("Could not log into guest after migration")
-
     logging.info("Logged in after migration")
-
     output = session.get_command_output(migration_test_command)
     session.close()
 
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 2/7] KVM test: timedrift test: move the get_time() helper function to kvm_test_utils.py
  2009-10-07 17:54 [KVM-AUTOTEST PATCH 1/7] KVM test: migration test: move the bulk of the code to a utility function Michael Goldish
@ 2009-10-07 17:54 ` Michael Goldish
  2009-10-07 17:54   ` [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Michael Goldish
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

Move get_time() to kvm_test_utils.py to make it reusable.

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_test_utils.py  |   27 ++++++
 client/tests/kvm/tests/timedrift.py |  164 ++++++++++++++++------------------
 2 files changed, 104 insertions(+), 87 deletions(-)

diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py
index 096a056..db9f666 100644
--- a/client/tests/kvm/kvm_test_utils.py
+++ b/client/tests/kvm/kvm_test_utils.py
@@ -131,3 +131,30 @@ def migrate(vm, env=None):
     except:
         dest_vm.destroy()
         raise
+
+
+def get_time(session, time_command, time_filter_re, time_format):
+    """
+    Return the host time and guest time.  If the guest time cannot be fetched
+    a TestError exception is raised.
+
+    Note that the shell session should be ready to receive commands
+    (i.e. should "display" a command prompt and should be done with all
+    previous commands).
+
+    @param session: A shell session.
+    @param time_command: Command to issue to get the current guest time.
+    @param time_filter_re: Regex filter to apply on the output of
+            time_command in order to get the current time.
+    @param time_format: Format string to pass to time.strptime() with the
+            result of the regex filter.
+    @return: A tuple containing the host time and guest time.
+    """
+    host_time = time.time()
+    session.sendline(time_command)
+    (match, s) = session.read_up_to_prompt()
+    if not match:
+        raise error.TestError("Could not get guest time")
+    s = re.findall(time_filter_re, s)[0]
+    guest_time = time.mktime(time.strptime(s, time_format))
+    return (host_time, guest_time)
diff --git a/client/tests/kvm/tests/timedrift.py b/client/tests/kvm/tests/timedrift.py
index fe0653e..146fa12 100644
--- a/client/tests/kvm/tests/timedrift.py
+++ b/client/tests/kvm/tests/timedrift.py
@@ -52,25 +52,6 @@ def run_timedrift(test, params, env):
         for tid, mask in prev_masks.items():
             commands.getoutput("taskset -p %s %s" % (mask, tid))
 
-    def get_time(session, time_command, time_filter_re, time_format):
-        """
-        Returns the host time and guest time.
-
-        @param session: A shell session.
-        @param time_command: Command to issue to get the current guest time.
-        @param time_filter_re: Regex filter to apply on the output of
-                time_command in order to get the current time.
-        @param time_format: Format string to pass to time.strptime() with the
-                result of the regex filter.
-        @return: A tuple containing the host time and guest time.
-        """
-        host_time = time.time()
-        session.sendline(time_command)
-        (match, s) = session.read_up_to_prompt()
-        s = re.findall(time_filter_re, s)[0]
-        guest_time = time.mktime(time.strptime(s, time_format))
-        return (host_time, guest_time)
-
     vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
     session = kvm_test_utils.wait_for_login(vm)
 
@@ -97,84 +78,93 @@ def run_timedrift(test, params, env):
     guest_load_sessions = []
     host_load_sessions = []
 
-    # Set the VM's CPU affinity
-    prev_affinity = set_cpu_affinity(vm.get_pid(), cpu_mask)
-
     try:
-        # Get time before load
-        (host_time_0, guest_time_0) = get_time(session, time_command,
-                                               time_filter_re, time_format)
-
-        # Run some load on the guest
-        logging.info("Starting load on guest...")
-        for i in range(guest_load_instances):
-            load_session = vm.remote_login()
-            if not load_session:
-                raise error.TestFail("Could not log into guest")
-            load_session.set_output_prefix("(guest load %d) " % i)
-            load_session.set_output_func(logging.debug)
-            load_session.sendline(guest_load_command)
-            guest_load_sessions.append(load_session)
-
-        # Run some load on the host
-        logging.info("Starting load on host...")
-        for i in range(host_load_instances):
-            host_load_sessions.append(
-                kvm_subprocess.run_bg(host_load_command,
-                                      output_func=logging.debug,
-                                      output_prefix="(host load %d) " % i,
-                                      timeout=0.5))
-            # Set the CPU affinity of the load process
-            pid = host_load_sessions[-1].get_pid()
-            set_cpu_affinity(pid, cpu_mask)
-
-        # Sleep for a while (during load)
-        logging.info("Sleeping for %s seconds..." % load_duration)
-        time.sleep(load_duration)
-
-        # Get time delta after load
-        (host_time_1, guest_time_1) = get_time(session, time_command,
-                                               time_filter_re, time_format)
-
-        # Report results
-        host_delta = host_time_1 - host_time_0
-        guest_delta = guest_time_1 - guest_time_0
-        drift = 100.0 * (host_delta - guest_delta) / host_delta
-        logging.info("Host duration: %.2f" % host_delta)
-        logging.info("Guest duration: %.2f" % guest_delta)
-        logging.info("Drift: %.2f%%" % drift)
+        # Set the VM's CPU affinity
+        prev_affinity = set_cpu_affinity(vm.get_pid(), cpu_mask)
+
+        # Get time before load 
+        # (ht stands for host time, gt stands for guest time)
+        (ht0, gt0) = kvm_test_utils.get_time(session,
+                                             time_command,
+                                             time_filter_re,
+                                             time_format)
+
+        try:
+            # Run some load on the guest
+            logging.info("Starting load on guest...")
+            for i in range(guest_load_instances):
+                load_session = vm.remote_login()
+                if not load_session:
+                    raise error.TestFail("Could not log into guest")
+                load_session.set_output_prefix("(guest load %d) " % i)
+                load_session.set_output_func(logging.debug)
+                load_session.sendline(guest_load_command)
+                guest_load_sessions.append(load_session)
+
+            # Run some load on the host
+            logging.info("Starting load on host...")
+            for i in range(host_load_instances):
+                host_load_sessions.append(
+                    kvm_subprocess.run_bg(host_load_command,
+                                          output_func=logging.debug,
+                                          output_prefix="(host load %d) " % i,
+                                          timeout=0.5))
+                # Set the CPU affinity of the load process
+                pid = host_load_sessions[-1].get_pid()
+                set_cpu_affinity(pid, cpu_mask)
+
+            # Sleep for a while (during load)
+            logging.info("Sleeping for %s seconds..." % load_duration)
+            time.sleep(load_duration)
+
+            # Get time delta after load
+            (ht1, gt1) = kvm_test_utils.get_time(session,
+                                                 time_command,
+                                                 time_filter_re,
+                                                 time_format)
+
+            # Report results
+            host_delta = ht1 - ht0
+            guest_delta = gt1 - gt0
+            drift = 100.0 * (host_delta - guest_delta) / host_delta
+            logging.info("Host duration: %.2f" % host_delta)
+            logging.info("Guest duration: %.2f" % guest_delta)
+            logging.info("Drift: %.2f%%" % drift)
+
+        finally:
+            logging.info("Cleaning up...")
+            # Restore the VM's CPU affinity
+            restore_cpu_affinity(prev_affinity)
+            # Stop the guest load
+            if guest_load_stop_command:
+                session.get_command_output(guest_load_stop_command)
+            # Close all load shell sessions
+            for load_session in guest_load_sessions:
+                load_session.close()
+            for load_session in host_load_sessions:
+                load_session.close()
+
+        # Sleep again (rest)
+        logging.info("Sleeping for %s seconds..." % rest_duration)
+        time.sleep(rest_duration)
+
+        # Get time after rest
+        (ht2, gt2) = kvm_test_utils.get_time(session,
+                                             time_command,
+                                             time_filter_re,
+                                             time_format)
 
     finally:
-        logging.info("Cleaning up...")
-        # Restore the VM's CPU affinity
-        restore_cpu_affinity(prev_affinity)
-        # Stop the guest load
-        if guest_load_stop_command:
-            session.get_command_output(guest_load_stop_command)
-        # Close all load shell sessions
-        for load_session in guest_load_sessions:
-            load_session.close()
-        for load_session in host_load_sessions:
-            load_session.close()
-
-    # Sleep again (rest)
-    logging.info("Sleeping for %s seconds..." % rest_duration)
-    time.sleep(rest_duration)
-
-    # Get time after rest
-    (host_time_2, guest_time_2) = get_time(session, time_command,
-                                           time_filter_re, time_format)
+        session.close()
 
     # Report results
-    host_delta_total = host_time_2 - host_time_0
-    guest_delta_total = guest_time_2 - guest_time_0
+    host_delta_total = ht2 - ht0
+    guest_delta_total = gt2 - gt0
     drift_total = 100.0 * (host_delta_total - guest_delta_total) / host_delta
     logging.info("Total host duration including rest: %.2f" % host_delta_total)
     logging.info("Total guest duration including rest: %.2f" % guest_delta_total)
     logging.info("Total drift after rest: %.2f%%" % drift_total)
 
-    session.close()
-
     # Fail the test if necessary
     if drift > drift_threshold:
         raise error.TestFail("Time drift too large: %.2f%%" % drift)
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration
  2009-10-07 17:54 ` [KVM-AUTOTEST PATCH 2/7] KVM test: timedrift test: move the get_time() helper function to kvm_test_utils.py Michael Goldish
@ 2009-10-07 17:54   ` Michael Goldish
  2009-10-07 17:54     ` [KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py Michael Goldish
  2009-10-12 15:28     ` [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Lucas Meneghel Rodrigues
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

This patch adds a new test that checks the timedrift introduced by migrations.
It uses the same parameters used by the timedrift test to get the guest time.
In addition, the number of migrations the test performs is controlled by the
parameter 'migration_iterations'.

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_tests.cfg.sample              |   33 ++++---
 client/tests/kvm/tests/timedrift_with_migration.py |   95 ++++++++++++++++++++
 2 files changed, 115 insertions(+), 13 deletions(-)
 create mode 100644 client/tests/kvm/tests/timedrift_with_migration.py

diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample
index 540d0a2..618c21e 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -100,19 +100,26 @@ variants:
         type = linux_s3
 
     - timedrift:    install setup
-        type = timedrift
         extra_params += " -rtc-td-hack"
-        # Pin the VM and host load to CPU #0
-        cpu_mask = 0x1
-        # Set the load and rest durations
-        load_duration = 20
-        rest_duration = 20
-        # Fail if the drift after load is higher than 50%
-        drift_threshold = 50
-        # Fail if the drift after the rest period is higher than 10%
-        drift_threshold_after_rest = 10
-        # For now, make sure this test is executed alone
-        used_cpus = 100
+        variants:
+            - with_load:
+                type = timedrift
+                # Pin the VM and host load to CPU #0
+                cpu_mask = 0x1
+                # Set the load and rest durations
+                load_duration = 20
+                rest_duration = 20
+                # Fail if the drift after load is higher than 50%
+                drift_threshold = 50
+                # Fail if the drift after the rest period is higher than 10%
+                drift_threshold_after_rest = 10
+                # For now, make sure this test is executed alone
+                used_cpus = 100
+            - with_migration:
+                type = timedrift_with_migration
+                migration_iterations = 3
+                drift_threshold = 10
+                drift_threshold_single = 3
 
     - stress_boot:  install setup
         type = stress_boot
@@ -581,7 +588,7 @@ variants:
         extra_params += " -smp 2"
         used_cpus = 2
         stress_boot: used_cpus = 10
-        timedrift: used_cpus = 100
+        timedrift.with_load: used_cpus = 100
 
 
 variants:
diff --git a/client/tests/kvm/tests/timedrift_with_migration.py b/client/tests/kvm/tests/timedrift_with_migration.py
new file mode 100644
index 0000000..139b663
--- /dev/null
+++ b/client/tests/kvm/tests/timedrift_with_migration.py
@@ -0,0 +1,95 @@
+import logging, time, commands, re
+from autotest_lib.client.common_lib import error
+import kvm_subprocess, kvm_test_utils, kvm_utils
+
+
+def run_timedrift_with_migration(test, params, env):
+    """
+    Time drift test with migration:
+
+    1) Log into a guest.
+    2) Take a time reading from the guest and host.
+    3) Migrate the guest.
+    4) Take a second time reading.
+    5) If the drift (in seconds) is higher than a user specified value, fail.
+
+    @param test: KVM test object.
+    @param params: Dictionary with test parameters.
+    @param env: Dictionary with the test environment.
+    """
+    vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
+    session = kvm_test_utils.wait_for_login(vm)
+
+    # Collect test parameters:
+    # Command to run to get the current time
+    time_command = params.get("time_command")
+    # Filter which should match a string to be passed to time.strptime()
+    time_filter_re = params.get("time_filter_re")
+    # Time format for time.strptime()
+    time_format = params.get("time_format")
+    drift_threshold = float(params.get("drift_threshold", "10"))
+    drift_threshold_single = float(params.get("drift_threshold_single", "3"))
+    migration_iterations = int(params.get("migration_iterations", 1))
+
+    try:
+        # Get initial time
+        # (ht stands for host time, gt stands for guest time)
+        (ht0, gt0) = kvm_test_utils.get_time(session, time_command,
+                                             time_filter_re, time_format)
+
+        # Migrate
+        for i in range(migration_iterations):
+            # Get time before current iteration
+            (ht0_, gt0_) = kvm_test_utils.get_time(session, time_command,
+                                                   time_filter_re, time_format)
+            session.close()
+            # Run current iteration
+            logging.info("Migrating: iteration %d of %d..." %
+                         (i + 1, migration_iterations))
+            vm = kvm_test_utils.migrate(vm, env)
+            # Log in
+            logging.info("Logging in after migration...")
+            session = vm.remote_login()
+            if not session:
+                raise error.TestFail("Could not log in after migration")
+            logging.info("Logged in after migration")
+            # Get time after current iteration
+            (ht1_, gt1_) = kvm_test_utils.get_time(session, time_command,
+                                                   time_filter_re, time_format)
+            # Report iteration results
+            host_delta = ht1_ - ht0_
+            guest_delta = gt1_ - gt0_
+            drift = abs(host_delta - guest_delta)
+            logging.info("Host duration (iteration %d): %.2f" %
+                         (i + 1, host_delta))
+            logging.info("Guest duration (iteration %d): %.2f" %
+                         (i + 1, guest_delta))
+            logging.info("Drift at iteration %d: %.2f seconds" %
+                         (i + 1, drift))
+            # Fail if necessary
+            if drift > drift_threshold_single:
+                raise error.TestFail("Time drift too large at iteration %d: "
+                                     "%.2f seconds" % (i + 1, drift))
+
+        # Get final time
+        (ht1, gt1) = kvm_test_utils.get_time(session, time_command,
+                                             time_filter_re, time_format)
+
+    finally:
+        session.close()
+
+    # Report results
+    host_delta = ht1 - ht0
+    guest_delta = gt1 - gt0
+    drift = abs(host_delta - guest_delta)
+    logging.info("Host duration (%d migrations): %.2f" %
+                 (migration_iterations, host_delta))
+    logging.info("Guest duration (%d migrations): %.2f" %
+                 (migration_iterations, guest_delta))
+    logging.info("Drift after %d migrations: %.2f seconds" %
+                 (migration_iterations, drift))
+
+    # Fail if necessary
+    if drift > drift_threshold:
+        raise error.TestFail("Time drift too large after %d migrations: "
+                             "%.2f seconds" % (migration_iterations, drift))
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py
  2009-10-07 17:54   ` [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Michael Goldish
@ 2009-10-07 17:54     ` Michael Goldish
  2009-10-07 17:54       ` [KVM-AUTOTEST PATCH 5/7] KVM test: new test timedrift_with_reboot Michael Goldish
  2009-10-12 15:28     ` [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Lucas Meneghel Rodrigues
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

Move the reboot code from the boot test (tests/boot.py) to kvm_test_utils.py
in order to make it reusable.

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_test_utils.py |   44 ++++++++++++++++++++++++++++++++++++
 client/tests/kvm/tests/boot.py     |   33 +++++---------------------
 2 files changed, 51 insertions(+), 26 deletions(-)

diff --git a/client/tests/kvm/kvm_test_utils.py b/client/tests/kvm/kvm_test_utils.py
index db9f666..4d7b1c3 100644
--- a/client/tests/kvm/kvm_test_utils.py
+++ b/client/tests/kvm/kvm_test_utils.py
@@ -61,6 +61,50 @@ def wait_for_login(vm, nic_index=0, timeout=240):
     return session
 
 
+def reboot(vm, session, method="shell", sleep_before_reset=10, nic_index=0,
+           timeout=240):
+    """
+    Reboot the VM and wait for it to come back up by trying to log in until
+    timeout expires.
+
+    @param vm: VM object.
+    @param session: A shell session object.
+    @param method: Reboot method.  Can be "shell" (send a shell reboot
+            command) or "system_reset" (send a system_reset monitor command).
+    @param nic_index: Index of NIC to access in the VM, when logging in after
+            rebooting.
+    @param timeout: Time to wait before giving up (after rebooting).
+    @return: A new shell session object.
+    """
+    if method == "shell":
+        # Send a reboot command to the guest's shell
+        session.sendline(vm.get_params().get("reboot_command"))
+        logging.info("Reboot command sent; waiting for guest to go down...")
+    elif method == "system_reset":
+        # Sleep for a while before sending the command
+        time.sleep(sleep_before_reset)
+        # Send a system_reset monitor command
+        vm.send_monitor_cmd("system_reset")
+        logging.info("system_reset monitor command sent; waiting for guest to "
+                     "go down...")
+    else:
+        logging.error("Unknown reboot method: %s" % method)
+
+    # Wait for the session to become unresponsive and close it
+    if not kvm_utils.wait_for(lambda: not session.is_responsive(), 120, 0, 1):
+        raise error.TestFail("Guest refuses to go down")
+    session.close()
+
+    # Try logging into the guest until timeout expires
+    logging.info("Guest is down; waiting for it to go up again...")
+    session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index),
+                                 timeout, 0, 2)
+    if not session:
+        raise error.TestFail("Could not log into guest after reboot")
+    logging.info("Guest is up again")
+    return session
+
+
 def migrate(vm, env=None):
     """
     Migrate a VM locally and re-register it in the environment.
diff --git a/client/tests/kvm/tests/boot.py b/client/tests/kvm/tests/boot.py
index 282efda..cd1f1d4 100644
--- a/client/tests/kvm/tests/boot.py
+++ b/client/tests/kvm/tests/boot.py
@@ -19,33 +19,14 @@ def run_boot(test, params, env):
     session = kvm_test_utils.wait_for_login(vm)
 
     try:
-        if params.get("reboot_method") == "shell":
-            # Send a reboot command to the guest's shell
-            session.sendline(vm.get_params().get("reboot_command"))
-            logging.info("Reboot command sent; waiting for guest to go "
-                         "down...")
-        elif params.get("reboot_method") == "system_reset":
-            # Sleep for a while -- give the guest a chance to finish booting
-            time.sleep(float(params.get("sleep_before_reset", 10)))
-            # Send a system_reset monitor command
-            vm.send_monitor_cmd("system_reset")
-            logging.info("system_reset monitor command sent; waiting for "
-                         "guest to go down...")
-        else: return
+        if not params.get("reboot_method"):
+            return
 
-        # Wait for the session to become unresponsive
-        if not kvm_utils.wait_for(lambda: not session.is_responsive(),
-                                  120, 0, 1):
-            raise error.TestFail("Guest refuses to go down")
+        # Reboot the VM
+        session = kvm_test_utils.reboot(vm, session,
+                                        params.get("reboot_method"),
+                                        float(params.get("sleep_before_reset",
+                                                         10)))
 
     finally:
         session.close()
-
-    logging.info("Guest is down; waiting for it to go up again...")
-
-    session = kvm_utils.wait_for(vm.remote_login, 240, 0, 2)
-    if not session:
-        raise error.TestFail("Could not log into guest after reboot")
-    session.close()
-
-    logging.info("Guest is up again")
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 5/7] KVM test: new test timedrift_with_reboot
  2009-10-07 17:54     ` [KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py Michael Goldish
@ 2009-10-07 17:54       ` Michael Goldish
  2009-10-07 17:54         ` [KVM-AUTOTEST PATCH 6/7] KVM test: add option to kill all unresponsive VMs at the end of each test Michael Goldish
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

Checks the time drift introduced by several reboots (default 1).

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_tests.cfg.sample           |    5 ++
 client/tests/kvm/tests/timedrift_with_reboot.py |   88 +++++++++++++++++++++++
 2 files changed, 93 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/kvm/tests/timedrift_with_reboot.py

diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample
index 618c21e..e80b645 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -120,6 +120,11 @@ variants:
                 migration_iterations = 3
                 drift_threshold = 10
                 drift_threshold_single = 3
+            - with_reboot:
+                type = timedrift_with_reboot
+                reboot_iterations = 1
+                drift_threshold = 10
+                drift_threshold_single = 3
 
     - stress_boot:  install setup
         type = stress_boot
diff --git a/client/tests/kvm/tests/timedrift_with_reboot.py b/client/tests/kvm/tests/timedrift_with_reboot.py
new file mode 100644
index 0000000..642daaf
--- /dev/null
+++ b/client/tests/kvm/tests/timedrift_with_reboot.py
@@ -0,0 +1,88 @@
+import logging, time, commands, re
+from autotest_lib.client.common_lib import error
+import kvm_subprocess, kvm_test_utils, kvm_utils
+
+
+def run_timedrift_with_reboot(test, params, env):
+    """
+    Time drift test with reboot:
+
+    1) Log into a guest.
+    2) Take a time reading from the guest and host.
+    3) Reboot the guest.
+    4) Take a second time reading.
+    5) If the drift (in seconds) is higher than a user specified value, fail.
+
+    @param test: KVM test object.
+    @param params: Dictionary with test parameters.
+    @param env: Dictionary with the test environment.
+    """
+    vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
+    session = kvm_test_utils.wait_for_login(vm)
+
+    # Collect test parameters:
+    # Command to run to get the current time
+    time_command = params.get("time_command")
+    # Filter which should match a string to be passed to time.strptime()
+    time_filter_re = params.get("time_filter_re")
+    # Time format for time.strptime()
+    time_format = params.get("time_format")
+    drift_threshold = float(params.get("drift_threshold", "10"))
+    drift_threshold_single = float(params.get("drift_threshold_single", "3"))
+    reboot_iterations = int(params.get("reboot_iterations", 1))
+
+    try:
+        # Get initial time
+        # (ht stands for host time, gt stands for guest time)
+        (ht0, gt0) = kvm_test_utils.get_time(session, time_command,
+                                             time_filter_re, time_format)
+
+        # Reboot
+        for i in range(reboot_iterations):
+            # Get time before current iteration
+            (ht0_, gt0_) = kvm_test_utils.get_time(session, time_command,
+                                                   time_filter_re, time_format)
+            # Run current iteration
+            logging.info("Rebooting: iteration %d of %d..." %
+                         (i + 1, reboot_iterations))
+            session = kvm_test_utils.reboot(vm, session)
+            # Get time after current iteration
+            (ht1_, gt1_) = kvm_test_utils.get_time(session, time_command,
+                                                   time_filter_re, time_format)
+            # Report iteration results
+            host_delta = ht1_ - ht0_
+            guest_delta = gt1_ - gt0_
+            drift = abs(host_delta - guest_delta)
+            logging.info("Host duration (iteration %d): %.2f" %
+                         (i + 1, host_delta))
+            logging.info("Guest duration (iteration %d): %.2f" %
+                         (i + 1, guest_delta))
+            logging.info("Drift at iteration %d: %.2f seconds" %
+                         (i + 1, drift))
+            # Fail if necessary
+            if drift > drift_threshold_single:
+                raise error.TestFail("Time drift too large at iteration %d: "
+                                     "%.2f seconds" % (i + 1, drift))
+
+        # Get final time
+        (ht1, gt1) = kvm_test_utils.get_time(session, time_command,
+                                             time_filter_re, time_format)
+
+    finally:
+        session.close()
+
+    # Report results
+    host_delta = ht1 - ht0
+    guest_delta = gt1 - gt0
+    drift = abs(host_delta - guest_delta)
+    logging.info("Host duration (%d reboots): %.2f" %
+                 (reboot_iterations, host_delta))
+    logging.info("Guest duration (%d reboots): %.2f" %
+                 (reboot_iterations, guest_delta))
+    logging.info("Drift after %d reboots: %.2f seconds" %
+                 (reboot_iterations, drift))
+
+    # Fail if necessary
+    if drift > drift_threshold:
+        raise error.TestFail("Time drift too large after %d reboots: "
+                             "%.2f seconds" % (reboot_iterations, drift))
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 6/7] KVM test: add option to kill all unresponsive VMs at the end of each test
  2009-10-07 17:54       ` [KVM-AUTOTEST PATCH 5/7] KVM test: new test timedrift_with_reboot Michael Goldish
@ 2009-10-07 17:54         ` Michael Goldish
  2009-10-07 17:54           ` [KVM-AUTOTEST PATCH 7/7] KVM test: kvm_preprocessing.py: fix indentation and logging messages in postprocess_vm Michael Goldish
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

This is useful for tests that may leave VMs in a bad state but can't afford to
use kill_vm_on_error = yes.
For example, timedrift.with_reboot can fail because the reboot failed or
because the time drift was too large.  In the latter case there's no reason to
kill the VM.

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_preprocessing.py |   12 ++++++++++++
 client/tests/kvm/kvm_tests.cfg.sample |    1 +
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py
index 26f7f8e..e624a42 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -293,6 +293,18 @@ def postprocess(test, params, env):
                         int(params.get("post_command_timeout", "600")),
                         params.get("post_command_noncritical") == "yes")
 
+    # Kill all unresponsive VMs
+    if params.get("kill_unresponsive_vms") == "yes":
+        logging.debug("'kill_unresponsive_vms' specified; killing all VMs "
+                      "that fail to respond to a remote login request...")
+        for vm in kvm_utils.env_get_all_vms(env):
+            if vm.is_alive():
+                session = vm.remote_login()
+                if session:
+                    session.close()
+                else:
+                    vm.destroy(gracefully=False)
+
     # Kill the tailing threads of all VMs
     for vm in kvm_utils.env_get_all_vms(env):
         vm.kill_tail_thread()
diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample
index e80b645..c4d8a60 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -13,6 +13,7 @@ convert_ppm_files_to_png_on_error = yes
 #keep_ppm_files_on_error = yes
 kill_vm = no
 kill_vm_gracefully = yes
+kill_unresponsive_vms = yes
 
 # Some default VM params
 qemu_binary = qemu
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [KVM-AUTOTEST PATCH 7/7] KVM test: kvm_preprocessing.py: fix indentation and logging messages in postprocess_vm
  2009-10-07 17:54         ` [KVM-AUTOTEST PATCH 6/7] KVM test: add option to kill all unresponsive VMs at the end of each test Michael Goldish
@ 2009-10-07 17:54           ` Michael Goldish
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Goldish @ 2009-10-07 17:54 UTC (permalink / raw)
  To: autotest, kvm; +Cc: Michael Goldish

Signed-off-by: Michael Goldish <mgoldish@redhat.com>
---
 client/tests/kvm/kvm_preprocessing.py |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py
index e624a42..5bae2bd 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -116,9 +116,12 @@ def postprocess_vm(test, params, env, name):
     vm.send_monitor_cmd("screendump %s" % scrdump_filename)
 
     if params.get("kill_vm") == "yes":
-        if not kvm_utils.wait_for(vm.is_dead,
-                float(params.get("kill_vm_timeout", 0)), 0.0, 1.0,
-                "Waiting for VM to kill itself..."):
+        kill_vm_timeout = float(params.get("kill_vm_timeout", 0))
+        if kill_vm_timeout:
+            logging.debug("'kill_vm' specified; waiting for VM to shut down "
+                          "before killing it...")
+            kvm_utils.wait_for(vm.is_dead, kill_vm_timeout, 0, 1)
+        else:
             logging.debug("'kill_vm' specified; killing VM...")
         vm.destroy(gracefully = params.get("kill_vm_gracefully") == "yes")
 
-- 
1.5.4.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration
  2009-10-07 17:54   ` [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Michael Goldish
  2009-10-07 17:54     ` [KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py Michael Goldish
@ 2009-10-12 15:28     ` Lucas Meneghel Rodrigues
  2009-10-27  9:32       ` Dor Laor
  1 sibling, 1 reply; 12+ messages in thread
From: Lucas Meneghel Rodrigues @ 2009-10-12 15:28 UTC (permalink / raw)
  To: Michael Goldish; +Cc: autotest, kvm

Hi Michael, I am reviewing your patchset and have just a minor remark
to make here:

On Wed, Oct 7, 2009 at 2:54 PM, Michael Goldish <mgoldish@redhat.com> wrote:
> This patch adds a new test that checks the timedrift introduced by migrations.
> It uses the same parameters used by the timedrift test to get the guest time.
> In addition, the number of migrations the test performs is controlled by the
> parameter 'migration_iterations'.
>
> Signed-off-by: Michael Goldish <mgoldish@redhat.com>
> ---
>  client/tests/kvm/kvm_tests.cfg.sample              |   33 ++++---
>  client/tests/kvm/tests/timedrift_with_migration.py |   95 ++++++++++++++++++++
>  2 files changed, 115 insertions(+), 13 deletions(-)
>  create mode 100644 client/tests/kvm/tests/timedrift_with_migration.py
>
> diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample
> index 540d0a2..618c21e 100644
> --- a/client/tests/kvm/kvm_tests.cfg.sample
> +++ b/client/tests/kvm/kvm_tests.cfg.sample
> @@ -100,19 +100,26 @@ variants:
>         type = linux_s3
>
>     - timedrift:    install setup
> -        type = timedrift
>         extra_params += " -rtc-td-hack"
> -        # Pin the VM and host load to CPU #0
> -        cpu_mask = 0x1
> -        # Set the load and rest durations
> -        load_duration = 20
> -        rest_duration = 20
> -        # Fail if the drift after load is higher than 50%
> -        drift_threshold = 50
> -        # Fail if the drift after the rest period is higher than 10%
> -        drift_threshold_after_rest = 10
> -        # For now, make sure this test is executed alone
> -        used_cpus = 100
> +        variants:
> +            - with_load:
> +                type = timedrift
> +                # Pin the VM and host load to CPU #0
> +                cpu_mask = 0x1
> +                # Set the load and rest durations
> +                load_duration = 20
> +                rest_duration = 20

Even the default duration here seems way too brief here, is there any
reason why 20s was chosen instead of, let's say, 1800s? I am under the
impression that 20s of load won't be enough to cause any noticeable
drift...

> +                # Fail if the drift after load is higher than 50%
> +                drift_threshold = 50
> +                # Fail if the drift after the rest period is higher than 10%
> +                drift_threshold_after_rest = 10

I am also curious about those tresholds and the reasoning behind them.
Is there any official agreement on what we consider to be an
unreasonable drift?

Another thing that struck me out is drift calculation: On the original
timedrift test, the guest drift is normalized against the host drift:

drift = 100.0 * (host_delta - guest_delta) / host_delta

While in the new drift tests, we consider only the guest drift. I
believe is better to normalize all tests based on one drift
calculation criteria, and those values should be reviewed, and at
least a certain level of agreement on our development community should
be reached.

Other than this concern that came to my mind, the new tests look good
and work fine here. I had to do a slight rebase in one of the patches,
very minor stuff. The default values and the drift calculation can be
changed on a later time. Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration
       [not found] <2046637733.55331255364857083.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
@ 2009-10-12 16:29 ` Michael Goldish
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Goldish @ 2009-10-12 16:29 UTC (permalink / raw)
  To: Lucas Meneghel Rodrigues; +Cc: autotest, kvm

----- "Lucas Meneghel Rodrigues" <lmr@redhat.com> wrote:

> Hi Michael, I am reviewing your patchset and have just a minor remark
> to make here:
> 
> On Wed, Oct 7, 2009 at 2:54 PM, Michael Goldish <mgoldish@redhat.com>
> wrote:
> > This patch adds a new test that checks the timedrift introduced by
> migrations.
> > It uses the same parameters used by the timedrift test to get the
> guest time.
> > In addition, the number of migrations the test performs is
> controlled by the
> > parameter 'migration_iterations'.
> >
> > Signed-off-by: Michael Goldish <mgoldish@redhat.com>
> > ---
> >  client/tests/kvm/kvm_tests.cfg.sample              |   33 ++++---
> >  client/tests/kvm/tests/timedrift_with_migration.py |   95
> ++++++++++++++++++++
> >  2 files changed, 115 insertions(+), 13 deletions(-)
> >  create mode 100644
> client/tests/kvm/tests/timedrift_with_migration.py
> >
> > diff --git a/client/tests/kvm/kvm_tests.cfg.sample
> b/client/tests/kvm/kvm_tests.cfg.sample
> > index 540d0a2..618c21e 100644
> > --- a/client/tests/kvm/kvm_tests.cfg.sample
> > +++ b/client/tests/kvm/kvm_tests.cfg.sample
> > @@ -100,19 +100,26 @@ variants:
> >         type = linux_s3
> >
> >     - timedrift:    install setup
> > -        type = timedrift
> >         extra_params += " -rtc-td-hack"
> > -        # Pin the VM and host load to CPU #0
> > -        cpu_mask = 0x1
> > -        # Set the load and rest durations
> > -        load_duration = 20
> > -        rest_duration = 20
> > -        # Fail if the drift after load is higher than 50%
> > -        drift_threshold = 50
> > -        # Fail if the drift after the rest period is higher than
> 10%
> > -        drift_threshold_after_rest = 10
> > -        # For now, make sure this test is executed alone
> > -        used_cpus = 100
> > +        variants:
> > +            - with_load:
> > +                type = timedrift
> > +                # Pin the VM and host load to CPU #0
> > +                cpu_mask = 0x1
> > +                # Set the load and rest durations
> > +                load_duration = 20
> > +                rest_duration = 20
> 
> Even the default duration here seems way too brief here, is there any
> reason why 20s was chosen instead of, let's say, 1800s? I am under
> the
> impression that 20s of load won't be enough to cause any noticeable
> drift...

Apparently I've been working with a bad qemu version for quite a while,
because after 20s of load I often get a huge (80%) drift.  This normally
shouldn't happen.  We might want to wait a little longer than 20s, but
there's no need to wait as long as 1800s AFAIK.  The test is meant to
catch drift problems, and apparently when there's a problem, it reveals
itself quickly.  I'm not sure there are drift problems that reveal
themselves after only 1800s of load, but quite frankly, I know very little
about this, so the timeout value should be changed by the user.

> > +                # Fail if the drift after load is higher than 50%
> > +                drift_threshold = 50
> > +                # Fail if the drift after the rest period is higher
> than 10%
> > +                drift_threshold_after_rest = 10
> 
> I am also curious about those tresholds and the reasoning behind
> them.
> Is there any official agreement on what we consider to be an
> unreasonable drift?

I really don't know.  After asking around I got the impression that the
threshold should depend on the load.  Theoretically it should be possible
to get any amount of drift with enough load -- that's the way I see it,
but I'm really not sure.

Maybe the best thing for us to do is run the time drift test a few times
with functional qemu versions as well as with broken ones (but not as broken
as the one I'm using), so we can see what thresholds best differentiate
between functional and broken code.

> Another thing that struck me out is drift calculation: On the
> original
> timedrift test, the guest drift is normalized against the host drift:
> 
> drift = 100.0 * (host_delta - guest_delta) / host_delta
> 
> While in the new drift tests, we consider only the guest drift. I
> believe is better to normalize all tests based on one drift
> calculation criteria, and those values should be reviewed, and at
> least a certain level of agreement on our development community
> should be reached.

The new tests use the host clock as reference like the original test.
We check how much time passed on the host, and then how much time passed
in the guest, and take the difference between those two.  The result is an
absolute number of seconds because there's no "load duration" with which
to normalize the result.  We're just interested in how much drift is caused
by a single reboot or a single migration (or a few).  I don't think it
matters how long the reboot/migration procedure took -- we treat it as a
single discrete event.

> Other than this concern that came to my mind, the new tests look good
> and work fine here. I had to do a slight rebase in one of the
> patches,
> very minor stuff. The default values and the drift calculation can be
> changed on a later time. Thanks!

Since the two new tests are similar it may be a good idea to merge them
into one and put just the differing code in conditional blocks, e.g.

(common code)
if op == "migration":
    (migration code)
elif op == "reboot":
    (reboot code)
(common code)

And where we use logging.info() we can do something like:
logging.info("Drift after %d %ss: %s seconds" % (iterations, op, drift))
so for reboot we get "Drift after 5 reboots: 3 seconds".

If you think this is a good idea, we can do it in a different patch, or
we can do it now.  I'm not even sure it's nececssary because the tests
are rather simple as they are.  Another question is what to call the merged
test -- timedrift with what?

Thanks,
Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test  timedrift_with_migration
  2009-10-12 15:28     ` [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Lucas Meneghel Rodrigues
@ 2009-10-27  9:32       ` Dor Laor
  2009-10-28  6:54         ` Michael Goldish
  0 siblings, 1 reply; 12+ messages in thread
From: Dor Laor @ 2009-10-27  9:32 UTC (permalink / raw)
  To: Lucas Meneghel Rodrigues; +Cc: Michael Goldish, autotest, kvm

On 10/12/2009 05:28 PM, Lucas Meneghel Rodrigues wrote:
> Hi Michael, I am reviewing your patchset and have just a minor remark
> to make here:
>
> On Wed, Oct 7, 2009 at 2:54 PM, Michael Goldish<mgoldish@redhat.com>  wrote:
>> This patch adds a new test that checks the timedrift introduced by migrations.
>> It uses the same parameters used by the timedrift test to get the guest time.
>> In addition, the number of migrations the test performs is controlled by the
>> parameter 'migration_iterations'.
>>
>> Signed-off-by: Michael Goldish<mgoldish@redhat.com>
>> ---
>>   client/tests/kvm/kvm_tests.cfg.sample              |   33 ++++---
>>   client/tests/kvm/tests/timedrift_with_migration.py |   95 ++++++++++++++++++++
>>   2 files changed, 115 insertions(+), 13 deletions(-)
>>   create mode 100644 client/tests/kvm/tests/timedrift_with_migration.py
>>
>> diff --git a/client/tests/kvm/kvm_tests.cfg.sample b/client/tests/kvm/kvm_tests.cfg.sample
>> index 540d0a2..618c21e 100644
>> --- a/client/tests/kvm/kvm_tests.cfg.sample
>> +++ b/client/tests/kvm/kvm_tests.cfg.sample
>> @@ -100,19 +100,26 @@ variants:
>>          type = linux_s3
>>
>>      - timedrift:    install setup
>> -        type = timedrift
>>          extra_params += " -rtc-td-hack"
>> -        # Pin the VM and host load to CPU #0
>> -        cpu_mask = 0x1
>> -        # Set the load and rest durations
>> -        load_duration = 20
>> -        rest_duration = 20
>> -        # Fail if the drift after load is higher than 50%
>> -        drift_threshold = 50
>> -        # Fail if the drift after the rest period is higher than 10%
>> -        drift_threshold_after_rest = 10
>> -        # For now, make sure this test is executed alone
>> -        used_cpus = 100
>> +        variants:
>> +            - with_load:
>> +                type = timedrift
>> +                # Pin the VM and host load to CPU #0
>> +                cpu_mask = 0x1


Let's use -smp 2 always.

btw: we need not to parallel the load test with standard tests.

>> +                # Set the load and rest durations
>> +                load_duration = 20
>> +                rest_duration = 20
>
> Even the default duration here seems way too brief here, is there any
> reason why 20s was chosen instead of, let's say, 1800s? I am under the
> impression that 20s of load won't be enough to cause any noticeable
> drift...
>
>> +                # Fail if the drift after load is higher than 50%
>> +                drift_threshold = 50
>> +                # Fail if the drift after the rest period is higher than 10%
>> +                drift_threshold_after_rest = 10
>
> I am also curious about those tresholds and the reasoning behind them.
> Is there any official agreement on what we consider to be an
> unreasonable drift?
>
> Another thing that struck me out is drift calculation: On the original
> timedrift test, the guest drift is normalized against the host drift:
>
> drift = 100.0 * (host_delta - guest_delta) / host_delta
>
> While in the new drift tests, we consider only the guest drift. I
> believe is better to normalize all tests based on one drift
> calculation criteria, and those values should be reviewed, and at
> least a certain level of agreement on our development community should
> be reached.

I think we don't need to calculate drift ratio. We should define a 
threshold in seconds, let's say 2 seconds. Beyond that, there should not 
be any drift.

Do we support migration to a different host? We should, especially in 
this test too. The destination host reading should also be used.

Apart for that, good patchset, and good thing you refactored some of the 
code to shared utils.

>
> Other than this concern that came to my mind, the new tests look good
> and work fine here. I had to do a slight rebase in one of the patches,
> very minor stuff. The default values and the drift calculation can be
> changed on a later time. Thanks!
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration
  2009-10-27  9:32       ` Dor Laor
@ 2009-10-28  6:54         ` Michael Goldish
  2009-11-16  9:17           ` Dor Laor
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Goldish @ 2009-10-28  6:54 UTC (permalink / raw)
  To: dlaor; +Cc: autotest, kvm, Lucas Meneghel Rodrigues


----- "Dor Laor" <dlaor@redhat.com> wrote:

> On 10/12/2009 05:28 PM, Lucas Meneghel Rodrigues wrote:
> > Hi Michael, I am reviewing your patchset and have just a minor
> remark
> > to make here:
> >
> > On Wed, Oct 7, 2009 at 2:54 PM, Michael Goldish<mgoldish@redhat.com>
>  wrote:
> >> This patch adds a new test that checks the timedrift introduced by
> migrations.
> >> It uses the same parameters used by the timedrift test to get the
> guest time.
> >> In addition, the number of migrations the test performs is
> controlled by the
> >> parameter 'migration_iterations'.
> >>
> >> Signed-off-by: Michael Goldish<mgoldish@redhat.com>
> >> ---
> >>   client/tests/kvm/kvm_tests.cfg.sample              |   33
> ++++---
> >>   client/tests/kvm/tests/timedrift_with_migration.py |   95
> ++++++++++++++++++++
> >>   2 files changed, 115 insertions(+), 13 deletions(-)
> >>   create mode 100644
> client/tests/kvm/tests/timedrift_with_migration.py
> >>
> >> diff --git a/client/tests/kvm/kvm_tests.cfg.sample
> b/client/tests/kvm/kvm_tests.cfg.sample
> >> index 540d0a2..618c21e 100644
> >> --- a/client/tests/kvm/kvm_tests.cfg.sample
> >> +++ b/client/tests/kvm/kvm_tests.cfg.sample
> >> @@ -100,19 +100,26 @@ variants:
> >>          type = linux_s3
> >>
> >>      - timedrift:    install setup
> >> -        type = timedrift
> >>          extra_params += " -rtc-td-hack"
> >> -        # Pin the VM and host load to CPU #0
> >> -        cpu_mask = 0x1
> >> -        # Set the load and rest durations
> >> -        load_duration = 20
> >> -        rest_duration = 20
> >> -        # Fail if the drift after load is higher than 50%
> >> -        drift_threshold = 50
> >> -        # Fail if the drift after the rest period is higher than
> 10%
> >> -        drift_threshold_after_rest = 10
> >> -        # For now, make sure this test is executed alone
> >> -        used_cpus = 100
> >> +        variants:
> >> +            - with_load:
> >> +                type = timedrift
> >> +                # Pin the VM and host load to CPU #0
> >> +                cpu_mask = 0x1
> 
> 
> Let's use -smp 2 always.

We can also just make -smp 2 the default for all tests. Does that sound
good?

> btw: we need not to parallel the load test with standard tests.

We already don't, because the load test has used_cpus = 100 which
forces it to run alone.

> >> +                # Set the load and rest durations
> >> +                load_duration = 20
> >> +                rest_duration = 20
> >
> > Even the default duration here seems way too brief here, is there
> any
> > reason why 20s was chosen instead of, let's say, 1800s? I am under
> the
> > impression that 20s of load won't be enough to cause any noticeable
> > drift...
> >
> >> +                # Fail if the drift after load is higher than 50%
> >> +                drift_threshold = 50
> >> +                # Fail if the drift after the rest period is
> higher than 10%
> >> +                drift_threshold_after_rest = 10
> >
> > I am also curious about those tresholds and the reasoning behind
> them.
> > Is there any official agreement on what we consider to be an
> > unreasonable drift?
> >
> > Another thing that struck me out is drift calculation: On the
> original
> > timedrift test, the guest drift is normalized against the host
> drift:
> >
> > drift = 100.0 * (host_delta - guest_delta) / host_delta
> >
> > While in the new drift tests, we consider only the guest drift. I
> > believe is better to normalize all tests based on one drift
> > calculation criteria, and those values should be reviewed, and at
> > least a certain level of agreement on our development community
> should
> > be reached.
> 
> I think we don't need to calculate drift ratio. We should define a 
> threshold in seconds, let's say 2 seconds. Beyond that, there should
> not be any drift.

Are you talking about the timedrift with load or timedrift with
migration or reboot tests?  I was told that when running the load test
for e.g 60 secs, the drift should be given in % of that duration.
In the case of migration and reboot, absolute durations are used (in
seconds, no %).  Should we do that in the load test too?

> Do we support migration to a different host? We should, especially in
> this test too. The destination host reading should also be used.
> Apart for that, good patchset, and good thing you refactored some of
> the code to shared utils.

We don't, and it would be very messy to implement with the framework
right now.  We should probably do that as some sort of server side test,
but we don't have server side tests right now, so doing it may take a
little time and effort.  I got the impression that there are more
important things to do at the moment, but please correct me if I'm wrong.

> >
> > Other than this concern that came to my mind, the new tests look
> good
> > and work fine here. I had to do a slight rebase in one of the
> patches,
> > very minor stuff. The default values and the drift calculation can
> be
> > changed on a later time. Thanks!
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test  timedrift_with_migration
  2009-10-28  6:54         ` Michael Goldish
@ 2009-11-16  9:17           ` Dor Laor
  0 siblings, 0 replies; 12+ messages in thread
From: Dor Laor @ 2009-11-16  9:17 UTC (permalink / raw)
  To: Michael Goldish; +Cc: autotest, kvm, Lucas Meneghel Rodrigues

On 10/28/2009 08:54 AM, Michael Goldish wrote:
>
> ----- "Dor Laor"<dlaor@redhat.com>  wrote:
>
>> On 10/12/2009 05:28 PM, Lucas Meneghel Rodrigues wrote:
>>> Hi Michael, I am reviewing your patchset and have just a minor
>> remark
>>> to make here:
>>>
>>> On Wed, Oct 7, 2009 at 2:54 PM, Michael Goldish<mgoldish@redhat.com>
>>   wrote:
>>>> This patch adds a new test that checks the timedrift introduced by
>> migrations.
>>>> It uses the same parameters used by the timedrift test to get the
>> guest time.
>>>> In addition, the number of migrations the test performs is
>> controlled by the
>>>> parameter 'migration_iterations'.
>>>>
>>>> Signed-off-by: Michael Goldish<mgoldish@redhat.com>
>>>> ---
>>>>    client/tests/kvm/kvm_tests.cfg.sample              |   33
>> ++++---
>>>>    client/tests/kvm/tests/timedrift_with_migration.py |   95
>> ++++++++++++++++++++
>>>>    2 files changed, 115 insertions(+), 13 deletions(-)
>>>>    create mode 100644
>> client/tests/kvm/tests/timedrift_with_migration.py
>>>>
>>>> diff --git a/client/tests/kvm/kvm_tests.cfg.sample
>> b/client/tests/kvm/kvm_tests.cfg.sample
>>>> index 540d0a2..618c21e 100644
>>>> --- a/client/tests/kvm/kvm_tests.cfg.sample
>>>> +++ b/client/tests/kvm/kvm_tests.cfg.sample
>>>> @@ -100,19 +100,26 @@ variants:
>>>>           type = linux_s3
>>>>
>>>>       - timedrift:    install setup
>>>> -        type = timedrift
>>>>           extra_params += " -rtc-td-hack"
>>>> -        # Pin the VM and host load to CPU #0
>>>> -        cpu_mask = 0x1
>>>> -        # Set the load and rest durations
>>>> -        load_duration = 20
>>>> -        rest_duration = 20
>>>> -        # Fail if the drift after load is higher than 50%
>>>> -        drift_threshold = 50
>>>> -        # Fail if the drift after the rest period is higher than
>> 10%
>>>> -        drift_threshold_after_rest = 10
>>>> -        # For now, make sure this test is executed alone
>>>> -        used_cpus = 100
>>>> +        variants:
>>>> +            - with_load:
>>>> +                type = timedrift
>>>> +                # Pin the VM and host load to CPU #0
>>>> +                cpu_mask = 0x1
>>
>>
>> Let's use -smp 2 always.
>
> We can also just make -smp 2 the default for all tests. Does that sound
> good?

Yes

>
>> btw: we need not to parallel the load test with standard tests.
>
> We already don't, because the load test has used_cpus = 100 which
> forces it to run alone.

Soon I'll have 100 on my laptop :), better change it to -1 or MAX_INT

>
>>>> +                # Set the load and rest durations
>>>> +                load_duration = 20
>>>> +                rest_duration = 20
>>>
>>> Even the default duration here seems way too brief here, is there
>> any
>>> reason why 20s was chosen instead of, let's say, 1800s? I am under
>> the
>>> impression that 20s of load won't be enough to cause any noticeable
>>> drift...
>>>
>>>> +                # Fail if the drift after load is higher than 50%
>>>> +                drift_threshold = 50
>>>> +                # Fail if the drift after the rest period is
>> higher than 10%
>>>> +                drift_threshold_after_rest = 10
>>>
>>> I am also curious about those tresholds and the reasoning behind
>> them.
>>> Is there any official agreement on what we consider to be an
>>> unreasonable drift?
>>>
>>> Another thing that struck me out is drift calculation: On the
>> original
>>> timedrift test, the guest drift is normalized against the host
>> drift:
>>>
>>> drift = 100.0 * (host_delta - guest_delta) / host_delta
>>>
>>> While in the new drift tests, we consider only the guest drift. I
>>> believe is better to normalize all tests based on one drift
>>> calculation criteria, and those values should be reviewed, and at
>>> least a certain level of agreement on our development community
>> should
>>> be reached.
>>
>> I think we don't need to calculate drift ratio. We should define a
>> threshold in seconds, let's say 2 seconds. Beyond that, there should
>> not be any drift.
>
> Are you talking about the timedrift with load or timedrift with
> migration or reboot tests?  I was told that when running the load test
> for e.g 60 secs, the drift should be given in % of that duration.
> In the case of migration and reboot, absolute durations are used (in
> seconds, no %).  Should we do that in the load test too?

Yes, but: during extreme load, we do predict that a guest *without* pv 
clock will drift and won't be able to catchup until the load stops and 
only then it will catchup. So my recommendation is to do the following:
- pvclock guest - can check with 'cat 
/sys/devices/system/clocksource/clocksource0/current_clocksource ' don't 
allow drift during huge loads.
   Exist (+safe) for rhel5.4 guests and ~2.6.29 (from 2.6.27).
- non-pv clock - run the load, stop the load, wait 5 seconds, measure time

For both, use absolute times.


>
>> Do we support migration to a different host? We should, especially in
>> this test too. The destination host reading should also be used.
>> Apart for that, good patchset, and good thing you refactored some of
>> the code to shared utils.
>
> We don't, and it would be very messy to implement with the framework
> right now.  We should probably do that as some sort of server side test,
> but we don't have server side tests right now, so doing it may take a
> little time and effort.  I got the impression that there are more
> important things to do at the moment, but please correct me if I'm wrong.

It is needed and will help us checking migration from one physical cpu 
to another, different host versions and of course the time gap potential 
bug upon migration.

>
>>>
>>> Other than this concern that came to my mind, the new tests look
>> good
>>> and work fine here. I had to do a slight rebase in one of the
>> patches,
>>> very minor stuff. The default values and the drift calculation can
>> be
>>> changed on a later time. Thanks!
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-11-16  9:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-07 17:54 [KVM-AUTOTEST PATCH 1/7] KVM test: migration test: move the bulk of the code to a utility function Michael Goldish
2009-10-07 17:54 ` [KVM-AUTOTEST PATCH 2/7] KVM test: timedrift test: move the get_time() helper function to kvm_test_utils.py Michael Goldish
2009-10-07 17:54   ` [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Michael Goldish
2009-10-07 17:54     ` [KVM-AUTOTEST PATCH 4/7] KVM test: move the reboot code to kvm_test_utils.py Michael Goldish
2009-10-07 17:54       ` [KVM-AUTOTEST PATCH 5/7] KVM test: new test timedrift_with_reboot Michael Goldish
2009-10-07 17:54         ` [KVM-AUTOTEST PATCH 6/7] KVM test: add option to kill all unresponsive VMs at the end of each test Michael Goldish
2009-10-07 17:54           ` [KVM-AUTOTEST PATCH 7/7] KVM test: kvm_preprocessing.py: fix indentation and logging messages in postprocess_vm Michael Goldish
2009-10-12 15:28     ` [Autotest] [KVM-AUTOTEST PATCH 3/7] KVM test: new test timedrift_with_migration Lucas Meneghel Rodrigues
2009-10-27  9:32       ` Dor Laor
2009-10-28  6:54         ` Michael Goldish
2009-11-16  9:17           ` Dor Laor
     [not found] <2046637733.55331255364857083.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2009-10-12 16:29 ` Michael Goldish

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox