* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
[not found] <20240821223012.3757828-1-vipinsh@google.com>
@ 2024-08-22 20:55 ` Vipin Sharma
2024-11-01 22:13 ` Vipin Sharma
[not found] ` <20240821223012.3757828-2-vipinsh@google.com>
1 sibling, 1 reply; 10+ messages in thread
From: Vipin Sharma @ 2024-08-22 20:55 UTC (permalink / raw)
To: kvm, kvmarm, kvm-riscv, linux-arm-kernel
Cc: Paolo Bonzini, Sean Christopherson, Anup Patel,
Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
Marc Zyngier, Oliver Upton, linux-kernel
Oops! Adding archs mailing list and maintainers which have arch folder
in tool/testing/selftests/kvm
On Wed, Aug 21, 2024 at 3:30 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> This series is introducing a KVM selftests runner to make it easier to
> run selftests with some interesting configurations and provide some
> enhancement over existing kselftests runner.
>
> I would like to get an early feedback from the community and see if this
> is something which can be useful for improving KVM selftests coverage
> and worthwhile investing time in it. Some specific questions:
>
> 1. Should this be done?
> 2. What features are must?
> 3. Any other way to write test configuration compared to what is done here?
>
> Note, python code written for runner is not optimized but shows how this
> runner can be useful.
>
> What are the goals?
> - Run tests with more than just the default settings of KVM module
> parameters and test itself.
> - Capture issues which only show when certain combination of module
> parameter and tests options are used.
> - Provide minimum testing which can be standardised for KVM patches.
> - Run tests parallely.
> - Dump output in a hierarchical folder structure for easier tracking of
> failures/success output
> - Feel free to add yours :)
>
> Why not use/extend kselftests?
> - Other submodules goal might not align and its gonna be difficult to
> capture broader set of requirements.
> - Instead of test configuration we will need separate shell scripts
> which will act as tests for each test arg and module parameter
> combination. This will easily pollute the KVM selftests directory.
> - Easier to enhance features using Python packages than shell scripts.
>
> What this runner do?
> - Reads a test configuration file (tests.json in patch 1).
> Configuration in json are written in hierarchy where multiple suites
> exist and each suite contains multiple tests.
> - Provides a way to execute tests inside a suite parallelly.
> - Provides a way to dump output to a folder in a hierarchical manner.
> - Allows to run selected suites, or tests in a specific suite.
> - Allows to do some setup and teardown for test suites and tests.
> - Timeout can be provided to limit test execution duration.
> - Allows to run test suites or tests on specific architecture only.
>
> Runner is written in python and goal is to only use standard library
> constructs. This runner will work on Python 3.6 and up
>
> What does a test configuration file looks like?
> Test configuration are written in json as it is easier to read and has
> inbuilt package support in Python. Root level is a json array denoting
> suites and each suite can multiple tests in it using json array.
>
> [
> {
> "suite": "dirty_log_perf_tests",
> "timeout_s": 300,
> "arch": "x86_64",
> "setup": "echo Setting up suite",
> "teardown": "echo tearing down suite",
> "tests": [
> {
> "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
> "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g",
> "arch": "x86_64",
> "setup": "echo Setting up test",
> "teardown": "echo tearing down test",
> "timeout_s": 5
> }
> ]
> }
> ]
>
> Usage:
> Runner "runner.py" and test configuration "tests.json" lives in
> tool/testing/selftests/kvm directory.
>
> To run serially:
> ./runner.py tests.json
>
> To run specific test suites:
> ./runner.py tests.json dirty_log_perf_tests x86_sanity_tests
>
> To run specific test in a suite:
> ./runner.py tests.json x86_sanity_tests/vmx_msrs_test
>
> To run everything parallely (runs tests inside a suite parallely):
> ./runner.py -j 10 tests.json
>
> To dump output to disk:
> ./runner.py -j 10 tests.json -o sample_run
>
> Sample output (after removing timestamp, process ID, and logging
> level columns):
>
> ./runner.py tests.json -j 10 -o sample_run
> PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
> PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
> PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
> PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
> PASSED: x86_sanity_tests/vmx_msrs_test
> SKIPPED: x86_sanity_tests/private_mem_conversions_test
> FAILED: x86_sanity_tests/apic_bus_clock_test
> PASSED: x86_sanity_tests/dirty_log_page_splitting_test
> --------------------------------------------------------------------------
> Test runner result:
> 1) dirty_log_perf_tests:
> 1) PASSED: dirty_log_perf_test_max_vcpu_no_manual_protect
> 2) PASSED: dirty_log_perf_test_max_vcpu_manual_protect
> 3) PASSED: dirty_log_perf_test_max_vcpu_manual_protect_random_access
> 4) PASSED: dirty_log_perf_test_max_10_vcpu_hugetlb
> 2) x86_sanity_tests:
> 1) PASSED: vmx_msrs_test
> 2) SKIPPED: private_mem_conversions_test
> 3) FAILED: apic_bus_clock_test
> 4) PASSED: dirty_log_page_splitting_test
> --------------------------------------------------------------------------
>
> Directory structure created:
>
> sample_run/
> |-- dirty_log_perf_tests
> | |-- dirty_log_perf_test_max_10_vcpu_hugetlb
> | | |-- command.stderr
> | | |-- command.stdout
> | | |-- setup.stderr
> | | |-- setup.stdout
> | | |-- teardown.stderr
> | | `-- teardown.stdout
> | |-- dirty_log_perf_test_max_vcpu_manual_protect
> | | |-- command.stderr
> | | `-- command.stdout
> | |-- dirty_log_perf_test_max_vcpu_manual_protect_random_access
> | | |-- command.stderr
> | | `-- command.stdout
> | `-- dirty_log_perf_test_max_vcpu_no_manual_protect
> | |-- command.stderr
> | `-- command.stdout
> `-- x86_sanity_tests
> |-- apic_bus_clock_test
> | |-- command.stderr
> | `-- command.stdout
> |-- dirty_log_page_splitting_test
> | |-- command.stderr
> | |-- command.stdout
> | |-- setup.stderr
> | |-- setup.stdout
> | |-- teardown.stderr
> | `-- teardown.stdout
> |-- private_mem_conversions_test
> | |-- command.stderr
> | `-- command.stdout
> `-- vmx_msrs_test
> |-- command.stderr
> `-- command.stdout
>
>
> Some other features for future:
> - Provide "precheck" command option in json, which can filter/skip tests if
> certain conditions are not met.
> - Iteration option in the runner. This will allow the same test suites to
> run again.
>
> Vipin Sharma (1):
> KVM: selftestsi: Create KVM selftests runnner to run interesting tests
>
> tools/testing/selftests/kvm/runner.py | 282 +++++++++++++++++++++++++
> tools/testing/selftests/kvm/tests.json | 60 ++++++
> 2 files changed, 342 insertions(+)
> create mode 100755 tools/testing/selftests/kvm/runner.py
> create mode 100644 tools/testing/selftests/kvm/tests.json
>
>
> base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
> --
> 2.46.0.184.g6999bdac58-goog
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] KVM: selftestsi: Create KVM selftests runnner to run interesting tests
[not found] ` <20240821223012.3757828-2-vipinsh@google.com>
@ 2024-08-22 20:56 ` Vipin Sharma
0 siblings, 0 replies; 10+ messages in thread
From: Vipin Sharma @ 2024-08-22 20:56 UTC (permalink / raw)
To: KVM, kvmarm, kvm-riscv, linux-arm-kernel
Cc: Paolo Bonzini, Sean Christopherson, Anup Patel,
Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
Marc Zyngier, Oliver Upton, Linux Kernel Mailing List
Oops! Adding archs mailing list and maintainers which have arch folder
in tool/testing/selftests/kvm
On Wed, Aug 21, 2024 at 3:30 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> Create a selftest runner "runner.py" for KVM which can run tests with
> more interesting configurations other than the default values. Read
> those configurations from "tests.json".
>
> Provide runner some options to run differently:
> 1. Run using different configuration files.
> 2. Run specific test suite or test in a specific suite.
> 3. Allow some setup and teardown capability for each test and test suite
> execution.
> 4. Timeout value for tests.
> 5. Run test suite parallelly.
> 6. Dump stdout and stderror in hierarchical folder structure.
> 7. Run/skip tests based on platform it is executing on.
>
> Print summary of the run at the end.
>
> Add a starter test configuration file "tests.json" with some sample
> tests which runner can use to execute tests.
>
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> ---
> tools/testing/selftests/kvm/runner.py | 282 +++++++++++++++++++++++++
> tools/testing/selftests/kvm/tests.json | 60 ++++++
> 2 files changed, 342 insertions(+)
> create mode 100755 tools/testing/selftests/kvm/runner.py
> create mode 100644 tools/testing/selftests/kvm/tests.json
>
> diff --git a/tools/testing/selftests/kvm/runner.py b/tools/testing/selftests/kvm/runner.py
> new file mode 100755
> index 000000000000..46f6c1c8ce2c
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/runner.py
> @@ -0,0 +1,282 @@
> +#!/usr/bin/env python3
> +
> +import argparse
> +import json
> +import subprocess
> +import os
> +import platform
> +import logging
> +import contextlib
> +import textwrap
> +import shutil
> +
> +from pathlib import Path
> +from multiprocessing import Pool
> +
> +logging.basicConfig(level=logging.INFO,
> + format = "%(asctime)s | %(process)d | %(levelname)8s | %(message)s")
> +
> +class Command:
> + """Executes a command
> +
> + Execute a command.
> + """
> + def __init__(self, id, command, timeout=None, command_artifacts_dir=None):
> + self.id = id
> + self.args = command
> + self.timeout = timeout
> + self.command_artifacts_dir = command_artifacts_dir
> +
> + def __run(self, command, timeout=None, output=None, error=None):
> + proc=subprocess.run(command, stdout=output,
> + stderr=error, universal_newlines=True,
> + shell=True, timeout=timeout)
> + return proc.returncode
> +
> + def run(self):
> + output = None
> + error = None
> + with contextlib.ExitStack() as stack:
> + if self.command_artifacts_dir is not None:
> + output_path = os.path.join(self.command_artifacts_dir, f"{self.id}.stdout")
> + error_path = os.path.join(self.command_artifacts_dir, f"{self.id}.stderr")
> + output = stack.enter_context(open(output_path, encoding="utf-8", mode = "w"))
> + error = stack.enter_context(open(error_path, encoding="utf-8", mode = "w"))
> + return self.__run(self.args, self.timeout, output, error)
> +
> +COMMAND_TIMED_OUT = "TIMED_OUT"
> +COMMAND_PASSED = "PASSED"
> +COMMAND_FAILED = "FAILED"
> +COMMAND_SKIPPED = "SKIPPED"
> +SETUP_FAILED = "SETUP_FAILED"
> +TEARDOWN_FAILED = "TEARDOWN_FAILED"
> +
> +def run_command(command):
> + if command is None:
> + return COMMAND_PASSED
> +
> + try:
> + ret = command.run()
> + if ret == 0:
> + return COMMAND_PASSED
> + elif ret == 4:
> + return COMMAND_SKIPPED
> + else:
> + return COMMAND_FAILED
> + except subprocess.TimeoutExpired as e:
> + logging.error(type(e).__name__ + str(e))
> + return COMMAND_TIMED_OUT
> +
> +class Test:
> + """A single test.
> +
> + A test which can be run on its own.
> + """
> + def __init__(self, test_json, timeout=None, suite_dir=None):
> + self.name = test_json["name"]
> + self.test_artifacts_dir = None
> + self.setup_command = None
> + self.teardown_command = None
> +
> + if suite_dir is not None:
> + self.test_artifacts_dir = os.path.join(suite_dir, self.name)
> +
> + test_timeout = test_json.get("timeout_s", timeout)
> +
> + self.test_command = Command("command", test_json["command"], test_timeout, self.test_artifacts_dir)
> + if "setup" in test_json:
> + self.setup_command = Command("setup", test_json["setup"], test_timeout, self.test_artifacts_dir)
> + if "teardown" in test_json:
> + self.teardown_command = Command("teardown", test_json["teardown"], test_timeout, self.test_artifacts_dir)
> +
> + def run(self):
> + if self.test_artifacts_dir is not None:
> + Path(self.test_artifacts_dir).mkdir(parents=True, exist_ok=True)
> +
> + setup_status = run_command(self.setup_command)
> + if setup_status != COMMAND_PASSED:
> + return SETUP_FAILED
> +
> + try:
> + status = run_command(self.test_command)
> + return status
> + finally:
> + teardown_status = run_command(self.teardown_command)
> + if (teardown_status != COMMAND_PASSED
> + and (status == COMMAND_PASSED or status == COMMAND_SKIPPED)):
> + return TEARDOWN_FAILED
> +
> +def run_test(test):
> + return test.run()
> +
> +class Suite:
> + """Collection of tests to run
> +
> + Group of tests.
> + """
> + def __init__(self, suite_json, platform_arch, artifacts_dir, test_filter):
> + self.suite_name = suite_json["suite"]
> + self.suite_artifacts_dir = None
> + self.setup_command = None
> + self.teardown_command = None
> + timeout = suite_json.get("timeout_s", None)
> +
> + if artifacts_dir is not None:
> + self.suite_artifacts_dir = os.path.join(artifacts_dir, self.suite_name)
> +
> + if "setup" in suite_json:
> + self.setup_command = Command("setup", suite_json["setup"], timeout, self.suite_artifacts_dir)
> + if "teardown" in suite_json:
> + self.teardown_command = Command("teardown", suite_json["teardown"], timeout, self.suite_artifacts_dir)
> +
> + self.tests = []
> + for test_json in suite_json["tests"]:
> + if len(test_filter) > 0 and test_json["name"] not in test_filter:
> + continue;
> + if test_json.get("arch") is None or test_json["arch"] == platform_arch:
> + self.tests.append(Test(test_json, timeout, self.suite_artifacts_dir))
> +
> + def run(self, jobs=1):
> + result = {}
> + if len(self.tests) == 0:
> + return COMMAND_PASSED, result
> +
> + if self.suite_artifacts_dir is not None:
> + Path(self.suite_artifacts_dir).mkdir(parents = True, exist_ok = True)
> +
> + setup_status = run_command(self.setup_command)
> + if setup_status != COMMAND_PASSED:
> + return SETUP_FAILED, result
> +
> +
> + if jobs > 1:
> + with Pool(jobs) as p:
> + tests_status = p.map(run_test, self.tests)
> + for i,test in enumerate(self.tests):
> + logging.info(f"{tests_status[i]}: {self.suite_name}/{test.name}")
> + result[test.name] = tests_status[i]
> + else:
> + for test in self.tests:
> + status = run_test(test)
> + logging.info(f"{status}: {self.suite_name}/{test.name}")
> + result[test.name] = status
> +
> + teardown_status = run_command(self.teardown_command)
> + if teardown_status != COMMAND_PASSED:
> + return TEARDOWN_FAILED, result
> +
> +
> + return COMMAND_PASSED, result
> +
> +def load_tests(path):
> + with open(path) as f:
> + tests = json.load(f)
> + return tests
> +
> +
> +def run_suites(suites, jobs):
> + """Runs the tests.
> +
> + Run test suits in the tests file.
> + """
> + result = {}
> + for suite in suites:
> + result[suite.suite_name] = suite.run(jobs)
> + return result
> +
> +def parse_test_filter(test_suite_or_test):
> + test_filter = {}
> + if len(test_suite_or_test) == 0:
> + return test_filter
> + for test in test_suite_or_test:
> + test_parts = test.split("/")
> + if len(test_parts) > 2:
> + raise ValueError("Incorrect format of suite/test_name combo")
> + if test_parts[0] not in test_filter:
> + test_filter[test_parts[0]] = []
> + if len(test_parts) == 2:
> + test_filter[test_parts[0]].append(test_parts[1])
> +
> + return test_filter
> +
> +def parse_suites(suites_json, platform_arch, artifacts_dir, test_suite_or_test):
> + suites = []
> + test_filter = parse_test_filter(test_suite_or_test)
> + for suite_json in suites_json:
> + if len(test_filter) > 0 and suite_json["suite"] not in test_filter:
> + continue
> + if suite_json.get("arch") is None or suite_json["arch"] == platform_arch:
> + suites.append(Suite(suite_json,
> + platform_arch,
> + artifacts_dir,
> + test_filter.get(suite_json["suite"], [])))
> + return suites
> +
> +
> +def pretty_print(result):
> + logging.info("--------------------------------------------------------------------------")
> + if not result:
> + logging.warning("No test executed.")
> + return
> + logging.info("Test runner result:")
> + suite_count = 0
> + test_count = 0
> + for suite_name, suite_result in result.items():
> + suite_count += 1
> + logging.info(f"{suite_count}) {suite_name}:")
> + if suite_result[0] != COMMAND_PASSED:
> + logging.info(f"\t{suite_result[0]}")
> + test_count = 0
> + for test_name, test_result in suite_result[1].items():
> + test_count += 1
> + if test_result == "PASSED":
> + logging.info(f"\t{test_count}) {test_result}: {test_name}")
> + else:
> + logging.error(f"\t{test_count}) {test_result}: {test_name}")
> + logging.info("--------------------------------------------------------------------------")
> +
> +def args_parser():
> + parser = argparse.ArgumentParser(
> + prog = "KVM Selftests Runner",
> + description = "Run KVM selftests with different configurations",
> + formatter_class=argparse.RawTextHelpFormatter
> + )
> +
> + parser.add_argument("-o","--output",
> + help="Creates a folder to dump test results.")
> + parser.add_argument("-j", "--jobs", default = 1, type = int,
> + help="Number of parallel executions in a suite")
> + parser.add_argument("test_suites_json",
> + help = "File containing test suites to run")
> +
> + test_suite_or_test_help = textwrap.dedent("""\
> + Run specific test suite or specific test from the test suite.
> + If nothing specified then run all of the tests.
> +
> + Example:
> + runner.py tests.json A/a1 A/a4 B C/c1
> +
> + Assuming capital letters are test suites and small letters are tests.
> + Runner will:
> + - Run test a1 and a4 from the test suite A
> + - Run all tests from the test suite B
> + - Run test c1 from the test suite C"""
> + )
> + parser.add_argument("test_suite_or_test", nargs="*", help=test_suite_or_test_help)
> +
> +
> + return parser.parse_args();
> +
> +def main():
> + args = args_parser()
> + suites_json = load_tests(args.test_suites_json)
> + suites = parse_suites(suites_json, platform.machine(),
> + args.output, args.test_suite_or_test)
> +
> + if args.output is not None:
> + shutil.rmtree(args.output, ignore_errors=True)
> + result = run_suites(suites, args.jobs)
> + pretty_print(result)
> +
> +if __name__ == "__main__":
> + main()
> diff --git a/tools/testing/selftests/kvm/tests.json b/tools/testing/selftests/kvm/tests.json
> new file mode 100644
> index 000000000000..1c1c15a0e880
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/tests.json
> @@ -0,0 +1,60 @@
> +[
> + {
> + "suite": "dirty_log_perf_tests",
> + "timeout_s": 300,
> + "tests": [
> + {
> + "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
> + "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g"
> + },
> + {
> + "name": "dirty_log_perf_test_max_vcpu_manual_protect",
> + "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo)"
> + },
> + {
> + "name": "dirty_log_perf_test_max_vcpu_manual_protect_random_access",
> + "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -a"
> + },
> + {
> + "name": "dirty_log_perf_test_max_10_vcpu_hugetlb",
> + "setup": "echo 5120 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
> + "command": "./dirty_log_perf_test -v 10 -s anonymous_hugetlb_2mb",
> + "teardown": "echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages"
> + }
> + ]
> + },
> + {
> + "suite": "x86_sanity_tests",
> + "arch" : "x86_64",
> + "tests": [
> + {
> + "name": "vmx_msrs_test",
> + "command": "./x86_64/vmx_msrs_test"
> + },
> + {
> + "name": "private_mem_conversions_test",
> + "command": "./x86_64/private_mem_conversions_test"
> + },
> + {
> + "name": "apic_bus_clock_test",
> + "command": "./x86_64/apic_bus_clock_test"
> + },
> + {
> + "name": "dirty_log_page_splitting_test",
> + "command": "./x86_64/dirty_log_page_splitting_test -b 2G -s anonymous_hugetlb_2mb",
> + "setup": "echo 2560 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages",
> + "teardown": "echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages"
> + }
> + ]
> + },
> + {
> + "suite": "arm_sanity_test",
> + "arch" : "aarch64",
> + "tests": [
> + {
> + "name": "page_fault_test",
> + "command": "./aarch64/page_fault_test"
> + }
> + ]
> + }
> +]
> \ No newline at end of file
> --
> 2.46.0.184.g6999bdac58-goog
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-08-22 20:55 ` [RFC PATCH 0/1] KVM selftests runner for running more than just default Vipin Sharma
@ 2024-11-01 22:13 ` Vipin Sharma
2024-11-06 17:06 ` Sean Christopherson
0 siblings, 1 reply; 10+ messages in thread
From: Vipin Sharma @ 2024-11-01 22:13 UTC (permalink / raw)
To: kvm, kvmarm, kvm-riscv, linux-arm-kernel
Cc: Paolo Bonzini, Sean Christopherson, Anup Patel,
Christian Borntraeger, Janosch Frank, Claudio Imbrenda,
Marc Zyngier, Oliver Upton, linux-kernel
On Thu, Aug 22, 2024 at 1:55 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> Oops! Adding archs mailing list and maintainers which have arch folder
> in tool/testing/selftests/kvm
>
> On Wed, Aug 21, 2024 at 3:30 PM Vipin Sharma <vipinsh@google.com> wrote:
> >
> > This series is introducing a KVM selftests runner to make it easier to
> > run selftests with some interesting configurations and provide some
> > enhancement over existing kselftests runner.
> >
> > I would like to get an early feedback from the community and see if this
> > is something which can be useful for improving KVM selftests coverage
> > and worthwhile investing time in it. Some specific questions:
> >
> > 1. Should this be done?
> > 2. What features are must?
> > 3. Any other way to write test configuration compared to what is done here?
> >
> > Note, python code written for runner is not optimized but shows how this
> > runner can be useful.
> >
> > What are the goals?
> > - Run tests with more than just the default settings of KVM module
> > parameters and test itself.
> > - Capture issues which only show when certain combination of module
> > parameter and tests options are used.
> > - Provide minimum testing which can be standardised for KVM patches.
> > - Run tests parallely.
> > - Dump output in a hierarchical folder structure for easier tracking of
> > failures/success output
> > - Feel free to add yours :)
> >
> > Why not use/extend kselftests?
> > - Other submodules goal might not align and its gonna be difficult to
> > capture broader set of requirements.
> > - Instead of test configuration we will need separate shell scripts
> > which will act as tests for each test arg and module parameter
> > combination. This will easily pollute the KVM selftests directory.
> > - Easier to enhance features using Python packages than shell scripts.
> >
> > What this runner do?
> > - Reads a test configuration file (tests.json in patch 1).
> > Configuration in json are written in hierarchy where multiple suites
> > exist and each suite contains multiple tests.
> > - Provides a way to execute tests inside a suite parallelly.
> > - Provides a way to dump output to a folder in a hierarchical manner.
> > - Allows to run selected suites, or tests in a specific suite.
> > - Allows to do some setup and teardown for test suites and tests.
> > - Timeout can be provided to limit test execution duration.
> > - Allows to run test suites or tests on specific architecture only.
> >
> > Runner is written in python and goal is to only use standard library
> > constructs. This runner will work on Python 3.6 and up
> >
> > What does a test configuration file looks like?
> > Test configuration are written in json as it is easier to read and has
> > inbuilt package support in Python. Root level is a json array denoting
> > suites and each suite can multiple tests in it using json array.
> >
> > [
> > {
> > "suite": "dirty_log_perf_tests",
> > "timeout_s": 300,
> > "arch": "x86_64",
> > "setup": "echo Setting up suite",
> > "teardown": "echo tearing down suite",
> > "tests": [
> > {
> > "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
> > "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g",
> > "arch": "x86_64",
> > "setup": "echo Setting up test",
> > "teardown": "echo tearing down test",
> > "timeout_s": 5
> > }
> > ]
> > }
> > ]
> >
> > Usage:
> > Runner "runner.py" and test configuration "tests.json" lives in
> > tool/testing/selftests/kvm directory.
> >
> > To run serially:
> > ./runner.py tests.json
> >
> > To run specific test suites:
> > ./runner.py tests.json dirty_log_perf_tests x86_sanity_tests
> >
> > To run specific test in a suite:
> > ./runner.py tests.json x86_sanity_tests/vmx_msrs_test
> >
> > To run everything parallely (runs tests inside a suite parallely):
> > ./runner.py -j 10 tests.json
> >
> > To dump output to disk:
> > ./runner.py -j 10 tests.json -o sample_run
> >
> > Sample output (after removing timestamp, process ID, and logging
> > level columns):
> >
> > ./runner.py tests.json -j 10 -o sample_run
> > PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
> > PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
> > PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
> > PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
> > PASSED: x86_sanity_tests/vmx_msrs_test
> > SKIPPED: x86_sanity_tests/private_mem_conversions_test
> > FAILED: x86_sanity_tests/apic_bus_clock_test
> > PASSED: x86_sanity_tests/dirty_log_page_splitting_test
> > --------------------------------------------------------------------------
> > Test runner result:
> > 1) dirty_log_perf_tests:
> > 1) PASSED: dirty_log_perf_test_max_vcpu_no_manual_protect
> > 2) PASSED: dirty_log_perf_test_max_vcpu_manual_protect
> > 3) PASSED: dirty_log_perf_test_max_vcpu_manual_protect_random_access
> > 4) PASSED: dirty_log_perf_test_max_10_vcpu_hugetlb
> > 2) x86_sanity_tests:
> > 1) PASSED: vmx_msrs_test
> > 2) SKIPPED: private_mem_conversions_test
> > 3) FAILED: apic_bus_clock_test
> > 4) PASSED: dirty_log_page_splitting_test
> > --------------------------------------------------------------------------
> >
> > Directory structure created:
> >
> > sample_run/
> > |-- dirty_log_perf_tests
> > | |-- dirty_log_perf_test_max_10_vcpu_hugetlb
> > | | |-- command.stderr
> > | | |-- command.stdout
> > | | |-- setup.stderr
> > | | |-- setup.stdout
> > | | |-- teardown.stderr
> > | | `-- teardown.stdout
> > | |-- dirty_log_perf_test_max_vcpu_manual_protect
> > | | |-- command.stderr
> > | | `-- command.stdout
> > | |-- dirty_log_perf_test_max_vcpu_manual_protect_random_access
> > | | |-- command.stderr
> > | | `-- command.stdout
> > | `-- dirty_log_perf_test_max_vcpu_no_manual_protect
> > | |-- command.stderr
> > | `-- command.stdout
> > `-- x86_sanity_tests
> > |-- apic_bus_clock_test
> > | |-- command.stderr
> > | `-- command.stdout
> > |-- dirty_log_page_splitting_test
> > | |-- command.stderr
> > | |-- command.stdout
> > | |-- setup.stderr
> > | |-- setup.stdout
> > | |-- teardown.stderr
> > | `-- teardown.stdout
> > |-- private_mem_conversions_test
> > | |-- command.stderr
> > | `-- command.stdout
> > `-- vmx_msrs_test
> > |-- command.stderr
> > `-- command.stdout
> >
> >
> > Some other features for future:
> > - Provide "precheck" command option in json, which can filter/skip tests if
> > certain conditions are not met.
> > - Iteration option in the runner. This will allow the same test suites to
> > run again.
> >
> > Vipin Sharma (1):
> > KVM: selftestsi: Create KVM selftests runnner to run interesting tests
> >
> > tools/testing/selftests/kvm/runner.py | 282 +++++++++++++++++++++++++
> > tools/testing/selftests/kvm/tests.json | 60 ++++++
> > 2 files changed, 342 insertions(+)
> > create mode 100755 tools/testing/selftests/kvm/runner.py
> > create mode 100644 tools/testing/selftests/kvm/tests.json
> >
> >
> > base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
> > --
> > 2.46.0.184.g6999bdac58-goog
> >
Had an offline discussion with Sean, providing a summary on what we
discussed (Sean, correct me if something is not aligned from our
discussion):
We need to have a roadmap for the runner in terms of features we support.
Phase 1: Having a basic selftest runner is useful which can:
- Run tests parallely
- Provide a summary of what passed and failed, or only in case of failure.
- Dump output which can be easily accessed and parsed.
- Allow to run with different command line parameters.
Current patch does more than this and can be simplified.
Phase 2: Environment setup via runner
Current patch, allows to write "setup" commands at test suite and test
level in the json config file to setup the environment needed by a
test to run. This might not be ideal as some settings are exposed
differently on different platforms.
For example,
To enable TDP:
- Intel needs npt=Y
- AMD needs ept=Y
- ARM always on.
To enable APIC virtualization
- Intel needs enable_apicv=Y
- AMD needs avic=Y
To enable/disable nested, they both have the same file name "nested"
in their module params directory which should be changed.
These kinds of settings become more verbose and unnecessary on other
platforms. Instead, runners should have some programming constructs
(API, command line options, default) to enable these options in a
generic way. For example, enable/disable nested can be exposed as a
command line --enable_nested, then based on the platform, runner can
update corresponding module param or ignore.
This will easily extend to providing sane configuration on the
corresponding platforms without lots of hardcoding in JSON. These
individual constructs will provide a generic view/option to run a KVM
feature, and under the hood will do things differently based on the
platform it is running on like arm, x86-intel, x86-amd, s390, etc.
Phase 3: Provide collection of interesting configurations
Specific individual constructs can be combined in a meaningful way to
provide interesting configurations to run on a platform. For example,
user doesn't need to specify each individual configuration instead,
some prebuilt configurations can be exposed like
--stress_test_shadow_mmu, --test_basic_nested
Tests need to handle the environment in which they are running
gracefully, which many tests already do but not exhaustively. If some
setting is not provided or set up properly for their execution then
they should fail/skip accordingly.
Runner will not be responsible to precheck things on tests behalf.
Next steps:
1. Consensus on above phases and features.
2. Start development.
Thanks,
Vipin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-01 22:13 ` Vipin Sharma
@ 2024-11-06 17:06 ` Sean Christopherson
2024-11-08 11:05 ` Andrew Jones
2024-11-15 20:18 ` Vipin Sharma
0 siblings, 2 replies; 10+ messages in thread
From: Sean Christopherson @ 2024-11-06 17:06 UTC (permalink / raw)
To: Vipin Sharma
Cc: kvm, kvmarm, kvm-riscv, linux-arm-kernel, Paolo Bonzini,
Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On Fri, Nov 01, 2024, Vipin Sharma wrote:
> Had an offline discussion with Sean, providing a summary on what we
> discussed (Sean, correct me if something is not aligned from our
> discussion):
>
> We need to have a roadmap for the runner in terms of features we support.
>
> Phase 1: Having a basic selftest runner is useful which can:
>
> - Run tests parallely
Maybe with a (very conversative) per test timeout? Selftests generally don't have
the same problems as KVM-Unit-Tests (KUT), as selftests are a little better at
guarding against waiting indefinitely, i.e. I don't think we need a configurable
timeout. But a 120 second timeout or so would be helpful.
E.g. I recently was testing a patch (of mine) that had a "minor" bug where it
caused KVM to do a remote TLB flush on *every* SPTE update in the shadow MMU,
which manifested as hilariously long runtimes for max_guest_memory_test. I was
_this_ close to not catching the bug (which would have been quite embarrasing),
because my hack-a-scripts don't use timeouts (I only noticed because a completely
unrelated bug was causing failures).
> - Provide a summary of what passed and failed, or only in case of failure.
I think a summary is always warranted. And for failures, it would be helpful to
spit out _what_ test failed, versus the annoying KUT runner's behavior of stating
only the number of passes/failures, which forces the user to go spelunking just
to find out what (sub)test failed.
I also think the runner should have a "heartbeat" mechanism, i.e. something that
communicates to the user that forward progress is being made. And IMO, that
mechanism should also spit out skips and failures (this could be optional though).
One of the flaws with the KUT runner is that it's either super noisy and super
quiet.
E.g. my mess of bash outputs this when running selftests in parallel (trimmed for
brevity):
Running selftests with npt_disabled
Waiting for 'access_tracking_perf_test', PID '92317'
Waiting for 'amx_test', PID '92318'
SKIPPED amx_test
Waiting for 'apic_bus_clock_test', PID '92319'
Waiting for 'coalesced_io_test', PID '92321'
Waiting for 'cpuid_test', PID '92324'
...
Waiting for 'hyperv_svm_test', PID '92552'
SKIPPED hyperv_svm_test
Waiting for 'hyperv_tlb_flush', PID '92563'
FAILED hyperv_tlb_flush : ret ='254'
Random seed: 0x6b8b4567
==== Test Assertion Failure ====
x86_64/hyperv_tlb_flush.c:117: val == expected
pid=92731 tid=93548 errno=4 - Interrupted system call
1 0x0000000000411566: assert_on_unhandled_exception at processor.c:627
2 0x000000000040889a: _vcpu_run at kvm_util.c:1649
3 (inlined by) vcpu_run at kvm_util.c:1660
4 0x00000000004041a1: vcpu_thread at hyperv_tlb_flush.c:548
5 0x000000000043a305: start_thread at pthread_create.o:?
6 0x000000000045f857: __clone3 at ??:?
val == expected
Waiting for 'kvm_binary_stats_test', PID '92579'
...
SKIPPED vmx_preemption_timer_test
Waiting for 'vmx_set_nested_state_test', PID '93316'
SKIPPED vmx_set_nested_state_test
Waiting for 'vmx_tsc_adjust_test', PID '93329'
SKIPPED vmx_tsc_adjust_test
Waiting for 'xapic_ipi_test', PID '93350'
Waiting for 'xapic_state_test', PID '93360'
Waiting for 'xcr0_cpuid_test', PID '93374'
Waiting for 'xen_shinfo_test', PID '93391'
Waiting for 'xen_vmcall_test', PID '93405'
Waiting for 'xss_msr_test', PID '93420'
It's far from perfect, e.g. just waits in alphabetical order, but it gives me
easy to read feedback, and signal that tests are indeed running and completing.
> - Dump output which can be easily accessed and parsed.
And persist the output/logs somewhere, e.g. so that the user can triage failures
after the fact.
> - Allow to run with different command line parameters.
Command line parameters for tests? If so, I would put this in phase 3. I.e. make
the goal of Phase 1 purely about running tests in parallel.
> Current patch does more than this and can be simplified.
>
> Phase 2: Environment setup via runner
>
> Current patch, allows to write "setup" commands at test suite and test
> level in the json config file to setup the environment needed by a
> test to run. This might not be ideal as some settings are exposed
> differently on different platforms.
>
> For example,
> To enable TDP:
> - Intel needs npt=Y
> - AMD needs ept=Y
> - ARM always on.
>
> To enable APIC virtualization
> - Intel needs enable_apicv=Y
> - AMD needs avic=Y
>
> To enable/disable nested, they both have the same file name "nested"
> in their module params directory which should be changed.
>
> These kinds of settings become more verbose and unnecessary on other
> platforms. Instead, runners should have some programming constructs
> (API, command line options, default) to enable these options in a
> generic way. For example, enable/disable nested can be exposed as a
> command line --enable_nested, then based on the platform, runner can
> update corresponding module param or ignore.
>
> This will easily extend to providing sane configuration on the
> corresponding platforms without lots of hardcoding in JSON. These
> individual constructs will provide a generic view/option to run a KVM
> feature, and under the hood will do things differently based on the
> platform it is running on like arm, x86-intel, x86-amd, s390, etc.
My main input on this front is that the runner needs to configure module params
(and other environment settings) _on behalf of the user_, i.e. in response to a
command line option (to the runner), not in response to per-test configurations.
One of my complaints with our internal infrastructure is that the testcases
themselves can dictate environment settings. There are certainly benefits to
that approach, but it really only makes sense at scale where there are many
machines available, i.e. where the runner can achieve parallelism by running
tests on multiple machines, and where the complexity of managing the environment
on a per-test basis is worth the payout.
For the upstream runner, I want to cater to developers, i.e. to people that are
running tests on one or two machines. And I want the runner to rip through tests
as fast as possible, i.e. I don't want tests to get serialized because each one
insists on being a special snowflake and doesn't play nice with other children.
Organizations that the have a fleet of systems can pony up the resources to develop
their own support (on top?).
Selftests can and do check for module params, and should and do use TEST_REQUIRE()
to skip when a module param isn't set as needed. Extending that to arbitrary
sysfs knobs should be trivial. I.e. if we get _failures_ because of an incompatible
environment, then it's a test bug.
> Phase 3: Provide collection of interesting configurations
>
> Specific individual constructs can be combined in a meaningful way to
> provide interesting configurations to run on a platform. For example,
> user doesn't need to specify each individual configuration instead,
> some prebuilt configurations can be exposed like
> --stress_test_shadow_mmu, --test_basic_nested
IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
command line options. Users shouldn't need to modify the runner just to bring
their own configuration. I also think configurations should be discoverable,
e.g. not hardcoded like KUT's unittest.cfg. A very real problem with KUT's
approach is that testing different combinations is frustratingly difficult,
because running a testcase with different configuration requires modifying a file
that is tracked by git.
There are underlying issues with KUT that essentially necessitate that approach,
e.g. x86 has several testcases that fail if run without the exact right config.
But that's just another reason to NOT follow KUT's pattern, e.g. to force us to
write robust tests.
E.g. instead of per-config command line options, let the user specify a file,
and/or a directory (using a well known filename pattern to detect configs).
> Tests need to handle the environment in which they are running
> gracefully, which many tests already do but not exhaustively. If some
> setting is not provided or set up properly for their execution then
> they should fail/skip accordingly.
This belongs in phase 2.
> Runner will not be responsible to precheck things on tests behalf.
>
>
> Next steps:
> 1. Consensus on above phases and features.
> 2. Start development.
>
> Thanks,
> Vipin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-06 17:06 ` Sean Christopherson
@ 2024-11-08 11:05 ` Andrew Jones
2024-11-14 17:42 ` Sean Christopherson
2024-11-15 20:18 ` Vipin Sharma
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Jones @ 2024-11-08 11:05 UTC (permalink / raw)
To: Sean Christopherson
Cc: Vipin Sharma, kvm, kvmarm, kvm-riscv, linux-arm-kernel,
Paolo Bonzini, Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On Wed, Nov 06, 2024 at 09:06:39AM -0800, Sean Christopherson wrote:
> On Fri, Nov 01, 2024, Vipin Sharma wrote:
> > Had an offline discussion with Sean, providing a summary on what we
> > discussed (Sean, correct me if something is not aligned from our
> > discussion):
> >
> > We need to have a roadmap for the runner in terms of features we support.
> >
> > Phase 1: Having a basic selftest runner is useful which can:
> >
> > - Run tests parallely
>
> Maybe with a (very conversative) per test timeout? Selftests generally don't have
> the same problems as KVM-Unit-Tests (KUT), as selftests are a little better at
> guarding against waiting indefinitely, i.e. I don't think we need a configurable
> timeout. But a 120 second timeout or so would be helpful.
>
> E.g. I recently was testing a patch (of mine) that had a "minor" bug where it
> caused KVM to do a remote TLB flush on *every* SPTE update in the shadow MMU,
> which manifested as hilariously long runtimes for max_guest_memory_test. I was
> _this_ close to not catching the bug (which would have been quite embarrasing),
> because my hack-a-scripts don't use timeouts (I only noticed because a completely
> unrelated bug was causing failures).
>
> > - Provide a summary of what passed and failed, or only in case of failure.
>
> I think a summary is always warranted. And for failures, it would be helpful to
> spit out _what_ test failed, versus the annoying KUT runner's behavior of stating
> only the number of passes/failures, which forces the user to go spelunking just
> to find out what (sub)test failed.
>
> I also think the runner should have a "heartbeat" mechanism, i.e. something that
> communicates to the user that forward progress is being made. And IMO, that
> mechanism should also spit out skips and failures (this could be optional though).
> One of the flaws with the KUT runner is that it's either super noisy and super
> quiet.
>
> E.g. my mess of bash outputs this when running selftests in parallel (trimmed for
> brevity):
>
> Running selftests with npt_disabled
> Waiting for 'access_tracking_perf_test', PID '92317'
> Waiting for 'amx_test', PID '92318'
> SKIPPED amx_test
> Waiting for 'apic_bus_clock_test', PID '92319'
> Waiting for 'coalesced_io_test', PID '92321'
> Waiting for 'cpuid_test', PID '92324'
>
> ...
>
> Waiting for 'hyperv_svm_test', PID '92552'
> SKIPPED hyperv_svm_test
> Waiting for 'hyperv_tlb_flush', PID '92563'
> FAILED hyperv_tlb_flush : ret ='254'
> Random seed: 0x6b8b4567
> ==== Test Assertion Failure ====
> x86_64/hyperv_tlb_flush.c:117: val == expected
> pid=92731 tid=93548 errno=4 - Interrupted system call
> 1 0x0000000000411566: assert_on_unhandled_exception at processor.c:627
> 2 0x000000000040889a: _vcpu_run at kvm_util.c:1649
> 3 (inlined by) vcpu_run at kvm_util.c:1660
> 4 0x00000000004041a1: vcpu_thread at hyperv_tlb_flush.c:548
> 5 0x000000000043a305: start_thread at pthread_create.o:?
> 6 0x000000000045f857: __clone3 at ??:?
> val == expected
> Waiting for 'kvm_binary_stats_test', PID '92579'
>
> ...
>
> SKIPPED vmx_preemption_timer_test
> Waiting for 'vmx_set_nested_state_test', PID '93316'
> SKIPPED vmx_set_nested_state_test
> Waiting for 'vmx_tsc_adjust_test', PID '93329'
> SKIPPED vmx_tsc_adjust_test
> Waiting for 'xapic_ipi_test', PID '93350'
> Waiting for 'xapic_state_test', PID '93360'
> Waiting for 'xcr0_cpuid_test', PID '93374'
> Waiting for 'xen_shinfo_test', PID '93391'
> Waiting for 'xen_vmcall_test', PID '93405'
> Waiting for 'xss_msr_test', PID '93420'
>
> It's far from perfect, e.g. just waits in alphabetical order, but it gives me
> easy to read feedback, and signal that tests are indeed running and completing.
>
> > - Dump output which can be easily accessed and parsed.
>
> And persist the output/logs somewhere, e.g. so that the user can triage failures
> after the fact.
>
> > - Allow to run with different command line parameters.
>
> Command line parameters for tests? If so, I would put this in phase 3. I.e. make
> the goal of Phase 1 purely about running tests in parallel.
>
> > Current patch does more than this and can be simplified.
> >
> > Phase 2: Environment setup via runner
> >
> > Current patch, allows to write "setup" commands at test suite and test
> > level in the json config file to setup the environment needed by a
> > test to run. This might not be ideal as some settings are exposed
> > differently on different platforms.
> >
> > For example,
> > To enable TDP:
> > - Intel needs npt=Y
> > - AMD needs ept=Y
> > - ARM always on.
> >
> > To enable APIC virtualization
> > - Intel needs enable_apicv=Y
> > - AMD needs avic=Y
> >
> > To enable/disable nested, they both have the same file name "nested"
> > in their module params directory which should be changed.
> >
> > These kinds of settings become more verbose and unnecessary on other
> > platforms. Instead, runners should have some programming constructs
> > (API, command line options, default) to enable these options in a
> > generic way. For example, enable/disable nested can be exposed as a
> > command line --enable_nested, then based on the platform, runner can
> > update corresponding module param or ignore.
> >
> > This will easily extend to providing sane configuration on the
> > corresponding platforms without lots of hardcoding in JSON. These
> > individual constructs will provide a generic view/option to run a KVM
> > feature, and under the hood will do things differently based on the
> > platform it is running on like arm, x86-intel, x86-amd, s390, etc.
>
> My main input on this front is that the runner needs to configure module params
> (and other environment settings) _on behalf of the user_, i.e. in response to a
> command line option (to the runner), not in response to per-test configurations.
>
> One of my complaints with our internal infrastructure is that the testcases
> themselves can dictate environment settings. There are certainly benefits to
> that approach, but it really only makes sense at scale where there are many
> machines available, i.e. where the runner can achieve parallelism by running
> tests on multiple machines, and where the complexity of managing the environment
> on a per-test basis is worth the payout.
>
> For the upstream runner, I want to cater to developers, i.e. to people that are
> running tests on one or two machines. And I want the runner to rip through tests
> as fast as possible, i.e. I don't want tests to get serialized because each one
> insists on being a special snowflake and doesn't play nice with other children.
> Organizations that the have a fleet of systems can pony up the resources to develop
> their own support (on top?).
>
> Selftests can and do check for module params, and should and do use TEST_REQUIRE()
> to skip when a module param isn't set as needed. Extending that to arbitrary
> sysfs knobs should be trivial. I.e. if we get _failures_ because of an incompatible
> environment, then it's a test bug.
>
> > Phase 3: Provide collection of interesting configurations
> >
> > Specific individual constructs can be combined in a meaningful way to
> > provide interesting configurations to run on a platform. For example,
> > user doesn't need to specify each individual configuration instead,
> > some prebuilt configurations can be exposed like
> > --stress_test_shadow_mmu, --test_basic_nested
>
> IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
> command line options. Users shouldn't need to modify the runner just to bring
> their own configuration. I also think configurations should be discoverable,
> e.g. not hardcoded like KUT's unittest.cfg. A very real problem with KUT's
> approach is that testing different combinations is frustratingly difficult,
> because running a testcase with different configuration requires modifying a file
> that is tracked by git.
We have support in KUT for environment variables (which are stored in an
initrd). The feature hasn't been used too much, but x86 applies it to
configuration parameters needed to execute tests from grub, arm uses it
for an errata framework allowing tests to run on kernels which may not
include fixes to host-crashing bugs, and riscv is using them quite a bit
for providing test parameters and test expected results in order to allow
SBI tests to be run on a variety of SBI implementations. The environment
variables are provided in a text file which is not tracked by git. kvm
selftests can obviously also use environment variables by simply sourcing
them first in wrapper scripts for the tests.
>
> There are underlying issues with KUT that essentially necessitate that approach,
> e.g. x86 has several testcases that fail if run without the exact right config.
> But that's just another reason to NOT follow KUT's pattern, e.g. to force us to
> write robust tests.
>
> E.g. instead of per-config command line options, let the user specify a file,
> and/or a directory (using a well known filename pattern to detect configs).
Could also use an environment variable to specify a file which contains
a config in a test-specific format if parsing environment variables is
insufficient or awkward for configuring a test.
Thanks,
drew
>
> > Tests need to handle the environment in which they are running
> > gracefully, which many tests already do but not exhaustively. If some
> > setting is not provided or set up properly for their execution then
> > they should fail/skip accordingly.
>
> This belongs in phase 2.
>
> > Runner will not be responsible to precheck things on tests behalf.
> >
> >
> > Next steps:
> > 1. Consensus on above phases and features.
> > 2. Start development.
> >
> > Thanks,
> > Vipin
>
> --
> kvm-riscv mailing list
> kvm-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kvm-riscv
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-08 11:05 ` Andrew Jones
@ 2024-11-14 17:42 ` Sean Christopherson
2024-11-15 21:15 ` Vipin Sharma
0 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2024-11-14 17:42 UTC (permalink / raw)
To: Andrew Jones
Cc: Vipin Sharma, kvm, kvmarm, kvm-riscv, linux-arm-kernel,
Paolo Bonzini, Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On Fri, Nov 08, 2024, Andrew Jones wrote:
> On Wed, Nov 06, 2024 at 09:06:39AM -0800, Sean Christopherson wrote:
> > On Fri, Nov 01, 2024, Vipin Sharma wrote:
> > > Phase 3: Provide collection of interesting configurations
> > >
> > > Specific individual constructs can be combined in a meaningful way to
> > > provide interesting configurations to run on a platform. For example,
> > > user doesn't need to specify each individual configuration instead,
> > > some prebuilt configurations can be exposed like
> > > --stress_test_shadow_mmu, --test_basic_nested
> >
> > IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
> > command line options. Users shouldn't need to modify the runner just to bring
> > their own configuration. I also think configurations should be discoverable,
> > e.g. not hardcoded like KUT's unittest.cfg. A very real problem with KUT's
> > approach is that testing different combinations is frustratingly difficult,
> > because running a testcase with different configuration requires modifying a file
> > that is tracked by git.
>
> We have support in KUT for environment variables (which are stored in an
> initrd). The feature hasn't been used too much, but x86 applies it to
> configuration parameters needed to execute tests from grub, arm uses it
> for an errata framework allowing tests to run on kernels which may not
> include fixes to host-crashing bugs, and riscv is using them quite a bit
> for providing test parameters and test expected results in order to allow
> SBI tests to be run on a variety of SBI implementations. The environment
> variables are provided in a text file which is not tracked by git. kvm
> selftests can obviously also use environment variables by simply sourcing
> them first in wrapper scripts for the tests.
Oh hell no! :-)
For reproducibility, transparency, determinism, environment variables are pure
evil. I don't want to discover that I wasn't actually testing what I thought I
was testing because I forgot to set/purge an environment variable. Ditto for
trying to reproduce a failure reported by someone.
KUT's usage to adjust to the *system* environment is somewhat understandable
But for KVM selftests, there should be absolutely zero reason to need to fall
back to environment variables. Unlike KUT, which can run in a fairly large variety
of environments, e.g. bare metal vs. virtual, different VMMs, different firmware,
etc., KVM selftests effectively support exactly one environment.
And unlike KUT, KVM selftests are tightly coupled to the kernel. Yes, it's very
possible to run selftests against different kernels, but I don't think we should
go out of our way to support such usage. And if an environment needs to skip a
test, it should be super easy to do so if we decouple the test configuration
inputs from the test runner.
> > There are underlying issues with KUT that essentially necessitate that approach,
> > e.g. x86 has several testcases that fail if run without the exact right config.
> > But that's just another reason to NOT follow KUT's pattern, e.g. to force us to
> > write robust tests.
> >
> > E.g. instead of per-config command line options, let the user specify a file,
> > and/or a directory (using a well known filename pattern to detect configs).
>
> Could also use an environment variable to specify a file which contains
> a config in a test-specific format if parsing environment variables is
> insufficient or awkward for configuring a test.
There's no reason to use a environment variable for this. If we want to support
"advanced" setup via a test configuration, then that can simply go in configuration
file that's passed to the runner.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-06 17:06 ` Sean Christopherson
2024-11-08 11:05 ` Andrew Jones
@ 2024-11-15 20:18 ` Vipin Sharma
1 sibling, 0 replies; 10+ messages in thread
From: Vipin Sharma @ 2024-11-15 20:18 UTC (permalink / raw)
To: Sean Christopherson
Cc: kvm, kvmarm, kvm-riscv, linux-arm-kernel, Paolo Bonzini,
Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On 2024-11-06 09:06:39, Sean Christopherson wrote:
> On Fri, Nov 01, 2024, Vipin Sharma wrote:
> >
> > We need to have a roadmap for the runner in terms of features we support.
> >
> > Phase 1: Having a basic selftest runner is useful which can:
> >
> > - Run tests parallely
>
> Maybe with a (very conversative) per test timeout? Selftests generally don't have
> the same problems as KVM-Unit-Tests (KUT), as selftests are a little better at
> guarding against waiting indefinitely, i.e. I don't think we need a configurable
> timeout. But a 120 second timeout or so would be helpful.
>
> E.g. I recently was testing a patch (of mine) that had a "minor" bug where it
> caused KVM to do a remote TLB flush on *every* SPTE update in the shadow MMU,
> which manifested as hilariously long runtimes for max_guest_memory_test. I was
> _this_ close to not catching the bug (which would have been quite embarrasing),
> because my hack-a-scripts don't use timeouts (I only noticed because a completely
> unrelated bug was causing failures).
RFC code has the feature to specify timeout for each test. I will change
it so that a default value of 120 second is applied to all and remove the need
to write timeout for each individual test in json. It will be simple to
add a command line option to override this value.
>
> > - Provide a summary of what passed and failed, or only in case of failure.
>
> I think a summary is always warranted. And for failures, it would be helpful to
> spit out _what_ test failed, versus the annoying KUT runner's behavior of stating
> only the number of passes/failures, which forces the user to go spelunking just
> to find out what (sub)test failed.
Default will be to print all test statuses, one line per test. This is
what RFC code does.
For summary, I can add one line at the end which shows count of passed,
skipped, and failed tests.
Having an option to only print failed/skipped test will provide easy
way to identify them. When we print each test status then user has to
scroll up the terminal read each status to identify what failed. But with an
option to override default behavior and print only skipped, failed, or
both. This doesn't have to be in phase 1, but I think its useful and we
should add in future version.
>
> I also think the runner should have a "heartbeat" mechanism, i.e. something that
> communicates to the user that forward progress is being made. And IMO, that
> mechanism should also spit out skips and failures (this could be optional though).
> One of the flaws with the KUT runner is that it's either super noisy and super
> quiet.
A true heartbeat feature would be something like runner is sending and
receiving messages from individual selftests. That is gonna be difficult
and not worth the complexity.
To show forward progress, each test's pass/fail status can be printed
immediately (RFC runner does this). As we are planning to have a
default timeout then there will always be a forward progress.
RFC runner has output like this:
2024-11-15 09:53:52,124 | 638866 | INFO | SKIPPED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
2024-11-15 09:53:52,124 | 638866 | INFO | SKIPPED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
2024-11-15 09:53:52,124 | 638866 | INFO | SKIPPED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
2024-11-15 09:53:52,124 | 638866 | INFO | SETUP_FAILED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
2024-11-15 09:53:52,171 | 638866 | INFO | SKIPPED: x86_sanity_tests/vmx_msrs_test
2024-11-15 09:53:52,171 | 638866 | INFO | SKIPPED: x86_sanity_tests/private_mem_conversions_test
2024-11-15 09:53:52,171 | 638866 | INFO | SKIPPED: x86_sanity_tests/apic_bus_clock_test
2024-11-15 09:53:52,171 | 638866 | INFO | SETUP_FAILED: x86_sanity_tests/dirty_log_page_splitting_test
>
> E.g. my mess of bash outputs this when running selftests in parallel (trimmed for
> brevity):
>
> Running selftests with npt_disabled
> Waiting for 'access_tracking_perf_test', PID '92317'
> Waiting for 'amx_test', PID '92318'
> SKIPPED amx_test
> Waiting for 'apic_bus_clock_test', PID '92319'
> Waiting for 'coalesced_io_test', PID '92321'
> Waiting for 'cpuid_test', PID '92324'
>
> ...
>
> Waiting for 'hyperv_svm_test', PID '92552'
> SKIPPED hyperv_svm_test
> Waiting for 'hyperv_tlb_flush', PID '92563'
> FAILED hyperv_tlb_flush : ret ='254'
> Random seed: 0x6b8b4567
> ==== Test Assertion Failure ====
> x86_64/hyperv_tlb_flush.c:117: val == expected
> pid=92731 tid=93548 errno=4 - Interrupted system call
> 1 0x0000000000411566: assert_on_unhandled_exception at processor.c:627
> 2 0x000000000040889a: _vcpu_run at kvm_util.c:1649
> 3 (inlined by) vcpu_run at kvm_util.c:1660
> 4 0x00000000004041a1: vcpu_thread at hyperv_tlb_flush.c:548
> 5 0x000000000043a305: start_thread at pthread_create.o:?
> 6 0x000000000045f857: __clone3 at ??:?
> val == expected
This is interesting, i.e. printing error message inline for the failed tests.
I can add this feature.
> Waiting for 'kvm_binary_stats_test', PID '92579'
>
> ...
>
> SKIPPED vmx_preemption_timer_test
> Waiting for 'vmx_set_nested_state_test', PID '93316'
> SKIPPED vmx_set_nested_state_test
> Waiting for 'vmx_tsc_adjust_test', PID '93329'
> SKIPPED vmx_tsc_adjust_test
> Waiting for 'xapic_ipi_test', PID '93350'
> Waiting for 'xapic_state_test', PID '93360'
> Waiting for 'xcr0_cpuid_test', PID '93374'
> Waiting for 'xen_shinfo_test', PID '93391'
> Waiting for 'xen_vmcall_test', PID '93405'
> Waiting for 'xss_msr_test', PID '93420'
>
> It's far from perfect, e.g. just waits in alphabetical order, but it gives me
> easy to read feedback, and signal that tests are indeed running and completing.
>
> > - Dump output which can be easily accessed and parsed.
>
> And persist the output/logs somewhere, e.g. so that the user can triage failures
> after the fact.
>
RFC runner does this but not by default. Its provide an option to pass
a path to dump output in hierarchical folder structure.
Do you think this should be enabled by default? I prefer not, because
then it becomes a cleaning chore to delete the directories later.
> > - Allow to run with different command line parameters.
>
> Command line parameters for tests? If so, I would put this in phase 3. I.e. make
> the goal of Phase 1 purely about running tests in parallel.
Okay. Only default execution in phase 1.
>
> > Current patch does more than this and can be simplified.
> >
> > Phase 2: Environment setup via runner
> >
> > Current patch, allows to write "setup" commands at test suite and test
> > level in the json config file to setup the environment needed by a
> > test to run. This might not be ideal as some settings are exposed
> > differently on different platforms.
> >
> > For example,
> > To enable TDP:
> > - Intel needs npt=Y
> > - AMD needs ept=Y
> > - ARM always on.
> >
> > To enable APIC virtualization
> > - Intel needs enable_apicv=Y
> > - AMD needs avic=Y
> >
> > To enable/disable nested, they both have the same file name "nested"
> > in their module params directory which should be changed.
> >
> > These kinds of settings become more verbose and unnecessary on other
> > platforms. Instead, runners should have some programming constructs
> > (API, command line options, default) to enable these options in a
> > generic way. For example, enable/disable nested can be exposed as a
> > command line --enable_nested, then based on the platform, runner can
> > update corresponding module param or ignore.
> >
> > This will easily extend to providing sane configuration on the
> > corresponding platforms without lots of hardcoding in JSON. These
> > individual constructs will provide a generic view/option to run a KVM
> > feature, and under the hood will do things differently based on the
> > platform it is running on like arm, x86-intel, x86-amd, s390, etc.
>
> My main input on this front is that the runner needs to configure module params
> (and other environment settings) _on behalf of the user_, i.e. in response to a
> command line option (to the runner), not in response to per-test configurations.
>
> One of my complaints with our internal infrastructure is that the testcases
> themselves can dictate environment settings. There are certainly benefits to
> that approach, but it really only makes sense at scale where there are many
> machines available, i.e. where the runner can achieve parallelism by running
> tests on multiple machines, and where the complexity of managing the environment
> on a per-test basis is worth the payout.
>
> For the upstream runner, I want to cater to developers, i.e. to people that are
> running tests on one or two machines. And I want the runner to rip through tests
> as fast as possible, i.e. I don't want tests to get serialized because each one
> insists on being a special snowflake and doesn't play nice with other children.
> Organizations that the have a fleet of systems can pony up the resources to develop
> their own support (on top?).
>
> Selftests can and do check for module params, and should and do use TEST_REQUIRE()
> to skip when a module param isn't set as needed. Extending that to arbitrary
> sysfs knobs should be trivial. I.e. if we get _failures_ because of an incompatible
> environment, then it's a test bug.
I agree, RFC runner approach is not great in this regards, it is very
verbose and hinders parallel execution.
Just for the record, not all options can be generic, there might be some
arch specific command line options. We should first strive to have
generic options or name them in a way they can be applied to other arch
when (if ever) they add a support. I don't have good playbook for this,
I think this will be handled case by case.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-14 17:42 ` Sean Christopherson
@ 2024-11-15 21:15 ` Vipin Sharma
2024-11-21 0:29 ` Sean Christopherson
0 siblings, 1 reply; 10+ messages in thread
From: Vipin Sharma @ 2024-11-15 21:15 UTC (permalink / raw)
To: Sean Christopherson
Cc: Andrew Jones, kvm, kvmarm, kvm-riscv, linux-arm-kernel,
Paolo Bonzini, Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On 2024-11-14 09:42:32, Sean Christopherson wrote:
> On Fri, Nov 08, 2024, Andrew Jones wrote:
> > On Wed, Nov 06, 2024 at 09:06:39AM -0800, Sean Christopherson wrote:
> > > On Fri, Nov 01, 2024, Vipin Sharma wrote:
> > > > Phase 3: Provide collection of interesting configurations
> > > >
> > > > Specific individual constructs can be combined in a meaningful way to
> > > > provide interesting configurations to run on a platform. For example,
> > > > user doesn't need to specify each individual configuration instead,
> > > > some prebuilt configurations can be exposed like
> > > > --stress_test_shadow_mmu, --test_basic_nested
> > >
> > > IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
> > > command line options. Users shouldn't need to modify the runner just to bring
> > > their own configuration. I also think configurations should be discoverable,
> > > e.g. not hardcoded like KUT's unittest.cfg. A very real problem with KUT's
> > > approach is that testing different combinations is frustratingly difficult,
> > > because running a testcase with different configuration requires modifying a file
> > > that is tracked by git.
I was thinking of folks who send upstream patches, they might not have
interesting configurations to run to test. If we don't provide an option
then they might not be able to test different scenarios.
I do agree command line option might not be a great choice here, we
should keep them granular.
What if there is a shell script which has some runner commands with
different combinations? There should be a default configuration provided
to ease testing of patches for folks who might not be aware of the
configurations which maintainers generally use.
End goal is to provide good confidence to the patch submitter that they have
done good testing.
> >
> > We have support in KUT for environment variables (which are stored in an
> > initrd). The feature hasn't been used too much, but x86 applies it to
> > configuration parameters needed to execute tests from grub, arm uses it
> > for an errata framework allowing tests to run on kernels which may not
> > include fixes to host-crashing bugs, and riscv is using them quite a bit
> > for providing test parameters and test expected results in order to allow
> > SBI tests to be run on a variety of SBI implementations. The environment
> > variables are provided in a text file which is not tracked by git. kvm
> > selftests can obviously also use environment variables by simply sourcing
> > them first in wrapper scripts for the tests.
>
> Oh hell no! :-)
>
> For reproducibility, transparency, determinism, environment variables are pure
> evil. I don't want to discover that I wasn't actually testing what I thought I
> was testing because I forgot to set/purge an environment variable. Ditto for
> trying to reproduce a failure reported by someone.
>
> KUT's usage to adjust to the *system* environment is somewhat understandable
> But for KVM selftests, there should be absolutely zero reason to need to fall
> back to environment variables. Unlike KUT, which can run in a fairly large variety
> of environments, e.g. bare metal vs. virtual, different VMMs, different firmware,
> etc., KVM selftests effectively support exactly one environment.
>
> And unlike KUT, KVM selftests are tightly coupled to the kernel. Yes, it's very
> possible to run selftests against different kernels, but I don't think we should
> go out of our way to support such usage. And if an environment needs to skip a
> test, it should be super easy to do so if we decouple the test configuration
> inputs from the test runner.
Also, keeping things out of tree won't help other developers much. I want
majority of that configurations which maintainers/regular contributors
maintain locally to be upstreamed and consolidated.
>
> > > There are underlying issues with KUT that essentially necessitate that approach,
> > > e.g. x86 has several testcases that fail if run without the exact right config.
> > > But that's just another reason to NOT follow KUT's pattern, e.g. to force us to
> > > write robust tests.
> > >
> > > E.g. instead of per-config command line options, let the user specify a file,
> > > and/or a directory (using a well known filename pattern to detect configs).
> >
> > Could also use an environment variable to specify a file which contains
> > a config in a test-specific format if parsing environment variables is
> > insufficient or awkward for configuring a test.
>
> There's no reason to use a environment variable for this. If we want to support
> "advanced" setup via a test configuration, then that can simply go in configuration
> file that's passed to the runner.
Can you guys specify What does this test configuration file/directory
will look like? Also, is it gonna be a one file for one test? This might
become ugly soon.
This brings the question on how to handle the test execution when we are using
different command line parameters for individual tests which need some
specific environmnet?
Some parameters will need a very specific module or sysfs setting which
might conflict with other tests. This is why I had "test_suite" in my
json, which can provide some module, sysfs, or other host settings. But
this also added cost of duplicating tests for each/few suites.
I guess the shell script I talked about few paragraphs above, can have
some specific runner invocations which will set specific requirements of
the test and execute that specific test (RFC runner has the capabilty to execute
specific test).
Open to suggestions on a better approach.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-15 21:15 ` Vipin Sharma
@ 2024-11-21 0:29 ` Sean Christopherson
2024-11-22 22:37 ` Vipin Sharma
0 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2024-11-21 0:29 UTC (permalink / raw)
To: Vipin Sharma
Cc: Andrew Jones, kvm, kvmarm, kvm-riscv, linux-arm-kernel,
Paolo Bonzini, Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On Fri, Nov 15, 2024, Vipin Sharma wrote:
> On 2024-11-14 09:42:32, Sean Christopherson wrote:
> > On Fri, Nov 08, 2024, Andrew Jones wrote:
> > > On Wed, Nov 06, 2024 at 09:06:39AM -0800, Sean Christopherson wrote:
> > > > On Fri, Nov 01, 2024, Vipin Sharma wrote:
> > > > > Phase 3: Provide collection of interesting configurations
> > > > >
> > > > > Specific individual constructs can be combined in a meaningful way to
> > > > > provide interesting configurations to run on a platform. For example,
> > > > > user doesn't need to specify each individual configuration instead,
> > > > > some prebuilt configurations can be exposed like
> > > > > --stress_test_shadow_mmu, --test_basic_nested
> > > >
> > > > IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
> > > > command line options. Users shouldn't need to modify the runner just to bring
> > > > their own configuration. I also think configurations should be discoverable,
> > > > e.g. not hardcoded like KUT's unittest.cfg. A very real problem with KUT's
> > > > approach is that testing different combinations is frustratingly difficult,
> > > > because running a testcase with different configuration requires modifying a file
> > > > that is tracked by git.
>
> I was thinking of folks who send upstream patches, they might not have
> interesting configurations to run to test. If we don't provide an option
> then they might not be able to test different scenarios.
Yeah, I'm not saying upstream can't provide testcases, just that the existence of
testcases shouldn't be hardcoded into the runner.
> I do agree command line option might not be a great choice here, we
> should keep them granular.
>
> What if there is a shell script which has some runner commands with
> different combinations? There should be a default configuration provided
> to ease testing of patches for folks who might not be aware of the
> configurations which maintainers generally use.
>
> End goal is to provide good confidence to the patch submitter that they have
> done good testing.
As discussed off-list, I think having one testcase per file is the way to go.
- Very discoverable (literally files)
- The user (or a shell script) can use regexes, globbing, etc., to select which
tests to run
- Creating "suites" is similarly easy, e.g. by having a list of files/testscase,
or maybe by directory layout
Keeping track of testcases (and their content), e.g. to avoid duplicates, might
be an issue, but I think we can mitigate that by establishing and following
guidelines for naming, e.g. so that the name of a testcase gives the user a
decent idea of what it does.
> > > Could also use an environment variable to specify a file which contains
> > > a config in a test-specific format if parsing environment variables is
> > > insufficient or awkward for configuring a test.
> >
> > There's no reason to use a environment variable for this. If we want to support
> > "advanced" setup via a test configuration, then that can simply go in configuration
> > file that's passed to the runner.
>
> Can you guys specify What does this test configuration file/directory
> will look like? Also, is it gonna be a one file for one test? This might
> become ugly soon.
As above, I like the idea of one file per testcase. I'm not anticipating thousands
of tests. Regardless of how we organize things, mentally keeping track of that
many tests would be extremely difficult. E.g. testcases would likely bitrot and/or
we'd end up with a lot of overlap. And if we do get anywhere near that number of
testcases, they'll need to be organzied in some way.
One idea would be create a directory per KVM selftest, and then put testcases for
that test in said directory. We could even do something clever like fail the
build if a test doesn't have a corresponding directory (and a default testcase?).
E.g. tools/testing/selftests/kvm/testcases, with sub-directories following the
tests themsleves and separated by architecture as appropriate.
That us decent organization. If each test has a testcase directory, it's easy to
get a list of testcases. At that point, the name of the testcase can be used to
organize and describe, e.g. by tying the name to the (most interesting) parameters.
Hmm, and for collections of multiple testscases, what if we added a separate
"testsuites" directory, with different syntax? E.g. one file per testuite, which
is basically a list of testcases. Maybe with some magic to allow testsuites to
"include" arch specific info?
E.g. something like this
$ tree test*
testcases
└── max_guest_memory_test
└── 128gib.allvcpus.test
testsuites
└── mmu.suite
3 directories, 2 files
$ cat testsuites/mmu.suite
$ARCH/mmu.suite
max_guest_memory_test/128gib.allvcpus.test
> This brings the question on how to handle the test execution when we are using
> different command line parameters for individual tests which need some
> specific environmnet?
>
> Some parameters will need a very specific module or sysfs setting which
> might conflict with other tests. This is why I had "test_suite" in my
> json, which can provide some module, sysfs, or other host settings. But
> this also added cost of duplicating tests for each/few suites.
IMO, this should be handled by user, or their CI environment, not by the upstream
runner. Reconfiguring module params or sysfs knobs is inherently environment
specific. E.g. not all KVM module params are writable post-load, and so changing
a param might require stopping all VMs on the system, or even a full kernel reboot
if KVM is built-in. There may also be knobs that require root access, that are
dependent on hardware and/or kernel config, etc.
I really don't want to build all that into the upstream test runner, certainly not
in the first few phases. I 100% think the runner should be constructed in such a
way that people/organizations/CI pipeines can build infrastructure on top, I just
don't think it's a big need or a good fit for upstream.
> I guess the shell script I talked about few paragraphs above, can have
> some specific runner invocations which will set specific requirements of
> the test and execute that specific test (RFC runner has the capabilty to execute
> specific test).
>
> Open to suggestions on a better approach.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/1] KVM selftests runner for running more than just default
2024-11-21 0:29 ` Sean Christopherson
@ 2024-11-22 22:37 ` Vipin Sharma
0 siblings, 0 replies; 10+ messages in thread
From: Vipin Sharma @ 2024-11-22 22:37 UTC (permalink / raw)
To: Sean Christopherson
Cc: Andrew Jones, kvm, kvmarm, kvm-riscv, linux-arm-kernel,
Paolo Bonzini, Anup Patel, Christian Borntraeger, Janosch Frank,
Claudio Imbrenda, Marc Zyngier, Oliver Upton, linux-kernel
On 2024-11-20 16:29:16, Sean Christopherson wrote:
> On Fri, Nov 15, 2024, Vipin Sharma wrote:
> > On 2024-11-14 09:42:32, Sean Christopherson wrote:
> > > On Fri, Nov 08, 2024, Andrew Jones wrote:
> > > > On Wed, Nov 06, 2024 at 09:06:39AM -0800, Sean Christopherson wrote:
> > > > > On Fri, Nov 01, 2024, Vipin Sharma wrote:
> As discussed off-list, I think having one testcase per file is the way to go.
>
> - Very discoverable (literally files)
> - The user (or a shell script) can use regexes, globbing, etc., to select which
> tests to run
> - Creating "suites" is similarly easy, e.g. by having a list of files/testscase,
> or maybe by directory layout
>
> Keeping track of testcases (and their content), e.g. to avoid duplicates, might
> be an issue, but I think we can mitigate that by establishing and following
> guidelines for naming, e.g. so that the name of a testcase gives the user a
> decent idea of what it does.
>
> > > > Could also use an environment variable to specify a file which contains
> > > > a config in a test-specific format if parsing environment variables is
> > > > insufficient or awkward for configuring a test.
> > >
> > > There's no reason to use a environment variable for this. If we want to support
> > > "advanced" setup via a test configuration, then that can simply go in configuration
> > > file that's passed to the runner.
> >
> > Can you guys specify What does this test configuration file/directory
> > will look like? Also, is it gonna be a one file for one test? This might
> > become ugly soon.
>
> As above, I like the idea of one file per testcase. I'm not anticipating thousands
> of tests. Regardless of how we organize things, mentally keeping track of that
> many tests would be extremely difficult. E.g. testcases would likely bitrot and/or
> we'd end up with a lot of overlap. And if we do get anywhere near that number of
> testcases, they'll need to be organzied in some way.
>
> One idea would be create a directory per KVM selftest, and then put testcases for
> that test in said directory. We could even do something clever like fail the
> build if a test doesn't have a corresponding directory (and a default testcase?).
>
> E.g. tools/testing/selftests/kvm/testcases, with sub-directories following the
> tests themsleves and separated by architecture as appropriate.
>
> That us decent organization. If each test has a testcase directory, it's easy to
> get a list of testcases. At that point, the name of the testcase can be used to
> organize and describe, e.g. by tying the name to the (most interesting) parameters.
Sounds good. Extending your example for testcases given below this what
I am imagining ordering will be:
testcases/
├── aarch64
│ └── arch_timer
│ └── default
├── memslot_modification_stress_test
│ ├── 128gib.allvcpus.partitioned_memory_access
│ ├── default
│ └── x86_64
│ └── disable_slot_zap_quirk
├── riscv
│ └── arch_timer
│ └── default
├── s390x
├── steal_time
│ └── default
└── x86_64
├── amx_test
│ └── default
└── private_mem_conversions_test
├── 2vcpu.2memslots
└── default
1. Testcases will follow directory structure of the test source files.
2. "default" will have just path of the test relative to
tools/testing/selftests/kvm directory and no arguments provided in it.
3. Extra tests can be provided as separate files with meaningful names.
4. Some tests (memslot_modification_stress_test) have arch specific
options. Those testcases will be under arch specific directory of
that specific testcase directory.
5. Runner will be provided with a directory path and it will run all
files in that and their subdirectories recursively.
6. Runner will also filter out testcases based on filepath. For example
if running on ARM platform it will ignore all filepaths which have
x86_64, s390, and riscv in their path anywhere.
7. If user wants to save output of runner, then output dump will follow
the same directory structure.
>
> Hmm, and for collections of multiple testscases, what if we added a separate
> "testsuites" directory, with different syntax? E.g. one file per testuite, which
> is basically a list of testcases. Maybe with some magic to allow testsuites to
> "include" arch specific info?
>
> E.g. something like this
>
> $ tree test*
> testcases
> └── max_guest_memory_test
> └── 128gib.allvcpus.test
> testsuites
> └── mmu.suite
>
> 3 directories, 2 files
> $ cat testsuites/mmu.suite
> $ARCH/mmu.suite
> max_guest_memory_test/128gib.allvcpus.test
This can be one option. For now lets table testsuites discussion. We will
revisit if after Phase1 is completed.
>
> > This brings the question on how to handle the test execution when we are using
> > different command line parameters for individual tests which need some
> > specific environmnet?
> >
> > Some parameters will need a very specific module or sysfs setting which
> > might conflict with other tests. This is why I had "test_suite" in my
> > json, which can provide some module, sysfs, or other host settings. But
> > this also added cost of duplicating tests for each/few suites.
>
> IMO, this should be handled by user, or their CI environment, not by the upstream
> runner. Reconfiguring module params or sysfs knobs is inherently environment
> specific. E.g. not all KVM module params are writable post-load, and so changing
> a param might require stopping all VMs on the system, or even a full kernel reboot
> if KVM is built-in. There may also be knobs that require root access, that are
> dependent on hardware and/or kernel config, etc.
>
> I really don't want to build all that into the upstream test runner, certainly not
> in the first few phases. I 100% think the runner should be constructed in such a
> way that people/organizations/CI pipeines can build infrastructure on top, I just
> don't think it's a big need or a good fit for upstream.
For phase 2 discussion, you responded:
> Phase 2: Environment setup via runner
>
> Current patch, allows to write "setup" commands at test suite and test
> level in the json config file to setup the environment needed by a
> test to run. This might not be ideal as some settings are exposed
> differently on different platforms.
>
> For example,
> To enable TDP:
> - Intel needs npt=Y
> - AMD needs ept=Y
> - ARM always on.
>
> To enable APIC virtualization
> - Intel needs enable_apicv=Y
> - AMD needs avic=Y
>
> To enable/disable nested, they both have the same file name "nested"
> in their module params directory which should be changed.
>
> These kinds of settings become more verbose and unnecessary on other
> platforms. Instead, runners should have some programming constructs
> (API, command line options, default) to enable these options in a
> generic way. For example, enable/disable nested can be exposed as a
> command line --enable_nested, then based on the platform, runner can
> update corresponding module param or ignore.
>
> This will easily extend to providing sane configuration on the
> corresponding platforms without lots of hardcoding in JSON. These
> individual constructs will provide a generic view/option to run a KVM
> feature, and under the hood will do things differently based on the
> platform it is running on like arm, x86-intel, x86-amd, s390, etc.
My main input on this front is that the runner needs to configure module params
(and other environment settings) _on behalf of the user_, i.e. in response to a
command line option (to the runner), not in response to per-test configurations.
Should we still write code in runner for setting parameters which are
writable/modifiable or just leave it? Our current plan is to provide
some command line options to set those things but it is true that we
might not be able to set everything via runner.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-11-22 22:39 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240821223012.3757828-1-vipinsh@google.com>
2024-08-22 20:55 ` [RFC PATCH 0/1] KVM selftests runner for running more than just default Vipin Sharma
2024-11-01 22:13 ` Vipin Sharma
2024-11-06 17:06 ` Sean Christopherson
2024-11-08 11:05 ` Andrew Jones
2024-11-14 17:42 ` Sean Christopherson
2024-11-15 21:15 ` Vipin Sharma
2024-11-21 0:29 ` Sean Christopherson
2024-11-22 22:37 ` Vipin Sharma
2024-11-15 20:18 ` Vipin Sharma
[not found] ` <20240821223012.3757828-2-vipinsh@google.com>
2024-08-22 20:56 ` [PATCH 1/1] KVM: selftestsi: Create KVM selftests runnner to run interesting tests Vipin Sharma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).