Re: [PATCH v7 06/11] KVM: selftests: Add NX huge pages test

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Ben Gardon <bgardon@google.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
	David Matlack <dmatlack@google.com>,
	Jim Mattson <jmattson@google.com>,
	David Dunn <daviddunn@google.com>,
	Jing Zhang <jingzhangos@google.com>,
	Junaid Shahid <junaids@google.com>
Subject: Re: [PATCH v7 06/11] KVM: selftests: Add NX huge pages test
Date: Thu, 5 May 2022 18:59:43 +0000	[thread overview]
Message-ID: <YnQen5rO3t/2WiLi@google.com> (raw)
In-Reply-To: <20220503183045.978509-7-bgardon@google.com>

On Tue, May 03, 2022, Ben Gardon wrote:
> There's currently no test coverage of NX hugepages in KVM selftests, so
> add a basic test to ensure that the feature works as intended.

Please describe how the test actually validates the feature, here and/or as a
file comment.  Doesn't have to be super detailed, but people shouldn't have to
read the code just to understand that the test uses stats to verify the KVM is
(or isn't) splitting pages as expected.

> Reviewed-by: David Matlack <dmatlack@google.com>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |  10 +
>  .../selftests/kvm/include/kvm_util_base.h     |   1 +
>  tools/testing/selftests/kvm/lib/kvm_util.c    |  80 ++++++++
>  .../selftests/kvm/x86_64/nx_huge_pages_test.c | 180 ++++++++++++++++++
>  .../kvm/x86_64/nx_huge_pages_test.sh          |  36 ++++

Test needs to be added to .gitignore.

>  5 files changed, 307 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
>  create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

...

> + * Input Args:
> + *   vm - the VM for which the stat should be read
> + *   stat_name - the name of the stat to read
> + *   max_elements - the maximum number of 8-byte values to read into data
> + *
> + * Output Args:
> + *   data - the buffer into which stat data should be read
> + *
> + * Return:
> + *   The number of data elements read into data or -ERRNO on error.
> + *
> + * Read the data values of a specified stat from the binary stats interface.
> + */
> +static int __vm_get_stat(struct kvm_vm *vm, const char *stat_name,
> +			 uint64_t *data, ssize_t max_elements)
> +{
> +	struct kvm_stats_desc *stats_desc;
> +	struct kvm_stats_header header;
> +	struct kvm_stats_desc *desc;
> +	size_t size_desc;
> +	int stats_fd;
> +	int ret = -EINVAL;
> +	int i;
> +
> +	stats_fd = vm_get_stats_fd(vm);
> +
> +	read_stats_header(stats_fd, &header);
> +
> +	stats_desc = read_stats_desc(stats_fd, &header);
> +
> +	size_desc = sizeof(struct kvm_stats_desc) + header.name_size;
> +
> +	/* Read kvm stats data one by one */

Bad copy+paste.  This doesn't read every stat, it reads only the specified stat.

> +	for (i = 0; i < header.num_desc; ++i) {
> +		desc = (void *)stats_desc + (i * size_desc);

		desc = get_stats_descriptor(vm->stats_desc, i, vm->stats_header);

> +
> +		if (strcmp(desc->name, stat_name))
> +			continue;
> +
> +		ret = read_stat_data(stats_fd, &header, desc, data,
> +				     max_elements);
> +
> +		break;
> +	}
> +
> +	free(stats_desc);
> +	close(stats_fd);
> +	return ret;
> +}
> +
> +/*
> + * Read the value of the named stat
> + *
> + * Input Args:
> + *   vm - the VM for which the stat should be read
> + *   stat_name - the name of the stat to read
> + *
> + * Output Args: None
> + *
> + * Return:
> + *   The value of the stat
> + *
> + * Reads the value of the named stat through the binary stat interface. If
> + * the named stat has multiple data elements, only the first will be returned.
> + */
> +uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)

One read_stat_data() doesn't return an int, this can be inlined.  Yeah, it'll
expose __vm_get_stat(), but IMO there's no point in having __vm_get_stat() unless
it's exposed.  At that point, drop the function comment and defer to the inner
helper for that detailed info (or even drop it entirely).

> +{
> +	uint64_t data;
> +	int ret;
> +
> +	ret = __vm_get_stat(vm, stat_name, &data, 1);
> +	TEST_ASSERT(ret == 1,
> +		    "Stat %s expected to have 1 element, but %d returned",
> +		    stat_name, ret);
> +	return data;
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> new file mode 100644
> index 000000000000..238a6047791c
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_test.c
> + *
> + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> + * environment setup and teardown
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <time.h>
> +
> +#include <test_util.h>
> +#include "kvm_util.h"
> +
> +#define HPAGE_SLOT		10
> +#define HPAGE_GVA		(23*1024*1024)
> +#define HPAGE_GPA		(10*1024*1024)

Is there any special meaning behind the addresses?  If not, IMO it's safer to
define the GPA to be 4gb, that way there's no chance of this colliding with
memslot0 created by the framework.  10mb (if my math isn't terrible) isn't all
that high.  And unless the GPA and GVA need to be different for some reason, just
identity map them.

> +#define HPAGE_SLOT_NPAGES	(512 * 3)
> +#define PAGE_SIZE		4096

commit e852be8b148e ("kvm: selftests: introduce and use more page size-related constants")
in kvm/master defines PAGE_SIZE.  I bring that up, because rather than open code
the "512" math in multiple places, I would much rather do something like

#define PAGE_SIZE_2MB		(PAGE_SIZE * 512)

or whatever, and then use that.

> +
> +/*
> + * Passed by nx_huge_pages_test.sh to provide an easy warning if this test is
> + * being run without it.
> + */
> +#define MAGIC_TOKEN 887563923
> +
> +/*
> + * x86 opcode for the return instruction. Used to call into, and then
> + * immediately return from, memory backed with hugepages.
> + */
> +#define RETURN_OPCODE 0xC3
> +
> +/*
> + * Exit the VM after each memory access so that the userspace component of the
> + * test can make assertions about the pages backing the VM.
> + *
> + * See the below for an explanation of how each access should affect the
> + * backing mappings.
> + */
> +void guest_code(void)
> +{
> +	uint64_t hpage_1 = HPAGE_GVA;
> +	uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
> +	uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(1);
> +
> +	READ_ONCE(*(uint64_t *)hpage_2);
> +	GUEST_SYNC(2);
> +
> +	((void (*)(void)) hpage_1)();

LOL, nice.  It'd be very, very helpful for readers to add a helper for this, e.g.

static guest_do_CALL(void *target)
{
	((void (*)(void)) target)();
}

and then the usage should be a little more obvious.

	guest_do_CALL(hpage_1);

> +	GUEST_SYNC(3);
> +
> +	((void (*)(void)) hpage_3)();
> +	GUEST_SYNC(4);
> +
> +	READ_ONCE(*(uint64_t *)hpage_1);
> +	GUEST_SYNC(5);
> +
> +	READ_ONCE(*(uint64_t *)hpage_3);
> +	GUEST_SYNC(6);
> +}

...

> +int main(int argc, char **argv)
> +{
> +       struct kvm_vm *vm;
> +       struct timespec ts;
> +       void *hva;
> +
> +       if (argc != 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {

Since this will take multiple params, I think it makes senes to skip on:

	if (argc < 2 || strtol(argv[1], NULL, 0) != MAGIC_TOKEN) {

And the error out on the remaining params.

> +               printf("This test must be run through nx_huge_pages_test.sh");

Needs a newline at the end.  Even better, just use print_skip().  And I strongly
prefer that the skip message complain about not getting the correct magic token,
not about the script.  It's totally valid to run the test without the script.
I think it's a good idea to provide a redirect to the script, but yell about the
magic token first.

> +               return KSFT_SKIP;
> +       }

...

> +	hva = addr_gpa2hva(vm, HPAGE_GPA);
> +	memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);

Heh, no need to set the entire page, only the first byte needs to be written.  If
you're going for paranoia, write 0xcc to the entire slot and then write only the
first byte to RET.

> +	/*
> +	 * Give recovery thread time to run. The wrapper script sets
> +	 * recovery_period_ms to 100, so wait 5x that.
> +	 */
> +	ts.tv_sec = 0;
> +	ts.tv_nsec = 500000000;
> +	nanosleep(&ts, NULL);

Please pass in the configured nx_huge_pages_recovery_period_ms, or alternatively
read it from within the test.  I assume the test will fail if the period is too
low, so some sanity checks are likely in order.  It'll be unfortunate if someone
changes the script and forgets to update the test.

> +
> +	/*
> +	 * Now that the reclaimer has run, all the split pages should be gone.
> +	 */
> +	check_2m_page_count(vm, 1);
> +	check_split_count(vm, 0);
> +
> +	/*
> +	 * The 4k mapping on hpage 3 should have been removed, so check that
> +	 * reading from it causes a huge page mapping to be installed.
> +	 */
> +	vcpu_run(vm, 0);
> +	check_2m_page_count(vm, 2);
> +	check_split_count(vm, 0);
> +
> +	kvm_vm_free(vm);
> +
> +	return 0;
> +}
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> new file mode 100755
> index 000000000000..60bfed8181b9
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -0,0 +1,36 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0-only */
> +#
> +# Wrapper script which performs setup and cleanup for nx_huge_pages_test.
> +# Makes use of root privileges to set up huge pages and KVM module parameters.
> +#
> +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> +# Copyright (C) 2022, Google LLC.
> +
> +set -e
> +
> +NX_HUGE_PAGES=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages)
> +NX_HUGE_PAGES_RECOVERY_RATIO=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> +NX_HUGE_PAGES_RECOVERY_PERIOD=$(sudo cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> +HUGE_PAGES=$(sudo cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)

I think we can omit sudo on these.  Since we're assuming sysfs is mounted at the
normal path, we can also likely assume it's mounted with normal permissions too,
i.e. read-only for non-root users.

> +
> +set +e
> +
> +(
> +	set -e
> +
> +	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages

"sudo echo" doesn't work, the redirection is done by the shell, not echo itself
(which is run with sudo).  This needs to be e.g.

	echo 1 | sudo tee -a /sys/module/kvm/parameters/nx_huge_pages > /dev/null

> +	sudo echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +	sudo echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +	sudo echo 3 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

So, what happens if the user is running something else that needs huge pages?  Feels
like the script should read the value and add three, not just blindly write '3'.

> +
> +	"$(dirname $0)"/nx_huge_pages_test 887563923
> +)
> +RET=$?
> +
> +sudo echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> +sudo echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +sudo echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +sudo echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +exit $RET
> -- 
> 2.36.0.464.gb9c8b46e94-goog
>

next prev parent reply	other threads:[~2022-05-05 18:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03 18:30 [PATCH v7 00/11] KVM: x86: Add a cap to disable NX hugepages on a VM Ben Gardon
2022-05-03 18:30 ` [PATCH v7 01/11] KVM: selftests: Remove dynamic memory allocation for stats header Ben Gardon
2022-05-03 18:30 ` [PATCH v7 02/11] KVM: selftests: Read binary stats header in lib Ben Gardon
2022-05-03 18:30 ` [PATCH v7 03/11] KVM: selftests: Read binary stats desc " Ben Gardon
2022-05-05 17:08   ` Sean Christopherson
2022-05-05 17:13     ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 04/11] KVM: selftests: Clean up coding style in binary stats test Ben Gardon
2022-05-03 18:30 ` [PATCH v7 05/11] KVM: selftests: Read binary stat data in lib Ben Gardon
2022-05-05 18:06   ` Sean Christopherson
2022-05-05 18:38     ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 06/11] KVM: selftests: Add NX huge pages test Ben Gardon
2022-05-05 18:59   ` Sean Christopherson [this message]
2022-05-03 18:30 ` [PATCH v7 07/11] KVM: x86: Fix errant brace in KVM capability handling Ben Gardon
2022-05-03 18:30 ` [PATCH v7 08/11] KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis Ben Gardon
2022-05-03 18:30 ` [PATCH v7 09/11] KVM: selftests: Factor out calculation of pages needed for a VM Ben Gardon
2022-05-03 18:30 ` [PATCH v7 10/11] KVM: selftests: Test disabling NX hugepages on " Ben Gardon
2022-05-03 18:34   ` Ben Gardon
2022-05-05 19:14   ` Sean Christopherson
2022-05-03 18:30 ` [PATCH v7 11/11] KVM: selftests: Cache binary stats metadata for duration of test Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnQen5rO3t/2WiLi@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=daviddunn@google.com \
    --cc=dmatlack@google.com \
    --cc=jingzhangos@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox