From: Paolo Bonzini <pbonzini@redhat.com>
To: Junaid Shahid <junaids@google.com>, kvm@vger.kernel.org
Cc: jmattson@google.com, seanjc@google.com, bgardon@google.com,
dmatlack@google.com
Subject: Re: [PATCH] kvm: x86: mmu: Make NX huge page recovery period configurable
Date: Sat, 16 Oct 2021 09:00:22 +0200 [thread overview]
Message-ID: <f513e90b-8eed-53ef-6c80-e588f1260e18@redhat.com> (raw)
In-Reply-To: <20211016005052.3820023-1-junaids@google.com>
On 16/10/21 02:50, Junaid Shahid wrote:
> Currently, the NX huge page recovery thread wakes up every minute and
> zaps 1/nx_huge_pages_recovery_ratio of the total number of split NX
> huge pages at a time. This is intended to ensure that only a
> relatively small number of pages get zapped at a time. But for very
> large VMs (or more specifically, VMs with a large number of
> executable pages), a period of 1 minute could still result in this
> number being too high (unless the ratio is changed significantly,
> but that can result in split pages lingering on for too long).
>
> This change makes the period configurable instead of fixing it at
> 1 minute (though that is still kept as the default). Users of
> large VMs can then adjust both the ratio and the period together to
> reduce the number of pages zapped at one time while still
> maintaining the same overall duration for cycling through the
> entire list (e.g. the period could be set to 1 second and the ratio
> to 3600 to maintain the overall cycling period of 1 hour).
The patch itself looks good, but perhaps the default (corresponding to a
period of 0) could be 3600000 / ratio, so that it's enough to adjust the
ratio?
Paolo
>
> Signed-off-by: Junaid Shahid <junaids@google.com>
> ---
> .../admin-guide/kernel-parameters.txt | 8 ++++-
> arch/x86/kvm/mmu/mmu.c | 33 +++++++++++++------
> 2 files changed, 30 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 91ba391f9b32..8e2b426726e5 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2353,7 +2353,13 @@
> [KVM] Controls how many 4KiB pages are periodically zapped
> back to huge pages. 0 disables the recovery, otherwise if
> the value is N KVM will zap 1/Nth of the 4KiB pages every
> - minute. The default is 60.
> + period (see below). The default is 60.
> +
> + kvm.nx_huge_pages_recovery_period_ms=
> + [KVM] Controls the time period at which KVM zaps 4KiB pages
> + back to huge pages. 0 disables the recovery, otherwise if
> + the value is N, KVM will zap a portion (see ratio above) of
> + the pages every N msecs. The default is 60000 (i.e. 1 min).
>
> kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
> Default is 1 (enabled)
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 24a9f4c3f5e7..47e113fc05ab 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -61,28 +61,33 @@ int __read_mostly nx_huge_pages = -1;
> #ifdef CONFIG_PREEMPT_RT
> /* Recovery can cause latency spikes, disable it for PREEMPT_RT. */
> static uint __read_mostly nx_huge_pages_recovery_ratio = 0;
> +static uint __read_mostly nx_huge_pages_recovery_period_ms = 0;
> #else
> static uint __read_mostly nx_huge_pages_recovery_ratio = 60;
> +static uint __read_mostly nx_huge_pages_recovery_period_ms = 60000;
> #endif
>
> static int set_nx_huge_pages(const char *val, const struct kernel_param *kp);
> -static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp);
> +static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp);
>
> static const struct kernel_param_ops nx_huge_pages_ops = {
> .set = set_nx_huge_pages,
> .get = param_get_bool,
> };
>
> -static const struct kernel_param_ops nx_huge_pages_recovery_ratio_ops = {
> - .set = set_nx_huge_pages_recovery_ratio,
> +static const struct kernel_param_ops nx_huge_pages_recovery_param_ops = {
> + .set = set_nx_huge_pages_recovery_param,
> .get = param_get_uint,
> };
>
> module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644);
> __MODULE_PARM_TYPE(nx_huge_pages, "bool");
> -module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_ratio_ops,
> +module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_param_ops,
> &nx_huge_pages_recovery_ratio, 0644);
> __MODULE_PARM_TYPE(nx_huge_pages_recovery_ratio, "uint");
> +module_param_cb(nx_huge_pages_recovery_period_ms, &nx_huge_pages_recovery_param_ops,
> + &nx_huge_pages_recovery_period_ms, 0644);
> +__MODULE_PARM_TYPE(nx_huge_pages_recovery_period_ms, "uint");
>
> static bool __read_mostly force_flush_and_sync_on_reuse;
> module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644);
> @@ -6088,18 +6093,23 @@ void kvm_mmu_module_exit(void)
> mmu_audit_disable();
> }
>
> -static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp)
> +static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp)
> {
> - unsigned int old_val;
> + bool was_recovery_enabled, is_recovery_enabled;
> int err;
>
> - old_val = nx_huge_pages_recovery_ratio;
> + was_recovery_enabled = nx_huge_pages_recovery_ratio &&
> + nx_huge_pages_recovery_period_ms;
> +
> err = param_set_uint(val, kp);
> if (err)
> return err;
>
> + is_recovery_enabled = nx_huge_pages_recovery_ratio &&
> + nx_huge_pages_recovery_period_ms;
> +
> if (READ_ONCE(nx_huge_pages) &&
> - !old_val && nx_huge_pages_recovery_ratio) {
> + !was_recovery_enabled && is_recovery_enabled) {
> struct kvm *kvm;
>
> mutex_lock(&kvm_lock);
> @@ -6162,8 +6172,11 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
>
> static long get_nx_lpage_recovery_timeout(u64 start_time)
> {
> - return READ_ONCE(nx_huge_pages) && READ_ONCE(nx_huge_pages_recovery_ratio)
> - ? start_time + 60 * HZ - get_jiffies_64()
> + uint ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
> + uint period = READ_ONCE(nx_huge_pages_recovery_period_ms);
> +
> + return READ_ONCE(nx_huge_pages) && ratio && period
> + ? start_time + msecs_to_jiffies(period) - get_jiffies_64()
> : MAX_SCHEDULE_TIMEOUT;
> }
>
>
next prev parent reply other threads:[~2021-10-16 7:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-16 0:50 [PATCH] kvm: x86: mmu: Make NX huge page recovery period configurable Junaid Shahid
2021-10-16 7:00 ` Paolo Bonzini [this message]
2021-10-18 18:55 ` Junaid Shahid
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f513e90b-8eed-53ef-6c80-e588f1260e18@redhat.com \
--to=pbonzini@redhat.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=jmattson@google.com \
--cc=junaids@google.com \
--cc=kvm@vger.kernel.org \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox