public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Junaid Shahid <junaids@google.com>, kvm@vger.kernel.org
Cc: jmattson@google.com, seanjc@google.com, bgardon@google.com,
	dmatlack@google.com
Subject: Re: [PATCH] kvm: x86: mmu: Make NX huge page recovery period configurable
Date: Sat, 16 Oct 2021 09:00:22 +0200	[thread overview]
Message-ID: <f513e90b-8eed-53ef-6c80-e588f1260e18@redhat.com> (raw)
In-Reply-To: <20211016005052.3820023-1-junaids@google.com>

On 16/10/21 02:50, Junaid Shahid wrote:
> Currently, the NX huge page recovery thread wakes up every minute and
> zaps 1/nx_huge_pages_recovery_ratio of the total number of split NX
> huge pages at a time. This is intended to ensure that only a
> relatively small number of pages get zapped at a time. But for very
> large VMs (or more specifically, VMs with a large number of
> executable pages), a period of 1 minute could still result in this
> number being too high (unless the ratio is changed significantly,
> but that can result in split pages lingering on for too long).
> 
> This change makes the period configurable instead of fixing it at
> 1 minute (though that is still kept as the default). Users of
> large VMs can then adjust both the ratio and the period together to
> reduce the number of pages zapped at one time while still
> maintaining the same overall duration for cycling through the
> entire list (e.g. the period could be set to 1 second and the ratio
> to 3600 to maintain the overall cycling period of 1 hour).

The patch itself looks good, but perhaps the default (corresponding to a 
period of 0) could be 3600000 / ratio, so that it's enough to adjust the 
ratio?

Paolo

> 
> Signed-off-by: Junaid Shahid <junaids@google.com>
> ---
>   .../admin-guide/kernel-parameters.txt         |  8 ++++-
>   arch/x86/kvm/mmu/mmu.c                        | 33 +++++++++++++------
>   2 files changed, 30 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 91ba391f9b32..8e2b426726e5 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2353,7 +2353,13 @@
>   			[KVM] Controls how many 4KiB pages are periodically zapped
>   			back to huge pages.  0 disables the recovery, otherwise if
>   			the value is N KVM will zap 1/Nth of the 4KiB pages every
> -			minute.  The default is 60.
> +			period (see below).  The default is 60.
> +
> +	kvm.nx_huge_pages_recovery_period_ms=
> +			[KVM] Controls the time period at which KVM zaps 4KiB pages
> +			back to huge pages. 0 disables the recovery, otherwise if
> +			the value is N, KVM will zap a portion (see ratio above) of
> +			the pages every N msecs. The default is 60000 (i.e. 1 min).
>   
>   	kvm-amd.nested=	[KVM,AMD] Allow nested virtualization in KVM/SVM.
>   			Default is 1 (enabled)
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 24a9f4c3f5e7..47e113fc05ab 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -61,28 +61,33 @@ int __read_mostly nx_huge_pages = -1;
>   #ifdef CONFIG_PREEMPT_RT
>   /* Recovery can cause latency spikes, disable it for PREEMPT_RT.  */
>   static uint __read_mostly nx_huge_pages_recovery_ratio = 0;
> +static uint __read_mostly nx_huge_pages_recovery_period_ms = 0;
>   #else
>   static uint __read_mostly nx_huge_pages_recovery_ratio = 60;
> +static uint __read_mostly nx_huge_pages_recovery_period_ms = 60000;
>   #endif
>   
>   static int set_nx_huge_pages(const char *val, const struct kernel_param *kp);
> -static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp);
> +static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp);
>   
>   static const struct kernel_param_ops nx_huge_pages_ops = {
>   	.set = set_nx_huge_pages,
>   	.get = param_get_bool,
>   };
>   
> -static const struct kernel_param_ops nx_huge_pages_recovery_ratio_ops = {
> -	.set = set_nx_huge_pages_recovery_ratio,
> +static const struct kernel_param_ops nx_huge_pages_recovery_param_ops = {
> +	.set = set_nx_huge_pages_recovery_param,
>   	.get = param_get_uint,
>   };
>   
>   module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644);
>   __MODULE_PARM_TYPE(nx_huge_pages, "bool");
> -module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_ratio_ops,
> +module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_param_ops,
>   		&nx_huge_pages_recovery_ratio, 0644);
>   __MODULE_PARM_TYPE(nx_huge_pages_recovery_ratio, "uint");
> +module_param_cb(nx_huge_pages_recovery_period_ms, &nx_huge_pages_recovery_param_ops,
> +		&nx_huge_pages_recovery_period_ms, 0644);
> +__MODULE_PARM_TYPE(nx_huge_pages_recovery_period_ms, "uint");
>   
>   static bool __read_mostly force_flush_and_sync_on_reuse;
>   module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644);
> @@ -6088,18 +6093,23 @@ void kvm_mmu_module_exit(void)
>   	mmu_audit_disable();
>   }
>   
> -static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp)
> +static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp)
>   {
> -	unsigned int old_val;
> +	bool was_recovery_enabled, is_recovery_enabled;
>   	int err;
>   
> -	old_val = nx_huge_pages_recovery_ratio;
> +	was_recovery_enabled = nx_huge_pages_recovery_ratio &&
> +			       nx_huge_pages_recovery_period_ms;
> +
>   	err = param_set_uint(val, kp);
>   	if (err)
>   		return err;
>   
> +	is_recovery_enabled = nx_huge_pages_recovery_ratio &&
> +			      nx_huge_pages_recovery_period_ms;
> +
>   	if (READ_ONCE(nx_huge_pages) &&
> -	    !old_val && nx_huge_pages_recovery_ratio) {
> +	    !was_recovery_enabled && is_recovery_enabled) {
>   		struct kvm *kvm;
>   
>   		mutex_lock(&kvm_lock);
> @@ -6162,8 +6172,11 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
>   
>   static long get_nx_lpage_recovery_timeout(u64 start_time)
>   {
> -	return READ_ONCE(nx_huge_pages) && READ_ONCE(nx_huge_pages_recovery_ratio)
> -		? start_time + 60 * HZ - get_jiffies_64()
> +	uint ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
> +	uint period = READ_ONCE(nx_huge_pages_recovery_period_ms);
> +
> +	return READ_ONCE(nx_huge_pages) && ratio && period
> +		? start_time + msecs_to_jiffies(period) - get_jiffies_64()
>   		: MAX_SCHEDULE_TIMEOUT;
>   }
>   
> 


  reply	other threads:[~2021-10-16  7:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-16  0:50 [PATCH] kvm: x86: mmu: Make NX huge page recovery period configurable Junaid Shahid
2021-10-16  7:00 ` Paolo Bonzini [this message]
2021-10-18 18:55   ` Junaid Shahid

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f513e90b-8eed-53ef-6c80-e588f1260e18@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=bgardon@google.com \
    --cc=dmatlack@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox