Re: [PATCH] i386/kvm: Prefault memory on page state change

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Tom Lendacky <thomas.lendacky@amd.com>
To: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	"Roth, Michael" <Michael.Roth@amd.com>
Subject: Re: [PATCH] i386/kvm: Prefault memory on page state change
Date: Mon, 2 Jun 2025 08:17:53 -0500	[thread overview]
Message-ID: <4bcfe4ff-939e-f669-8d80-4077cc7aeb60@amd.com> (raw)
In-Reply-To: <f5411c42340bd2f5c14972551edb4e959995e42b.1743193824.git.thomas.lendacky@amd.com>

On 3/28/25 15:30, Lendacky, Thomas wrote:
> A page state change is typically followed by an access of the page(s) and
> results in another VMEXIT in order to map the page into the nested page
> table. Depending on the size of page state change request, this can
> generate a number of additional VMEXITs. For example, under SNP, when
> Linux is utilizing lazy memory acceptance, memory is typically accepted in
> 4M chunks. A page state change request is submitted to mark the pages as
> private, followed by validation of the memory. Since the guest_memfd
> currently only supports 4K pages, each page validation will result in
> VMEXIT to map the page, resulting in 1024 additional exits.
> 
> When performing a page state change, invoke KVM_PRE_FAULT_MEMORY for the
> size of the page state change in order to pre-map the pages and avoid the
> additional VMEXITs. This helps speed up boot times.

Ping...

> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  accel/kvm/kvm-all.c   |  2 ++
>  include/system/kvm.h  |  1 +
>  target/i386/kvm/kvm.c | 31 ++++++++++++++++++++++++++-----
>  3 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index f89568bfa3..0cd487cea7 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -93,6 +93,7 @@ bool kvm_allowed;
>  bool kvm_readonly_mem_allowed;
>  bool kvm_vm_attributes_allowed;
>  bool kvm_msi_use_devid;
> +bool kvm_pre_fault_memory_supported;
>  static bool kvm_has_guest_debug;
>  static int kvm_sstep_flags;
>  static bool kvm_immediate_exit;
> @@ -2732,6 +2733,7 @@ static int kvm_init(MachineState *ms)
>          kvm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
>          kvm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
>          (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +    kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>  
>      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
>          s->kernel_irqchip_split = mc->default_kernel_irqchip_split ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
> diff --git a/include/system/kvm.h b/include/system/kvm.h
> index ab17c09a55..492ea8a383 100644
> --- a/include/system/kvm.h
> +++ b/include/system/kvm.h
> @@ -42,6 +42,7 @@ extern bool kvm_gsi_routing_allowed;
>  extern bool kvm_gsi_direct_mapping;
>  extern bool kvm_readonly_mem_allowed;
>  extern bool kvm_msi_use_devid;
> +extern bool kvm_pre_fault_memory_supported;
>  
>  #define kvm_enabled()           (kvm_allowed)
>  /**
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 6c749d4ee8..7c39d30c5f 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -5999,9 +5999,11 @@ static bool host_supports_vmx(void)
>   * because private/shared page tracking is already provided through other
>   * means, these 2 use-cases should be treated as being mutually-exclusive.
>   */
> -static int kvm_handle_hc_map_gpa_range(struct kvm_run *run)
> +static int kvm_handle_hc_map_gpa_range(X86CPU *cpu, struct kvm_run *run)
>  {
> +    struct kvm_pre_fault_memory mem;
>      uint64_t gpa, size, attributes;
> +    int ret;
>  
>      if (!machine_require_guest_memfd(current_machine))
>          return -EINVAL;
> @@ -6012,13 +6014,32 @@ static int kvm_handle_hc_map_gpa_range(struct kvm_run *run)
>  
>      trace_kvm_hc_map_gpa_range(gpa, size, attributes, run->hypercall.flags);
>  
> -    return kvm_convert_memory(gpa, size, attributes & KVM_MAP_GPA_RANGE_ENCRYPTED);
> +    ret = kvm_convert_memory(gpa, size, attributes & KVM_MAP_GPA_RANGE_ENCRYPTED);
> +    if (ret || !kvm_pre_fault_memory_supported) {
> +        return ret;
> +    }
> +
> +    /*
> +     * Opportunistically pre-fault memory in. Failures are ignored so that any
> +     * errors in faulting in the memory will get captured in KVM page fault
> +     * path when the guest first accesses the page.
> +     */
> +    memset(&mem, 0, sizeof(mem));
> +    mem.gpa = gpa;
> +    mem.size = size;
> +    while (mem.size) {
> +        if (kvm_vcpu_ioctl(CPU(cpu), KVM_PRE_FAULT_MEMORY, &mem)) {
> +            break;
> +        }
> +    }
> +
> +    return 0;
>  }
>  
> -static int kvm_handle_hypercall(struct kvm_run *run)
> +static int kvm_handle_hypercall(X86CPU *cpu, struct kvm_run *run)
>  {
>      if (run->hypercall.nr == KVM_HC_MAP_GPA_RANGE)
> -        return kvm_handle_hc_map_gpa_range(run);
> +        return kvm_handle_hc_map_gpa_range(cpu, run);
>  
>      return -EINVAL;
>  }
> @@ -6118,7 +6139,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
>          break;
>  #endif
>      case KVM_EXIT_HYPERCALL:
> -        ret = kvm_handle_hypercall(run);
> +        ret = kvm_handle_hypercall(cpu, run);
>          break;
>      default:
>          fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
> 
> base-commit: 0f15892acaf3f50ecc20c6dad4b3ebdd701aa93e

next prev parent reply	other threads:[~2025-06-02 13:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28 20:30 [PATCH] i386/kvm: Prefault memory on page state change Tom Lendacky
2025-06-02 13:17 ` Tom Lendacky [this message]
2025-06-03  7:41 ` Xiaoyao Li
2025-06-03 11:47   ` Xiaoyao Li
2025-06-03 13:40     ` Tom Lendacky
2025-06-03 15:00     ` Paolo Bonzini
2025-06-03 15:31       ` Xiaoyao Li
2025-06-03 16:12       ` Xiaoyao Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4bcfe4ff-939e-f669-8d80-4077cc7aeb60@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=Michael.Roth@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).