public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrei Vagin <avagin@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Andrei Vagin <avagin@google.com>,
	Sean Christopherson <seanjc@google.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jianfeng Tan <henry.tjf@antfin.com>,
	Adin Scannell <ascannell@google.com>,
	Konstantin Bogomolov <bogomolov@google.com>,
	Etienne Perot <eperot@google.com>
Subject: [PATCH 3/5] KVM/x86: add a new hypercall to execute host system calls.
Date: Fri, 22 Jul 2022 16:02:39 -0700	[thread overview]
Message-ID: <20220722230241.1944655-4-avagin@google.com> (raw)
In-Reply-To: <20220722230241.1944655-1-avagin@google.com>

There is a class of applications that use KVM to manage multiple address
spaces rather than use it as an isolation boundary. In all other terms,
they are normal processes that execute system calls, handle signals,
etc. Currently, each time when such a process needs to interact with the
operation system, it has to switch to host and back to guest. Such
entire switches are expensive and significantly increase the overhead of
system calls. The new hypercall reduces this overhead by more than two
times.

The new hypercall allows to execute host system calls. As for native
calls, seccomp filters are executed before calls.  It takes one argument
that is a pointer to a pt_regs structure in the host address space. It
provides registers to execute a system call according to the calling
convention. Arguments are passed in %rdi, %rsi, %rdx, %r10, %r8 and %r9
and then a return code is stored in %rax. 

The hypercall returns 0 if a system call has been executed. Otherwise,
it returns an error code.

Signed-off-by: Andrei Vagin <avagin@google.com>
---
 Documentation/virt/kvm/x86/hypercalls.rst | 18 +++++++++++++
 arch/x86/kvm/x86.c                        | 33 +++++++++++++++++++++++
 include/uapi/linux/kvm_para.h             |  1 +
 3 files changed, 52 insertions(+)

diff --git a/Documentation/virt/kvm/x86/hypercalls.rst b/Documentation/virt/kvm/x86/hypercalls.rst
index e56fa8b9cfca..eb18f2128bfe 100644
--- a/Documentation/virt/kvm/x86/hypercalls.rst
+++ b/Documentation/virt/kvm/x86/hypercalls.rst
@@ -190,3 +190,21 @@ the KVM_CAP_EXIT_HYPERCALL capability. Userspace must enable that capability
 before advertising KVM_FEATURE_HC_MAP_GPA_RANGE in the guest CPUID.  In
 addition, if the guest supports KVM_FEATURE_MIGRATION_CONTROL, userspace
 must also set up an MSR filter to process writes to MSR_KVM_MIGRATION_CONTROL.
+
+9. KVM_HC_HOST_SYSCALL
+---------------------
+:Architecture: x86
+:Status: active
+:Purpose: Execute a specified system call.
+
+- a0: pointer to a pt_regs structure in the host addess space.
+
+This hypercall lets a guest to execute host system calls. The first and only
+argument represents process registers that are used as input and output
+parameters.
+
+Returns 0 if the requested syscall has been executed. Otherwise, it returns an
+error code.
+
+**Implementation note**: The KVM_CAP_PV_HOST_SYSCALL capability has to be set
+to use this hypercall.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 19e634768161..aa54e180c9d4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -81,6 +81,7 @@
 #include <asm/emulate_prefix.h>
 #include <asm/sgx.h>
 #include <clocksource/hyperv_timer.h>
+#include <asm/syscall.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -9253,6 +9254,27 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu)
 	return kvm_skip_emulated_instruction(vcpu);
 }
 
+static int kvm_pv_host_syscall(unsigned long a0)
+{
+	struct pt_regs pt_regs = {};
+	unsigned long sysno;
+
+	if (copy_from_user(&pt_regs, (void *)a0, sizeof(pt_regs)))
+		return -EFAULT;
+
+	sysno = pt_regs.ax;
+	pt_regs.orig_ax = pt_regs.ax;
+	pt_regs.ax = -ENOSYS;
+
+	do_ksyscall_64(sysno, &pt_regs);
+
+	pt_regs.orig_ax = -1;
+	if (copy_to_user((void *)a0, &pt_regs, sizeof(pt_regs)))
+		return -EFAULT;
+
+	return 0;
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -9318,6 +9340,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		kvm_sched_yield(vcpu, a0);
 		ret = 0;
 		break;
+
 	case KVM_HC_MAP_GPA_RANGE: {
 		u64 gpa = a0, npages = a1, attrs = a2;
 
@@ -9340,6 +9363,16 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		vcpu->arch.complete_userspace_io = complete_hypercall_exit;
 		return 0;
 	}
+
+	case KVM_HC_HOST_SYSCALL:
+		if (!guest_pv_has(vcpu, KVM_FEATURE_PV_HOST_SYSCALL))
+			break;
+
+		kvm_vcpu_srcu_read_unlock(vcpu);
+		ret = kvm_pv_host_syscall(a0);
+		kvm_vcpu_srcu_read_lock(vcpu);
+		break;
+
 	default:
 		ret = -KVM_ENOSYS;
 		break;
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 960c7e93d1a9..3fcfb3241f35 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -30,6 +30,7 @@
 #define KVM_HC_SEND_IPI		10
 #define KVM_HC_SCHED_YIELD		11
 #define KVM_HC_MAP_GPA_RANGE		12
+#define KVM_HC_HOST_SYSCALL		13
 
 /*
  * hypercalls use architecture specific
-- 
2.37.1.359.gd136c6c3e2-goog


  parent reply	other threads:[~2022-07-22 23:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-22 23:02 [PATCH 0/5] KVM/x86: add a new hypercall to execute host system Andrei Vagin
2022-07-22 23:02 ` [PATCH 1/5] kernel: add a new helper to execute system calls from kernel code Andrei Vagin
2022-07-22 23:02 ` [PATCH 2/5] kvm/x86: add controls to enable/disable paravirtualized system calls Andrei Vagin
2022-07-22 23:02 ` Andrei Vagin [this message]
2022-07-22 23:02 ` [PATCH 4/5] selftests/kvm/x86_64: set rax before vmcall Andrei Vagin
2022-08-01 11:32   ` Vitaly Kuznetsov
2022-08-01 12:43     ` Paolo Bonzini
2022-07-22 23:02 ` [PATCH 5/5] selftests/kvm/x86_64: add tests for KVM_HC_HOST_SYSCALL Andrei Vagin
2022-07-22 23:41 ` [PATCH 0/5] KVM/x86: add a new hypercall to execute host system Sean Christopherson
2022-07-26  8:33   ` Andrei Vagin
2022-07-26 10:27     ` Paolo Bonzini
2022-07-27  6:44       ` Andrei Vagin
2022-07-26 15:10     ` Sean Christopherson
2022-07-26 22:10       ` Thomas Gleixner
2022-07-27  1:03         ` Andrei Vagin
2022-08-22 20:26           ` Andrei Vagin
2022-07-27  0:25       ` Andrei Vagin
2022-07-26 21:27   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220722230241.1944655-4-avagin@google.com \
    --to=avagin@google.com \
    --cc=ascannell@google.com \
    --cc=bogomolov@google.com \
    --cc=eperot@google.com \
    --cc=henry.tjf@antfin.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox