From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D43C7171092 for ; Thu, 18 Apr 2024 18:26:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713464775; cv=none; b=PR0PhugD1ME/FOHvhMGh/5iZclArlfjM7JJZ13gT1YDcR/xXt+PKKAuNrbD+6LE5oGPbWbQ4B18+ALAfA7RSPhHSxBgUi3n+XO42XMi0tyLXykY6fA/SUxb+gC1QdyNw+DBCgkUzlpFZH6NAI91jBIu6eqkY4Dru5jw8RbMbjfg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713464775; c=relaxed/simple; bh=FhDtn/1Oncebi5qTzuSBS0nrpK3TbxVu093eT7IPIoA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=hVCFbKPMzmDN3H12MRB9AdSRAIN7SYrtR395RgQFcBnZxhWHZPMs33l68Uu7YQM6Dx5isg+p6YnQ/HFGkvBJfLsT87oYEU6A45OpTEyCstyKW5jkeSJ+sRFf4bqgrBMuol21ubnF5N8/JEN3BfiDMunpLJEQSPfxOfiZhsYInOE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hTskooSC; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hTskooSC" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1e2a553aad6so13252475ad.1 for ; Thu, 18 Apr 2024 11:26:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713464773; x=1714069573; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RmUziw85ELNs5DunGMczbwZRDhKt3LNvBNFMebu6TKQ=; b=hTskooSCAkgxaDEFuU3btjif1PSvVS2WsIJ0A3+Ahx4yIBxpUevhls1HOla39JZslH u7bpeZrFVoFVPeiO23p7L5F3DNN131xbSIekK2QzjeGPcvZYDVjIaGZxDCuu9IeRdqAZ A3NakElPETPYe0pP/w78yYkxylgZU4PFquRgxBV9iOOjMGf+Di6bLvGe9GpGDM/OFFj5 3qLskCTb5cBTO+3QpmwKUNV4Aw9tXPAZk5lau/FbvB+Bj7jMNphteEKDoOkDKXRajM1q bgnVA3Ubwp3x4v+/QEoQr0PmAPEhmsZx01NCzlJT8GPaQvfK6cjIrrd8p0W6tqie6SF7 e07w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713464773; x=1714069573; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RmUziw85ELNs5DunGMczbwZRDhKt3LNvBNFMebu6TKQ=; b=nbr3jGrlerDJqBHeNQuofPS7PHRuhdG/CoQYKpDfDjXh/f6MyKJt68x46gb7uD/yNe mhvJcEjtlIkOudhaHL5P61lGPD7TsVkmso/a4pZbsqSETFUxjJ8JndJTwTbswTB1sLsM 1DNaMzdRQbrMk+9UYtwR6h8tPcW12T2HFT6pT250WoDV5bNYjv0zeOcfK77MFGz0yZc0 MHjyB35dkjYWSUNUr6UwXHlT84TW4dVQRPZZOf4lj32C+JcZrz2sE20JwzI4XSblbdll 3XQ9RflwEM4MqqCQm+jPbdHSa8GArkI2phSFuVgACujFjUbT4O1LTbdgonhLaJUejaBf 5/lw== X-Forwarded-Encrypted: i=1; AJvYcCUqFj5F4LMtDAPvG6tXQltst/DiGOJNjC0P2GsUGM5hpgm+wY/FXcmmgtPuWPqO3PWd2nJTQ7BkqOmpg0+3bbFBD/ez X-Gm-Message-State: AOJu0YywLLhLktkuq3GcH7BnQFYYLtwRhf5xcnkDrIUPxQOiL3enh9TY NTQXjnFKW01XUAuBD3HelRrduV/fpkjCJ4Qm/Jz8cOYS3AxCTEseQLfH0rft81nxMSu6V3BGgvU 7yQ== X-Google-Smtp-Source: AGHT+IHi4MYznpirLYLsYvqS5mZfsgjkGMGyYW3vGG4C7v/qV79tzP+wzMFnQyGkJtLGYkIfMLxse2hFPSI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:e5c2:b0:1e0:afa0:cc94 with SMTP id u2-20020a170902e5c200b001e0afa0cc94mr8974plf.7.1713464773080; Thu, 18 Apr 2024 11:26:13 -0700 (PDT) Date: Thu, 18 Apr 2024 11:26:11 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <8f64043a6c393c017347bf8954d92b84b58603ec.1708933498.git.isaku.yamahata@intel.com> <54ae3bbb-34dc-4b10-a14e-2af9e9240ef1@intel.com> <39e9c5606b525f1b2e915be08cc95ac3aecc658b.camel@intel.com> Message-ID: Subject: Re: [PATCH v19 007/130] x86/virt/tdx: Export SEAMCALL functions From: Sean Christopherson To: "kirill.shutemov@linux.intel.com" Cc: Rick P Edgecombe , Tina Zhang , Dave Hansen , Hang Yuan , "x86@kernel.org" , Kai Huang , Bo2 Chen , "sagis@google.com" , "linux-kernel@vger.kernel.org" , "isaku.yamahata@gmail.com" , Erdem Aktas , "kvm@vger.kernel.org" , "pbonzini@redhat.com" , Isaku Yamahata Content-Type: text/plain; charset="us-ascii" On Thu, Apr 18, 2024, kirill.shutemov@linux.intel.com wrote: > On Tue, Apr 16, 2024 at 07:45:18PM +0000, Edgecombe, Rick P wrote: > > On Wed, 2024-04-10 at 15:49 +0300, Kirill A. Shutemov wrote: > > > On Fri, Mar 15, 2024 at 09:33:20AM -0700, Sean Christopherson wrote: > > > > So my feedback is to not worry about the exports, and instead focus on > > > > figuring > > > > out a way to make the generated code less bloated and easier to read/debug. > > > > > > I think it was mistake trying to centralize TDCALL/SEAMCALL calls into > > > few megawrappers. I think we can get better results by shifting leaf > > > function wrappers into assembly. > > > > > > We are going to have more assembly, but it should produce better result. > > > Adding macros can help to write such wrapper and minimizer boilerplate. > > > > > > Below is an example of how it can look like. It's not complete. I only > > > converted TDCALLs, but TDVMCALLs or SEAMCALLs. TDVMCALLs are going to be > > > more complex. > > > > > > Any opinions? Is it something worth investing more time? > > > > We discussed offline how implementing these for each TDVM/SEAMCALL increases the > > chances of a bug in just one TDVM/SEAMCALL. Which could making debugging > > problems more challenging. Kirill raised the possibility of some code generating > > solution like cpufeatures.h, that could take a spec and generate correct calls. > > > > So far no big wins have presented themselves. Kirill, do we think the path to > > move the messy part out-of-line will not work? > > I converted all TDCALL and TDVMCALL leafs to direct assembly wrappers. > Here's WIP branch: https://github.com/intel/tdx/commits/guest-tdx-asm/ > > I still need to clean it up and write commit messages and comments for all > wrappers. > > Now I think it worth the shot. > > Any feedback? I find it hard to review for correctness, and extremely susceptible to developer error. E.g. lots of copy+paste, and manual encoding of RCX to expose registers. It also bleeds TDX ABI into C code, e.g. /* * As per TDX GHCI CPUID ABI, r12-r15 registers contain contents of * EAX, EBX, ECX, EDX registers after the CPUID instruction execution. * So copy the register contents back to pt_regs. */ regs->ax = args.r12; regs->bx = args.r13; regs->cx = args.r14; regs->dx = args.r15; Oh, and it requires input/output paramters, which is quite gross for C code *and* for assembly code, e.g. u64 tdvmcall_map_gpa(u64 *gpa, u64 size); and then the accompanying assembly code: FRAME_BEGIN save_regs r12,r13 movq (%rdi), %r12 movq %rsi, %r13 movq $(TDX_R10 | TDX_R11 | TDX_R12 | TDX_R13), %rcx tdvmcall $TDVMCALL_MAP_GPA movq %r11, (%rdi) restore_regs r13,r12 FRAME_END RET I think having one trampoline makes sense, e.g. to minimize the probability of leaking register state to the VMM. The part that I don't like, and which generates awful code, is shoving register state into a memory structure. The annoying part with the TDX ABI is that it heavily uses r8-r15, and asm() constraints don't play nice with r8-15. But that doesn't mean we can't use asm() with macros, it just means we have to play games with registers. Because REG-REG moves are super cheap, and ignoring the fatal error goofiness, there are at most four inputs. That means having a single trampoline take *all* possible inputs is a non-issue. And we can avoiding polluting the inline code if we bury the register shuffling in the trampoline. And if we use asm() wrappers to call the trampoline, then the trampoline doesn't need to precisely follow the C calling convention. I.e. the trampoline can return with the outputs still in r12-r15, and let the asm() wrappers extract the outputs they want. As it stands, TDVMCALLs either have 0, 1, or 4 outputs. I.e. we only need three asm() wrappers. We could get away with one wrapper, but then users of the wrappers would need dummy variables for inputs *and* outputs, and the outputs get gross. Completely untested, but this is what I'm thinking. Side topic, I think making "tdcall" a macro that takes a leaf is a mistake. If/when an assembler learns what tdcall is, we're going to have to rewrite all of that code. And what a coincidence, my suggestion needs a bare TDCALL! :-) Side topic #2, I don't think the trampoline needs a stack frame, its a leaf function. Side topic #3, the ud2 to induce panic should be out-of-line. Weird? Yeah. But at least we one need to document one weird calling convention, and the ugliness is contained to three macros and a small assembly function. .pushsection .noinstr.text, "ax" SYM_FUNC_START(tdvmcall_trampoline) movq $TDX_HYPERCALL_STANDARD, %r10 movq %rax, %r11 movq %rdi, %r12 movq %rsi, %r13 movq %rdx, %r14 movq %rcx, %r15 movq $(TDX_R10 | TDX_R11 | TDX_R12 | TDX_R13 | TDX_R14 | TDX_R15), %rcx tdcall testq %rax, %rax jnz .Lpanic ret .Lpanic: ud2 SYM_FUNC_END(tdvmcall_trampoline) .popsection #define TDVMCALL(reason, in1, in2, in3, in4) \ ({ \ long __ret; \ \ asm( \ "call tdvmcall_trampoline\n\t" \ "mov %%r10, %0\n\t" \ : "=r" (__ret) \ : "D" (in1), "S"(in2), "d"(in3), "c" (in4) \ : "r12", "r13", "r14", "r15" \ ); \ __ret; \ }) #define TDVMCALL_1(reason, in1, in2, in3, in4, out1) \ ({ \ long __ret; \ \ asm( \ "call tdvmcall_trampoline\n\t" \ "mov %%r10, %0\n\t" \ "mov %%r12, %1\n\t" \ : "=r"(__ret) "=r" (out1) \ : "a"(reason), "D" (in1), "S"(in2), "d"(in3), "c" (in4) \ : "r12", "r13", "r14", "r15" \ ); \ __ret; \ }) #define TDVMCALL_4(reason, in1, in2, in3, in4, out1, out2, out3, out4) \ ({ \ long __ret; \ \ asm( \ "call tdvmcall_trampoline\n\t" \ "mov %%r10, %0\n\t" \ "mov %%r12, %1\n\t" \ "mov %%r13, %2\n\t" \ "mov %%r14, %3\n\t" \ "mov %%r15, %4\n\t" \ : "=r" (__ret), \ "=r" (out1), "=r" (out2), "=r" (out3), "=r" (out4) \ : "a"(reason), "D" (in1), "S"(in2), "d"(in3), "c" (in4) \ [reason] "i" (reason) \ : "r12", "r13", "r14", "r15" \ ); \ __ret; \ }) static int handle_halt(struct ve_info *ve) { if (TDVMCALL(EXIT_REASON_HALT, irqs_disabled(), 0, 0, 0)) return -EIO; return ve_instr_len(ve); } void __cpuidle tdx_safe_halt(void) { WARN_ONCE(TDVMCALL(EXIT_REASON_HALT, false, 0, 0, 0), "HLT instruction emulation failed"); } static int read_msr(struct pt_regs *regs, struct ve_info *ve) { u64 val; if (TDVMCALL_1(EXIT_REASON_MSR_READ, regs->cx, 0, 0, 0, val)) return -EIO; regs->ax = lower_32_bits(val); regs->dx = upper_32_bits(val); return ve_instr_len(ve); } static int write_msr(struct pt_regs *regs, struct ve_info *ve) { u64 val = (u64)regs->dx << 32 | regs->ax; if (TDVMCALL(EXIT_REASON_MSR_WRITE, regs->cx, val, 0, 0)) return -EIO; return ve_instr_len(ve); } static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve) { /* * Only allow VMM to control range reserved for hypervisor * communication. * * Return all-zeros for any CPUID outside the range. It matches CPU * behaviour for non-supported leaf. */ if (regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) { regs->ax = regs->bx = regs->cx = regs->dx = 0; return ve_instr_len(ve); } if (TDVMCALL_4(EXIT_REASON_CPUID, regs->ax, regs->cx, 0, 0, regs->ax, regs->bx, regs->cx, regs->dx)) return -EIO; return ve_instr_len(ve); } static bool mmio_read(int size, u64 gpa, u64 *val) { *val = 0; return !TDVMCALL_1(EXIT_REASON_EPT_VIOLATION, size, EPT_READ, gpa, 0, val); } static bool mmio_write(int size, u64 gpa, u64 val) { return !TDVMCALL(EXIT_REASON_EPT_VIOLATION, size, EPT_WRITE, gpa, val); }