From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 354AA211A19 for ; Thu, 23 Jan 2025 19:08:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.169 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737659291; cv=none; b=U0pWhCpuuRiok/nUF5RJQ6oZIp9jnxxNsum7oeo1LJYxQdmdu9vf8WbWeCNZmf/vmNn8Z1ANCh0lN+LqFmTpiIeFPQt6A1KcH23YGhyUrvUx/BC+7qCiULuGW/THRTqU2BwTA1WYpd92gOwW1t16PuxmAbkEzlPMPUHdxfjowA8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737659291; c=relaxed/simple; bh=9NSrYHdVuTU1FKGbLZpdrRs3Dpy4QKqxLyKMal29PNE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jY+hV1XNCYv8SVerM5W9HGaJtJKVJ0JVCi7xdJrS7GbOmbowQmcb3FBPmIs9ZH21YY6oKnT1FsT3ejkHt0dVvni9ZwA3ycKKFG2ER7MoxUtJJzfbkBKAV5EdpXC6Nn52HA84mz0ELNzD997IDHV+kak58nQ2AKdhSe3kwlfRL64= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jGJGkXMD; arc=none smtp.client-ip=209.85.160.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jGJGkXMD" Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-467a63f5d1cso11850381cf.0 for ; Thu, 23 Jan 2025 11:08:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737659287; x=1738264087; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZxY+mat0ucNinf9zmPQDA/BTk1kUqVZNmJUpITMUkVI=; b=jGJGkXMD4QUMujSpIYB77mhvfTmIm0MctUvtzebYirEhnZpKiZw4gR8CvnJfRbTAVW M0DD+pnDD+wS/FkA2c3jf3jKBVsS+FC6dJtELVWTefZu5rP4gfH7asFt75cEoPc3rxOj PVTxoWKdRJVHEu6tKZZXcLud596rqc0tXcC6F82NS/mE+FZY72uz1GAa/nF7k8x9n29H gxnlLlbDfSplegEpldhhSENgUdLqaZC50jaeojjAqyhbqv13RfwyavDt0rgnbNrALdl4 dhm8B7By4WG24e8v0INxuSLdHMKTlnuSqmdDUqhdluZlParr8TxdVD2eMBdB1LDsd+QR xWQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737659287; x=1738264087; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZxY+mat0ucNinf9zmPQDA/BTk1kUqVZNmJUpITMUkVI=; b=LYZDnzLsl+0gIiWFDdrgnesm00nTvx8XCNmEaHDLo5kkWh/Ge7GsJSBzWqduCSR5sI nnPy9jvolpCCpMRE5qyRZIVvNKAp0lsd4vq7qvk6ZXP2AccaQsjm0rReL2PbyfVwJZSo y6FN0SA1wABbuey3EQfyO6NcHt7bcU8pLNf2jfC/+53wLfMrfAWGXmGZ2xFe0iYSgCPr cm0lV+c4jLVD4PrKcQR3Nw5cC4BK6pO9Wm4i2S01dsv0Poc6PbfS10H/GWsvP/kU+jBA ku//oRDEz2N4WIkDVBzc5TbxB8TKAA8sZIgil8mV5Rsaw4iE1x6CuXUGrW16Nf3Ogl2c CEbg== X-Gm-Message-State: AOJu0Yydkfy8wnjq3JwupYh/26oX428ThN811189yeVReFe8y0oFDUsK N1AyKrIhXsn9TkGi3KT/Npsfnj1msS7nF33rWJBWNEacOI4OVr8V+bSE X-Gm-Gg: ASbGncui2gCPjK+rdhh9xozrj3aLBtTdxSyObJIh8ZUVYEdQW4tDJ5HsHMgeWDYTzsf d6gQYrCZzqklzRQJ423s4X24BKYtHUzmyIgJ3ITOTRNKEd07i50wFZcTOHaU2sm1Xtj6FpRLaDa IXIDcPLGcGtxs3JZZdPZSRGw3r1bsRzJ2ijqI8CwdbTlDu2Uwe0BniWOAbUMWD6okNIqpkmMtZq 1yl7R92piom48jOjYPQC1bZbzM63CPI2TY3DPkuURK8HYnzT2gAzfdbuAUvqA== X-Google-Smtp-Source: AGHT+IHnkKshIjuDt/k3CVvBIf1/nTyGghd8Muti3LX4dEZWe78PQIme0BT1ggfO3ryivbHKzfOavQ== X-Received: by 2002:a05:622a:5cc:b0:466:93e2:8ba5 with SMTP id d75a77b69052e-46e12ad5ed7mr354668271cf.5.1737659287106; Thu, 23 Jan 2025 11:08:07 -0800 (PST) Received: from citadel.lan ([2600:6c4a:4d3f:6d5c::1019]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-46e66b880b6sm1768021cf.69.2025.01.23.11.08.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jan 2025 11:08:06 -0800 (PST) From: Brian Gerst To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ingo Molnar , "H . Peter Anvin" , Thomas Gleixner , Borislav Petkov , Ard Biesheuvel , Uros Bizjak , Brian Gerst Subject: [PATCH v6 08/15] x86/percpu/64: Use relative percpu offsets Date: Thu, 23 Jan 2025 14:07:40 -0500 Message-ID: <20250123190747.745588-9-brgerst@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250123190747.745588-1-brgerst@gmail.com> References: <20250123190747.745588-1-brgerst@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit The percpu section is currently linked at absolute address 0, because older compilers hardcoded the stack protector canary value at a fixed offset from the start of the GS segment. Now that the canary is a normal percpu variable, the percpu section does not need to be linked at a specific address. x86-64 will now calculate the percpu offsets as the delta between the initial percpu address and the dynamically allocated memory, like other architectures. Note that GSBASE is limited to the canonical address width (48 or 57 bits, sign-extended). As long as the kernel text, modules, and the dynamically allocated percpu memmory are all in the negative address space, the delta will not overflow this limit. Signed-off-by: Brian Gerst Reviewed-by: Ard Biesheuvel Reviewed-by: Uros Bizjak --- arch/x86/include/asm/processor.h | 6 +++++- arch/x86/kernel/head_64.S | 19 +++++++++---------- arch/x86/kernel/setup_percpu.c | 12 ++---------- arch/x86/kernel/vmlinux.lds.S | 29 +---------------------------- arch/x86/platform/pvh/head.S | 5 ++--- arch/x86/tools/relocs.c | 10 +++------- arch/x86/xen/xen-head.S | 9 ++++----- init/Kconfig | 2 +- 8 files changed, 27 insertions(+), 65 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index a4687122951f..b8fee88dac3d 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -431,7 +431,11 @@ DECLARE_INIT_PER_CPU(fixed_percpu_data); static inline unsigned long cpu_kernelmode_gs_base(int cpu) { - return (unsigned long)per_cpu(fixed_percpu_data.gs_base, cpu); +#ifdef CONFIG_SMP + return per_cpu_offset(cpu); +#else + return 0; +#endif } extern asmlinkage void entry_SYSCALL32_ignore(void); diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index c3d73c04603f..905d8be93220 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -61,11 +61,14 @@ SYM_CODE_START_NOALIGN(startup_64) /* Set up the stack for verify_cpu() */ leaq __top_init_kernel_stack(%rip), %rsp - /* Setup GSBASE to allow stack canary access for C code */ + /* + * Set up GSBASE. + * Note that, on SMP, the boot cpu uses init data section until + * the per cpu areas are set up. + */ movl $MSR_GS_BASE, %ecx - leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx - movl %edx, %eax - shrq $32, %rdx + xorl %eax, %eax + xorl %edx, %edx wrmsr call startup_64_setup_gdt_idt @@ -359,16 +362,12 @@ SYM_INNER_LABEL(common_startup_64, SYM_L_LOCAL) movl %eax,%fs movl %eax,%gs - /* Set up %gs. - * - * The base of %gs always points to fixed_percpu_data. + /* + * Set up GSBASE. * Note that, on SMP, the boot cpu uses init data section until * the per cpu areas are set up. */ movl $MSR_GS_BASE,%ecx -#ifndef CONFIG_SMP - leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx -#endif movl %edx, %eax shrq $32, %rdx wrmsr diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c index b30d6e180df7..1e7be9409aa2 100644 --- a/arch/x86/kernel/setup_percpu.c +++ b/arch/x86/kernel/setup_percpu.c @@ -23,18 +23,10 @@ #include #include -#ifdef CONFIG_X86_64 -#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load) -#else -#define BOOT_PERCPU_OFFSET 0 -#endif - -DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off) = BOOT_PERCPU_OFFSET; +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, this_cpu_off); EXPORT_PER_CPU_SYMBOL(this_cpu_off); -unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init = { - [0 ... NR_CPUS-1] = BOOT_PERCPU_OFFSET, -}; +unsigned long __per_cpu_offset[NR_CPUS] __ro_after_init; EXPORT_SYMBOL(__per_cpu_offset); /* diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 0deb4887d6e9..8a598515239a 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -112,12 +112,6 @@ ASSERT(__relocate_kernel_end - __relocate_kernel_start <= KEXEC_CONTROL_CODE_MAX PHDRS { text PT_LOAD FLAGS(5); /* R_E */ data PT_LOAD FLAGS(6); /* RW_ */ -#ifdef CONFIG_X86_64 -#ifdef CONFIG_SMP - percpu PT_LOAD FLAGS(6); /* RW_ */ -#endif - init PT_LOAD FLAGS(7); /* RWE */ -#endif note PT_NOTE FLAGS(0); /* ___ */ } @@ -216,21 +210,7 @@ SECTIONS __init_begin = .; /* paired with __init_end */ } -#if defined(CONFIG_X86_64) && defined(CONFIG_SMP) - /* - * percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the - * output PHDR, so the next output section - .init.text - should - * start another segment - init. - */ - PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu) - ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START, - "per-CPU data too large - increase CONFIG_PHYSICAL_START") -#endif - INIT_TEXT_SECTION(PAGE_SIZE) -#ifdef CONFIG_X86_64 - :init -#endif /* * Section for code used exclusively before alternatives are run. All @@ -347,9 +327,7 @@ SECTIONS EXIT_DATA } -#if !defined(CONFIG_X86_64) || !defined(CONFIG_SMP) PERCPU_SECTION(INTERNODE_CACHE_BYTES) -#endif RUNTIME_CONST_VARIABLES RUNTIME_CONST(ptr, USER_PTR_MAX) @@ -497,16 +475,11 @@ PROVIDE(__ref_stack_chk_guard = __stack_chk_guard); * Per-cpu symbols which need to be offset from __per_cpu_load * for the boot processor. */ -#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) + __per_cpu_load +#define INIT_PER_CPU(x) init_per_cpu__##x = ABSOLUTE(x) INIT_PER_CPU(gdt_page); INIT_PER_CPU(fixed_percpu_data); INIT_PER_CPU(irq_stack_backing_store); -#ifdef CONFIG_SMP -. = ASSERT((fixed_percpu_data == 0), - "fixed_percpu_data is not at start of per-cpu area"); -#endif - #ifdef CONFIG_MITIGATION_UNRET_ENTRY . = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned"); #endif diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S index fa0072e0ca43..84bb46f86421 100644 --- a/arch/x86/platform/pvh/head.S +++ b/arch/x86/platform/pvh/head.S @@ -179,9 +179,8 @@ SYM_CODE_START(pvh_start_xen) * the per cpu areas are set up. */ movl $MSR_GS_BASE,%ecx - leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx - movq %edx, %eax - shrq $32, %rdx + xorl %eax, %eax + xorl %edx, %edx wrmsr /* Call xen_prepare_pvh() via the kernel virtual mapping */ diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c index 33dffc5c30b5..9aebc3b18d73 100644 --- a/arch/x86/tools/relocs.c +++ b/arch/x86/tools/relocs.c @@ -835,12 +835,7 @@ static void percpu_init(void) */ static int is_percpu_sym(ElfW(Sym) *sym, const char *symname) { - int shndx = sym_index(sym); - - return (shndx == per_cpu_shndx) && - strcmp(symname, "__init_begin") && - strcmp(symname, "__per_cpu_load") && - strncmp(symname, "init_per_cpu_", 13); + return 0; } @@ -1062,7 +1057,8 @@ static int cmp_relocs(const void *va, const void *vb) static void sort_relocs(struct relocs *r) { - qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs); + if (r->count) + qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs); } static int write32(uint32_t v, FILE *f) diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S index 5d3866ec3100..0aed24540212 100644 --- a/arch/x86/xen/xen-head.S +++ b/arch/x86/xen/xen-head.S @@ -31,15 +31,14 @@ SYM_CODE_START(startup_xen) leaq __top_init_kernel_stack(%rip), %rsp - /* Set up %gs. - * - * The base of %gs always points to fixed_percpu_data. + /* + * Set up GSBASE. * Note that, on SMP, the boot cpu uses init data section until * the per cpu areas are set up. */ movl $MSR_GS_BASE,%ecx - movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax - cdq + xorl %eax, %eax + xorl %edx, %edx wrmsr mov %rsi, %rdi diff --git a/init/Kconfig b/init/Kconfig index 7fe82a46e88c..01d36a84cf66 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1873,7 +1873,7 @@ config KALLSYMS_ALL config KALLSYMS_ABSOLUTE_PERCPU bool depends on KALLSYMS - default X86_64 && SMP + default n # end of the "standard kernel features (expert users)" menu -- 2.47.1