All of lore.kernel.org
 help / color / mirror / Atom feed
From: "tip-bot for Chang S. Bae" <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: tglx@linutronix.de, luto@kernel.org, peterz@infradead.org,
	chang.seok.bae@intel.com, luto@amacapital.net, hpa@zytor.com,
	dave.hansen@linux.intel.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, bp@alien8.de, markus.t.metzger@intel.com,
	torvalds@linux-foundation.org, dvlasenk@redhat.com,
	riel@surriel.com, ravi.v.shankar@intel.com, brgerst@gmail.com
Subject: [tip:x86/asm] x86/fsgsbase/64: Introduce FS/GS base helper functions
Date: Mon, 8 Oct 2018 02:55:28 -0700	[thread overview]
Message-ID: <tip-b1378a561fd16afdd96ef0bc912b1bcd2b85a68e@git.kernel.org> (raw)
In-Reply-To: <1537312139-5580-3-git-send-email-chang.seok.bae@intel.com>

Commit-ID:  b1378a561fd16afdd96ef0bc912b1bcd2b85a68e
Gitweb:     https://git.kernel.org/tip/b1378a561fd16afdd96ef0bc912b1bcd2b85a68e
Author:     Chang S. Bae <chang.seok.bae@intel.com>
AuthorDate: Tue, 18 Sep 2018 16:08:53 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 8 Oct 2018 10:41:08 +0200

x86/fsgsbase/64: Introduce FS/GS base helper functions

Introduce FS/GS base access functionality via <asm/fsgsbase.h>,
not yet used by anything directly.

Factor out task_seg_base() from x86/ptrace.c and rename it to
x86_fsgsbase_read_task() to make it part of the new helpers.

This will allow us to enhance FSGSBASE support and eventually enable
the FSBASE/GSBASE instructions.

An "inactive" GS base refers to a base saved at kernel entry
and being part of an inactive, non-running/stopped user-task.
(The typical ptrace model.)

Here are the new functions:

  x86_fsbase_read_task()
  x86_gsbase_read_task()
  x86_fsbase_write_task()
  x86_gsbase_write_task()
  x86_fsbase_read_cpu()
  x86_fsbase_write_cpu()
  x86_gsbase_read_cpu_inactive()
  x86_gsbase_write_cpu_inactive()

As an advantage of the unified namespace we can now see all FS/GSBASE
API use in the kernel via the following 'git grep' pattern:

  $ git grep x86_.*sbase

[ mingo: Wrote new changelog. ]

Based-on-code-from: Andy Lutomirski <luto@kernel.org>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1537312139-5580-3-git-send-email-chang.seok.bae@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fsgsbase.h |  50 ++++++++++++++++
 arch/x86/kernel/process_64.c    | 124 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/ptrace.c        |  51 ++---------------
 3 files changed, 179 insertions(+), 46 deletions(-)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
new file mode 100644
index 000000000000..1ab465ee23fe
--- /dev/null
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_FSGSBASE_H
+#define _ASM_FSGSBASE_H 1
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_X86_64
+
+#include <asm/msr-index.h>
+
+unsigned long x86_fsgsbase_read_task(struct task_struct *task,
+				     unsigned short selector);
+
+/*
+ * Read/write a task's fsbase or gsbase. This returns the value that
+ * the FS/GS base would have (if the task were to be resumed). These
+ * work on current or on a different non-running task.
+ */
+unsigned long x86_fsbase_read_task(struct task_struct *task);
+unsigned long x86_gsbase_read_task(struct task_struct *task);
+int x86_fsbase_write_task(struct task_struct *task, unsigned long fsbase);
+int x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase);
+
+/* Helper functions for reading/writing FS/GS base */
+
+static inline unsigned long x86_fsbase_read_cpu(void)
+{
+	unsigned long fsbase;
+
+	rdmsrl(MSR_FS_BASE, fsbase);
+	return fsbase;
+}
+
+void x86_fsbase_write_cpu(unsigned long fsbase);
+
+static inline unsigned long x86_gsbase_read_cpu_inactive(void)
+{
+	unsigned long gsbase;
+
+	rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
+	return gsbase;
+}
+
+void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
+
+#endif /* CONFIG_X86_64 */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_FSGSBASE_H */
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ea5ea850348d..2a53ff8d1baf 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -54,6 +54,7 @@
 #include <asm/vdso.h>
 #include <asm/intel_rdt_sched.h>
 #include <asm/unistd.h>
+#include <asm/fsgsbase.h>
 #ifdef CONFIG_IA32_EMULATION
 /* Not included via unistd.h */
 #include <asm/unistd_32_ia32.h>
@@ -286,6 +287,129 @@ static __always_inline void load_seg_legacy(unsigned short prev_index,
 	}
 }
 
+unsigned long x86_fsgsbase_read_task(struct task_struct *task,
+				     unsigned short selector)
+{
+	unsigned short idx = selector >> 3;
+	unsigned long base;
+
+	if (likely((selector & SEGMENT_TI_MASK) == 0)) {
+		if (unlikely(idx >= GDT_ENTRIES))
+			return 0;
+
+		/*
+		 * There are no user segments in the GDT with nonzero bases
+		 * other than the TLS segments.
+		 */
+		if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX)
+			return 0;
+
+		idx -= GDT_ENTRY_TLS_MIN;
+		base = get_desc_base(&task->thread.tls_array[idx]);
+	} else {
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+		struct ldt_struct *ldt;
+
+		/*
+		 * If performance here mattered, we could protect the LDT
+		 * with RCU.  This is a slow path, though, so we can just
+		 * take the mutex.
+		 */
+		mutex_lock(&task->mm->context.lock);
+		ldt = task->mm->context.ldt;
+		if (unlikely(idx >= ldt->nr_entries))
+			base = 0;
+		else
+			base = get_desc_base(ldt->entries + idx);
+		mutex_unlock(&task->mm->context.lock);
+#else
+		base = 0;
+#endif
+	}
+
+	return base;
+}
+
+void x86_fsbase_write_cpu(unsigned long fsbase)
+{
+	/*
+	 * Set the selector to 0 as a notion, that the segment base is
+	 * overwritten, which will be checked for skipping the segment load
+	 * during context switch.
+	 */
+	loadseg(FS, 0);
+	wrmsrl(MSR_FS_BASE, fsbase);
+}
+
+void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
+{
+	/* Set the selector to 0 for the same reason as %fs above. */
+	loadseg(GS, 0);
+	wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
+}
+
+unsigned long x86_fsbase_read_task(struct task_struct *task)
+{
+	unsigned long fsbase;
+
+	if (task == current)
+		fsbase = x86_fsbase_read_cpu();
+	else if (task->thread.fsindex == 0)
+		fsbase = task->thread.fsbase;
+	else
+		fsbase = x86_fsgsbase_read_task(task, task->thread.fsindex);
+
+	return fsbase;
+}
+
+unsigned long x86_gsbase_read_task(struct task_struct *task)
+{
+	unsigned long gsbase;
+
+	if (task == current)
+		gsbase = x86_gsbase_read_cpu_inactive();
+	else if (task->thread.gsindex == 0)
+		gsbase = task->thread.gsbase;
+	else
+		gsbase = x86_fsgsbase_read_task(task, task->thread.gsindex);
+
+	return gsbase;
+}
+
+int x86_fsbase_write_task(struct task_struct *task, unsigned long fsbase)
+{
+	/*
+	 * Not strictly needed for %fs, but do it for symmetry
+	 * with %gs
+	 */
+	if (unlikely(fsbase >= TASK_SIZE_MAX))
+		return -EPERM;
+
+	preempt_disable();
+	task->thread.fsbase = fsbase;
+	if (task == current)
+		x86_fsbase_write_cpu(fsbase);
+	task->thread.fsindex = 0;
+	preempt_enable();
+
+	return 0;
+}
+
+int x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase)
+{
+	if (unlikely(gsbase >= TASK_SIZE_MAX))
+		return -EPERM;
+
+	preempt_disable();
+	task->thread.gsbase = gsbase;
+	if (task == current)
+		x86_gsbase_write_cpu_inactive(gsbase);
+	task->thread.gsindex = 0;
+	preempt_enable();
+
+	return 0;
+}
+
 int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 		unsigned long arg, struct task_struct *p, unsigned long tls)
 {
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 3acbf45cb7fb..fbde2a7ce377 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -39,7 +39,7 @@
 #include <asm/hw_breakpoint.h>
 #include <asm/traps.h>
 #include <asm/syscall.h>
-#include <asm/mmu_context.h>
+#include <asm/fsgsbase.h>
 
 #include "tls.h"
 
@@ -343,49 +343,6 @@ static int set_segment_reg(struct task_struct *task,
 	return 0;
 }
 
-static unsigned long task_seg_base(struct task_struct *task,
-				   unsigned short selector)
-{
-	unsigned short idx = selector >> 3;
-	unsigned long base;
-
-	if (likely((selector & SEGMENT_TI_MASK) == 0)) {
-		if (unlikely(idx >= GDT_ENTRIES))
-			return 0;
-
-		/*
-		 * There are no user segments in the GDT with nonzero bases
-		 * other than the TLS segments.
-		 */
-		if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX)
-			return 0;
-
-		idx -= GDT_ENTRY_TLS_MIN;
-		base = get_desc_base(&task->thread.tls_array[idx]);
-	} else {
-#ifdef CONFIG_MODIFY_LDT_SYSCALL
-		struct ldt_struct *ldt;
-
-		/*
-		 * If performance here mattered, we could protect the LDT
-		 * with RCU.  This is a slow path, though, so we can just
-		 * take the mutex.
-		 */
-		mutex_lock(&task->mm->context.lock);
-		ldt = task->mm->context.ldt;
-		if (unlikely(idx >= ldt->nr_entries))
-			base = 0;
-		else
-			base = get_desc_base(ldt->entries + idx);
-		mutex_unlock(&task->mm->context.lock);
-#else
-		base = 0;
-#endif
-	}
-
-	return base;
-}
-
 #endif	/* CONFIG_X86_32 */
 
 static unsigned long get_flags(struct task_struct *task)
@@ -482,13 +439,15 @@ static unsigned long getreg(struct task_struct *task, unsigned long offset)
 		if (task->thread.fsindex == 0)
 			return task->thread.fsbase;
 		else
-			return task_seg_base(task, task->thread.fsindex);
+			return x86_fsgsbase_read_task(task,
+						      task->thread.fsindex);
 	}
 	case offsetof(struct user_regs_struct, gs_base): {
 		if (task->thread.gsindex == 0)
 			return task->thread.gsbase;
 		else
-			return task_seg_base(task, task->thread.gsindex);
+			return x86_fsgsbase_read_task(task,
+						      task->thread.gsindex);
 	}
 #endif
 	}

  reply	other threads:[~2018-10-08  9:56 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18 23:08 [PATCH v6 0/8] x86: infrastructure to enable FSGSBASE Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 1/8] x86/arch_prctl/64: Make ptrace read FS/GS base accurately Chang S. Bae
2018-10-08  9:54   ` [tip:x86/asm] x86/fsgsbase/64: Fix ptrace() to read the " tip-bot for Andy Lutomirski
2018-10-08  9:59   ` [tip:x86/asm] x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick tip-bot for Ingo Molnar
2018-10-08  9:59   ` [tip:x86/asm] x86/fsgsbase/64: Clean up various details tip-bot for Ingo Molnar
2018-09-18 23:08 ` [PATCH v6 2/8] x86/fsgsbase/64: Introduce FS/GS base helper functions Chang S. Bae
2018-10-08  9:55   ` tip-bot for Chang S. Bae [this message]
2018-10-24 19:01   ` [regression in -rc1] " Andy Lutomirski
2018-10-24 19:13     ` Bae, Chang Seok
2018-10-24 19:22       ` Andy Lutomirski
2018-10-24 19:29         ` Bae, Chang Seok
2018-10-24 19:43           ` Andy Lutomirski
2018-10-24 22:50             ` Bae, Chang Seok
2018-10-25 22:37     ` Andy Lutomirski
2018-09-18 23:08 ` [PATCH v6 3/8] x86/fsgsbase/64: Make ptrace use correct FS/GS base helpers Chang S. Bae
2018-10-08  9:56   ` [tip:x86/asm] x86/fsgsbase/64: Make ptrace use the new " tip-bot for Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 4/8] x86/fsgsbase/64: Use FS/GS base helpers in core dump Chang S. Bae
2018-10-08  9:56   ` [tip:x86/asm] x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers tip-bot for Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 5/8] x86/fsgsbase/64: Factor out load FS/GS segments from __switch_to() Chang S. Bae
2018-10-08  9:57   ` [tip:x86/asm] x86/fsgsbase/64: Factor out FS/GS segment loading " tip-bot for Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 6/8] x86/segments/64: Rename PER_CPU segment to CPU_NUMBER Chang S. Bae
2018-10-08  9:57   ` [tip:x86/asm] x86/segments/64: Rename the GDT PER_CPU entry " tip-bot for Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 7/8] x86/vdso: Introduce helper functions for CPU and node number Chang S. Bae
2018-10-08  9:58   ` [tip:x86/asm] " tip-bot for Chang S. Bae
2018-09-18 23:08 ` [PATCH v6 8/8] x86/vdso: Move out the CPU initialization Chang S. Bae
2018-10-08  8:36   ` Ingo Molnar
2018-10-08  9:58   ` [tip:x86/asm] x86/vdso: Initialize the CPU/node NR segment descriptor earlier tip-bot for Chang S. Bae

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-b1378a561fd16afdd96ef0bc912b1bcd2b85a68e@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=markus.t.metzger@intel.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=riel@surriel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.