Linux Documentation
 help / color / mirror / Atom feed
* [PATCH v2 51/53] usb: fix the comment with regards to DocBook
From: Mauro Carvalho Chehab @ 2017-05-16 12:16 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, David Woodhouse, Brian Norris, Boris Brezillon,
	Marek Vasut, Richard Weinberger, Cyrille Pitchen, linux-mtd,
	Felipe Balbi, Greg Kroah-Hartman, linux-usb
In-Reply-To: <cover.1494935649.git.mchehab@s-opensource.com>

The USB gadget documentation is not at DocBook anymore.
The main file was converted to ReST, and stored at
Documentation/driver-api/usb/gadget.rst, but there are
still several plain text files related to gadget under
Documentation/usb.

So, be generic and just mention documentation
without specifying where it is.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
 drivers/usb/gadget/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/Kconfig b/drivers/usb/gadget/Kconfig
index c164d6b788c3..b3c879b75a39 100644
--- a/drivers/usb/gadget/Kconfig
+++ b/drivers/usb/gadget/Kconfig
@@ -41,7 +41,7 @@ menuconfig USB_GADGET
 	   don't have this kind of hardware (except maybe inside Linux PDAs).
 
 	   For more information, see <http://www.linux-usb.org/gadget> and
-	   the kernel DocBook documentation for this API.
+	   the kernel documentation for this API.
 
 if USB_GADGET
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v2 50/53] fs: fix the location of the kernel-api book
From: Mauro Carvalho Chehab @ 2017-05-16 12:16 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, David Woodhouse, Brian Norris, Boris Brezillon,
	Marek Vasut, Richard Weinberger, Cyrille Pitchen, linux-mtd,
	Greg Kroah-Hartman
In-Reply-To: <cover.1494935649.git.mchehab@s-opensource.com>

The kernel-api book is now part of the core-api. Update its
location.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
 fs/debugfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index e892ae7d89f8..77440e4aa9d4 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -9,7 +9,7 @@
  *	2 as published by the Free Software Foundation.
  *
  *  debugfs is for people to use instead of /proc or /sys.
- *  See Documentation/DocBook/kernel-api for more details.
+ *  See ./Documentation/core-api/kernel-api.rst for more details.
  *
  */
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v2 53/53] kernel-doc: describe the ``literal`` syntax
From: Mauro Carvalho Chehab @ 2017-05-16 12:16 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, David Woodhouse, Brian Norris, Boris Brezillon,
	Marek Vasut, Richard Weinberger, Cyrille Pitchen, linux-mtd,
	Markus Heiser, Jani Nikula, Daniel Vetter
In-Reply-To: <cover.1494935649.git.mchehab@s-opensource.com>

changeset b97f193abf83 ("scripts/kernel-doc: fix parser
for apostrophes") added support for ``literal`` inside
kernel-doc, in order to allow using the "%" symbol inside
a literal block, as this is used at printk() description.

Document it.

Fixes: b97f193abf83 ("scripts/kernel-doc: fix parser for apostrophes")
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
---
 Documentation/doc-guide/kernel-doc.rst | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/doc-guide/kernel-doc.rst b/Documentation/doc-guide/kernel-doc.rst
index b32e4813ff6f..b24854b5d6be 100644
--- a/Documentation/doc-guide/kernel-doc.rst
+++ b/Documentation/doc-guide/kernel-doc.rst
@@ -149,6 +149,16 @@ Domain`_ references.
 ``%CONST``
   Name of a constant. (No cross-referencing, just formatting.)
 
+````literal````
+  A literal block that should be handled as-is. The output will use a
+  ``monospaced font``.
+
+  Useful if you need to use special characters that would otherwise have some
+  meaning either by kernel-doc script of by reStructuredText.
+
+  This is particularly useful if you need to use things like ``%ph`` inside
+  a function description.
+
 ``$ENVVAR``
   Name of an environment variable. (No cross-referencing, just formatting.)
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v2 53/53] kernel-doc: describe the ``literal`` syntax
From: Markus Heiser @ 2018-06-06 16:31 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, David Woodhouse, Brian Norris, Boris Brezillon,
	Marek Vasut, Richard Weinberger, Cyrille Pitchen, linux-mtd,
	Jani Nikula, Daniel Vetter
In-Reply-To: <5d47c31b59f6c2c7ddce0fcb1be0f85ad39a56fe.1494935649.git.mchehab@s-opensource.com>

Hi Mauro,

> Am 16.05.2017 um 14:16 schrieb Mauro Carvalho Chehab <mchehab@s-opensource.com>:
> 
> changeset b97f193abf83 ("scripts/kernel-doc: fix parser
> for apostrophes") added support for ``literal`` inside
> kernel-doc, in order to allow using the "%" symbol inside
> a literal block, as this is used at printk() description.
> 
> Document it.
> 
> Fixes: b97f193abf83 ("scripts/kernel-doc: fix parser for apostrophes")
> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
> ---
> Documentation/doc-guide/kernel-doc.rst | 10 ++++++++++
> 1 file changed, 10 insertions(+)
> 
> diff --git a/Documentation/doc-guide/kernel-doc.rst b/Documentation/doc-guide/kernel-doc.rst
> index b32e4813ff6f..b24854b5d6be 100644
> --- a/Documentation/doc-guide/kernel-doc.rst
> +++ b/Documentation/doc-guide/kernel-doc.rst
> @@ -149,6 +149,16 @@ Domain`_ references.
> ``%CONST``
>   Name of a constant. (No cross-referencing, just formatting.)
> 
> +````literal````
> +  A literal block that should be handled as-is. The output will use a
> +  ``monospaced font``.

just nitpicking; this is not literal block, this is called an inline literal [1]

To be complete: "blocks" are allways indented.

[1] http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#inline-literals

--- Markus --

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] cpuset: Enforce that a child's cpus must be a subset of the parent
From: Tejun Heo @ 2018-06-06 20:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Zefan Li, Johannes Weiner, Ingo Molnar, cgroups,
	linux-kernel, linux-doc, kernel-team, pjt, luto, Mike Galbraith,
	torvalds, Roman Gushchin, Juri Lelli, Patrick Bellasi
In-Reply-To: <20180531163826.GO12180@hirez.programming.kicks-ass.net>

Hello, Peter.

Sorry about late reply.

On Thu, May 31, 2018 at 06:38:26PM +0200, Peter Zijlstra wrote:
> > Yeah, for cpuset, it's messier, but it isn't different from hotunplug
> > scenario, right?  I think the best we can do there is putting ancestor
> > operation on an equal footing as hotplug ops.
> 
> Right, but hotplug is exceedingly rare, while I get the impression you
> think it is perfectly fine to recind on your resource grants.

Well, yeah, for a trivial example, imagine dynamic workload management
where you wanna restrict what a side-loaded batch workload can do on
and off peak hours.  All other controllers can do that.  It'd be a
really odd design trade-off if we make that really clumsy for cpuset
especially given that we wouldn't be gaining any actual
functionalities.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] doc: Update synchronize_rcu() definition in whatisRCU.txt
From: Andrea Parri @ 2018-06-07 10:01 UTC (permalink / raw)
  To: linux-kernel, linux-doc
  Cc: Andrea Parri, Paul E . McKenney, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Jonathan Corbet

The synchronize_rcu() definition based on RW-locks in whatisRCU.txt
does not meet the "Memory-Barrier Guarantees" in Requirements.html;
for example, the following SB-like test:

    P0:                      P1:

    WRITE_ONCE(x, 1);        WRITE_ONCE(y, 1);
    synchronize_rcu();       smp_mb();
    r0 = READ_ONCE(y);       r1 = READ_ONCE(x);

should not be allowed to reach the state "r0 = 0 AND r1 = 0", but
the current write_lock()+write_unlock() definition can not ensure
this. Remedies this by inserting an smp_mb__after_spinlock().

Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/RCU/whatisRCU.txt | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index a27fbfb0efb82..86a54ff911fc2 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -586,6 +586,7 @@ It is extremely simple:
 	void synchronize_rcu(void)
 	{
 		write_lock(&rcu_gp_mutex);
+		smp_mb__after_spinlock();
 		write_unlock(&rcu_gp_mutex);
 	}
 
@@ -607,12 +608,15 @@ don't forget about them when submitting patches making use of RCU!]
 
 The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
 and release a global reader-writer lock.  The synchronize_rcu()
-primitive write-acquires this same lock, then immediately releases
-it.  This means that once synchronize_rcu() exits, all RCU read-side
-critical sections that were in progress before synchronize_rcu() was
-called are guaranteed to have completed -- there is no way that
-synchronize_rcu() would have been able to write-acquire the lock
-otherwise.
+primitive write-acquires this same lock, then releases it.  This means
+that once synchronize_rcu() exits, all RCU read-side critical sections
+that were in progress before synchronize_rcu() was called are guaranteed
+to have completed -- there is no way that synchronize_rcu() would have
+been able to write-acquire the lock otherwise.  The smp_mb__after_spinlock()
+promotes synchronize_rcu() to a full memory barrier in compliance with
+the "Memory-Barrier Guarantees" listed in:
+
+	Documentation/RCU/Design/Requirements/Requirements.html.
 
 It is possible to nest rcu_read_lock(), since reader-writer locks may
 be recursively acquired.  Note also that rcu_read_lock() is immune
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 1/2] arm64: KVM: export the capability to set guest SError syndrome
From: Dongjiu Geng @ 2018-06-07 20:03 UTC (permalink / raw)
  To: rkrcmar, corbet, christoffer.dall, marc.zyngier, linux,
	catalin.marinas, will.deacon, kvm, linux-doc, james.morse,
	gengdongjiu, linux-arm-kernel, linux-kernel, linux-acpi
In-Reply-To: <1528401833-29963-1-git-send-email-gengdongjiu@huawei.com>

For the arm64 RAS Extension, user space can inject a virtual-SError
with specified ESR. So user space needs to know whether KVM support
to inject such SError, this interface adds this query for this capability.

KVM will check whether system support RAS Extension, if supported, KVM
returns true to user space, otherwise returns false.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
---
 Documentation/virtual/kvm/api.txt | 11 +++++++++++
 arch/arm64/kvm/reset.c            |  3 +++
 include/uapi/linux/kvm.h          |  1 +
 3 files changed, 15 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 758bf40..fdac969 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4603,3 +4603,14 @@ Architectures: s390
 This capability indicates that kvm will implement the interfaces to handle
 reset, migration and nested KVM for branch prediction blocking. The stfle
 facility 82 should not be provided to the guest without this capability.
+
+8.14 KVM_CAP_ARM_SET_SERROR_ESR
+
+Architectures: arm, arm64
+
+This capability indicates that userspace can specify the syndrome value reported
+to the guest OS when guest takes a virtual SError interrupt exception.
+If KVM has this capability, userspace can only specify the ISS field for the ESR
+syndrome, it can not specify the EC field which is not under control by KVM.
+If this virtual SError is taken to EL1 using AArch64, this value will be reported
+in ISS filed of ESR_EL1.
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3256b92..38c8a64 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -77,6 +77,9 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_PMU_V3:
 		r = kvm_arm_support_pmu_v3();
 		break;
+	case KVM_CAP_ARM_INJECT_SERROR_ESR:
+		r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+		break;
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
 		r = 1;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b02c41e..e88f976 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -948,6 +948,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_BPB 152
 #define KVM_CAP_GET_MSR_FEATURES 153
 #define KVM_CAP_HYPERV_EVENTFD 154
+#define KVM_CAP_ARM_INJECT_SERROR_ESR 155
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v4 0/2] support exception state migration and set VSESR_EL2 by user space
From: Dongjiu Geng @ 2018-06-07 20:03 UTC (permalink / raw)
  To: rkrcmar, corbet, christoffer.dall, marc.zyngier, linux,
	catalin.marinas, will.deacon, kvm, linux-doc, james.morse,
	gengdongjiu, linux-arm-kernel, linux-kernel, linux-acpi

This series patch is separated from https://www.spinics.net/lists/kvm/msg168917.html

1. Detect whether KVM can set set guest SError syndrome
2. Support to Set VSESR_EL2 and inject SError by user space.
3. Support live migration to keep SError pending state and VSESR_EL2 value

The user space patch is here: https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg06965.html

Dongjiu Geng (2):
  arm64: KVM: export the capability to set guest SError syndrome
  arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS

 Documentation/virtual/kvm/api.txt    | 42 +++++++++++++++++++++++++++++++++---
 arch/arm/include/asm/kvm_host.h      |  6 ++++++
 arch/arm/include/uapi/asm/kvm.h      | 12 +++++++++++
 arch/arm/kvm/guest.c                 | 12 +++++++++++
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  7 ++++++
 arch/arm64/include/uapi/asm/kvm.h    | 13 +++++++++++
 arch/arm64/kvm/guest.c               | 36 +++++++++++++++++++++++++++++++
 arch/arm64/kvm/inject_fault.c        |  6 +++---
 arch/arm64/kvm/reset.c               |  4 ++++
 include/uapi/linux/kvm.h             |  1 +
 virt/kvm/arm/arm.c                   | 19 ++++++++++++++++
 12 files changed, 157 insertions(+), 6 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v4 2/2] arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS
From: Dongjiu Geng @ 2018-06-07 20:03 UTC (permalink / raw)
  To: rkrcmar, corbet, christoffer.dall, marc.zyngier, linux,
	catalin.marinas, will.deacon, kvm, linux-doc, james.morse,
	gengdongjiu, linux-arm-kernel, linux-kernel, linux-acpi
In-Reply-To: <1528401833-29963-1-git-send-email-gengdongjiu@huawei.com>

For the migrating VMs, user space may need to know the exception
state. For example, in the machine A, KVM make an SError pending,
when migrate to B, KVM also needs to pend an SError.

This new IOCTL exports user-invisible states related to SError.
Together with appropriate user space changes, user space can get/set
the SError exception state to do migrate/snapshot/suspend.

Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
---
change since v3:
1. Fix the memset() issue in the kvm_arm_vcpu_get_events()

change since v2:
1. Add kvm_vcpu_events structure definition for arm platform to avoid the build errors.

change since v1:
Address Marc's comments, thanks Marc's review
1. serror_has_esr always true when ARM64_HAS_RAS_EXTN is set
2. remove Spurious blank line in kvm_arm_vcpu_set_events()
3. rename pend_guest_serror() to kvm_set_sei_esr()
4. Make kvm_arm_vcpu_get_events() did all the work rather than having this split responsibility.
5.  using sizeof(events) instead of sizeof(struct kvm_vcpu_events)

this series patch is separated from https://www.spinics.net/lists/kvm/msg168917.html
The user space patch is here: https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg06965.html

change since V12:
1. change (vcpu->arch.hcr_el2 & HCR_VSE) to !!(vcpu->arch.hcr_el2 & HCR_VSE) in kvm_arm_vcpu_get_events()

Change since V11:
Address James's comments, thanks James
1. Align the struct of kvm_vcpu_events to 64 bytes
2. Avoid exposing the stale ESR value in the kvm_arm_vcpu_get_events()
3. Change variables 'injected' name to 'serror_pending' in the kvm_arm_vcpu_set_events()
4. Change to sizeof(events) from sizeof(struct kvm_vcpu_events) in kvm_arch_vcpu_ioctl()

Change since V10:
Address James's comments, thanks James
1. Merge the helper function with the user.
2. Move the ISS_MASK into pend_guest_serror() to clear top bits
3. Make kvm_vcpu_events struct align to 4 bytes
4. Add something check in the kvm_arm_vcpu_set_events()
5. Check kvm_arm_vcpu_get/set_events()'s return value.
6. Initialise kvm_vcpu_events to 0 so that padding transferred to user-space doesn't
   contain kernel stack.
---
 Documentation/virtual/kvm/api.txt    | 31 ++++++++++++++++++++++++++++---
 arch/arm/include/asm/kvm_host.h      |  6 ++++++
 arch/arm/include/uapi/asm/kvm.h      | 12 ++++++++++++
 arch/arm/kvm/guest.c                 | 12 ++++++++++++
 arch/arm64/include/asm/kvm_emulate.h |  5 +++++
 arch/arm64/include/asm/kvm_host.h    |  7 +++++++
 arch/arm64/include/uapi/asm/kvm.h    | 13 +++++++++++++
 arch/arm64/kvm/guest.c               | 36 ++++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/inject_fault.c        |  6 +++---
 arch/arm64/kvm/reset.c               |  1 +
 virt/kvm/arm/arm.c                   | 19 +++++++++++++++++++
 11 files changed, 142 insertions(+), 6 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index fdac969..8896737 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -835,11 +835,13 @@ struct kvm_clock_data {
 
 Capability: KVM_CAP_VCPU_EVENTS
 Extended by: KVM_CAP_INTR_SHADOW
-Architectures: x86
+Architectures: x86, arm, arm64
 Type: vm ioctl
 Parameters: struct kvm_vcpu_event (out)
 Returns: 0 on success, -1 on error
 
+X86:
+
 Gets currently pending exceptions, interrupts, and NMIs as well as related
 states of the vcpu.
 
@@ -881,15 +883,32 @@ Only two fields are defined in the flags field:
 - KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
   smi contains a valid state.
 
+ARM, ARM64:
+
+Gets currently pending SError exceptions as well as related states of the vcpu.
+
+struct kvm_vcpu_events {
+	struct {
+		__u8 serror_pending;
+		__u8 serror_has_esr;
+		/* Align it to 8 bytes */
+		__u8 pad[6];
+		__u64 serror_esr;
+	} exception;
+	__u32 reserved[12];
+};
+
 4.32 KVM_SET_VCPU_EVENTS
 
-Capability: KVM_CAP_VCPU_EVENTS
+Capebility: KVM_CAP_VCPU_EVENTS
 Extended by: KVM_CAP_INTR_SHADOW
-Architectures: x86
+Architectures: x86, arm, arm64
 Type: vm ioctl
 Parameters: struct kvm_vcpu_event (in)
 Returns: 0 on success, -1 on error
 
+X86:
+
 Set pending exceptions, interrupts, and NMIs as well as related states of the
 vcpu.
 
@@ -910,6 +929,12 @@ shall be written into the VCPU.
 
 KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.
 
+ARM, ARM64:
+
+Set pending SError exceptions as well as related states of the vcpu.
+
+See KVM_GET_VCPU_EVENTS for the data structure.
+
 
 4.33 KVM_GET_DEBUGREGS
 
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c7c28c8..39f9901 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -213,6 +213,12 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events);
+
+int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events);
+
 unsigned long kvm_call_hyp(void *hypfn, ...);
 void force_vm_exit(const cpumask_t *mask);
 
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index caae484..c3e6975 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -124,6 +124,18 @@ struct kvm_sync_regs {
 struct kvm_arch_memory_slot {
 };
 
+/* for KVM_GET/SET_VCPU_EVENTS */
+struct kvm_vcpu_events {
+	struct {
+		__u8 serror_pending;
+		__u8 serror_has_esr;
+		/* Align it to 8 bytes */
+		__u8 pad[6];
+		__u64 serror_esr;
+	} exception;
+	__u32 reserved[12];
+};
+
 /* If you need to interpret the index values, here is the key: */
 #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
 #define KVM_REG_ARM_COPROC_SHIFT	16
diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
index a18f33e..c685f0e 100644
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -261,6 +261,18 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 	return -EINVAL;
 }
 
+int kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events)
+{
+	return -EINVAL;
+}
+
+int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events)
+{
+	return -EINVAL;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
 	switch (read_cpuid_part()) {
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 1dab3a9..18f61ff 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -81,6 +81,11 @@ static inline unsigned long *vcpu_hcr(struct kvm_vcpu *vcpu)
 	return (unsigned long *)&vcpu->arch.hcr_el2;
 }
 
+static inline unsigned long vcpu_get_vsesr(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.vsesr_el2;
+}
+
 static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
 {
 	vcpu->arch.vsesr_el2 = vsesr;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 469de8a..357304a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -335,6 +335,11 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu);
 int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events);
+
+int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events);
 
 #define KVM_ARCH_WANT_MMU_NOTIFIER
 int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
@@ -363,6 +368,8 @@ void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
+
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
 void __kvm_set_tpidr_el2(u64 tpidr_el2);
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 04b3256..df4faee 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -39,6 +39,7 @@
 #define __KVM_HAVE_GUEST_DEBUG
 #define __KVM_HAVE_IRQ_LINE
 #define __KVM_HAVE_READONLY_MEM
+#define __KVM_HAVE_VCPU_EVENTS
 
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
 
@@ -153,6 +154,18 @@ struct kvm_sync_regs {
 struct kvm_arch_memory_slot {
 };
 
+/* for KVM_GET/SET_VCPU_EVENTS */
+struct kvm_vcpu_events {
+	struct {
+		__u8 serror_pending;
+		__u8 serror_has_esr;
+		/* Align it to 8 bytes */
+		__u8 pad[6];
+		__u64 serror_esr;
+	} exception;
+	__u32 reserved[12];
+};
+
 /* If you need to interpret the index values, here is the key: */
 #define KVM_REG_ARM_COPROC_MASK		0x000000000FFF0000
 #define KVM_REG_ARM_COPROC_SHIFT	16
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 56a0260..60028f7 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -289,6 +289,42 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 	return -EINVAL;
 }
 
+int kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events)
+{
+	memset(events, 0, sizeof(events));
+
+	events->exception.serror_pending = !!(vcpu->arch.hcr_el2 & HCR_VSE);
+	events->exception.serror_has_esr =
+					cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
+
+	if (events->exception.serror_pending &&
+		events->exception.serror_has_esr)
+		events->exception.serror_esr = vcpu_get_vsesr(vcpu);
+	else
+		events->exception.serror_esr = 0;
+
+	return 0;
+}
+
+int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
+			struct kvm_vcpu_events *events)
+{
+	bool serror_pending = events->exception.serror_pending;
+	bool has_esr = events->exception.serror_has_esr;
+
+	if (serror_pending && has_esr) {
+		if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+			return -EINVAL;
+
+		kvm_set_sei_esr(vcpu, events->exception.serror_esr);
+	} else if (serror_pending) {
+		kvm_inject_vabt(vcpu);
+	}
+
+	return 0;
+}
+
 int __attribute_const__ kvm_target_cpu(void)
 {
 	unsigned long implementor = read_cpuid_implementor();
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index d8e7165..a55e91d 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -164,9 +164,9 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
-static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
 {
-	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
 	*vcpu_hcr(vcpu) |= HCR_VSE;
 }
 
@@ -184,5 +184,5 @@ static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	pend_guest_serror(vcpu, ESR_ELx_ISV);
+	kvm_set_sei_esr(vcpu, ESR_ELx_ISV);
 }
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 38c8a64..20e919a 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -82,6 +82,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
+	case KVM_CAP_VCPU_EVENTS:
 		r = 1;
 		break;
 	default:
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a4c1b76..79ecba9 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1107,6 +1107,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_arm_vcpu_has_attr(vcpu, &attr);
 		break;
 	}
+	case KVM_GET_VCPU_EVENTS: {
+		struct kvm_vcpu_events events;
+
+		if (kvm_arm_vcpu_get_events(vcpu, &events))
+			return -EINVAL;
+
+		if (copy_to_user(argp, &events, sizeof(events)))
+			return -EFAULT;
+
+		return 0;
+	}
+	case KVM_SET_VCPU_EVENTS: {
+		struct kvm_vcpu_events events;
+
+		if (copy_from_user(&events, argp, sizeof(events)))
+			return -EFAULT;
+
+		return kvm_arm_vcpu_set_events(vcpu, &events);
+	}
 	default:
 		r = -EINVAL;
 	}
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 0/5] Control Flow Enforcement - Part (1)
From: Yu-cheng Yu @ 2018-06-07 14:35 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu

Control flow enforcement technology (CET) is an upcoming Intel
processor family feature that prevents return/jmp-oriented
programming attacks.  It has two components: shadow stack (SHSTK)
and indirect branch tracking (IBT).

The specification is at:

  https://software.intel.com/sites/default/files/managed/4d/2a/
  control-flow-enforcement-technology-preview.pdf

The SHSTK is a secondary stack allocated from system memory.
The CALL instruction stores a secure copy of the return address
on the SHSTK; the RET instruction compares the return address
from the program stack to the SHSTK copy.  Any mismatch
triggers a control protection fault.

When the IBT is enabled, the processor verifies an indirect
CALL/JMP destination is an ENDBR instruction; otherwise, it
raises a control protection fault.  The compiler inserts ENDBRs
at all valid branch targets.

CET can be enabled for both kernel and user mode protection.
The Linux kernel patches being posted are for user-mode
protection.  They are grouped into four series:

  (1) CPUID enumeration, CET XSAVES system states, and
      documentation;
  (2) Kernel config, exception handling, and memory management
      changes;
  (3) SHSTK support;
  (4) IBT support, command-line tool, PTRACE.

Yu-cheng Yu (5):
  x86/cpufeatures: Add CPUIDs for Control-flow Enforcement Technology
    (CET)
  x86/fpu/xstate: Change some names to separate XSAVES system and user
    states
  x86/fpu/xstate: Enable XSAVES system states
  x86/fpu/xstate: Add XSAVES system states for shadow stack
  Documentation/x86: Add CET description

 Documentation/admin-guide/kernel-parameters.txt |   6 +
 Documentation/x86/intel_cet.txt                 | 161 ++++++++++++++++++++++++
 arch/x86/include/asm/cpufeatures.h              |   2 +
 arch/x86/include/asm/fpu/internal.h             |   6 +-
 arch/x86/include/asm/fpu/types.h                |  22 ++++
 arch/x86/include/asm/fpu/xstate.h               |  31 ++---
 arch/x86/include/uapi/asm/processor-flags.h     |   2 +
 arch/x86/kernel/cpu/scattered.c                 |   1 +
 arch/x86/kernel/fpu/core.c                      |  11 +-
 arch/x86/kernel/fpu/init.c                      |  10 --
 arch/x86/kernel/fpu/signal.c                    |   6 +-
 arch/x86/kernel/fpu/xstate.c                    | 152 +++++++++++++---------
 12 files changed, 319 insertions(+), 91 deletions(-)
 create mode 100644 Documentation/x86/intel_cet.txt

-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 3/5] x86/fpu/xstate: Enable XSAVES system states
From: Yu-cheng Yu @ 2018-06-07 14:35 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143544.3477-1-yu-cheng.yu@intel.com>

XSAVES saves both system and user states.  The Linux kernel
currently does not save/restore any system states.  This patch
creates the framework for supporting system states.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/fpu/internal.h |   3 +-
 arch/x86/include/asm/fpu/xstate.h   |   9 +--
 arch/x86/kernel/fpu/core.c          |   7 ++-
 arch/x86/kernel/fpu/init.c          |  10 ----
 arch/x86/kernel/fpu/xstate.c        | 112 ++++++++++++++++++++++--------------
 5 files changed, 80 insertions(+), 61 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index f1f9bf91a0ab..1f447865db3a 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -45,7 +45,6 @@ extern void fpu__init_cpu_xstate(void);
 extern void fpu__init_system(struct cpuinfo_x86 *c);
 extern void fpu__init_check_bugs(void);
 extern void fpu__resume_cpu(void);
-extern u64 fpu__get_supported_xfeatures_mask(void);
 
 /*
  * Debugging facility:
@@ -94,7 +93,7 @@ static inline void fpstate_init_xstate(struct xregs_state *xsave)
 	 * trigger #GP:
 	 */
 	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
-			xfeatures_mask_user;
+			xfeatures_mask_all;
 }
 
 static inline void fpstate_init_fxstate(struct fxregs_state *fx)
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 9b382e5157ed..a32dc5f8c963 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -19,10 +19,10 @@
 #define XSAVE_YMM_SIZE	    256
 #define XSAVE_YMM_OFFSET    (XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
-/* System features */
-#define XFEATURE_MASK_SYSTEM (XFEATURE_MASK_PT)
-
-/* All currently supported features */
+/*
+ * SUPPORTED_XFEATURES_MASK indicates all features
+ * implemented in and supported by the kernel.
+ */
 #define SUPPORTED_XFEATURES_MASK (XFEATURE_MASK_FP | \
 				  XFEATURE_MASK_SSE | \
 				  XFEATURE_MASK_YMM | \
@@ -40,6 +40,7 @@
 #endif
 
 extern u64 xfeatures_mask_user;
+extern u64 xfeatures_mask_all;
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 
 extern void __init update_regset_xstate_info(unsigned int size,
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d654b2f9a6c4..12474f019a14 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -364,8 +364,13 @@ void fpu__drop(struct fpu *fpu)
  */
 static inline void copy_init_fpstate_user_settings_to_fpregs(void)
 {
+	/*
+	 * Only XSAVES user states are copied.
+	 * System states are preserved.
+	 */
 	if (use_xsave())
-		copy_kernel_to_xregs(&init_fpstate.xsave, -1);
+		copy_kernel_to_xregs(&init_fpstate.xsave,
+				     xfeatures_mask_user);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		copy_kernel_to_fxregs(&init_fpstate.fxsave);
 	else
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 761c3a5a9e07..eaf9d9d479a5 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -222,16 +222,6 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 	fpu_user_xstate_size = fpu_kernel_xstate_size;
 }
 
-/*
- * Find supported xfeatures based on cpu features and command-line input.
- * This must be called after fpu__init_parse_early_param() is called and
- * xfeatures_mask is enumerated.
- */
-u64 __init fpu__get_supported_xfeatures_mask(void)
-{
-	return SUPPORTED_XFEATURES_MASK;
-}
-
 /* Legacy code to initialize eager fpu mode. */
 static void __init fpu__init_system_ctx_switch(void)
 {
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 19f8df54c72a..dd2c561c4544 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -51,13 +51,16 @@ static short xsave_cpuid_features[] __initdata = {
 };
 
 /*
- * Mask of xstate features supported by the CPU and the kernel:
+ * Mask of xstate features supported by the CPU and the kernel.
+ * This is the result from CPUID query, SUPPORTED_XFEATURES_MASK,
+ * and boot_cpu_has().
  */
 u64 xfeatures_mask_user __read_mostly;
+u64 xfeatures_mask_all __read_mostly;
 
 static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask_user)*8];
+static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask_all)*8];
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
@@ -82,7 +85,7 @@ void fpu__xstate_clear_all_cpu_caps(void)
  */
 int cpu_has_xfeatures(u64 xfeatures_needed, const char **feature_name)
 {
-	u64 xfeatures_missing = xfeatures_needed & ~xfeatures_mask_user;
+	u64 xfeatures_missing = xfeatures_needed & ~xfeatures_mask_all;
 
 	if (unlikely(feature_name)) {
 		long xfeature_idx, max_idx;
@@ -164,7 +167,7 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
 	 * None of the feature bits are in init state. So nothing else
 	 * to do for us, as the memory layout is up to date.
 	 */
-	if ((xfeatures & xfeatures_mask_user) == xfeatures_mask_user)
+	if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
 		return;
 
 	/*
@@ -219,30 +222,31 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
  */
 void fpu__init_cpu_xstate(void)
 {
-	if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask_user)
+	if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask_all)
 		return;
+
+	cr4_set_bits(X86_CR4_OSXSAVE);
+
 	/*
-	 * Make it clear that XSAVES system states are not yet
-	 * implemented should anyone expect it to work by changing
-	 * bits in XFEATURE_MASK_* macros and XCR0.
+	 * XCR_XFEATURE_ENABLED_MASK sets the features that are managed
+	 * by XSAVE{C, OPT} and XRSTOR.  Only XSAVE user states can be
+	 * set here.
 	 */
-	WARN_ONCE((xfeatures_mask_user & XFEATURE_MASK_SYSTEM),
-		"x86/fpu: XSAVES system states are not yet implemented.\n");
+	xsetbv(XCR_XFEATURE_ENABLED_MASK,
+	       xfeatures_mask_user);
 
-	xfeatures_mask_user &= ~XFEATURE_MASK_SYSTEM;
-
-	cr4_set_bits(X86_CR4_OSXSAVE);
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user);
+	/*
+	 * MSR_IA32_XSS sets which XSAVES system states to be managed by
+	 * XSAVES.  Only XSAVES system states can be set here.
+	 */
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		wrmsrl(MSR_IA32_XSS,
+		       xfeatures_mask_all & ~xfeatures_mask_user);
 }
 
-/*
- * Note that in the future we will likely need a pair of
- * functions here: one for user xstates and the other for
- * system xstates.  For now, they are the same.
- */
 static int xfeature_enabled(enum xfeature xfeature)
 {
-	return !!(xfeatures_mask_user & BIT_ULL(xfeature));
+	return !!(xfeatures_mask_all & BIT_ULL(xfeature));
 }
 
 /*
@@ -348,7 +352,7 @@ static int xfeature_is_aligned(int xfeature_nr)
  */
 static void __init setup_xstate_comp(void)
 {
-	unsigned int xstate_comp_sizes[sizeof(xfeatures_mask_user)*8];
+	unsigned int xstate_comp_sizes[sizeof(xfeatures_mask_all)*8];
 	int i;
 
 	/*
@@ -422,7 +426,7 @@ static void __init setup_init_fpu_buf(void)
 
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
 		init_fpstate.xsave.header.xcomp_bv =
-			BIT_ULL(63) | xfeatures_mask_user;
+			BIT_ULL(63) | xfeatures_mask_all;
 
 	/*
 	 * Init all the features state with header.xfeatures being 0x0
@@ -441,11 +445,10 @@ static int xfeature_uncompacted_offset(int xfeature_nr)
 	u32 eax, ebx, ecx, edx;
 
 	/*
-	 * Only XSAVES supports system states and it uses compacted
-	 * format. Checking a system state's uncompacted offset is
-	 * an error.
+	 * Checking a system or unsupported state's uncompacted offset
+	 * is an error.
 	 */
-	if (XFEATURE_MASK_SYSTEM & (1 << xfeature_nr)) {
+	if (~xfeatures_mask_user & BIT_ULL(xfeature_nr)) {
 		WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
 		return -1;
 	}
@@ -482,7 +485,7 @@ int using_compacted_format(void)
 int validate_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or system features may be set */
-	if (hdr->xfeatures & (~xfeatures_mask_user | XFEATURE_MASK_SYSTEM))
+	if (hdr->xfeatures & ~xfeatures_mask_user)
 		return -EINVAL;
 
 	/* Userspace must use the uncompacted format */
@@ -617,15 +620,12 @@ static void do_extra_xstate_size_checks(void)
 
 
 /*
- * Get total size of enabled xstates in XCR0/xfeatures_mask_user.
+ * Get total size of enabled xstates in XCR0 | IA32_XSS.
  *
  * Note the SDM's wording here.  "sub-function 0" only enumerates
  * the size of the *user* states.  If we use it to size a buffer
  * that we use 'XSAVES' on, we could potentially overflow the
  * buffer because 'XSAVES' saves system states too.
- *
- * Note that we do not currently set any bits on IA32_XSS so
- * 'XCR0 | IA32_XSS == XCR0' for now.
  */
 static unsigned int __init get_xsaves_size(void)
 {
@@ -707,6 +707,7 @@ static int init_xstate_size(void)
  */
 static void fpu__init_disable_system_xstate(void)
 {
+	xfeatures_mask_all = 0;
 	xfeatures_mask_user = 0;
 	cr4_clear_bits(X86_CR4_OSXSAVE);
 	fpu__xstate_clear_all_cpu_caps();
@@ -722,6 +723,8 @@ void __init fpu__init_system_xstate(void)
 	static int on_boot_cpu __initdata = 1;
 	int err;
 	int i;
+	u64 cpu_user_xfeatures_mask;
+	u64 cpu_system_xfeatures_mask;
 
 	WARN_ON_FPU(!on_boot_cpu);
 	on_boot_cpu = 0;
@@ -742,10 +745,24 @@ void __init fpu__init_system_xstate(void)
 		return;
 	}
 
+	/*
+	 * Find user states supported by the processor.
+	 * Only these bits can be set in XCR0.
+	 */
 	cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
-	xfeatures_mask_user = eax + ((u64)edx << 32);
+	cpu_user_xfeatures_mask = eax + ((u64)edx << 32);
+
+	/*
+	 * Find system states supported by the processor.
+	 * Only these bits can be set in IA32_XSS MSR.
+	 */
+	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
+	cpu_system_xfeatures_mask = ecx + ((u64)edx << 32);
 
-	if ((xfeatures_mask_user & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+	xfeatures_mask_all = cpu_user_xfeatures_mask |
+			     cpu_system_xfeatures_mask;
+
+	if ((xfeatures_mask_all & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
 		/*
 		 * This indicates that something really unexpected happened
 		 * with the enumeration.  Disable XSAVE and try to continue
@@ -760,10 +777,11 @@ void __init fpu__init_system_xstate(void)
 	 */
 	for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
 		if (!boot_cpu_has(xsave_cpuid_features[i]))
-			xfeatures_mask_user &= ~BIT_ULL(i);
+			xfeatures_mask_all &= ~BIT_ULL(i);
 	}
 
-	xfeatures_mask_user &= fpu__get_supported_xfeatures_mask();
+	xfeatures_mask_all &= SUPPORTED_XFEATURES_MASK;
+	xfeatures_mask_user = xfeatures_mask_all & cpu_user_xfeatures_mask;
 
 	/* Enable xstate instructions to be able to continue with initialization: */
 	fpu__init_cpu_xstate();
@@ -775,8 +793,7 @@ void __init fpu__init_system_xstate(void)
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * system xstates:
 	 */
-	update_regset_xstate_info(fpu_user_xstate_size,
-				  xfeatures_mask_user & ~XFEATURE_MASK_SYSTEM);
+	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user);
 
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
@@ -784,7 +801,7 @@ void __init fpu__init_system_xstate(void)
 	print_xstate_offset_size();
 
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
-		xfeatures_mask_user,
+		xfeatures_mask_all,
 		fpu_kernel_xstate_size,
 		boot_cpu_has(X86_FEATURE_XSAVES) ? "compacted" : "standard");
 	return;
@@ -804,6 +821,13 @@ void fpu__resume_cpu(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE))
 		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user);
+
+	/*
+	 * Restore IA32_XSS
+	 */
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		wrmsrl(MSR_IA32_XSS,
+		       xfeatures_mask_all & ~xfeatures_mask_user);
 }
 
 /*
@@ -853,9 +877,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	/*
 	 * We should not ever be requesting features that we
 	 * have not enabled.  Remember that pcntxt_mask is
-	 * what we write to the XCR0 register.
+	 * what we write to the XCR0 | IA32_XSS registers.
 	 */
-	WARN_ONCE(!(xfeatures_mask_user & xstate_feature),
+	WARN_ONCE(!(xfeatures_mask_all & xstate_feature),
 		  "get of unsupported state");
 	/*
 	 * This assumes the last 'xsave*' instruction to
@@ -1005,7 +1029,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
 	 */
 	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= ~XFEATURE_MASK_SYSTEM;
+	header.xfeatures &= xfeatures_mask_user;
 
 	/*
 	 * Copy xregs_state->header:
@@ -1089,7 +1113,7 @@ int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned i
 	 */
 	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= ~XFEATURE_MASK_SYSTEM;
+	header.xfeatures &= xfeatures_mask_user;
 
 	/*
 	 * Copy xregs_state->header:
@@ -1182,7 +1206,7 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 	 * The state that came in from userspace was user-state only.
 	 * Mask all the user states out of 'xfeatures':
 	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SYSTEM;
+	xsave->header.xfeatures &= (xfeatures_mask_all & ~xfeatures_mask_user);
 
 	/*
 	 * Add back in the features that came in from userspace:
@@ -1238,7 +1262,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 	 * The state that came in from userspace was user-state only.
 	 * Mask all the user states out of 'xfeatures':
 	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SYSTEM;
+	xsave->header.xfeatures &= (xfeatures_mask_all & ~xfeatures_mask_user);
 
 	/*
 	 * Add back in the features that came in from userspace:
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 6/9] x86/mm: Introduce ptep_set_wrprotect_flush and related functions
From: Yu-cheng Yu @ 2018-06-07 14:37 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143705.3531-1-yu-cheng.yu@intel.com>

The function ptep_set_wrprotect()/huge_ptep_set_wrprotect() is
used by copy_page_range()/copy_hugetlb_page_range() to copy
PTEs.

On x86, when the shadow stack is enabled, only a shadow stack
PTE has the read-only and _PAGE_DIRTY_HW combination.  Upon
making a dirty PTE read-only, we move its _PAGE_DIRTY_HW to
_PAGE_DIRTY_SW.

When ptep_set_wrprotect() moves _PAGE_DIRTY_HW to _PAGE_DIRTY_SW,
if the PTE is writable and the mm is shared, another task could
race to set _PAGE_DIRTY_HW again.

Introduce ptep_set_wrprotect_flush(), pmdp_set_wrprotect_flush(),
and huge_ptep_set_wrprotect_flush() to make sure this does not
happen.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/pgtable.h | 56 +++++++++++++++++++++++++++++++++++-------
 include/asm-generic/pgtable.h  | 26 ++++++++++++++++++++
 2 files changed, 73 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 0996f8a6979a..1053b940b35c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1148,11 +1148,27 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
 	return pte;
 }
 
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm,
-				      unsigned long addr, pte_t *ptep)
-{
-	clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte);
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT_FLUSH
+extern pte_t ptep_clear_flush(struct vm_area_struct *vma,
+			      unsigned long address,
+			      pte_t *ptep);
+static inline void ptep_set_wrprotect_flush(struct vm_area_struct *vma,
+					    unsigned long addr, pte_t *ptep)
+{
+	bool rw;
+
+	rw = test_and_clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte);
+	if (IS_ENABLED(CONFIG_X86_INTEL_SHADOW_STACK_USER)) {
+		struct mm_struct *mm = vma->vm_mm;
+		pte_t pte;
+
+		if (rw && (atomic_read(&mm->mm_users) > 1))
+			pte = ptep_clear_flush(vma, addr, ptep);
+		else
+			pte = *ptep;
+		pte = pte_move_flags(pte, _PAGE_DIRTY_HW, _PAGE_DIRTY_SW);
+		set_pte_at(mm, addr, ptep, pte);
+	}
 }
 
 #define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
@@ -1198,11 +1214,33 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm,
 	return native_pudp_get_and_clear(pudp);
 }
 
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm,
-				      unsigned long addr, pmd_t *pmdp)
+#define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT_FLUSH
+static inline void huge_ptep_set_wrprotect_flush(struct vm_area_struct *vma,
+						 unsigned long addr, pte_t *ptep)
 {
-	clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp);
+	ptep_set_wrprotect_flush(vma, addr, ptep);
+}
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT_FLUSH
+extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma,
+				   unsigned long address,
+				   pmd_t *pmdp);
+static inline void pmdp_set_wrprotect_flush(struct vm_area_struct *vma,
+					    unsigned long addr, pmd_t *pmdp)
+{	bool rw;
+
+	rw = test_and_clear_bit(_PAGE_BIT_RW, (unsigned long *)&pmdp);
+	if (IS_ENABLED(CONFIG_X86_INTEL_SHADOW_STACK_USER)) {
+		struct mm_struct *mm = vma->vm_mm;
+		pmd_t pmd;
+
+		if (rw && (atomic_read(&mm->mm_users) > 1))
+			pmd = pmdp_huge_clear_flush(vma, addr, pmdp);
+		else
+			pmd = *pmdp;
+		pmd = pmd_move_flags(pmd, _PAGE_DIRTY_HW, _PAGE_DIRTY_SW);
+		set_pmd_at(mm, addr, pmdp, pmd);
+	}
 }
 
 #define pud_write pud_write
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 3f6f998509f0..9bcfdfc045bb 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -121,6 +121,15 @@ static inline int pmdp_clear_flush_young(struct vm_area_struct *vma,
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
 
+#ifndef __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT_FLUSH
+static inline void huge_ptep_set_wrorptect_flush(struct vm_area_struct *vma,
+						 unsigned long addr,
+						 pte_t *ptep)
+{
+	huge_ptep_set_wrprotect(vma->vm_mm, addr, ptep);
+}
+#endif
+
 #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address,
@@ -226,6 +235,15 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
 }
 #endif
 
+#ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT_FLUSH
+static inline void ptep_set_wrprotect_flush(struct vm_area_struct *vma,
+					    unsigned long address,
+					    pte_t *ptep)
+{
+	ptep_set_wrprotect(vma->vm_mm, address, ptep);
+}
+#endif
+
 #ifndef pte_savedwrite
 #define pte_savedwrite pte_write
 #endif
@@ -266,6 +284,14 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
+#ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT_FLUSH
+static inline void pmdp_set_wrprotect_flush(struct vm_area_struct *vma,
+					    unsigned long address,
+					    pmd_t *pmdp)
+{
+	pmdp_set_wrprotect(vma->vm_mm, address, pmdp);
+}
+#endif
 #ifndef __HAVE_ARCH_PUDP_SET_WRPROTECT
 #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
 static inline void pudp_set_wrprotect(struct mm_struct *mm,
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 05/10] x86/cet: ELF header parsing of Control Flow Enforcement
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143807.3611-1-yu-cheng.yu@intel.com>

Look in .note.gnu.property of an ELF file and check if shadow stack needs
to be enabled for the task.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/Kconfig                         |   4 +
 arch/x86/include/asm/elf.h               |   5 +
 arch/x86/include/uapi/asm/elf_property.h |  16 +++
 arch/x86/kernel/Makefile                 |   2 +
 arch/x86/kernel/elf.c                    | 220 +++++++++++++++++++++++++++++++
 fs/binfmt_elf.c                          |  16 +++
 include/uapi/linux/elf.h                 |   1 +
 7 files changed, 264 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/elf_property.h
 create mode 100644 arch/x86/kernel/elf.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index dd580d4910fc..24339a5299da 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1931,12 +1931,16 @@ config X86_INTEL_CET
 config ARCH_HAS_SHSTK
 	def_bool n
 
+config ARCH_HAS_PROGRAM_PROPERTIES
+	def_bool n
+
 config X86_INTEL_SHADOW_STACK_USER
 	prompt "Intel Shadow Stack for user-mode"
 	def_bool n
 	depends on CPU_SUP_INTEL && X86_64
 	select X86_INTEL_CET
 	select ARCH_HAS_SHSTK
+	select ARCH_HAS_PROGRAM_PROPERTIES
 	---help---
 	  Shadow stack provides hardware protection against program stack
 	  corruption.  Only when all the following are true will an application
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 0d157d2a1e2a..5b5f169c5c07 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -382,4 +382,9 @@ struct va_alignment {
 
 extern struct va_alignment va_align;
 extern unsigned long align_vdso_addr(unsigned long);
+
+#ifdef CONFIG_ARCH_HAS_PROGRAM_PROPERTIES
+extern int arch_setup_features(void *ehdr, void *phdr, struct file *file,
+			       bool interp);
+#endif
 #endif /* _ASM_X86_ELF_H */
diff --git a/arch/x86/include/uapi/asm/elf_property.h b/arch/x86/include/uapi/asm/elf_property.h
new file mode 100644
index 000000000000..343a871b8fc1
--- /dev/null
+++ b/arch/x86/include/uapi/asm/elf_property.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _UAPI_ASM_X86_ELF_PROPERTY_H
+#define _UAPI_ASM_X86_ELF_PROPERTY_H
+
+/*
+ * pr_type
+ */
+#define GNU_PROPERTY_X86_FEATURE_1_AND (0xc0000002)
+
+/*
+ * Bits for GNU_PROPERTY_X86_FEATURE_1_AND
+ */
+#define GNU_PROPERTY_X86_FEATURE_1_SHSTK	(0x00000002)
+#define GNU_PROPERTY_X86_FEATURE_1_IBT		(0x00000001)
+
+#endif /* _UAPI_ASM_X86_ELF_PROPERTY_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 7ea5e099d558..cbf983f44b61 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -140,6 +140,8 @@ obj-$(CONFIG_UNWINDER_GUESS)		+= unwind_guess.o
 
 obj-$(CONFIG_X86_INTEL_CET)		+= cet.o
 
+obj-$(CONFIG_ARCH_HAS_PROGRAM_PROPERTIES) += elf.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/elf.c b/arch/x86/kernel/elf.c
new file mode 100644
index 000000000000..8e2719d8dc86
--- /dev/null
+++ b/arch/x86/kernel/elf.c
@@ -0,0 +1,220 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Look at an ELF file's .note.gnu.property and determine if the file
+ * supports shadow stack and/or indirect branch tracking.
+ * The path from the ELF header to the note section is the following:
+ * elfhdr->elf_phdr->elf_note->x86_note_gnu_property[].
+ */
+
+#include <asm/cet.h>
+#include <asm/elf_property.h>
+#include <uapi/linux/elf-em.h>
+#include <linux/binfmts.h>
+#include <linux/elf.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/uaccess.h>
+#include <linux/string.h>
+
+#define ELF_NOTE_DESC_OFFSET(n, align) \
+	round_up(sizeof(*n) + n->n_namesz, (align))
+
+#define ELF_NOTE_NEXT_OFFSET(n, align) \
+	round_up(ELF_NOTE_DESC_OFFSET(n, align) + n->n_descsz, (align))
+
+static int find_cet(u8 *buf, u32 size, u32 align, int *shstk, int *ibt)
+{
+	unsigned long start = (unsigned long)buf;
+	struct elf_note *note = (struct elf_note *)buf;
+
+	*shstk = 0;
+	*ibt = 0;
+
+	/*
+	 * Go through the x86_note_gnu_property array pointed by
+	 * buf and look for shadow stack and indirect branch
+	 * tracking features.
+	 * The GNU_PROPERTY_X86_FEATURE_1_AND entry contains only
+	 * one u32 as data.  Do not go beyond buf_size.
+	 */
+
+	while ((unsigned long) (note + 1) - start < size) {
+		/* Find the NT_GNU_PROPERTY_TYPE_0 note. */
+		if (note->n_namesz == 4 &&
+		    note->n_type == NT_GNU_PROPERTY_TYPE_0 &&
+		    memcmp(note + 1, "GNU", 4) == 0) {
+			u8 *ptr, *ptr_end;
+
+			/* Check for invalid property. */
+			if (note->n_descsz < 8 ||
+			   (note->n_descsz % align) != 0)
+				return 0;
+
+			/* Start and end of property array. */
+			ptr = (u8 *)(note + 1) + 4;
+			ptr_end = ptr + note->n_descsz;
+
+			while (1) {
+				u32 type = *(u32 *)ptr;
+				u32 datasz = *(u32 *)(ptr + 4);
+
+				ptr += 8;
+				if ((ptr + datasz) > ptr_end)
+					break;
+
+				if (type == GNU_PROPERTY_X86_FEATURE_1_AND &&
+				    datasz == 4) {
+					u32 p = *(u32 *)ptr;
+
+					if (p & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
+						*shstk = 1;
+					if (p & GNU_PROPERTY_X86_FEATURE_1_IBT)
+						*ibt = 1;
+					return 1;
+				}
+			}
+		}
+
+		/*
+		 * Note sections like .note.ABI-tag and .note.gnu.build-id
+		 * are aligned to 4 bytes in 64-bit ELF objects.
+		 */
+		note = (void *)note + ELF_NOTE_NEXT_OFFSET(note, align);
+	}
+
+	return 0;
+}
+
+static int check_pt_note_segment(struct file *file,
+				 unsigned long note_size, loff_t *pos,
+				 u32 align, int *shstk, int *ibt)
+{
+	int retval;
+	char *note_buf;
+
+	/*
+	 * Try to read in the whole PT_NOTE segment.
+	 */
+	note_buf = kmalloc(note_size, GFP_KERNEL);
+	if (!note_buf)
+		return -ENOMEM;
+	retval = kernel_read(file, note_buf, note_size, pos);
+	if (retval != note_size) {
+		kfree(note_buf);
+		return (retval < 0) ? retval : -EIO;
+	}
+
+	retval = find_cet(note_buf, note_size, align, shstk, ibt);
+	kfree(note_buf);
+	return retval;
+}
+
+#ifdef CONFIG_COMPAT
+static int check_pt_note_32(struct file *file, struct elf32_phdr *phdr,
+			    int phnum, int *shstk, int *ibt)
+{
+	int i;
+	int found = 0;
+
+	/*
+	 * Go through all PT_NOTE segments and find NT_GNU_PROPERTY_TYPE_0.
+	 */
+	for (i = 0; i < phnum; i++, phdr++) {
+		loff_t pos;
+
+		/*
+		 * NT_GNU_PROPERTY_TYPE_0 note is aligned to 4 bytes
+		 * in 32-bit binaries.
+		 */
+		if ((phdr->p_type != PT_NOTE) || (phdr->p_align != 4))
+			continue;
+
+		pos = phdr->p_offset;
+		found = check_pt_note_segment(file, phdr->p_filesz,
+					      &pos, phdr->p_align,
+					      shstk, ibt);
+		if (found)
+			break;
+	}
+	return found;
+}
+#endif
+
+#ifdef CONFIG_X86_64
+static int check_pt_note_64(struct file *file, struct elf64_phdr *phdr,
+			    int phnum, int *shstk, int *ibt)
+{
+	int found = 0;
+
+	/*
+	 * Go through all PT_NOTE segments and find NT_GNU_PROPERTY_TYPE_0.
+	 */
+	for (; phnum > 0; phnum--, phdr++) {
+		loff_t pos;
+
+		/*
+		 * NT_GNU_PROPERTY_TYPE_0 note is aligned to 8 bytes
+		 * in 64-bit binaries.
+		 */
+		if ((phdr->p_type != PT_NOTE) || (phdr->p_align != 8))
+			continue;
+
+		pos = phdr->p_offset;
+		found = check_pt_note_segment(file, phdr->p_filesz,
+					      &pos, phdr->p_align,
+					      shstk, ibt);
+
+		if (found)
+			break;
+	}
+	return found;
+}
+#endif
+
+int arch_setup_features(void *ehdr_p, void *phdr_p,
+			struct file *file, bool interp)
+{
+	int err = 0;
+	int shstk = 0;
+	int ibt = 0;
+
+	struct elf64_hdr *ehdr64 = ehdr_p;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
+		return 0;
+
+	if (ehdr64->e_ident[EI_CLASS] == ELFCLASS64) {
+		struct elf64_phdr *phdr64 = phdr_p;
+
+		err = check_pt_note_64(file, phdr64, ehdr64->e_phnum,
+				       &shstk, &ibt);
+		if (err < 0)
+			goto out;
+	} else {
+#ifdef CONFIG_COMPAT
+		struct elf32_hdr *ehdr32 = ehdr_p;
+
+		if (ehdr32->e_ident[EI_CLASS] == ELFCLASS32) {
+			struct elf32_phdr *phdr32 = phdr_p;
+
+			err = check_pt_note_32(file, phdr32, ehdr32->e_phnum,
+					       &shstk, &ibt);
+			if (err < 0)
+				goto out;
+		}
+#endif
+	}
+
+	current->thread.cet.shstk_enabled = 0;
+	current->thread.cet.shstk_base = 0;
+	current->thread.cet.shstk_size = 0;
+	if (cpu_feature_enabled(X86_FEATURE_SHSTK)) {
+		if (shstk) {
+			err = cet_setup_shstk();
+			if (err < 0)
+				goto out;
+		}
+	}
+out:
+	return err;
+}
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 4ad6f669fe34..9ddc6d01e779 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1081,6 +1081,22 @@ static int load_elf_binary(struct linux_binprm *bprm)
 		goto out_free_dentry;
 	}
 
+#ifdef CONFIG_ARCH_HAS_PROGRAM_PROPERTIES
+
+	if (interpreter) {
+		retval = arch_setup_features(&loc->interp_elf_ex,
+					     interp_elf_phdata,
+					     interpreter, true);
+	} else {
+		retval = arch_setup_features(&loc->elf_ex,
+					     elf_phdata,
+					     bprm->file, false);
+	}
+
+	if (retval < 0)
+		goto out_free_dentry;
+#endif
+
 	if (elf_interpreter) {
 		unsigned long interp_map_addr = 0;
 
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index e2535d6dcec7..f69ed8702271 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -372,6 +372,7 @@ typedef struct elf64_shdr {
 #define NT_PRFPREG	2
 #define NT_PRPSINFO	3
 #define NT_TASKSTRUCT	4
+#define NT_GNU_PROPERTY_TYPE_0 5
 #define NT_AUXV		6
 /*
  * Note to userspace developers: size of NT_SIGINFO note may increase
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 0/7] Control Flow Enforcement - Part (4)
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu

This series introduces CET - indirect branch tracking

The major task of indirect branch tracking is for the compiler to
insert the ENDBR instructions at all valid branch targets.

The kernel provides:
	CPUID enumeration and feature setup;
	Legacy bitmap allocation;
	Some basic supporting routines.

In this patch, there are also a CET command-line utility and
PTRACE support.

H.J. Lu (2):
  x86: Insert endbr32/endbr64 to vDSO
  tools: Add cetcmd

Yu-cheng Yu (5):
  x86/cet: Add Kconfig option for user-mode Indirect Branch Tracking
  x86/cet: User-mode indirect branch tracking support
  mm/mmap: Add IBT bitmap size to address space limit check
  x86/cet: add arcp_prctl functions for indirect branch tracking
  x86/cet: Add PTRACE interface for CET

 arch/x86/Kconfig                               |  12 +++
 arch/x86/entry/vdso/.gitignore                 |   4 +
 arch/x86/entry/vdso/Makefile                   |  34 +++++++
 arch/x86/entry/vdso/endbr.sh                   |  32 ++++++
 arch/x86/include/asm/cet.h                     |   9 ++
 arch/x86/include/asm/disabled-features.h       |   8 +-
 arch/x86/include/asm/fpu/regset.h              |   7 +-
 arch/x86/include/uapi/asm/prctl.h              |   1 +
 arch/x86/include/uapi/asm/resource.h           |   5 +
 arch/x86/kernel/cet.c                          |  73 ++++++++++++++
 arch/x86/kernel/cet_prctl.c                    |  54 +++++++++-
 arch/x86/kernel/cpu/common.c                   |  20 +++-
 arch/x86/kernel/elf.c                          |  19 +++-
 arch/x86/kernel/fpu/regset.c                   |  41 ++++++++
 arch/x86/kernel/process.c                      |   2 +
 arch/x86/kernel/ptrace.c                       |  16 +++
 include/uapi/asm-generic/resource.h            |   3 +
 include/uapi/linux/elf.h                       |   1 +
 mm/mmap.c                                      |   8 +-
 tools/Makefile                                 |  13 +--
 tools/arch/x86/include/uapi/asm/elf_property.h |  16 +++
 tools/arch/x86/include/uapi/asm/prctl.h        |  33 ++++++
 tools/cet/.gitignore                           |   1 +
 tools/cet/Makefile                             |  11 ++
 tools/cet/cetcmd.c                             | 134 +++++++++++++++++++++++++
 tools/include/uapi/asm/elf_property.h          |   4 +
 tools/include/uapi/asm/prctl.h                 |   4 +
 27 files changed, 549 insertions(+), 16 deletions(-)
 create mode 100644 arch/x86/entry/vdso/endbr.sh
 create mode 100644 tools/arch/x86/include/uapi/asm/elf_property.h
 create mode 100644 tools/arch/x86/include/uapi/asm/prctl.h
 create mode 100644 tools/cet/.gitignore
 create mode 100644 tools/cet/Makefile
 create mode 100644 tools/cet/cetcmd.c
 create mode 100644 tools/include/uapi/asm/elf_property.h
 create mode 100644 tools/include/uapi/asm/prctl.h

-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 4/7] x86/cet: add arcp_prctl functions for indirect branch tracking
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/cet.h        |  1 +
 arch/x86/include/uapi/asm/prctl.h |  1 +
 arch/x86/kernel/cet_prctl.c       | 54 ++++++++++++++++++++++++++++++++++++---
 arch/x86/kernel/elf.c             | 12 ++++++---
 arch/x86/kernel/process.c         |  1 +
 5 files changed, 62 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h
index d07bdeb27db4..5b71a2b44eb1 100644
--- a/arch/x86/include/asm/cet.h
+++ b/arch/x86/include/asm/cet.h
@@ -19,6 +19,7 @@ struct cet_stat {
 	unsigned int	ibt_enabled:1;
 	unsigned int	locked:1;
 	unsigned int	exec_shstk:2;
+	unsigned int	exec_ibt:2;
 };
 
 #ifdef CONFIG_X86_INTEL_CET
diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index f9965403b655..fef476d2d2f6 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -20,6 +20,7 @@
 #define ARCH_CET_EXEC		0x3004
 #define ARCH_CET_ALLOC_SHSTK	0x3005
 #define ARCH_CET_PUSH_SHSTK	0x3006
+#define ARCH_CET_LEGACY_BITMAP	0x3007
 
 /*
  * Settings for ARCH_CET_EXEC
diff --git a/arch/x86/kernel/cet_prctl.c b/arch/x86/kernel/cet_prctl.c
index 326996e2ea80..948f7ba98dc2 100644
--- a/arch/x86/kernel/cet_prctl.c
+++ b/arch/x86/kernel/cet_prctl.c
@@ -19,6 +19,7 @@
  * ARCH_CET_EXEC: set default features for exec()
  * ARCH_CET_ALLOC_SHSTK: allocate shadow stack
  * ARCH_CET_PUSH_SHSTK: put a return address on shadow stack
+ * ARCH_CET_LEGACY_BITMAP: allocate legacy bitmap
  */
 
 static int handle_get_status(unsigned long arg2)
@@ -28,8 +29,12 @@ static int handle_get_status(unsigned long arg2)
 
 	if (current->thread.cet.shstk_enabled)
 		features |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+	if (current->thread.cet.ibt_enabled)
+		features |= GNU_PROPERTY_X86_FEATURE_1_IBT;
 	if (current->thread.cet.exec_shstk == CET_EXEC_ALWAYS_ON)
 		cet_exec |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+	if (current->thread.cet.exec_ibt == CET_EXEC_ALWAYS_ON)
+		cet_exec |= GNU_PROPERTY_X86_FEATURE_1_IBT;
 	shstk_size = current->thread.cet.exec_shstk_size;
 
 	if (in_compat_syscall()) {
@@ -94,9 +99,18 @@ static int handle_set_exec(unsigned long arg2)
 			return -EPERM;
 	}
 
+	if (features & GNU_PROPERTY_X86_FEATURE_1_IBT) {
+		if (!cpu_feature_enabled(X86_FEATURE_IBT))
+			return -EINVAL;
+		if ((current->thread.cet.exec_ibt == CET_EXEC_ALWAYS_ON) &&
+		    (cet_exec != CET_EXEC_ALWAYS_ON))
+			return -EPERM;
+	}
+
 	if (features & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
 		current->thread.cet.exec_shstk = cet_exec;
-
+	if (features & GNU_PROPERTY_X86_FEATURE_1_IBT)
+		current->thread.cet.exec_ibt = cet_exec;
 	current->thread.cet.exec_shstk_size = shstk_size;
 	return 0;
 }
@@ -167,9 +181,36 @@ static int handle_alloc_shstk(unsigned long arg2)
 	return 0;
 }
 
+static int handle_bitmap(unsigned long arg2)
+{
+	unsigned long addr, size;
+
+	if (current->thread.cet.ibt_enabled) {
+		if (!current->thread.cet.ibt_bitmap_addr)
+			cet_setup_ibt_bitmap();
+		addr = current->thread.cet.ibt_bitmap_addr;
+		size = current->thread.cet.ibt_bitmap_size;
+	} else {
+		addr = 0;
+		size = 0;
+	}
+
+	if (in_compat_syscall()) {
+		if (put_user(addr, (unsigned int __user *)arg2) ||
+		    put_user(size, (unsigned int __user *)arg2 + 1))
+			return -EFAULT;
+	} else {
+		if (put_user(addr, (unsigned long __user *)arg2) ||
+		    put_user(size, (unsigned long __user *)arg2 + 1))
+		return -EFAULT;
+	}
+	return 0;
+}
+
 int prctl_cet(int option, unsigned long arg2)
 {
-	if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
+	if (!cpu_feature_enabled(X86_FEATURE_SHSTK) &&
+	    !cpu_feature_enabled(X86_FEATURE_IBT))
 		return -EINVAL;
 
 	switch (option) {
@@ -181,7 +222,8 @@ int prctl_cet(int option, unsigned long arg2)
 			return -EPERM;
 		if (arg2 & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
 			cet_disable_free_shstk(current);
-
+		if (arg2 & GNU_PROPERTY_X86_FEATURE_1_IBT)
+			cet_disable_ibt();
 		return 0;
 
 	case ARCH_CET_LOCK:
@@ -197,6 +239,12 @@ int prctl_cet(int option, unsigned long arg2)
 	case ARCH_CET_PUSH_SHSTK:
 		return handle_push_shstk(arg2);
 
+	/*
+	 * Allocate legacy bitmap and return address & size to user.
+	 */
+	case ARCH_CET_LEGACY_BITMAP:
+		return handle_bitmap(arg2);
+
 	default:
 		return -EINVAL;
 	}
diff --git a/arch/x86/kernel/elf.c b/arch/x86/kernel/elf.c
index a3995c8c2fc2..c2a89f3c7186 100644
--- a/arch/x86/kernel/elf.c
+++ b/arch/x86/kernel/elf.c
@@ -230,10 +230,14 @@ int arch_setup_features(void *ehdr_p, void *phdr_p,
 	}
 
 	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
-		if (ibt) {
-			err = cet_setup_ibt();
-			if (err < 0)
-				goto out;
+		int exec = current->thread.cet.exec_ibt;
+
+		if (exec != CET_EXEC_ALWAYS_OFF) {
+			if (ibt || (exec == CET_EXEC_ALWAYS_ON)) {
+				err = cet_setup_ibt();
+				if (err < 0)
+					goto out;
+			}
 		}
 	}
 
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9bec164e7958..c69576b4abd1 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -801,6 +801,7 @@ long do_arch_prctl_common(struct task_struct *task, int option,
 	case ARCH_CET_EXEC:
 	case ARCH_CET_ALLOC_SHSTK:
 	case ARCH_CET_PUSH_SHSTK:
+	case ARCH_CET_LEGACY_BITMAP:
 		return prctl_cet(option, cpuid_enabled);
 	}
 
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 3/7] mm/mmap: Add IBT bitmap size to address space limit check
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

The indirect branch tracking legacy bitmap takes a large address
space.  This causes may_expand_vm() failure on the address limit
check.  For a IBT-enabled task, add the bitmap size to the
address limit.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/uapi/asm/resource.h | 5 +++++
 include/uapi/asm-generic/resource.h  | 3 +++
 mm/mmap.c                            | 8 +++++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/resource.h b/arch/x86/include/uapi/asm/resource.h
index 04bc4db8921b..0741b2a6101a 100644
--- a/arch/x86/include/uapi/asm/resource.h
+++ b/arch/x86/include/uapi/asm/resource.h
@@ -1 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+#ifdef CONFIG_X86_INTEL_CET
+#define rlimit_as_extra() current->thread.cet.ibt_bitmap_size
+#endif
+
 #include <asm-generic/resource.h>
diff --git a/include/uapi/asm-generic/resource.h b/include/uapi/asm-generic/resource.h
index f12db7a0da64..8a7608a09700 100644
--- a/include/uapi/asm-generic/resource.h
+++ b/include/uapi/asm-generic/resource.h
@@ -58,5 +58,8 @@
 # define RLIM_INFINITY		(~0UL)
 #endif
 
+#ifndef rlimit_as_extra
+#define rlimit_as_extra() 0
+#endif
 
 #endif /* _UAPI_ASM_GENERIC_RESOURCE_H */
diff --git a/mm/mmap.c b/mm/mmap.c
index e7d1fcb7ec58..5c07f052bed7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3255,7 +3255,13 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
  */
 bool may_expand_vm(struct mm_struct *mm, vm_flags_t flags, unsigned long npages)
 {
-	if (mm->total_vm + npages > rlimit(RLIMIT_AS) >> PAGE_SHIFT)
+	unsigned long as_limit = rlimit(RLIMIT_AS);
+	unsigned long as_limit_plus = as_limit + rlimit_as_extra();
+
+	if (as_limit_plus > as_limit)
+		as_limit = as_limit_plus;
+
+	if (mm->total_vm + npages > as_limit >> PAGE_SHIFT)
 		return false;
 
 	if (is_data_mapping(flags) &&
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 7/7] x86/cet: Add PTRACE interface for CET
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

Add PTRACE interface for CET MSRs.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/fpu/regset.h |  7 ++++---
 arch/x86/kernel/fpu/regset.c      | 41 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/ptrace.c          | 16 +++++++++++++++
 include/uapi/linux/elf.h          |  1 +
 4 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/regset.h b/arch/x86/include/asm/fpu/regset.h
index d5bdffb9d27f..edad0d889084 100644
--- a/arch/x86/include/asm/fpu/regset.h
+++ b/arch/x86/include/asm/fpu/regset.h
@@ -7,11 +7,12 @@
 
 #include <linux/regset.h>
 
-extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active;
+extern user_regset_active_fn regset_fpregs_active, regset_xregset_fpregs_active,
+				cetregs_active;
 extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
-				xstateregs_get;
+				xstateregs_get, cetregs_get;
 extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
-				 xstateregs_set;
+				 xstateregs_set, cetregs_set;
 
 /*
  * xstateregs_active == regset_fpregs_active. Please refer to the comment
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index bc02f5144b95..7008eb084d36 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -160,6 +160,47 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	return ret;
 }
 
+int cetregs_active(struct task_struct *target, const struct user_regset *regset)
+{
+#ifdef CONFIG_X86_INTEL_CET
+	if (target->thread.cet.shstk_enabled || target->thread.cet.ibt_enabled)
+		return regset->n;
+#endif
+	return 0;
+}
+
+int cetregs_get(struct task_struct *target, const struct user_regset *regset,
+		unsigned int pos, unsigned int count,
+		void *kbuf, void __user *ubuf)
+{
+	struct fpu *fpu = &target->thread.fpu;
+	struct cet_user_state *cetregs;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK))
+		return -ENODEV;
+
+	cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_MASK_SHSTK_USER);
+
+	fpu__prepare_read(fpu);
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf, cetregs, 0, -1);
+}
+
+int cetregs_set(struct task_struct *target, const struct user_regset *regset,
+		  unsigned int pos, unsigned int count,
+		  const void *kbuf, const void __user *ubuf)
+{
+	struct fpu *fpu = &target->thread.fpu;
+	struct cet_user_state *cetregs;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK))
+		return -ENODEV;
+
+	cetregs = get_xsave_addr(&fpu->state.xsave, XFEATURE_MASK_SHSTK_USER);
+
+	fpu__prepare_write(fpu);
+	return user_regset_copyin(&pos, &count, &kbuf, &ubuf, cetregs, 0, -1);
+}
+
 #if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
 
 /*
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index ed5c4cdf0a34..a4501b8d086a 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -49,7 +49,9 @@ enum x86_regset {
 	REGSET_IOPERM64 = REGSET_XFP,
 	REGSET_XSTATE,
 	REGSET_TLS,
+	REGSET_CET64 = REGSET_TLS,
 	REGSET_IOPERM32,
+	REGSET_CET32,
 };
 
 struct pt_regs_offset {
@@ -1276,6 +1278,13 @@ static struct user_regset x86_64_regsets[] __ro_after_init = {
 		.size = sizeof(long), .align = sizeof(long),
 		.active = ioperm_active, .get = ioperm_get
 	},
+	[REGSET_CET64] = {
+		.core_note_type = NT_X86_CET,
+		.n = sizeof(struct cet_user_state) / sizeof(u64),
+		.size = sizeof(u64), .align = sizeof(u64),
+		.active = cetregs_active, .get = cetregs_get,
+		.set = cetregs_set
+	},
 };
 
 static const struct user_regset_view user_x86_64_view = {
@@ -1331,6 +1340,13 @@ static struct user_regset x86_32_regsets[] __ro_after_init = {
 		.size = sizeof(u32), .align = sizeof(u32),
 		.active = ioperm_active, .get = ioperm_get
 	},
+	[REGSET_CET32] = {
+		.core_note_type = NT_X86_CET,
+		.n = sizeof(struct cet_user_state) / sizeof(u64),
+		.size = sizeof(u64), .align = sizeof(u64),
+		.active = cetregs_active, .get = cetregs_get,
+		.set = cetregs_set
+	},
 };
 
 static const struct user_regset_view user_x86_32_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index f69ed8702271..0dd1f9dc6e14 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -401,6 +401,7 @@ typedef struct elf64_shdr {
 #define NT_386_TLS	0x200		/* i386 TLS slots (struct user_desc) */
 #define NT_386_IOPERM	0x201		/* x86 io permission bitmap (1=deny) */
 #define NT_X86_XSTATE	0x202		/* x86 extended state using xsave */
+#define NT_X86_CET	0x203		/* x86 cet state */
 #define NT_S390_HIGH_GPRS	0x300	/* s390 upper register halves */
 #define NT_S390_TIMER	0x301		/* s390 timer register */
 #define NT_S390_TODCMP	0x302		/* s390 TOD clock comparator register */
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 2/7] x86/cet: User-mode indirect branch tracking support
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

Add user-mode indirect branch tracking enabling/disabling
and supporting routines.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/cet.h               |  8 ++++
 arch/x86/include/asm/disabled-features.h |  8 +++-
 arch/x86/kernel/cet.c                    | 73 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/common.c             | 20 ++++++++-
 arch/x86/kernel/elf.c                    | 15 ++++++-
 arch/x86/kernel/process.c                |  1 +
 6 files changed, 122 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h
index a2a53fe4d5e6..d07bdeb27db4 100644
--- a/arch/x86/include/asm/cet.h
+++ b/arch/x86/include/asm/cet.h
@@ -13,7 +13,10 @@ struct cet_stat {
 	unsigned long	shstk_base;
 	unsigned long	shstk_size;
 	unsigned long	exec_shstk_size;
+	unsigned long	ibt_bitmap_addr;
+	unsigned long	ibt_bitmap_size;
 	unsigned int	shstk_enabled:1;
+	unsigned int	ibt_enabled:1;
 	unsigned int	locked:1;
 	unsigned int	exec_shstk:2;
 };
@@ -29,6 +32,9 @@ void cet_disable_shstk(void);
 void cet_disable_free_shstk(struct task_struct *p);
 int cet_restore_signal(unsigned long ssp);
 int cet_setup_signal(int ia32, unsigned long addr);
+int cet_setup_ibt(void);
+int cet_setup_ibt_bitmap(void);
+void cet_disable_ibt(void);
 #else
 static inline int prctl_cet(int option, unsigned long arg2) { return 0; }
 static inline unsigned long cet_get_shstk_ptr(void) { return 0; }
@@ -41,6 +47,8 @@ static inline void cet_disable_shstk(void) {}
 static inline void cet_disable_free_shstk(struct task_struct *p) {}
 static inline int cet_restore_signal(unsigned long ssp) { return 0; }
 static inline int cet_setup_signal(int ia32, unsigned long addr) { return 0; }
+static inline int cet_setup_ibt(void) { return 0; }
+static inline void cet_disable_ibt(void) {}
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index 3624a11e5ba6..ce5bdaf0f1ff 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -62,6 +62,12 @@
 #define DISABLE_SHSTK	(1<<(X86_FEATURE_SHSTK & 31))
 #endif
 
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+#define DISABLE_IBT	0
+#else
+#define DISABLE_IBT	(1<<(X86_FEATURE_IBT & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -72,7 +78,7 @@
 #define DISABLED_MASK4	(DISABLE_PCID)
 #define DISABLED_MASK5	0
 #define DISABLED_MASK6	0
-#define DISABLED_MASK7	(DISABLE_PTI)
+#define DISABLED_MASK7	(DISABLE_PTI|DISABLE_IBT)
 #define DISABLED_MASK8	0
 #define DISABLED_MASK9	(DISABLE_MPX)
 #define DISABLED_MASK10	0
diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
index 1b7089dcf1ea..4df4b583311f 100644
--- a/arch/x86/kernel/cet.c
+++ b/arch/x86/kernel/cet.c
@@ -12,6 +12,8 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/sched/signal.h>
+#include <linux/vmalloc.h>
+#include <linux/bitops.h>
 #include <asm/msr.h>
 #include <asm/user.h>
 #include <asm/fpu/xstate.h>
@@ -222,3 +224,74 @@ int cet_setup_signal(int ia32, unsigned long rstor_addr)
 
 	return cet_push_shstk(ia32, ssp, rstor_addr);
 }
+
+static unsigned long ibt_mmap(unsigned long addr, unsigned long len)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long populate;
+
+	down_write(&mm->mmap_sem);
+	addr = do_mmap(NULL, addr, len, PROT_READ | PROT_WRITE,
+		       MAP_ANONYMOUS | MAP_PRIVATE,
+		       VM_DONTDUMP, 0, &populate, NULL);
+	up_write(&mm->mmap_sem);
+
+	if (populate)
+		mm_populate(addr, populate);
+
+	return addr;
+}
+
+int cet_setup_ibt(void)
+{
+	u64 r;
+
+	if (!cpu_feature_enabled(X86_FEATURE_IBT))
+		return -EOPNOTSUPP;
+
+	rdmsrl(MSR_IA32_U_CET, r);
+	r |= (MSR_IA32_CET_ENDBR_EN | MSR_IA32_CET_NO_TRACK_EN);
+	wrmsrl(MSR_IA32_U_CET, r);
+	current->thread.cet.ibt_enabled = 1;
+	return 0;
+}
+
+int cet_setup_ibt_bitmap(void)
+{
+	u64 r;
+	unsigned long bitmap;
+	unsigned long size;
+
+	if (!cpu_feature_enabled(X86_FEATURE_IBT))
+		return -EOPNOTSUPP;
+
+	size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
+	bitmap = ibt_mmap(0, size);
+
+	if (bitmap >= TASK_SIZE)
+		return -ENOMEM;
+
+	bitmap &= PAGE_MASK;
+
+	rdmsrl(MSR_IA32_U_CET, r);
+	r |= (MSR_IA32_CET_LEG_IW_EN | bitmap);
+	wrmsrl(MSR_IA32_U_CET, r);
+
+	current->thread.cet.ibt_bitmap_addr = bitmap;
+	current->thread.cet.ibt_bitmap_size = size;
+	return 0;
+}
+
+void cet_disable_ibt(void)
+{
+	u64 r;
+
+	if (!cpu_feature_enabled(X86_FEATURE_IBT))
+		return;
+
+	rdmsrl(MSR_IA32_U_CET, r);
+	r &= ~(MSR_IA32_CET_ENDBR_EN | MSR_IA32_CET_LEG_IW_EN |
+	       MSR_IA32_CET_NO_TRACK_EN);
+	wrmsrl(MSR_IA32_U_CET, r);
+	current->thread.cet.ibt_enabled = 0;
+}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f54fabdaef60..4041d6b94455 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -403,7 +403,8 @@ __setup("nopku", setup_disable_pku);
 
 static __always_inline void setup_cet(struct cpuinfo_x86 *c)
 {
-	if (cpu_feature_enabled(X86_FEATURE_SHSTK))
+	if (cpu_feature_enabled(X86_FEATURE_SHSTK) ||
+	    cpu_feature_enabled(X86_FEATURE_IBT))
 		cr4_set_bits(X86_CR4_CET);
 }
 
@@ -424,6 +425,23 @@ static __init int setup_disable_shstk(char *s)
 __setup("noshstk", setup_disable_shstk);
 #endif
 
+#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+static __init int setup_disable_ibt(char *s)
+{
+	/* require an exact match without trailing characters */
+	if (strlen(s))
+		return 0;
+
+	if (!boot_cpu_has(X86_FEATURE_IBT))
+		return 1;
+
+	setup_clear_cpu_cap(X86_FEATURE_IBT);
+	pr_info("x86: 'noibt' specified, disabling Branch Tracking\n");
+	return 1;
+}
+__setup("noibt", setup_disable_ibt);
+#endif
+
 /*
  * Some CPU features depend on higher CPUID levels, which may not always
  * be available due to CPUID level capping or broken virtualization
diff --git a/arch/x86/kernel/elf.c b/arch/x86/kernel/elf.c
index de08d41971f6..a3995c8c2fc2 100644
--- a/arch/x86/kernel/elf.c
+++ b/arch/x86/kernel/elf.c
@@ -18,6 +18,7 @@
 #include <linux/fs.h>
 #include <linux/uaccess.h>
 #include <linux/string.h>
+#include <linux/compat.h>
 
 #define ELF_NOTE_DESC_OFFSET(n, align) \
 	round_up(sizeof(*n) + n->n_namesz, (align))
@@ -183,7 +184,8 @@ int arch_setup_features(void *ehdr_p, void *phdr_p,
 
 	struct elf64_hdr *ehdr64 = ehdr_p;
 
-	if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
+	if (!cpu_feature_enabled(X86_FEATURE_SHSTK) &&
+	    !cpu_feature_enabled(X86_FEATURE_IBT))
 		return 0;
 
 	if (ehdr64->e_ident[EI_CLASS] == ELFCLASS64) {
@@ -211,6 +213,9 @@ int arch_setup_features(void *ehdr_p, void *phdr_p,
 	current->thread.cet.shstk_enabled = 0;
 	current->thread.cet.shstk_base = 0;
 	current->thread.cet.shstk_size = 0;
+	current->thread.cet.ibt_enabled = 0;
+	current->thread.cet.ibt_bitmap_addr = 0;
+	current->thread.cet.ibt_bitmap_size = 0;
 	current->thread.cet.locked = 0;
 	if (cpu_feature_enabled(X86_FEATURE_SHSTK)) {
 		int exec = current->thread.cet.exec_shstk;
@@ -224,6 +229,14 @@ int arch_setup_features(void *ehdr_p, void *phdr_p,
 		}
 	}
 
+	if (cpu_feature_enabled(X86_FEATURE_IBT)) {
+		if (ibt) {
+			err = cet_setup_ibt();
+			if (err < 0)
+				goto out;
+		}
+	}
+
 	/*
 	 * Lockout CET features if no interpreter
 	 */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 54ad1863c6d2..9bec164e7958 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -139,6 +139,7 @@ void flush_thread(void)
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
 	cet_disable_shstk();
+	cet_disable_ibt();
 	fpu__clear(&tsk->thread.fpu);
 }
 
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 1/7] x86/cet: Add Kconfig option for user-mode Indirect Branch Tracking
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

The user-mode indirect branch tracking support is done mostly by
GCC to insert ENDBR64/ENDBR32 instructions at branch targets.
The kernel provides CPUID enumeration, feature MSR setup and
the allocation of legacy bitmap.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 24339a5299da..27bfbd137fbe 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1953,6 +1953,18 @@ config X86_INTEL_SHADOW_STACK_USER
 
 	  If unsure, say y.
 
+config X86_INTEL_BRANCH_TRACKING_USER
+	prompt "Intel Indirect Branch Tracking for user-mode"
+	def_bool n
+	depends on CPU_SUP_INTEL && X86_64
+	select X86_INTEL_CET
+	select ARCH_HAS_PROGRAM_PROPERTIES
+	---help---
+	  Indirect Branch Tracking provides hardware protection against return-/jmp-
+	  oriented programing attacks.
+
+	  If unsure, say y
+
 config EFI
 	bool "EFI runtime service support"
 	depends on ACPI
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 6/7] tools: Add cetcmd
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

From: "H.J. Lu" <hjl.tools@gmail.com>

Introduce CET command-line utility.  This utility allows system admin
to enable/disable CET features and set default shadow stack size.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
 tools/Makefile                                 |  13 +--
 tools/arch/x86/include/uapi/asm/elf_property.h |  16 +++
 tools/arch/x86/include/uapi/asm/prctl.h        |  33 ++++++
 tools/cet/.gitignore                           |   1 +
 tools/cet/Makefile                             |  11 ++
 tools/cet/cetcmd.c                             | 134 +++++++++++++++++++++++++
 tools/include/uapi/asm/elf_property.h          |   4 +
 tools/include/uapi/asm/prctl.h                 |   4 +
 8 files changed, 210 insertions(+), 6 deletions(-)
 create mode 100644 tools/arch/x86/include/uapi/asm/elf_property.h
 create mode 100644 tools/arch/x86/include/uapi/asm/prctl.h
 create mode 100644 tools/cet/.gitignore
 create mode 100644 tools/cet/Makefile
 create mode 100644 tools/cet/cetcmd.c
 create mode 100644 tools/include/uapi/asm/elf_property.h
 create mode 100644 tools/include/uapi/asm/prctl.h

diff --git a/tools/Makefile b/tools/Makefile
index be02c8b904db..bdca71e61d22 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -10,6 +10,7 @@ help:
 	@echo 'Possible targets:'
 	@echo ''
 	@echo '  acpi                   - ACPI tools'
+	@echo '  cet                    - Intel CET tools'
 	@echo '  cgroup                 - cgroup tools'
 	@echo '  cpupower               - a tool for all things x86 CPU power'
 	@echo '  firewire               - the userspace part of nosy, an IEEE-1394 traffic sniffer'
@@ -59,7 +60,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi: FORCE
+cet cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi: FORCE
 	$(call descend,$@)
 
 liblockdep: FORCE
@@ -91,7 +92,7 @@ freefall: FORCE
 kvm_stat: FORCE
 	$(call descend,kvm/$@)
 
-all: acpi cgroup cpupower gpio hv firewire liblockdep \
+all: acpi cet cgroup cpupower gpio hv firewire liblockdep \
 		perf selftests spi turbostat usb \
 		virtio vm bpf x86_energy_perf_policy \
 		tmon freefall iio objtool kvm_stat wmi
@@ -102,7 +103,7 @@ acpi_install:
 cpupower_install:
 	$(call descend,power/$(@:_install=),install)
 
-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install:
+cet_install cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install:
 	$(call descend,$(@:_install=),install)
 
 liblockdep_install:
@@ -123,7 +124,7 @@ freefall_install:
 kvm_stat_install:
 	$(call descend,kvm/$(@:_install=),install)
 
-install: acpi_install cgroup_install cpupower_install gpio_install \
+install: acpi_install cet_install cgroup_install cpupower_install gpio_install \
 		hv_install firewire_install iio_install liblockdep_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install vm_install bpf_install x86_energy_perf_policy_install \
@@ -136,7 +137,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean:
+cet_clean cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean:
 	$(call descend,$(@:_clean=),clean)
 
 liblockdep_clean:
@@ -170,7 +171,7 @@ freefall_clean:
 build_clean:
 	$(call descend,build,clean)
 
-clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
+clean: acpi_clean cet_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
 		vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
 		freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
diff --git a/tools/arch/x86/include/uapi/asm/elf_property.h b/tools/arch/x86/include/uapi/asm/elf_property.h
new file mode 100644
index 000000000000..343a871b8fc1
--- /dev/null
+++ b/tools/arch/x86/include/uapi/asm/elf_property.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _UAPI_ASM_X86_ELF_PROPERTY_H
+#define _UAPI_ASM_X86_ELF_PROPERTY_H
+
+/*
+ * pr_type
+ */
+#define GNU_PROPERTY_X86_FEATURE_1_AND (0xc0000002)
+
+/*
+ * Bits for GNU_PROPERTY_X86_FEATURE_1_AND
+ */
+#define GNU_PROPERTY_X86_FEATURE_1_SHSTK	(0x00000002)
+#define GNU_PROPERTY_X86_FEATURE_1_IBT		(0x00000001)
+
+#endif /* _UAPI_ASM_X86_ELF_PROPERTY_H */
diff --git a/tools/arch/x86/include/uapi/asm/prctl.h b/tools/arch/x86/include/uapi/asm/prctl.h
new file mode 100644
index 000000000000..fef476d2d2f6
--- /dev/null
+++ b/tools/arch/x86/include/uapi/asm/prctl.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_X86_PRCTL_H
+#define _ASM_X86_PRCTL_H
+
+#define ARCH_SET_GS		0x1001
+#define ARCH_SET_FS		0x1002
+#define ARCH_GET_FS		0x1003
+#define ARCH_GET_GS		0x1004
+
+#define ARCH_GET_CPUID		0x1011
+#define ARCH_SET_CPUID		0x1012
+
+#define ARCH_MAP_VDSO_X32	0x2001
+#define ARCH_MAP_VDSO_32	0x2002
+#define ARCH_MAP_VDSO_64	0x2003
+
+#define ARCH_CET_STATUS		0x3001
+#define ARCH_CET_DISABLE	0x3002
+#define ARCH_CET_LOCK		0x3003
+#define ARCH_CET_EXEC		0x3004
+#define ARCH_CET_ALLOC_SHSTK	0x3005
+#define ARCH_CET_PUSH_SHSTK	0x3006
+#define ARCH_CET_LEGACY_BITMAP	0x3007
+
+/*
+ * Settings for ARCH_CET_EXEC
+ */
+#define CET_EXEC_ELF_PROPERTY	0
+#define CET_EXEC_ALWAYS_OFF	1
+#define CET_EXEC_ALWAYS_ON	2
+#define CET_EXEC_MAX CET_EXEC_ALWAYS_ON
+
+#endif /* _ASM_X86_PRCTL_H */
diff --git a/tools/cet/.gitignore b/tools/cet/.gitignore
new file mode 100644
index 000000000000..bd100f593454
--- /dev/null
+++ b/tools/cet/.gitignore
@@ -0,0 +1 @@
+cetcmd
diff --git a/tools/cet/Makefile b/tools/cet/Makefile
new file mode 100644
index 000000000000..fae42b84d796
--- /dev/null
+++ b/tools/cet/Makefile
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for CET tools
+
+CFLAGS = -O2 -g -Wall -Wextra -I../include/uapi
+
+all: cetcmd
+%: %.c
+	$(CC) $(CFLAGS) -o $@ $^
+
+clean:
+	$(RM) cetcmd
diff --git a/tools/cet/cetcmd.c b/tools/cet/cetcmd.c
new file mode 100644
index 000000000000..dbbfb5267c1f
--- /dev/null
+++ b/tools/cet/cetcmd.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define _GNU_SOURCE
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <getopt.h>
+#include <errno.h>
+#include <error.h>
+#include <asm/elf_property.h>
+#include <asm/prctl.h>
+
+enum command_line_switch {
+	OPTION_ELF_PROPERTY = 150,
+	OPTION_ALWAYS_OFF,
+	OPTION_ALWAYS_ON,
+	OPTION_SHSTK_SIZE
+};
+
+static const struct option options[] = {
+	{"property",	no_argument, 0, OPTION_ELF_PROPERTY},
+	{"off",		no_argument, 0, OPTION_ALWAYS_OFF},
+	{"on",		no_argument, 0, OPTION_ALWAYS_ON},
+	{"shstk-size",	required_argument, 0, OPTION_SHSTK_SIZE},
+	{"feature",	required_argument, 0, 'f'},
+	{"help",	no_argument, 0, 'h'},
+	{0,		no_argument, 0, 0}
+};
+
+__attribute__((__noreturn__))
+static void
+usage(FILE *stream, int exit_status)
+{
+	fprintf(stream, "Usage: %s <option(s)> -- command [args]\n",
+		program_invocation_short_name);
+	fprintf(stream, " Run command with CET features\n");
+	fprintf(stream, " The options are:\n");
+	fprintf(stream,
+		"\t--property                Enable CET features based on ELF property note\n"
+		"\t--off                     Always disable CET features\n"
+		"\t--on                      Always enable CET features\n"
+		"\t-f, --feature [ibt|shstk] Control CET [IBT|SHSTK] feature\n"
+		"\t--shstk-size SIZE         Set shadow stack size\n"
+		"\t-h --help                 Display this information\n");
+
+	exit(exit_status);
+}
+
+extern int arch_prctl(int, unsigned long *);
+
+int
+main(int argc, char *const *argv, char *const *envp)
+{
+	int c;
+	unsigned long values[3] = {0, -1, 0};
+	unsigned long status[3];
+	unsigned long shstk_size = -1;
+	char **args;
+	size_t i, num_of_args;
+
+	while ((c = getopt_long(argc, argv, "f:h",
+				options, (int *) 0)) != EOF) {
+		switch (c) {
+		case OPTION_ELF_PROPERTY:
+			values[1] = CET_EXEC_ELF_PROPERTY;
+			break;
+
+		case OPTION_ALWAYS_OFF:
+			values[1] = CET_EXEC_ALWAYS_OFF;
+			break;
+
+		case OPTION_ALWAYS_ON:
+			values[1] = CET_EXEC_ALWAYS_ON;
+			break;
+
+		case OPTION_SHSTK_SIZE:
+			shstk_size = strtol(optarg, NULL, 0);
+			break;
+
+		case 'f':
+			if (strcasecmp(optarg, "ibt") == 0)
+				values[0] = GNU_PROPERTY_X86_FEATURE_1_IBT;
+			else if (strcasecmp(optarg, "shstk") == 0)
+				values[0] = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+			else
+				usage(stderr, EXIT_FAILURE);
+			break;
+
+		case 'h':
+			usage(stdout, EXIT_SUCCESS);
+
+		default:
+			usage(stderr, EXIT_FAILURE);
+		}
+	}
+
+	if ((optind + 1) > argc ||
+	    (values[1] == (unsigned long)-1 &&
+	     shstk_size == (unsigned long)-1))
+		usage(stderr, EXIT_FAILURE);
+
+	/* If --shstk-size isn't used, get the current shadow stack size. */
+	if (shstk_size == (unsigned long)-1) {
+		if (arch_prctl(ARCH_CET_STATUS, status) < 0)
+			error(EXIT_FAILURE, errno, "arch_prctl failed\n");
+		shstk_size = status[2];
+	}
+
+	/* If --property/--off/--on aren't used, clear all features. */
+	if (values[1] == (unsigned long)-1) {
+		values[0] = 0;
+		values[1] = 0;
+	} else {
+		if (values[0] == 0)
+			values[0] = (GNU_PROPERTY_X86_FEATURE_1_IBT |
+				     GNU_PROPERTY_X86_FEATURE_1_SHSTK);
+	}
+
+	values[2] = shstk_size;
+	if (arch_prctl(ARCH_CET_EXEC, values) < 0)
+		error(EXIT_FAILURE, errno, "arch_prctl failed\n");
+
+	num_of_args = argc - optind + 1;
+	args = malloc(num_of_args * sizeof(char *));
+	if (args == NULL)
+		error(EXIT_FAILURE, errno, "malloc failed\n");
+
+	for (i = 0; i < num_of_args; i++)
+		args[i] = argv[optind + i];
+	args[i] = NULL;
+
+	return execvpe(argv[optind], args, envp);
+}
diff --git a/tools/include/uapi/asm/elf_property.h b/tools/include/uapi/asm/elf_property.h
new file mode 100644
index 000000000000..1281b4e1a578
--- /dev/null
+++ b/tools/include/uapi/asm/elf_property.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if defined(__i386__) || defined(__x86_64__)
+#include "../../arch/x86/include/uapi/asm/elf_property.h"
+#endif
\ No newline at end of file
diff --git a/tools/include/uapi/asm/prctl.h b/tools/include/uapi/asm/prctl.h
new file mode 100644
index 000000000000..b0894b828b06
--- /dev/null
+++ b/tools/include/uapi/asm/prctl.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#if defined(__i386__) || defined(__x86_64__)
+#include "../../arch/x86/include/uapi/asm/prctl.h"
+#endif
\ No newline at end of file
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 5/7] x86: Insert endbr32/endbr64 to vDSO
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
In-Reply-To: <20180607143855.3681-1-yu-cheng.yu@intel.com>

From: "H.J. Lu" <hjl.tools@gmail.com>

When Intel indirect branch tracking is enabled, functions in vDSO which
may be called indirectly should have endbr32 or endbr64 as the first
instruction.  We try to compile vDSO with -fcf-protection=branch -mibt
if possible.  Otherwise, we insert endbr32 or endbr64 by hand to assembly
codes generated by the compiler.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
 arch/x86/entry/vdso/.gitignore |  4 ++++
 arch/x86/entry/vdso/Makefile   | 34 ++++++++++++++++++++++++++++++++++
 arch/x86/entry/vdso/endbr.sh   | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 70 insertions(+)
 create mode 100644 arch/x86/entry/vdso/endbr.sh

diff --git a/arch/x86/entry/vdso/.gitignore b/arch/x86/entry/vdso/.gitignore
index aae8ffdd5880..552941fdfae0 100644
--- a/arch/x86/entry/vdso/.gitignore
+++ b/arch/x86/entry/vdso/.gitignore
@@ -5,3 +5,7 @@ vdso32-sysenter-syms.lds
 vdso32-int80-syms.lds
 vdso-image-*.c
 vdso2c
+vclock_gettime.S
+vgetcpu.S
+vclock_gettime.asm
+vgetcpu.asm
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index d998a487c9b1..cccafc36a831 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -118,6 +118,40 @@ $(obj)/%-x32.o: $(obj)/%.o FORCE
 
 targets += vdsox32.lds $(vobjx32s-y)
 
+ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
+  ifeq ($(call cc-option-yn, -fcf-protection=branch -mibt), y)
+    $(obj)/vclock_gettime.o $(obj)/vgetcpu.o $(obj)/vdso32/vclock_gettime.o: KBUILD_CFLAGS += -fcf-protection=branch -mibt
+  else
+    endbr := $(srctree)/$(src)/endbr.sh
+    quiet_cmd_endbr = ENDBR $@
+	  cmd_endbr = $(CONFIG_SHELL) '$(endbr)' $< $@
+
+    quiet_cmd_cc_asm_c = CC $(quiet_modtag) $@
+	  cmd_cc_asm_c = $(CC) $(c_flags) $(DISABLE_LTO) -S -o $@ $<
+
+$(obj)/%.asm: $(src)/%.c
+	$(call if_changed_dep,cc_asm_c)
+
+$(obj)/vclock_gettime.S: $(obj)/vclock_gettime.asm
+	$(call if_changed,endbr)
+
+$(obj)/vgetcpu.S: $(obj)/vgetcpu.asm
+	$(call if_changed,endbr)
+
+$(obj)/vclock_gettime.o: $(obj)/vclock_gettime.S
+	 $(call if_changed_rule,as_o_S)
+
+$(obj)/vgetcpu.o: $(obj)/vgetcpu.S
+	 $(call if_changed_rule,as_o_S)
+
+$(obj)/vdso32/vclock_gettime.S: $(obj)/vdso32/vclock_gettime.asm
+	$(call if_changed,endbr)
+
+$(obj)/vdso32/vclock_gettime.o: $(obj)/vdso32/vclock_gettime.S
+	 $(call if_changed_rule,as_o_S)
+  endif
+endif
+
 $(obj)/%.so: OBJCOPYFLAGS := -S
 $(obj)/%.so: $(obj)/%.so.dbg
 	$(call if_changed,objcopy)
diff --git a/arch/x86/entry/vdso/endbr.sh b/arch/x86/entry/vdso/endbr.sh
new file mode 100644
index 000000000000..983dbec182f2
--- /dev/null
+++ b/arch/x86/entry/vdso/endbr.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+in="$1"
+out="$2"
+
+case "$in" in
+*/vdso32/*)
+    endbr=".byte 0xf3,0x0f,0x1e,0xfb"
+    ;;
+*)
+    endbr=".byte 0xf3,0x0f,0x1e,0xfa"
+    ;;
+esac
+
+need_endbr=no
+while IFS= read line ; do
+   case "$line" in
+   __vdso_clock_gettime:|__vdso_getcpu:|__vdso_gettimeofday:|__vdso_time:)
+	need_endbr=yes
+	;;
+   "	."*)
+	;;
+   "	"*)
+	if [ "$need_endbr" = "yes" ]; then
+	    need_endbr=no
+	    echo "	$endbr"
+	fi
+	;;
+   esac
+   echo "$line"
+done < "$in" > "$out"
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 2/5] x86/fpu/xstate: Change some names to separate XSAVES system and user states
From: Andy Lutomirski @ 2018-06-07 15:38 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: LKML, linux-doc, Linux-MM, linux-arch, X86 ML, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, H. J. Lu, Shanbhogue, Vedvyas,
	Ravi V. Shankar, Dave Hansen, Jonathan Corbet, Oleg Nesterov,
	Arnd Bergmann, mike.kravetz
In-Reply-To: <20180607143544.3477-3-yu-cheng.yu@intel.com>

On Thu, Jun 7, 2018 at 7:40 AM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote:
>
> To support XSAVES system states, change some names to distinguish
> user and system states.
>
> Change:
>   supervisor to system
>   copy_init_fpstate_to_fpregs() to copy_init_fpstate_user_settings_to_fpregs()
>   xfeatures_mask to xfeatures_mask_user
>   XCNTXT_MASK to SUPPORTED_XFEATURES_MASK (states supported)

How about copy_init_user_fpstate_to_fpregs()?  It's shorter and more
to the point.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 5/5] Documentation/x86: Add CET description
From: Andy Lutomirski @ 2018-06-07 15:39 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: LKML, linux-doc, Linux-MM, linux-arch, X86 ML, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, H. J. Lu, Shanbhogue, Vedvyas,
	Ravi V. Shankar, Dave Hansen, Jonathan Corbet, Oleg Nesterov,
	Arnd Bergmann, mike.kravetz
In-Reply-To: <20180607143544.3477-6-yu-cheng.yu@intel.com>

On Thu, Jun 7, 2018 at 7:40 AM Yu-cheng Yu <yu-cheng.yu@intel.com> wrote:

Fix the subject line, please.  This is more than just docs.

>
> Explain how CET works and the noshstk/noibt kernel parameters.

Maybe no_cet_shstk and no_cet_ibt?  noshstk sounds like gibberish and
people might need a reminder.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 08/10] mm: Prevent mremap of shadow stack
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143807.3611-1-yu-cheng.yu@intel.com>

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 mm/mremap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/mremap.c b/mm/mremap.c
index 049470aa1e3e..70f20edb248e 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -525,7 +525,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
 		unsigned long, new_addr)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma = find_vma(mm, addr);
 	unsigned long ret = -EINVAL;
 	unsigned long charged = 0;
 	bool locked = false;
@@ -533,6 +533,9 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
 	LIST_HEAD(uf_unmap_early);
 	LIST_HEAD(uf_unmap);
 
+	if (vma->vm_flags & VM_SHSTK)
+		return ret;
+
 	if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE))
 		return ret;
 
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 07/10] mm: Prevent mprotect from changing shadow stack
From: Yu-cheng Yu @ 2018-06-07 14:38 UTC (permalink / raw)
  To: linux-kernel, linux-doc, linux-mm, linux-arch, x86,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, H.J. Lu,
	Vedvyas Shanbhogue, Ravi V. Shankar, Dave Hansen, Andy Lutomirski,
	Jonathan Corbet, Oleg Nesterov, Arnd Bergmann, Mike Kravetz
  Cc: Yu-cheng Yu
In-Reply-To: <20180607143807.3611-1-yu-cheng.yu@intel.com>

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 mm/mprotect.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 625608bc8962..128dcb880c12 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -446,6 +446,15 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 	error = -ENOMEM;
 	if (!vma)
 		goto out;
+
+	/*
+	 * Do not allow changing shadow stack memory.
+	 */
+	if (vma->vm_flags & VM_SHSTK) {
+		error = -EINVAL;
+		goto out;
+	}
+
 	prev = vma->vm_prev;
 	if (unlikely(grows & PROT_GROWSDOWN)) {
 		if (vma->vm_start >= end)
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox