* [PATCH 01/17] prmem: linker section for static write rare
[not found] <20181023213504.28905-1-igor.stoppa@huawei.com>
@ 2018-10-23 21:34 ` Igor Stoppa
2018-10-23 21:34 ` Igor Stoppa
2018-10-23 21:35 ` [PATCH 16/17] prmem: pratomic-long Igor Stoppa
1 sibling, 1 reply; 16+ messages in thread
From: Igor Stoppa @ 2018-10-23 21:34 UTC (permalink / raw)
To: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module
Cc: igor.stoppa, Dave Hansen, Jonathan Corbet, Laura Abbott,
Arnd Bergmann, Thomas Gleixner, Kate Stewart, Greg Kroah-Hartman,
Philippe Ombredanne, linux-arch, linux-kernel
Introduce a section and a label for statically allocated write rare
data.
The label is named "__write_rare_after_init".
As the name implies, after the init phase is completed, this section
will be modifiable only by invoking write rare functions.
NOTE:
this needs rework, because the current write-rare mechanism works
only on x86_64 and not arm64, due to arm64 mappings.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Kate Stewart <kstewart@linuxfoundation.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Philippe Ombredanne <pombredanne@nexb.com>
CC: linux-arch@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
include/asm-generic/vmlinux.lds.h | 20 ++++++++++++++++++++
include/linux/cache.h | 17 +++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index d7701d466b60..fd40a15e3b24 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -300,6 +300,25 @@
. = __start_init_task + THREAD_SIZE; \
__end_init_task = .;
+/*
+ * Allow architectures to handle wr_after_init data on their
+ * own by defining an empty WR_AFTER_INIT_DATA.
+ * However, it's important that pages containing WR_RARE data do not
+ * hold anything else, to avoid both accidentally unprotecting something
+ * that is supposed to stay read-only all the time and also to protect
+ * something else that is supposed to be writeable all the time.
+ */
+#ifndef WR_AFTER_INIT_DATA
+#define WR_AFTER_INIT_DATA(align) \
+ . = ALIGN(PAGE_SIZE); \
+ __start_wr_after_init = .; \
+ . = ALIGN(align); \
+ *(.data..wr_after_init) \
+ . = ALIGN(PAGE_SIZE); \
+ __end_wr_after_init = .; \
+ . = ALIGN(align);
+#endif
+
/*
* Allow architectures to handle ro_after_init data on their
* own by defining an empty RO_AFTER_INIT_DATA.
@@ -320,6 +339,7 @@
__start_rodata = .; \
*(.rodata) *(.rodata.*) \
RO_AFTER_INIT_DATA /* Read only after init */ \
+ WR_AFTER_INIT_DATA(align) /* wr after init */ \
KEEP(*(__vermagic)) /* Kernel version magic */ \
. = ALIGN(8); \
__start___tracepoints_ptrs = .; \
diff --git a/include/linux/cache.h b/include/linux/cache.h
index 750621e41d1c..9a7e7134b887 100644
--- a/include/linux/cache.h
+++ b/include/linux/cache.h
@@ -31,6 +31,23 @@
#define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
#endif
+/*
+ * __wr_after_init is used to mark objects that cannot be modified
+ * directly after init (i.e. after mark_rodata_ro() has been called).
+ * These objects become effectively read-only, from the perspective of
+ * performing a direct write, like a variable assignment.
+ * However, they can be altered through a dedicated function.
+ * It is intended for those objects which are occasionally modified after
+ * init, however they are modified so seldomly, that the extra cost from
+ * the indirect modification is either negligible or worth paying, for the
+ * sake of the protection gained.
+ */
+#ifndef __wr_after_init
+#define __wr_after_init \
+ __attribute__((__section__(".data..wr_after_init")))
+#endif
+
+
#ifndef ____cacheline_aligned
#define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 01/17] prmem: linker section for static write rare
2018-10-23 21:34 ` [PATCH 01/17] prmem: linker section for static write rare Igor Stoppa
@ 2018-10-23 21:34 ` Igor Stoppa
0 siblings, 0 replies; 16+ messages in thread
From: Igor Stoppa @ 2018-10-23 21:34 UTC (permalink / raw)
To: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module
Cc: igor.stoppa, Dave Hansen, Jonathan Corbet, Laura Abbott,
Arnd Bergmann, Thomas Gleixner, Kate Stewart, Greg Kroah-Hartman,
Philippe Ombredanne, linux-arch, linux-kernel
Introduce a section and a label for statically allocated write rare
data.
The label is named "__write_rare_after_init".
As the name implies, after the init phase is completed, this section
will be modifiable only by invoking write rare functions.
NOTE:
this needs rework, because the current write-rare mechanism works
only on x86_64 and not arm64, due to arm64 mappings.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Kate Stewart <kstewart@linuxfoundation.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Philippe Ombredanne <pombredanne@nexb.com>
CC: linux-arch@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
include/asm-generic/vmlinux.lds.h | 20 ++++++++++++++++++++
include/linux/cache.h | 17 +++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index d7701d466b60..fd40a15e3b24 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -300,6 +300,25 @@
. = __start_init_task + THREAD_SIZE; \
__end_init_task = .;
+/*
+ * Allow architectures to handle wr_after_init data on their
+ * own by defining an empty WR_AFTER_INIT_DATA.
+ * However, it's important that pages containing WR_RARE data do not
+ * hold anything else, to avoid both accidentally unprotecting something
+ * that is supposed to stay read-only all the time and also to protect
+ * something else that is supposed to be writeable all the time.
+ */
+#ifndef WR_AFTER_INIT_DATA
+#define WR_AFTER_INIT_DATA(align) \
+ . = ALIGN(PAGE_SIZE); \
+ __start_wr_after_init = .; \
+ . = ALIGN(align); \
+ *(.data..wr_after_init) \
+ . = ALIGN(PAGE_SIZE); \
+ __end_wr_after_init = .; \
+ . = ALIGN(align);
+#endif
+
/*
* Allow architectures to handle ro_after_init data on their
* own by defining an empty RO_AFTER_INIT_DATA.
@@ -320,6 +339,7 @@
__start_rodata = .; \
*(.rodata) *(.rodata.*) \
RO_AFTER_INIT_DATA /* Read only after init */ \
+ WR_AFTER_INIT_DATA(align) /* wr after init */ \
KEEP(*(__vermagic)) /* Kernel version magic */ \
. = ALIGN(8); \
__start___tracepoints_ptrs = .; \
diff --git a/include/linux/cache.h b/include/linux/cache.h
index 750621e41d1c..9a7e7134b887 100644
--- a/include/linux/cache.h
+++ b/include/linux/cache.h
@@ -31,6 +31,23 @@
#define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
#endif
+/*
+ * __wr_after_init is used to mark objects that cannot be modified
+ * directly after init (i.e. after mark_rodata_ro() has been called).
+ * These objects become effectively read-only, from the perspective of
+ * performing a direct write, like a variable assignment.
+ * However, they can be altered through a dedicated function.
+ * It is intended for those objects which are occasionally modified after
+ * init, however they are modified so seldomly, that the extra cost from
+ * the indirect modification is either negligible or worth paying, for the
+ * sake of the protection gained.
+ */
+#ifndef __wr_after_init
+#define __wr_after_init \
+ __attribute__((__section__(".data..wr_after_init")))
+#endif
+
+
#ifndef ____cacheline_aligned
#define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 16/17] prmem: pratomic-long
[not found] <20181023213504.28905-1-igor.stoppa@huawei.com>
2018-10-23 21:34 ` [PATCH 01/17] prmem: linker section for static write rare Igor Stoppa
@ 2018-10-23 21:35 ` Igor Stoppa
2018-10-23 21:35 ` Igor Stoppa
2018-10-25 0:13 ` Peter Zijlstra
1 sibling, 2 replies; 16+ messages in thread
From: Igor Stoppa @ 2018-10-23 21:35 UTC (permalink / raw)
To: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module
Cc: igor.stoppa, Dave Hansen, Jonathan Corbet, Laura Abbott,
Will Deacon, Peter Zijlstra, Boqun Feng, Arnd Bergmann,
linux-arch, linux-kernel
Minimalistic functionality for having the write rare version of
atomic_long_t data.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: linux-arch@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
MAINTAINERS | 1 +
include/linux/pratomic-long.h | 73 +++++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
create mode 100644 include/linux/pratomic-long.h
diff --git a/MAINTAINERS b/MAINTAINERS
index e7f7cb1682a6..9d72688d00a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9466,6 +9466,7 @@ F: mm/test_pmalloc.c
F: Documentation/core-api/prmem.rst
F: include/linux/prlist.h
F: lib/test_prlist.c
+F: include/linux/pratomic-long.h
MEMORY MANAGEMENT
L: linux-mm@kvack.org
diff --git a/include/linux/pratomic-long.h b/include/linux/pratomic-long.h
new file mode 100644
index 000000000000..8f1408593733
--- /dev/null
+++ b/include/linux/pratomic-long.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Atomic operations for write rare memory */
+#ifndef _LINUX_PRATOMIC_LONG_H
+#define _LINUX_PRATOMIC_LONG_H
+#include <linux/prmem.h>
+#include <linux/compiler.h>
+#include <asm-generic/atomic-long.h>
+
+struct pratomic_long_t {
+ atomic_long_t l __aligned(sizeof(atomic_long_t));
+} __aligned(sizeof(atomic_long_t));
+
+#define PRATOMIC_LONG_INIT(i) { \
+ .l = ATOMIC_LONG_INIT((i)), \
+}
+
+static __always_inline
+bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
+{
+ struct page *page;
+ uintptr_t base;
+ uintptr_t offset;
+ unsigned long flags;
+ size_t size = sizeof(*l);
+ bool is_virt = __is_wr_after_init(l, size);
+
+ if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
+ WR_ERR_RANGE_MSG))
+ return false;
+ local_irq_save(flags);
+ if (is_virt)
+ page = virt_to_page(l);
+ else
+ vmalloc_to_page(l);
+ offset = (~PAGE_MASK) & (uintptr_t)l;
+ base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
+ if (WARN(!base, WR_ERR_PAGE_MSG)) {
+ local_irq_restore(flags);
+ return false;
+ }
+ if (inc)
+ atomic_long_inc((atomic_long_t *)(base + offset));
+ else
+ atomic_long_dec((atomic_long_t *)(base + offset));
+ vunmap((void *)base);
+ local_irq_restore(flags);
+ return true;
+
+}
+
+/**
+ * pratomic_long_inc - atomic increment of rare write long
+ * @l: address of the variable of type struct pratomic_long_t
+ *
+ * Return: true on success, false otherwise
+ */
+static __always_inline bool pratomic_long_inc(struct pratomic_long_t *l)
+{
+ return __pratomic_long_op(true, l);
+}
+
+/**
+ * pratomic_long_inc - atomic decrement of rare write long
+ * @l: address of the variable of type struct pratomic_long_t
+ *
+ * Return: true on success, false otherwise
+ */
+static __always_inline bool pratomic_long_dec(struct pratomic_long_t *l)
+{
+ return __pratomic_long_op(false, l);
+}
+
+#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 16/17] prmem: pratomic-long
2018-10-23 21:35 ` [PATCH 16/17] prmem: pratomic-long Igor Stoppa
@ 2018-10-23 21:35 ` Igor Stoppa
2018-10-25 0:13 ` Peter Zijlstra
1 sibling, 0 replies; 16+ messages in thread
From: Igor Stoppa @ 2018-10-23 21:35 UTC (permalink / raw)
To: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module
Cc: igor.stoppa, Dave Hansen, Jonathan Corbet, Laura Abbott,
Will Deacon, Peter Zijlstra, Boqun Feng, Arnd Bergmann,
linux-arch, linux-kernel
Minimalistic functionality for having the write rare version of
atomic_long_t data.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: linux-arch@vger.kernel.org
CC: linux-kernel@vger.kernel.org
---
MAINTAINERS | 1 +
include/linux/pratomic-long.h | 73 +++++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
create mode 100644 include/linux/pratomic-long.h
diff --git a/MAINTAINERS b/MAINTAINERS
index e7f7cb1682a6..9d72688d00a3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9466,6 +9466,7 @@ F: mm/test_pmalloc.c
F: Documentation/core-api/prmem.rst
F: include/linux/prlist.h
F: lib/test_prlist.c
+F: include/linux/pratomic-long.h
MEMORY MANAGEMENT
L: linux-mm@kvack.org
diff --git a/include/linux/pratomic-long.h b/include/linux/pratomic-long.h
new file mode 100644
index 000000000000..8f1408593733
--- /dev/null
+++ b/include/linux/pratomic-long.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Atomic operations for write rare memory */
+#ifndef _LINUX_PRATOMIC_LONG_H
+#define _LINUX_PRATOMIC_LONG_H
+#include <linux/prmem.h>
+#include <linux/compiler.h>
+#include <asm-generic/atomic-long.h>
+
+struct pratomic_long_t {
+ atomic_long_t l __aligned(sizeof(atomic_long_t));
+} __aligned(sizeof(atomic_long_t));
+
+#define PRATOMIC_LONG_INIT(i) { \
+ .l = ATOMIC_LONG_INIT((i)), \
+}
+
+static __always_inline
+bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
+{
+ struct page *page;
+ uintptr_t base;
+ uintptr_t offset;
+ unsigned long flags;
+ size_t size = sizeof(*l);
+ bool is_virt = __is_wr_after_init(l, size);
+
+ if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
+ WR_ERR_RANGE_MSG))
+ return false;
+ local_irq_save(flags);
+ if (is_virt)
+ page = virt_to_page(l);
+ else
+ vmalloc_to_page(l);
+ offset = (~PAGE_MASK) & (uintptr_t)l;
+ base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
+ if (WARN(!base, WR_ERR_PAGE_MSG)) {
+ local_irq_restore(flags);
+ return false;
+ }
+ if (inc)
+ atomic_long_inc((atomic_long_t *)(base + offset));
+ else
+ atomic_long_dec((atomic_long_t *)(base + offset));
+ vunmap((void *)base);
+ local_irq_restore(flags);
+ return true;
+
+}
+
+/**
+ * pratomic_long_inc - atomic increment of rare write long
+ * @l: address of the variable of type struct pratomic_long_t
+ *
+ * Return: true on success, false otherwise
+ */
+static __always_inline bool pratomic_long_inc(struct pratomic_long_t *l)
+{
+ return __pratomic_long_op(true, l);
+}
+
+/**
+ * pratomic_long_inc - atomic decrement of rare write long
+ * @l: address of the variable of type struct pratomic_long_t
+ *
+ * Return: true on success, false otherwise
+ */
+static __always_inline bool pratomic_long_dec(struct pratomic_long_t *l)
+{
+ return __pratomic_long_op(false, l);
+}
+
+#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-23 21:35 ` [PATCH 16/17] prmem: pratomic-long Igor Stoppa
2018-10-23 21:35 ` Igor Stoppa
@ 2018-10-25 0:13 ` Peter Zijlstra
2018-10-25 0:13 ` Peter Zijlstra
2018-10-29 21:17 ` Igor Stoppa
1 sibling, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-25 0:13 UTC (permalink / raw)
To: Igor Stoppa
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> +static __always_inline
> +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> +{
> + struct page *page;
> + uintptr_t base;
> + uintptr_t offset;
> + unsigned long flags;
> + size_t size = sizeof(*l);
> + bool is_virt = __is_wr_after_init(l, size);
> +
> + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> + WR_ERR_RANGE_MSG))
> + return false;
> + local_irq_save(flags);
> + if (is_virt)
> + page = virt_to_page(l);
> + else
> + vmalloc_to_page(l);
> + offset = (~PAGE_MASK) & (uintptr_t)l;
> + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> + local_irq_restore(flags);
> + return false;
> + }
> + if (inc)
> + atomic_long_inc((atomic_long_t *)(base + offset));
> + else
> + atomic_long_dec((atomic_long_t *)(base + offset));
> + vunmap((void *)base);
> + local_irq_restore(flags);
> + return true;
> +
> +}
That's just hideously nasty.. and horribly broken.
We're not going to duplicate all these kernel interfaces wrapped in gunk
like that. Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
you've never tested this with debug bits enabled.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-25 0:13 ` Peter Zijlstra
@ 2018-10-25 0:13 ` Peter Zijlstra
2018-10-29 21:17 ` Igor Stoppa
1 sibling, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-25 0:13 UTC (permalink / raw)
To: Igor Stoppa
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> +static __always_inline
> +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> +{
> + struct page *page;
> + uintptr_t base;
> + uintptr_t offset;
> + unsigned long flags;
> + size_t size = sizeof(*l);
> + bool is_virt = __is_wr_after_init(l, size);
> +
> + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> + WR_ERR_RANGE_MSG))
> + return false;
> + local_irq_save(flags);
> + if (is_virt)
> + page = virt_to_page(l);
> + else
> + vmalloc_to_page(l);
> + offset = (~PAGE_MASK) & (uintptr_t)l;
> + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> + local_irq_restore(flags);
> + return false;
> + }
> + if (inc)
> + atomic_long_inc((atomic_long_t *)(base + offset));
> + else
> + atomic_long_dec((atomic_long_t *)(base + offset));
> + vunmap((void *)base);
> + local_irq_restore(flags);
> + return true;
> +
> +}
That's just hideously nasty.. and horribly broken.
We're not going to duplicate all these kernel interfaces wrapped in gunk
like that. Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
you've never tested this with debug bits enabled.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-25 0:13 ` Peter Zijlstra
2018-10-25 0:13 ` Peter Zijlstra
@ 2018-10-29 21:17 ` Igor Stoppa
2018-10-29 21:17 ` Igor Stoppa
2018-10-30 15:58 ` Peter Zijlstra
1 sibling, 2 replies; 16+ messages in thread
From: Igor Stoppa @ 2018-10-29 21:17 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On 25/10/2018 01:13, Peter Zijlstra wrote:
> On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
>> +static __always_inline
>> +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
>> +{
>> + struct page *page;
>> + uintptr_t base;
>> + uintptr_t offset;
>> + unsigned long flags;
>> + size_t size = sizeof(*l);
>> + bool is_virt = __is_wr_after_init(l, size);
>> +
>> + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
>> + WR_ERR_RANGE_MSG))
>> + return false;
>> + local_irq_save(flags);
>> + if (is_virt)
>> + page = virt_to_page(l);
>> + else
>> + vmalloc_to_page(l);
>> + offset = (~PAGE_MASK) & (uintptr_t)l;
>> + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
>> + if (WARN(!base, WR_ERR_PAGE_MSG)) {
>> + local_irq_restore(flags);
>> + return false;
>> + }
>> + if (inc)
>> + atomic_long_inc((atomic_long_t *)(base + offset));
>> + else
>> + atomic_long_dec((atomic_long_t *)(base + offset));
>> + vunmap((void *)base);
>> + local_irq_restore(flags);
>> + return true;
>> +
>> +}
>
> That's just hideously nasty.. and horribly broken.
>
> We're not going to duplicate all these kernel interfaces wrapped in gunk
> like that.
one possibility would be to have macros which use typeof() on the
parameter being passed, to decide what implementation to use: regular or
write-rare
This means that type punning would still be needed, to select the
implementation.
Would this be enough? Is there some better way?
> Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
> you've never tested this with debug bits enabled.
I thought I had them. And I _did_ have them enabled, at some point.
But I must have messed up with the configuration and I failed to notice
this.
I can think of a way it might work, albeit it's not going to be very pretty:
* for the vmap(): if I understand correctly, it might sleep while
obtaining memory for creating the mapping. This part could be executed
before disabling interrupts. The rest of the function, instead, would be
executed after interrupts are disabled.
* for vunmap(): after the writing is done, change also the alternate
mapping to read only, then enable interrupts and destroy the alternate
mapping. Making also the secondary mapping read only makes it equally
secure as the primary, which means that it can be visible also with
interrupts enabled.
--
igor
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-29 21:17 ` Igor Stoppa
@ 2018-10-29 21:17 ` Igor Stoppa
2018-10-30 15:58 ` Peter Zijlstra
1 sibling, 0 replies; 16+ messages in thread
From: Igor Stoppa @ 2018-10-29 21:17 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On 25/10/2018 01:13, Peter Zijlstra wrote:
> On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
>> +static __always_inline
>> +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
>> +{
>> + struct page *page;
>> + uintptr_t base;
>> + uintptr_t offset;
>> + unsigned long flags;
>> + size_t size = sizeof(*l);
>> + bool is_virt = __is_wr_after_init(l, size);
>> +
>> + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
>> + WR_ERR_RANGE_MSG))
>> + return false;
>> + local_irq_save(flags);
>> + if (is_virt)
>> + page = virt_to_page(l);
>> + else
>> + vmalloc_to_page(l);
>> + offset = (~PAGE_MASK) & (uintptr_t)l;
>> + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
>> + if (WARN(!base, WR_ERR_PAGE_MSG)) {
>> + local_irq_restore(flags);
>> + return false;
>> + }
>> + if (inc)
>> + atomic_long_inc((atomic_long_t *)(base + offset));
>> + else
>> + atomic_long_dec((atomic_long_t *)(base + offset));
>> + vunmap((void *)base);
>> + local_irq_restore(flags);
>> + return true;
>> +
>> +}
>
> That's just hideously nasty.. and horribly broken.
>
> We're not going to duplicate all these kernel interfaces wrapped in gunk
> like that.
one possibility would be to have macros which use typeof() on the
parameter being passed, to decide what implementation to use: regular or
write-rare
This means that type punning would still be needed, to select the
implementation.
Would this be enough? Is there some better way?
> Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
> you've never tested this with debug bits enabled.
I thought I had them. And I _did_ have them enabled, at some point.
But I must have messed up with the configuration and I failed to notice
this.
I can think of a way it might work, albeit it's not going to be very pretty:
* for the vmap(): if I understand correctly, it might sleep while
obtaining memory for creating the mapping. This part could be executed
before disabling interrupts. The rest of the function, instead, would be
executed after interrupts are disabled.
* for vunmap(): after the writing is done, change also the alternate
mapping to read only, then enable interrupts and destroy the alternate
mapping. Making also the secondary mapping read only makes it equally
secure as the primary, which means that it can be visible also with
interrupts enabled.
--
igor
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-29 21:17 ` Igor Stoppa
2018-10-29 21:17 ` Igor Stoppa
@ 2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 16:28 ` Will Deacon
1 sibling, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-30 15:58 UTC (permalink / raw)
To: Igor Stoppa
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On Mon, Oct 29, 2018 at 11:17:14PM +0200, Igor Stoppa wrote:
>
>
> On 25/10/2018 01:13, Peter Zijlstra wrote:
> > On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> > > +static __always_inline
> > > +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> > > +{
> > > + struct page *page;
> > > + uintptr_t base;
> > > + uintptr_t offset;
> > > + unsigned long flags;
> > > + size_t size = sizeof(*l);
> > > + bool is_virt = __is_wr_after_init(l, size);
> > > +
> > > + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> > > + WR_ERR_RANGE_MSG))
> > > + return false;
> > > + local_irq_save(flags);
> > > + if (is_virt)
> > > + page = virt_to_page(l);
> > > + else
> > > + vmalloc_to_page(l);
> > > + offset = (~PAGE_MASK) & (uintptr_t)l;
> > > + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> > > + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> > > + local_irq_restore(flags);
> > > + return false;
> > > + }
> > > + if (inc)
> > > + atomic_long_inc((atomic_long_t *)(base + offset));
> > > + else
> > > + atomic_long_dec((atomic_long_t *)(base + offset));
> > > + vunmap((void *)base);
> > > + local_irq_restore(flags);
> > > + return true;
> > > +
> > > +}
> >
> > That's just hideously nasty.. and horribly broken.
> >
> > We're not going to duplicate all these kernel interfaces wrapped in gunk
> > like that.
>
> one possibility would be to have macros which use typeof() on the parameter
> being passed, to decide what implementation to use: regular or write-rare
>
> This means that type punning would still be needed, to select the
> implementation.
>
> Would this be enough? Is there some better way?
Like mentioned elsewhere; if you do write_enable() + write_disable()
thingies, it all becomes:
write_enable();
atomic_foo(&bar);
write_disable();
No magic gunk infested duplication at all. Of course, ideally you'd then
teach objtool about this (or a GCC plugin I suppose) to ensure any
enable reached a disable.
The alternative is something like:
#define ALLOW_WRITE(stmt) do { write_enable(); do { stmt; } while (0); write_disable(); } while (0)
which then allows you to write:
ALLOW_WRITE(atomic_foo(&bar));
No duplication.
> > Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
> > you've never tested this with debug bits enabled.
>
> I thought I had them. And I _did_ have them enabled, at some point.
> But I must have messed up with the configuration and I failed to notice
> this.
>
> I can think of a way it might work, albeit it's not going to be very pretty:
>
> * for the vmap(): if I understand correctly, it might sleep while obtaining
> memory for creating the mapping. This part could be executed before
> disabling interrupts. The rest of the function, instead, would be executed
> after interrupts are disabled.
>
> * for vunmap(): after the writing is done, change also the alternate mapping
> to read only, then enable interrupts and destroy the alternate mapping.
> Making also the secondary mapping read only makes it equally secure as the
> primary, which means that it can be visible also with interrupts enabled.
That doesn't work if you wanted to do this write while you already have
IRQs disabled for example.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-30 15:58 ` Peter Zijlstra
@ 2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 16:28 ` Will Deacon
1 sibling, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-30 15:58 UTC (permalink / raw)
To: Igor Stoppa
Cc: Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner, James Morris,
Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Will Deacon, Boqun Feng, Arnd Bergmann, linux-arch,
linux-kernel
On Mon, Oct 29, 2018 at 11:17:14PM +0200, Igor Stoppa wrote:
>
>
> On 25/10/2018 01:13, Peter Zijlstra wrote:
> > On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> > > +static __always_inline
> > > +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> > > +{
> > > + struct page *page;
> > > + uintptr_t base;
> > > + uintptr_t offset;
> > > + unsigned long flags;
> > > + size_t size = sizeof(*l);
> > > + bool is_virt = __is_wr_after_init(l, size);
> > > +
> > > + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> > > + WR_ERR_RANGE_MSG))
> > > + return false;
> > > + local_irq_save(flags);
> > > + if (is_virt)
> > > + page = virt_to_page(l);
> > > + else
> > > + vmalloc_to_page(l);
> > > + offset = (~PAGE_MASK) & (uintptr_t)l;
> > > + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> > > + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> > > + local_irq_restore(flags);
> > > + return false;
> > > + }
> > > + if (inc)
> > > + atomic_long_inc((atomic_long_t *)(base + offset));
> > > + else
> > > + atomic_long_dec((atomic_long_t *)(base + offset));
> > > + vunmap((void *)base);
> > > + local_irq_restore(flags);
> > > + return true;
> > > +
> > > +}
> >
> > That's just hideously nasty.. and horribly broken.
> >
> > We're not going to duplicate all these kernel interfaces wrapped in gunk
> > like that.
>
> one possibility would be to have macros which use typeof() on the parameter
> being passed, to decide what implementation to use: regular or write-rare
>
> This means that type punning would still be needed, to select the
> implementation.
>
> Would this be enough? Is there some better way?
Like mentioned elsewhere; if you do write_enable() + write_disable()
thingies, it all becomes:
write_enable();
atomic_foo(&bar);
write_disable();
No magic gunk infested duplication at all. Of course, ideally you'd then
teach objtool about this (or a GCC plugin I suppose) to ensure any
enable reached a disable.
The alternative is something like:
#define ALLOW_WRITE(stmt) do { write_enable(); do { stmt; } while (0); write_disable(); } while (0)
which then allows you to write:
ALLOW_WRITE(atomic_foo(&bar));
No duplication.
> > Also, you _cannot_ call vunmap() with IRQs disabled. Clearly
> > you've never tested this with debug bits enabled.
>
> I thought I had them. And I _did_ have them enabled, at some point.
> But I must have messed up with the configuration and I failed to notice
> this.
>
> I can think of a way it might work, albeit it's not going to be very pretty:
>
> * for the vmap(): if I understand correctly, it might sleep while obtaining
> memory for creating the mapping. This part could be executed before
> disabling interrupts. The rest of the function, instead, would be executed
> after interrupts are disabled.
>
> * for vunmap(): after the writing is done, change also the alternate mapping
> to read only, then enable interrupts and destroy the alternate mapping.
> Making also the secondary mapping read only makes it equally secure as the
> primary, which means that it can be visible also with interrupts enabled.
That doesn't work if you wanted to do this write while you already have
IRQs disabled for example.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 15:58 ` Peter Zijlstra
@ 2018-10-30 16:28 ` Will Deacon
2018-10-30 16:28 ` Will Deacon
2018-10-31 9:10 ` Peter Zijlstra
1 sibling, 2 replies; 16+ messages in thread
From: Will Deacon @ 2018-10-30 16:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Igor Stoppa, Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner,
James Morris, Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Boqun Feng, Arnd Bergmann, linux-arch, linux-kernel
On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
> On Mon, Oct 29, 2018 at 11:17:14PM +0200, Igor Stoppa wrote:
> >
> >
> > On 25/10/2018 01:13, Peter Zijlstra wrote:
> > > On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> > > > +static __always_inline
> > > > +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> > > > +{
> > > > + struct page *page;
> > > > + uintptr_t base;
> > > > + uintptr_t offset;
> > > > + unsigned long flags;
> > > > + size_t size = sizeof(*l);
> > > > + bool is_virt = __is_wr_after_init(l, size);
> > > > +
> > > > + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> > > > + WR_ERR_RANGE_MSG))
> > > > + return false;
> > > > + local_irq_save(flags);
> > > > + if (is_virt)
> > > > + page = virt_to_page(l);
> > > > + else
> > > > + vmalloc_to_page(l);
> > > > + offset = (~PAGE_MASK) & (uintptr_t)l;
> > > > + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> > > > + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> > > > + local_irq_restore(flags);
> > > > + return false;
> > > > + }
> > > > + if (inc)
> > > > + atomic_long_inc((atomic_long_t *)(base + offset));
> > > > + else
> > > > + atomic_long_dec((atomic_long_t *)(base + offset));
> > > > + vunmap((void *)base);
> > > > + local_irq_restore(flags);
> > > > + return true;
> > > > +
> > > > +}
> > >
> > > That's just hideously nasty.. and horribly broken.
> > >
> > > We're not going to duplicate all these kernel interfaces wrapped in gunk
> > > like that.
> >
> > one possibility would be to have macros which use typeof() on the parameter
> > being passed, to decide what implementation to use: regular or write-rare
> >
> > This means that type punning would still be needed, to select the
> > implementation.
> >
> > Would this be enough? Is there some better way?
>
> Like mentioned elsewhere; if you do write_enable() + write_disable()
> thingies, it all becomes:
>
> write_enable();
> atomic_foo(&bar);
> write_disable();
>
> No magic gunk infested duplication at all. Of course, ideally you'd then
> teach objtool about this (or a GCC plugin I suppose) to ensure any
> enable reached a disable.
Isn't the issue here that we don't want to change the page tables for the
mapping of &bar, but instead want to create a temporary writable alias
at a random virtual address?
So you'd want:
wbar = write_enable(&bar);
atomic_foo(wbar);
write_disable(wbar);
which is probably better expressed as a map/unmap API. I suspect this
would also be the only way to do things for cmpxchg() loops, where you
want to create the mapping outside of the loop to minimise your time in
the critical section.
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-30 16:28 ` Will Deacon
@ 2018-10-30 16:28 ` Will Deacon
2018-10-31 9:10 ` Peter Zijlstra
1 sibling, 0 replies; 16+ messages in thread
From: Will Deacon @ 2018-10-30 16:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Igor Stoppa, Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner,
James Morris, Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Boqun Feng, Arnd Bergmann, linux-arch, linux-kernel
On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
> On Mon, Oct 29, 2018 at 11:17:14PM +0200, Igor Stoppa wrote:
> >
> >
> > On 25/10/2018 01:13, Peter Zijlstra wrote:
> > > On Wed, Oct 24, 2018 at 12:35:03AM +0300, Igor Stoppa wrote:
> > > > +static __always_inline
> > > > +bool __pratomic_long_op(bool inc, struct pratomic_long_t *l)
> > > > +{
> > > > + struct page *page;
> > > > + uintptr_t base;
> > > > + uintptr_t offset;
> > > > + unsigned long flags;
> > > > + size_t size = sizeof(*l);
> > > > + bool is_virt = __is_wr_after_init(l, size);
> > > > +
> > > > + if (WARN(!(is_virt || likely(__is_wr_pool(l, size))),
> > > > + WR_ERR_RANGE_MSG))
> > > > + return false;
> > > > + local_irq_save(flags);
> > > > + if (is_virt)
> > > > + page = virt_to_page(l);
> > > > + else
> > > > + vmalloc_to_page(l);
> > > > + offset = (~PAGE_MASK) & (uintptr_t)l;
> > > > + base = (uintptr_t)vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> > > > + if (WARN(!base, WR_ERR_PAGE_MSG)) {
> > > > + local_irq_restore(flags);
> > > > + return false;
> > > > + }
> > > > + if (inc)
> > > > + atomic_long_inc((atomic_long_t *)(base + offset));
> > > > + else
> > > > + atomic_long_dec((atomic_long_t *)(base + offset));
> > > > + vunmap((void *)base);
> > > > + local_irq_restore(flags);
> > > > + return true;
> > > > +
> > > > +}
> > >
> > > That's just hideously nasty.. and horribly broken.
> > >
> > > We're not going to duplicate all these kernel interfaces wrapped in gunk
> > > like that.
> >
> > one possibility would be to have macros which use typeof() on the parameter
> > being passed, to decide what implementation to use: regular or write-rare
> >
> > This means that type punning would still be needed, to select the
> > implementation.
> >
> > Would this be enough? Is there some better way?
>
> Like mentioned elsewhere; if you do write_enable() + write_disable()
> thingies, it all becomes:
>
> write_enable();
> atomic_foo(&bar);
> write_disable();
>
> No magic gunk infested duplication at all. Of course, ideally you'd then
> teach objtool about this (or a GCC plugin I suppose) to ensure any
> enable reached a disable.
Isn't the issue here that we don't want to change the page tables for the
mapping of &bar, but instead want to create a temporary writable alias
at a random virtual address?
So you'd want:
wbar = write_enable(&bar);
atomic_foo(wbar);
write_disable(wbar);
which is probably better expressed as a map/unmap API. I suspect this
would also be the only way to do things for cmpxchg() loops, where you
want to create the mapping outside of the loop to minimise your time in
the critical section.
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-30 16:28 ` Will Deacon
2018-10-30 16:28 ` Will Deacon
@ 2018-10-31 9:10 ` Peter Zijlstra
2018-10-31 9:10 ` Peter Zijlstra
2018-11-01 3:28 ` Kees Cook
1 sibling, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-31 9:10 UTC (permalink / raw)
To: Will Deacon
Cc: Igor Stoppa, Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner,
James Morris, Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Boqun Feng, Arnd Bergmann, linux-arch, linux-kernel
On Tue, Oct 30, 2018 at 04:28:16PM +0000, Will Deacon wrote:
> On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
> > Like mentioned elsewhere; if you do write_enable() + write_disable()
> > thingies, it all becomes:
> >
> > write_enable();
> > atomic_foo(&bar);
> > write_disable();
> >
> > No magic gunk infested duplication at all. Of course, ideally you'd then
> > teach objtool about this (or a GCC plugin I suppose) to ensure any
> > enable reached a disable.
>
> Isn't the issue here that we don't want to change the page tables for the
> mapping of &bar, but instead want to create a temporary writable alias
> at a random virtual address?
>
> So you'd want:
>
> wbar = write_enable(&bar);
> atomic_foo(wbar);
> write_disable(wbar);
>
> which is probably better expressed as a map/unmap API. I suspect this
> would also be the only way to do things for cmpxchg() loops, where you
> want to create the mapping outside of the loop to minimise your time in
> the critical section.
Ah, so I was thikning that the altnerative mm would have stuff in the
same location, just RW instead of RO.
But yes, if we, like Andy suggets, use the userspace address range for
the aliases, then we need to do as you suggest.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-31 9:10 ` Peter Zijlstra
@ 2018-10-31 9:10 ` Peter Zijlstra
2018-11-01 3:28 ` Kees Cook
1 sibling, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2018-10-31 9:10 UTC (permalink / raw)
To: Will Deacon
Cc: Igor Stoppa, Mimi Zohar, Kees Cook, Matthew Wilcox, Dave Chinner,
James Morris, Michal Hocko, kernel-hardening, linux-integrity,
linux-security-module, igor.stoppa, Dave Hansen, Jonathan Corbet,
Laura Abbott, Boqun Feng, Arnd Bergmann, linux-arch, linux-kernel
On Tue, Oct 30, 2018 at 04:28:16PM +0000, Will Deacon wrote:
> On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
> > Like mentioned elsewhere; if you do write_enable() + write_disable()
> > thingies, it all becomes:
> >
> > write_enable();
> > atomic_foo(&bar);
> > write_disable();
> >
> > No magic gunk infested duplication at all. Of course, ideally you'd then
> > teach objtool about this (or a GCC plugin I suppose) to ensure any
> > enable reached a disable.
>
> Isn't the issue here that we don't want to change the page tables for the
> mapping of &bar, but instead want to create a temporary writable alias
> at a random virtual address?
>
> So you'd want:
>
> wbar = write_enable(&bar);
> atomic_foo(wbar);
> write_disable(wbar);
>
> which is probably better expressed as a map/unmap API. I suspect this
> would also be the only way to do things for cmpxchg() loops, where you
> want to create the mapping outside of the loop to minimise your time in
> the critical section.
Ah, so I was thikning that the altnerative mm would have stuff in the
same location, just RW instead of RO.
But yes, if we, like Andy suggets, use the userspace address range for
the aliases, then we need to do as you suggest.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-10-31 9:10 ` Peter Zijlstra
2018-10-31 9:10 ` Peter Zijlstra
@ 2018-11-01 3:28 ` Kees Cook
2018-11-01 3:28 ` Kees Cook
1 sibling, 1 reply; 16+ messages in thread
From: Kees Cook @ 2018-11-01 3:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Will Deacon, Igor Stoppa, Mimi Zohar, Matthew Wilcox,
Dave Chinner, James Morris, Michal Hocko, Kernel Hardening,
linux-integrity, linux-security-module, Igor Stoppa, Dave Hansen,
Jonathan Corbet, Laura Abbott, Boqun Feng, Arnd Bergmann,
linux-arch, LKML
On Wed, Oct 31, 2018 at 2:10 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Oct 30, 2018 at 04:28:16PM +0000, Will Deacon wrote:
>> On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
>> > Like mentioned elsewhere; if you do write_enable() + write_disable()
>> > thingies, it all becomes:
>> >
>> > write_enable();
>> > atomic_foo(&bar);
>> > write_disable();
>> >
>> > No magic gunk infested duplication at all. Of course, ideally you'd then
>> > teach objtool about this (or a GCC plugin I suppose) to ensure any
>> > enable reached a disable.
>>
>> Isn't the issue here that we don't want to change the page tables for the
>> mapping of &bar, but instead want to create a temporary writable alias
>> at a random virtual address?
>>
>> So you'd want:
>>
>> wbar = write_enable(&bar);
>> atomic_foo(wbar);
>> write_disable(wbar);
>>
>> which is probably better expressed as a map/unmap API. I suspect this
>> would also be the only way to do things for cmpxchg() loops, where you
>> want to create the mapping outside of the loop to minimise your time in
>> the critical section.
>
> Ah, so I was thikning that the altnerative mm would have stuff in the
> same location, just RW instead of RO.
I was hoping for the same location too. That allows use to use a gcc
plugin to mark, say, function pointer tables, as read-only, and
annotate their rare updates with write_rare() without any
recalculation.
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 16/17] prmem: pratomic-long
2018-11-01 3:28 ` Kees Cook
@ 2018-11-01 3:28 ` Kees Cook
0 siblings, 0 replies; 16+ messages in thread
From: Kees Cook @ 2018-11-01 3:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Will Deacon, Igor Stoppa, Mimi Zohar, Matthew Wilcox,
Dave Chinner, James Morris, Michal Hocko, Kernel Hardening,
linux-integrity, linux-security-module, Igor Stoppa, Dave Hansen,
Jonathan Corbet, Laura Abbott, Boqun Feng, Arnd Bergmann,
linux-arch, LKML
On Wed, Oct 31, 2018 at 2:10 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Oct 30, 2018 at 04:28:16PM +0000, Will Deacon wrote:
>> On Tue, Oct 30, 2018 at 04:58:41PM +0100, Peter Zijlstra wrote:
>> > Like mentioned elsewhere; if you do write_enable() + write_disable()
>> > thingies, it all becomes:
>> >
>> > write_enable();
>> > atomic_foo(&bar);
>> > write_disable();
>> >
>> > No magic gunk infested duplication at all. Of course, ideally you'd then
>> > teach objtool about this (or a GCC plugin I suppose) to ensure any
>> > enable reached a disable.
>>
>> Isn't the issue here that we don't want to change the page tables for the
>> mapping of &bar, but instead want to create a temporary writable alias
>> at a random virtual address?
>>
>> So you'd want:
>>
>> wbar = write_enable(&bar);
>> atomic_foo(wbar);
>> write_disable(wbar);
>>
>> which is probably better expressed as a map/unmap API. I suspect this
>> would also be the only way to do things for cmpxchg() loops, where you
>> want to create the mapping outside of the loop to minimise your time in
>> the critical section.
>
> Ah, so I was thikning that the altnerative mm would have stuff in the
> same location, just RW instead of RO.
I was hoping for the same location too. That allows use to use a gcc
plugin to mark, say, function pointer tables, as read-only, and
annotate their rare updates with write_rare() without any
recalculation.
-Kees
--
Kees Cook
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-11-01 12:35 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181023213504.28905-1-igor.stoppa@huawei.com>
2018-10-23 21:34 ` [PATCH 01/17] prmem: linker section for static write rare Igor Stoppa
2018-10-23 21:34 ` Igor Stoppa
2018-10-23 21:35 ` [PATCH 16/17] prmem: pratomic-long Igor Stoppa
2018-10-23 21:35 ` Igor Stoppa
2018-10-25 0:13 ` Peter Zijlstra
2018-10-25 0:13 ` Peter Zijlstra
2018-10-29 21:17 ` Igor Stoppa
2018-10-29 21:17 ` Igor Stoppa
2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 15:58 ` Peter Zijlstra
2018-10-30 16:28 ` Will Deacon
2018-10-30 16:28 ` Will Deacon
2018-10-31 9:10 ` Peter Zijlstra
2018-10-31 9:10 ` Peter Zijlstra
2018-11-01 3:28 ` Kees Cook
2018-11-01 3:28 ` Kees Cook
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).