* [PATCH v2] fs/proc: Expose RSEQ configuration
@ 2021-01-14 18:54 Piotr Figiel
2021-01-15 12:23 ` kernel test robot
2021-01-15 15:44 ` Mathieu Desnoyers
0 siblings, 2 replies; 5+ messages in thread
From: Piotr Figiel @ 2021-01-14 18:54 UTC (permalink / raw)
To: Alexey Dobriyan, Eric W. Biederman, Andrew Morton, Kees Cook,
Alexey Gladkov, Christian Brauner, Michel Lespinasse,
Bernd Edlinger, Andrei Vagin, mathieu.desnoyers
Cc: linux-kernel, linux-fsdevel, posk, kyurtsever, ckennelly, pjt,
Piotr Figiel
For userspace checkpoint and restore (C/R) some way of getting process
state containing RSEQ configuration is needed.
There are two ways this information is going to be used:
- to re-enable RSEQ for threads which had it enabled before C/R
- to detect if a thread was in a critical section during C/R
Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
using the address registered before C/R.
Detection whether the thread is in a critical section during C/R is
needed to enforce behavior of RSEQ abort during C/R. Attaching with
ptrace() before registers are dumped itself doesn't cause RSEQ abort.
Restoring the instruction pointer within the critical section is
problematic because rseq_cs may get cleared before the control is
passed to the migrated application code leading to RSEQ invariants not
being preserved.
Signed-off-by: Piotr Figiel <figiel@google.com>
---
v2:
- fixed string formatting for 32-bit architectures
v1:
- https://lkml.kernel.org/r/20210113174127.2500051-1-figiel@google.com
---
fs/proc/base.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index b3422cda2a91..7cc36a224b8b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -662,6 +662,21 @@ static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns,
return 0;
}
+
+#ifdef CONFIG_RSEQ
+static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ int res = lock_trace(task);
+
+ if (res)
+ return res;
+ seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
+ task->rseq_sig);
+ unlock_trace(task);
+ return 0;
+}
+#endif /* CONFIG_RSEQ */
#endif /* CONFIG_HAVE_ARCH_TRACEHOOK */
/************************************************************************/
@@ -3182,6 +3197,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
ONE("syscall", S_IRUSR, proc_pid_syscall),
+#ifdef CONFIG_RSEQ
+ ONE("rseq", S_IRUSR, proc_pid_rseq),
+#endif
#endif
REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
ONE("stat", S_IRUGO, proc_tgid_stat),
@@ -3522,6 +3540,9 @@ static const struct pid_entry tid_base_stuff[] = {
&proc_pid_set_comm_operations, {}),
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
ONE("syscall", S_IRUSR, proc_pid_syscall),
+#ifdef CONFIG_RSEQ
+ ONE("rseq", S_IRUSR, proc_pid_rseq),
+#endif
#endif
REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
ONE("stat", S_IRUGO, proc_tid_stat),
--
2.30.0.284.gd98b1dd5eaa7-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] fs/proc: Expose RSEQ configuration
2021-01-14 18:54 [PATCH v2] fs/proc: Expose RSEQ configuration Piotr Figiel
@ 2021-01-15 12:23 ` kernel test robot
2021-01-15 15:44 ` Mathieu Desnoyers
1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2021-01-15 12:23 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 8720 bytes --]
Hi Piotr,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on userns/for-next]
[also build test WARNING on linus/master hnaz-linux-mm/master v5.11-rc3 next-20210115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Piotr-Figiel/fs-proc-Expose-RSEQ-configuration/20210115-114533
base: https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git for-next
config: powerpc64-randconfig-r021-20210115 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 5b42fd8dd4e7e29125a09a41a33af7c9cb57d144)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# https://github.com/0day-ci/linux/commit/dc5b7016b885415391e9d88d7349de14353c7249
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Piotr-Figiel/fs-proc-Expose-RSEQ-configuration/20210115-114533
git checkout dc5b7016b885415391e9d88d7349de14353c7249
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:78:1: note: expanded from here
__do_insw
^
arch/powerpc/include/asm/io.h:557:56: note: expanded from macro '__do_insw'
#define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:80:1: note: expanded from here
__do_insl
^
arch/powerpc/include/asm/io.h:558:56: note: expanded from macro '__do_insl'
#define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:82:1: note: expanded from here
__do_outsb
^
arch/powerpc/include/asm/io.h:559:58: note: expanded from macro '__do_outsb'
#define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:51:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:84:1: note: expanded from here
__do_outsw
^
arch/powerpc/include/asm/io.h:560:58: note: expanded from macro '__do_outsw'
#define __do_outsw(p, b, n) writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:53:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:86:1: note: expanded from here
__do_outsl
^
arch/powerpc/include/asm/io.h:561:58: note: expanded from macro '__do_outsl'
#define __do_outsl(p, b, n) writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
>> fs/proc/base.c:674:30: warning: format specifies type 'unsigned ptrdiff_t' (aka 'unsigned int') but the argument has type 'ptrdiff_t' (aka 'long') [-Wformat]
seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%lx
7 warnings generated.
vim +674 fs/proc/base.c
665
666 #ifdef CONFIG_RSEQ
667 static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
668 struct pid *pid, struct task_struct *task)
669 {
670 int res = lock_trace(task);
671
672 if (res)
673 return res;
> 674 seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
675 task->rseq_sig);
676 unlock_trace(task);
677 return 0;
678 }
679 #endif /* CONFIG_RSEQ */
680 #endif /* CONFIG_HAVE_ARCH_TRACEHOOK */
681
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 25620 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] fs/proc: Expose RSEQ configuration
@ 2021-01-15 12:23 ` kernel test robot
0 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2021-01-15 12:23 UTC (permalink / raw)
To: Piotr Figiel, Alexey Dobriyan, Eric W. Biederman, Andrew Morton,
Kees Cook, Alexey Gladkov, Christian Brauner, Michel Lespinasse,
Bernd Edlinger, Andrei Vagin
Cc: kbuild-all, clang-built-linux, Linux Memory Management List
[-- Attachment #1: Type: text/plain, Size: 8557 bytes --]
Hi Piotr,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on userns/for-next]
[also build test WARNING on linus/master hnaz-linux-mm/master v5.11-rc3 next-20210115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Piotr-Figiel/fs-proc-Expose-RSEQ-configuration/20210115-114533
base: https://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git for-next
config: powerpc64-randconfig-r021-20210115 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 5b42fd8dd4e7e29125a09a41a33af7c9cb57d144)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# https://github.com/0day-ci/linux/commit/dc5b7016b885415391e9d88d7349de14353c7249
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Piotr-Figiel/fs-proc-Expose-RSEQ-configuration/20210115-114533
git checkout dc5b7016b885415391e9d88d7349de14353c7249
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:78:1: note: expanded from here
__do_insw
^
arch/powerpc/include/asm/io.h:557:56: note: expanded from macro '__do_insw'
#define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:80:1: note: expanded from here
__do_insl
^
arch/powerpc/include/asm/io.h:558:56: note: expanded from macro '__do_insl'
#define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:82:1: note: expanded from here
__do_outsb
^
arch/powerpc/include/asm/io.h:559:58: note: expanded from macro '__do_outsb'
#define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:51:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsw, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:84:1: note: expanded from here
__do_outsw
^
arch/powerpc/include/asm/io.h:560:58: note: expanded from macro '__do_outsw'
#define __do_outsw(p, b, n) writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
In file included from fs/proc/base.c:68:
In file included from include/linux/swap.h:9:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:10:
In file included from arch/powerpc/include/asm/hardirq.h:6:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/powerpc/include/asm/io.h:619:
arch/powerpc/include/asm/io-defs.h:53:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outsl, (unsigned long p, const void *b, unsigned long c),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al; \
^~~~~~~~~~~~~~
<scratch space>:86:1: note: expanded from here
__do_outsl
^
arch/powerpc/include/asm/io.h:561:58: note: expanded from macro '__do_outsl'
#define __do_outsl(p, b, n) writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
~~~~~~~~~~~~~~~~~~~~~^
>> fs/proc/base.c:674:30: warning: format specifies type 'unsigned ptrdiff_t' (aka 'unsigned int') but the argument has type 'ptrdiff_t' (aka 'long') [-Wformat]
seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
%lx
7 warnings generated.
vim +674 fs/proc/base.c
665
666 #ifdef CONFIG_RSEQ
667 static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
668 struct pid *pid, struct task_struct *task)
669 {
670 int res = lock_trace(task);
671
672 if (res)
673 return res;
> 674 seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
675 task->rseq_sig);
676 unlock_trace(task);
677 return 0;
678 }
679 #endif /* CONFIG_RSEQ */
680 #endif /* CONFIG_HAVE_ARCH_TRACEHOOK */
681
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25620 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] fs/proc: Expose RSEQ configuration
2021-01-14 18:54 [PATCH v2] fs/proc: Expose RSEQ configuration Piotr Figiel
2021-01-15 12:23 ` kernel test robot
@ 2021-01-15 15:44 ` Mathieu Desnoyers
2021-01-18 17:25 ` Piotr Figiel
1 sibling, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2021-01-15 15:44 UTC (permalink / raw)
To: Piotr Figiel, Peter Zijlstra, paulmck, Boqun Feng
Cc: Alexey Dobriyan, Eric W. Biederman, Andrew Morton, Kees Cook,
Alexey Gladkov, Christian Brauner, Michel Lespinasse,
Bernd Edlinger, Andrei Vagin, linux-kernel, linux-fsdevel,
Peter Oskolkov, Kamil Yurtsever, Chris Kennelly, Paul Turner
----- On Jan 14, 2021, at 1:54 PM, Piotr Figiel figiel@google.com wrote:
Added PeterZ, Paul and Boqun to CC. They are also listed as maintainers of rseq.
Please CC them in your next round of patches.
> For userspace checkpoint and restore (C/R) some way of getting process
> state containing RSEQ configuration is needed.
>
> There are two ways this information is going to be used:
> - to re-enable RSEQ for threads which had it enabled before C/R
> - to detect if a thread was in a critical section during C/R
>
> Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
> using the address registered before C/R.
Indeed, if the process goes through a checkpoint/restore while within a
rseq c.s., that critical section should abort. Given that it's only the
restored process which resumes user-space execution, there should be some
way to ensure that the rseq tls pointer is restored before that thread goes
back to user-space, or some way to ensure the rseq TLS is registered
before that thread returns to the saved instruction pointer.
How do you plan to re-register the rseq TLS for each thread upon restore ?
I suspect you move the return IP to the abort either at checkpoint or restore
if you detect that the thread is running in a rseq critical section.
>
> Detection whether the thread is in a critical section during C/R is
> needed to enforce behavior of RSEQ abort during C/R. Attaching with
> ptrace() before registers are dumped itself doesn't cause RSEQ abort.
Right, because the RSEQ abort is only done when going back to user-space,
and AFAIU the checkpointed process will cease to exist, and won't go back
to user-space, therefore bypassing any RSEQ abort.
> Restoring the instruction pointer within the critical section is
> problematic because rseq_cs may get cleared before the control is
> passed to the migrated application code leading to RSEQ invariants not
> being preserved.
The commit message should state that both the per-thread rseq TLS area address
and the signature are dumped within this new proc file.
>
> Signed-off-by: Piotr Figiel <figiel@google.com>
>
> ---
>
> v2:
> - fixed string formatting for 32-bit architectures
>
> v1:
> - https://lkml.kernel.org/r/20210113174127.2500051-1-figiel@google.com
>
> ---
> fs/proc/base.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index b3422cda2a91..7cc36a224b8b 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -662,6 +662,21 @@ static int proc_pid_syscall(struct seq_file *m, struct
> pid_namespace *ns,
>
> return 0;
> }
> +
> +#ifdef CONFIG_RSEQ
> +static int proc_pid_rseq(struct seq_file *m, struct pid_namespace *ns,
> + struct pid *pid, struct task_struct *task)
> +{
> + int res = lock_trace(task);
AFAIU lock_trace prevents concurrent exec() from modifying the task's content.
What prevents a concurrent rseq register/unregister to be executed concurrently
with proc_pid_rseq ?
> +
> + if (res)
> + return res;
> + seq_printf(m, "%tx %08x\n", (ptrdiff_t)((uintptr_t)task->rseq),
I wonder if all those parentheses are needed. Wouldn't it be enough to have:
(ptrdiff_t)(uintptr_t)task->rseq
?
Thanks,
Mathieu
> + task->rseq_sig);
> + unlock_trace(task);
> + return 0;
> +}
> +#endif /* CONFIG_RSEQ */
> #endif /* CONFIG_HAVE_ARCH_TRACEHOOK */
>
> /************************************************************************/
> @@ -3182,6 +3197,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> REG("comm", S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
> #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> ONE("syscall", S_IRUSR, proc_pid_syscall),
> +#ifdef CONFIG_RSEQ
> + ONE("rseq", S_IRUSR, proc_pid_rseq),
> +#endif
> #endif
> REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
> ONE("stat", S_IRUGO, proc_tgid_stat),
> @@ -3522,6 +3540,9 @@ static const struct pid_entry tid_base_stuff[] = {
> &proc_pid_set_comm_operations, {}),
> #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
> ONE("syscall", S_IRUSR, proc_pid_syscall),
> +#ifdef CONFIG_RSEQ
> + ONE("rseq", S_IRUSR, proc_pid_rseq),
> +#endif
> #endif
> REG("cmdline", S_IRUGO, proc_pid_cmdline_ops),
> ONE("stat", S_IRUGO, proc_tid_stat),
> --
> 2.30.0.284.gd98b1dd5eaa7-goog
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] fs/proc: Expose RSEQ configuration
2021-01-15 15:44 ` Mathieu Desnoyers
@ 2021-01-18 17:25 ` Piotr Figiel
0 siblings, 0 replies; 5+ messages in thread
From: Piotr Figiel @ 2021-01-18 17:25 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Peter Zijlstra, paulmck, Boqun Feng, Alexey Dobriyan,
Eric W. Biederman, Andrew Morton, Kees Cook, Alexey Gladkov,
Christian Brauner, Michel Lespinasse, Bernd Edlinger,
Andrei Vagin, linux-kernel, linux-fsdevel, Peter Oskolkov,
Kamil Yurtsever, Chris Kennelly, Paul Turner
Hi, thanks for review.
On Fri, Jan 15, 2021 at 10:44:20AM -0500, Mathieu Desnoyers wrote:
> ----- On Jan 14, 2021, at 1:54 PM, Piotr Figiel figiel@google.com wrote:
> Added PeterZ, Paul and Boqun to CC. They are also listed as maintainers of rseq.
> Please CC them in your next round of patches.
OK.
> > Since C/R preserves TLS memory and addresses RSEQ ABI will be
> > restored using the address registered before C/R.
> How do you plan to re-register the rseq TLS for each thread upon
> restore ?
In CRIU restorer there is a moment when the code runs on behalf of the
restored thread after the memory is already restored but before the
control is passed to the application code. I'm going to use rseq()
syscall there with the checkpointed values of ABI address and signatures
(obtained via the newly added procfs file).
> I suspect you move the return IP to the abort either at checkpoint or
> restore if you detect that the thread is running in a rseq critical
> section.
Actually in the prototype implementation I use PTRACE_SINGLESTEP during
checkpointing (with some safeguards) to force the kernel to jump out of
the critical section before registers values are fetched. This has the
drawback though that the first instruction of abort handler is executed
upon checkpointing.
I'll likely rework it to update instruction pointer by getting abort
address with PTRACE_PEEKTEXT (via RSEQ ABI).
I think an option is to have a kernel interface to trigger the abort on
userspace's request without need for some hacks. This could be a ptrace
extension. Alternatively attach could trigger RSEQ logic, but this is
potentially a breaking change for debuggers.
> > Detection whether the thread is in a critical section during C/R is
> > needed to enforce behavior of RSEQ abort during C/R. Attaching with
> > ptrace() before registers are dumped itself doesn't cause RSEQ
> > abort.
> Right, because the RSEQ abort is only done when going back to
> user-space, and AFAIU the checkpointed process will cease to exist,
> and won't go back to user-space, therefore bypassing any RSEQ abort.
The checkpointed process doesn't have to cease to exist, actually it can
continue, and when it's unpaused kernel will schedule the process and
should call the abort handler for RSEQ CS. But this will happen on the
checkpointing side after process state was already checkpointed.
For C/R is important that the checkpoint (serialized process state) is
safe wrt RSEQ.
> > Restoring the instruction pointer within the critical section is
> > problematic because rseq_cs may get cleared before the control is
> > passed to the migrated application code leading to RSEQ invariants
> > not being preserved.
> The commit message should state that both the per-thread rseq TLS area
> address and the signature are dumped within this new proc file.
I'll include this in v3, thanks.
> AFAIU lock_trace prevents concurrent exec() from modifying the task's
> content. What prevents a concurrent rseq register/unregister to be
> executed concurrently with proc_pid_rseq ?
Yes, in this shape only ptrace prevents, as it was the intended use
case. Do you think it would make sense to add a mutex on task_struct for
the purpose of casual reader (sys admin?) consistency? This would be
locked only here and in the syscall during setting.
(Alternatively SMP barrier could be used to enforce the order so that
the ABI address is always written first, and the signature wouldn't make
sense on ABI address = 0, but probably being simply consistent is
better).
> I wonder if all those parentheses are needed. Wouldn't it be enough to have:
Will remove thanks.
Best regards, Piotr.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-01-18 17:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-14 18:54 [PATCH v2] fs/proc: Expose RSEQ configuration Piotr Figiel
2021-01-15 12:23 ` kernel test robot
2021-01-15 12:23 ` kernel test robot
2021-01-15 15:44 ` Mathieu Desnoyers
2021-01-18 17:25 ` Piotr Figiel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.