* Re: [PATCH] riscv: Fix vector state restore in rt_sigreturn()
2024-04-03 7:26 [PATCH] riscv: Fix vector state restore in rt_sigreturn() Björn Töpel
@ 2024-04-03 10:12 ` Andy Chiu
2024-04-03 17:33 ` Vineet Gupta
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Andy Chiu @ 2024-04-03 10:12 UTC (permalink / raw)
To: Björn Töpel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
Björn Töpel, Conor Dooley, Heiko Stuebner, Vincent Chen,
Ben Dooks, Greentime Hu, Haorong Lu, Jerry Shih, Nick Knight,
linux-kernel, Vineet Gupta, Charlie Jenkins, Vineet Gupta
On Wed, Apr 3, 2024 at 3:27 PM Björn Töpel <bjorn@kernel.org> wrote:
>
> From: Björn Töpel <bjorn@rivosinc.com>
>
> The RISC-V Vector specification states in "Appendix D: Calling
> Convention for Vector State" [1] that "Executing a system call causes
> all caller-saved vector registers (v0-v31, vl, vtype) and vstart to
> become unspecified.". In the RISC-V kernel this is called "discarding
> the vstate".
>
> Returning from a signal handler via the rt_sigreturn() syscall, vector
> discard is also performed. However, this is not an issue since the
> vector state should be restored from the sigcontext, and therefore not
> care about the vector discard.
>
> The "live state" is the actual vector register in the running context,
> and the "vstate" is the vector state of the task. A dirty live state,
> means that the vstate and live state are not in synch.
>
> When vectorized user_from_copy() was introduced, an bug sneaked in at
> the restoration code, related to the discard of the live state.
>
> An example when this go wrong:
>
> 1. A userland application is executing vector code
> 2. The application receives a signal, and the signal handler is
> entered.
> 3. The application returns from the signal handler, using the
> rt_sigreturn() syscall.
> 4. The live vector state is discarded upon entering the
> rt_sigreturn(), and the live state is marked as "dirty", indicating
> that the live state need to be synchronized with the current
> vstate.
> 5. rt_sigreturn() restores the vstate, except the Vector registers,
> from the sigcontext
> 6. rt_sigreturn() restores the Vector registers, from the sigcontext,
> and now the vectorized user_from_copy() is used. The dirty live
> state from the discard is saved to the vstate, making the vstate
> corrupt.
> 7. rt_sigreturn() returns to the application, which crashes due to
> corrupted vstate.
>
> Note that the vectorized user_from_copy() is invoked depending on the
> value of CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD. Default is 768, which
> means that vlen has to be larger than 128b for this bug to trigger.
>
> The fix is simply to mark the live state as non-dirty/clean prior
> performing the vstate restore.
>
> Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-8abdb41-2024-03-26/unpriv-isa-asciidoc.pdf # [1]
> Reported-by: Charlie Jenkins <charlie@rivosinc.com>
> Reported-by: Vineet Gupta <vgupta@kernel.org>
> Fixes: c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user")
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Thanks for the findings!
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
> ---
> arch/riscv/kernel/signal.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index 501e66debf69..5a2edd7f027e 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -119,6 +119,13 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
> struct __sc_riscv_v_state __user *state = sc_vec;
> void __user *datap;
>
> + /*
> + * Mark the vstate as clean prior performing the actual copy,
> + * to avoid getting the vstate incorrectly clobbered by the
> + * discarded vector state.
> + */
> + riscv_v_vstate_set_restore(current, regs);
> +
> /* Copy everything of __sc_riscv_v_state except datap. */
> err = __copy_from_user(¤t->thread.vstate, &state->v_state,
> offsetof(struct __riscv_v_ext_state, datap));
> @@ -133,13 +140,7 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
> * Copy the whole vector content from user space datap. Use
> * copy_from_user to prevent information leak.
> */
> - err = copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
> - if (unlikely(err))
> - return err;
> -
> - riscv_v_vstate_set_restore(current, regs);
> -
> - return err;
> + return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
> }
> #else
> #define save_v_state(task, regs) (0)
>
> base-commit: 7115ff4a8bfed3b9294bad2e111744e6abeadf1a
> --
> 2.40.1
>
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] riscv: Fix vector state restore in rt_sigreturn()
2024-04-03 7:26 [PATCH] riscv: Fix vector state restore in rt_sigreturn() Björn Töpel
2024-04-03 10:12 ` Andy Chiu
@ 2024-04-03 17:33 ` Vineet Gupta
2024-04-04 19:40 ` patchwork-bot+linux-riscv
2024-12-04 4:45 ` Yangyu Chen
3 siblings, 0 replies; 5+ messages in thread
From: Vineet Gupta @ 2024-04-03 17:33 UTC (permalink / raw)
To: Björn Töpel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Andy Chiu, linux-riscv
Cc: Björn Töpel, Conor Dooley, Heiko Stuebner, Vincent Chen,
Ben Dooks, Greentime Hu, Haorong Lu, Jerry Shih, Nick Knight,
linux-kernel, Charlie Jenkins, Vineet Gupta
On 4/3/24 00:26, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
>
> The RISC-V Vector specification states in "Appendix D: Calling
> Convention for Vector State" [1] that "Executing a system call causes
> all caller-saved vector registers (v0-v31, vl, vtype) and vstart to
> become unspecified.". In the RISC-V kernel this is called "discarding
> the vstate".
>
> Returning from a signal handler via the rt_sigreturn() syscall, vector
> discard is also performed. However, this is not an issue since the
> vector state should be restored from the sigcontext, and therefore not
> care about the vector discard.
>
> The "live state" is the actual vector register in the running context,
> and the "vstate" is the vector state of the task. A dirty live state,
> means that the vstate and live state are not in synch.
>
> When vectorized user_from_copy() was introduced, an bug sneaked in at
> the restoration code, related to the discard of the live state.
>
> An example when this go wrong:
>
> 1. A userland application is executing vector code
> 2. The application receives a signal, and the signal handler is
> entered.
> 3. The application returns from the signal handler, using the
> rt_sigreturn() syscall.
> 4. The live vector state is discarded upon entering the
> rt_sigreturn(), and the live state is marked as "dirty", indicating
> that the live state need to be synchronized with the current
> vstate.
> 5. rt_sigreturn() restores the vstate, except the Vector registers,
> from the sigcontext
> 6. rt_sigreturn() restores the Vector registers, from the sigcontext,
> and now the vectorized user_from_copy() is used. The dirty live
> state from the discard is saved to the vstate, making the vstate
> corrupt.
> 7. rt_sigreturn() returns to the application, which crashes due to
> corrupted vstate.
>
> Note that the vectorized user_from_copy() is invoked depending on the
> value of CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD. Default is 768, which
> means that vlen has to be larger than 128b for this bug to trigger.
>
> The fix is simply to mark the live state as non-dirty/clean prior
> performing the vstate restore.
>
> Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-8abdb41-2024-03-26/unpriv-isa-asciidoc.pdf # [1]
> Reported-by: Charlie Jenkins <charlie@rivosinc.com>
> Reported-by: Vineet Gupta <vgupta@kernel.org>
> Fixes: c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user")
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Vineet Gupta <vineetg@rivosinc.com>
For completeness (and fun)
1. The issue was triggered on dual core spike run with a seemingly
benign workload (the key is repeated fork/execve/exit with a little I/O)
some-shell-script.sh
#!/bin/bash
(while true; do ls; done) &
for i in $seq (1 20); do
<long running job>
done
2. The issue initially appears as follows: Vector store instruction,
before starting to run invalidates it's own context (page fault ->
preemption -> handle-signal -> sigreturn -> VILL / V-clobber), so when
it eventually runs, it takes an illegal instruction exception, taking
down the entire program.
Thx,
-Vineet
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] riscv: Fix vector state restore in rt_sigreturn()
2024-04-03 7:26 [PATCH] riscv: Fix vector state restore in rt_sigreturn() Björn Töpel
2024-04-03 10:12 ` Andy Chiu
2024-04-03 17:33 ` Vineet Gupta
@ 2024-04-04 19:40 ` patchwork-bot+linux-riscv
2024-12-04 4:45 ` Yangyu Chen
3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+linux-riscv @ 2024-04-04 19:40 UTC (permalink / raw)
To: =?utf-8?b?QmrDtnJuIFTDtnBlbCA8Ympvcm5Aa2VybmVsLm9yZz4=?=
Cc: linux-riscv, paul.walmsley, palmer, aou, andy.chiu, bjorn,
conor.dooley, heiko, vincent.chen, ben.dooks, greentime.hu,
ancientmodern4, jerry.shih, nick.knight, linux-kernel, vineetg,
charlie, vgupta
Hello:
This patch was applied to riscv/linux.git (fixes)
by Palmer Dabbelt <palmer@rivosinc.com>:
On Wed, 3 Apr 2024 09:26:38 +0200 you wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
>
> The RISC-V Vector specification states in "Appendix D: Calling
> Convention for Vector State" [1] that "Executing a system call causes
> all caller-saved vector registers (v0-v31, vl, vtype) and vstart to
> become unspecified.". In the RISC-V kernel this is called "discarding
> the vstate".
>
> [...]
Here is the summary with links:
- riscv: Fix vector state restore in rt_sigreturn()
https://git.kernel.org/riscv/c/c27fa53b858b
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] riscv: Fix vector state restore in rt_sigreturn()
2024-04-03 7:26 [PATCH] riscv: Fix vector state restore in rt_sigreturn() Björn Töpel
` (2 preceding siblings ...)
2024-04-04 19:40 ` patchwork-bot+linux-riscv
@ 2024-12-04 4:45 ` Yangyu Chen
3 siblings, 0 replies; 5+ messages in thread
From: Yangyu Chen @ 2024-12-04 4:45 UTC (permalink / raw)
To: Björn Töpel, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Andy Chiu, linux-riscv
Cc: Björn Töpel, Conor Dooley, Heiko Stuebner, Vincent Chen,
Ben Dooks, Greentime Hu, Haorong Lu, Jerry Shih, Nick Knight,
linux-kernel, Vineet Gupta, Charlie Jenkins, Vineet Gupta, stable
I think this patch should also be backported to the v6.6 LTS tree.
Since it should recolonize as Fixes: 8ee0b41898 ("riscv: signal:
Add sigcontext save/restore for vector") and that commit first
appears since v6.5-rc1 and this patch land to master branch since
v6.9-rc3
Thanks,
Yangyu Chen
On 4/3/24 15:26, Björn Töpel wrote:
> From: Björn Töpel <bjorn@rivosinc.com>
> The RISC-V Vector specification states in "Appendix D: Calling
> Convention for Vector State" [1] that "Executing a system call causes
> all caller-saved vector registers (v0-v31, vl, vtype) and vstart to
> become unspecified.". In the RISC-V kernel this is called "discarding
> the vstate".
> Returning from a signal handler via the rt_sigreturn() syscall, vector
> discard is also performed. However, this is not an issue since the
> vector state should be restored from the sigcontext, and therefore not
> care about the vector discard.
> The "live state" is the actual vector register in the running context,
> and the "vstate" is the vector state of the task. A dirty live state,
> means that the vstate and live state are not in synch.
> When vectorized user_from_copy() was introduced, an bug sneaked in at
> the restoration code, related to the discard of the live state.
> An example when this go wrong:
> 1. A userland application is executing vector code
> 2. The application receives a signal, and the signal handler is
> entered.
> 3. The application returns from the signal handler, using the
> rt_sigreturn() syscall.
> 4. The live vector state is discarded upon entering the
> rt_sigreturn(), and the live state is marked as "dirty", indicating
> that the live state need to be synchronized with the current
> vstate.
> 5. rt_sigreturn() restores the vstate, except the Vector registers,
> from the sigcontext
> 6. rt_sigreturn() restores the Vector registers, from the sigcontext,
> and now the vectorized user_from_copy() is used. The dirty live
> state from the discard is saved to the vstate, making the vstate
> corrupt.
> 7. rt_sigreturn() returns to the application, which crashes due to
> corrupted vstate.
> Note that the vectorized user_from_copy() is invoked depending on the
> value of CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD. Default is 768, which
> means that vlen has to be larger than 128b for this bug to trigger.
> The fix is simply to mark the live state as non-dirty/clean prior
> performing the vstate restore.
> Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-8abdb41-2024-03-26/unpriv-isa-asciidoc.pdf # [1]
> Reported-by: Charlie Jenkins <charlie@rivosinc.com>
> Reported-by: Vineet Gupta <vgupta@kernel.org>
> Fixes: c2a658d41924 ("riscv: lib: vectorize copy_to_user/copy_from_user")
> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
> ---
> arch/riscv/kernel/signal.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
> diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
> index 501e66debf69..5a2edd7f027e 100644
> --- a/arch/riscv/kernel/signal.c
> +++ b/arch/riscv/kernel/signal.c
> @@ -119,6 +119,13 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
> struct __sc_riscv_v_state __user *state = sc_vec;
> void __user *datap;
> + /*
> + * Mark the vstate as clean prior performing the actual copy,
> + * to avoid getting the vstate incorrectly clobbered by the
> + * discarded vector state.
> + */
> + riscv_v_vstate_set_restore(current, regs);
> +
> /* Copy everything of __sc_riscv_v_state except datap. */
> err = __copy_from_user(¤t->thread.vstate, &state->v_state,
> offsetof(struct __riscv_v_ext_state, datap));
> @@ -133,13 +140,7 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec)
> * Copy the whole vector content from user space datap. Use
> * copy_from_user to prevent information leak.
> */
> - err = copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
> - if (unlikely(err))
> - return err;
> -
> - riscv_v_vstate_set_restore(current, regs);
> -
> - return err;
> + return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize);
> }
> #else
> #define save_v_state(task, regs) (0)
> base-commit: 7115ff4a8bfed3b9294bad2e111744e6abeadf1a
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread