* [PATCH v1 0/1] 68020/030 signal handling following exceptions @ 2023-05-06 9:38 Finn Thain 2023-05-06 9:38 ` [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 Finn Thain 0 siblings, 1 reply; 6+ messages in thread From: Finn Thain @ 2023-05-06 9:38 UTC (permalink / raw) To: Geert Uytterhoeven Cc: linux-kernel, linux-m68k, Michael Schmitz, Andreas Schwab, stable This patch prevents potential stack corruption on 68020/030 when delivering signals following a bus error or (theoretically) an address error. Changed since RFC: - Dropped patch 1 because, as Andreas pointed out, it will not work properly. Finn Thain (1): m68k: Move signal frame following exception on 68020/030 arch/m68k/kernel/signal.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) -- 2.37.5 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 2023-05-06 9:38 [PATCH v1 0/1] 68020/030 signal handling following exceptions Finn Thain @ 2023-05-06 9:38 ` Finn Thain 2023-05-22 11:41 ` Geert Uytterhoeven 0 siblings, 1 reply; 6+ messages in thread From: Finn Thain @ 2023-05-06 9:38 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Michael Schmitz, Andreas Schwab, stable, linux-m68k, linux-kernel On 68030/020, an instruction such as, moveml %a2-%a3/%a5,%sp@- may cause a stack page fault during instruction execution (i.e. not at an instruction boundary) and produce a format 0xB exception frame. In this situation, the value of USP will be unreliable. If a signal is to be delivered following the exception, this USP value is used to calculate the location for a signal frame. This can result in a corrupted user stack. The corruption was detected in dash (actually in glibc) where it showed up as an intermittent "stack smashing detected" message and crash following signal delivery for SIGCHLD. It was hard to reproduce that failure because delivery of the signal raced with the page fault and because the kernel places an unpredictable gap of up to 7 bytes between the USP and the signal frame. A format 0xB exception frame can be produced by a bus error or an address error. The 68030 Users Manual says that address errors occur immediately upon detection during instruction prefetch. The instruction pipeline allows prefetch to overlap with other instructions, which means an address error can arise during the execution of a different instruction. So it seems likely that this patch may help in the address error case also. Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> Link: https://lore.kernel.org/all/CAMuHMdW3yD22_ApemzW_6me3adq6A458u1_F0v-1EYwK_62jPA@mail.gmail.com/ Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: stable@vger.kernel.org Co-developed-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Finn Thain <fthain@linux-m68k.org> --- This is the same patch that was posted previously. The commit log has been revised and tags added. --- arch/m68k/kernel/signal.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c index b9f6908a31bc..8aeafbb083f7 100644 --- a/arch/m68k/kernel/signal.c +++ b/arch/m68k/kernel/signal.c @@ -858,11 +858,16 @@ static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs * } static inline void __user * -get_sigframe(struct ksignal *ksig, size_t frame_size) +get_sigframe(struct ksignal *ksig, struct pt_regs *tregs, size_t frame_size) { unsigned long usp = sigsp(rdusp(), ksig); + unsigned long gap = 0; - return (void __user *)((usp - frame_size) & -8UL); + if (CPU_IS_020_OR_030 && tregs->format == 0xb) + /* USP is unreliable so use worst-case value */ + gap = 256; + + return (void __user *)((usp - gap - frame_size) & -8UL); } static int setup_frame(struct ksignal *ksig, sigset_t *set, @@ -880,7 +885,7 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set, return -EFAULT; } - frame = get_sigframe(ksig, sizeof(*frame) + fsize); + frame = get_sigframe(ksig, tregs, sizeof(*frame) + fsize); if (fsize) err |= copy_to_user (frame + 1, regs + 1, fsize); @@ -952,7 +957,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, return -EFAULT; } - frame = get_sigframe(ksig, sizeof(*frame)); + frame = get_sigframe(ksig, tregs, sizeof(*frame)); if (fsize) err |= copy_to_user (&frame->uc.uc_extra, regs + 1, fsize); -- 2.37.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 2023-05-06 9:38 ` [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 Finn Thain @ 2023-05-22 11:41 ` Geert Uytterhoeven 2023-05-23 1:11 ` Michael Schmitz 2023-05-25 2:13 ` Finn Thain 0 siblings, 2 replies; 6+ messages in thread From: Geert Uytterhoeven @ 2023-05-22 11:41 UTC (permalink / raw) To: Finn Thain Cc: Michael Schmitz, Andreas Schwab, stable, linux-m68k, linux-kernel On Sat, May 6, 2023 at 11:36 AM Finn Thain <fthain@linux-m68k.org> wrote: > On 68030/020, an instruction such as, moveml %a2-%a3/%a5,%sp@- may cause > a stack page fault during instruction execution (i.e. not at an > instruction boundary) and produce a format 0xB exception frame. > > In this situation, the value of USP will be unreliable. If a signal is to > be delivered following the exception, this USP value is used to calculate > the location for a signal frame. This can result in a corrupted user > stack. > > The corruption was detected in dash (actually in glibc) where it showed > up as an intermittent "stack smashing detected" message and crash > following signal delivery for SIGCHLD. > > It was hard to reproduce that failure because delivery of the signal > raced with the page fault and because the kernel places an unpredictable > gap of up to 7 bytes between the USP and the signal frame. > > A format 0xB exception frame can be produced by a bus error or an address > error. The 68030 Users Manual says that address errors occur immediately > upon detection during instruction prefetch. The instruction pipeline > allows prefetch to overlap with other instructions, which means an > address error can arise during the execution of a different instruction. > So it seems likely that this patch may help in the address error case also. > > Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> > Link: https://lore.kernel.org/all/CAMuHMdW3yD22_ApemzW_6me3adq6A458u1_F0v-1EYwK_62jPA@mail.gmail.com/ > Cc: Michael Schmitz <schmitzmic@gmail.com> > Cc: Andreas Schwab <schwab@linux-m68k.org> > Cc: stable@vger.kernel.org > Co-developed-by: Michael Schmitz <schmitzmic@gmail.com> > Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> > Signed-off-by: Finn Thain <fthain@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> i.e. will queue as a fix in the m68k for-v6.4 branch. I plan to send this upstream later this week, so any additional testing would be appreciated. Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 2023-05-22 11:41 ` Geert Uytterhoeven @ 2023-05-23 1:11 ` Michael Schmitz 2023-05-23 6:59 ` Geert Uytterhoeven 2023-05-25 2:13 ` Finn Thain 1 sibling, 1 reply; 6+ messages in thread From: Michael Schmitz @ 2023-05-23 1:11 UTC (permalink / raw) To: Geert Uytterhoeven, Finn Thain Cc: Andreas Schwab, stable, linux-m68k, linux-kernel Hi Geert, On 22/05/23 23:41, Geert Uytterhoeven wrote: > On Sat, May 6, 2023 at 11:36 AM Finn Thain <fthain@linux-m68k.org> wrote: >> On 68030/020, an instruction such as, moveml %a2-%a3/%a5,%sp@- may cause >> a stack page fault during instruction execution (i.e. not at an >> instruction boundary) and produce a format 0xB exception frame. >> >> In this situation, the value of USP will be unreliable. If a signal is to >> be delivered following the exception, this USP value is used to calculate >> the location for a signal frame. This can result in a corrupted user >> stack. >> >> The corruption was detected in dash (actually in glibc) where it showed >> up as an intermittent "stack smashing detected" message and crash >> following signal delivery for SIGCHLD. >> >> It was hard to reproduce that failure because delivery of the signal >> raced with the page fault and because the kernel places an unpredictable >> gap of up to 7 bytes between the USP and the signal frame. >> >> A format 0xB exception frame can be produced by a bus error or an address >> error. The 68030 Users Manual says that address errors occur immediately >> upon detection during instruction prefetch. The instruction pipeline >> allows prefetch to overlap with other instructions, which means an >> address error can arise during the execution of a different instruction. >> So it seems likely that this patch may help in the address error case also. >> >> Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> >> Link: https://lore.kernel.org/all/CAMuHMdW3yD22_ApemzW_6me3adq6A458u1_F0v-1EYwK_62jPA@mail.gmail.com/ >> Cc: Michael Schmitz <schmitzmic@gmail.com> >> Cc: Andreas Schwab <schwab@linux-m68k.org> >> Cc: stable@vger.kernel.org >> Co-developed-by: Michael Schmitz <schmitzmic@gmail.com> >> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> >> Signed-off-by: Finn Thain <fthain@linux-m68k.org> > Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> > i.e. will queue as a fix in the m68k for-v6.4 branch. > > I plan to send this upstream later this week, so any additional > testing would be appreciated. I've given this some lengthy stress testing, and haven't seen it fail once. In contrast, various attempts of mine to improve on the concept (by only moving the signal frame away from the USP in case it's likely to clash) sometimes came up against a kernel bus error in setup_frame() when copying the signo to the signal frame. I must be making some incorrect assumptions still ... Limiting the signal frame shift to bus fault exceptions that happen mid-instruction is not too much of an overhead even in low memory settings, and using 256 bytes (the largest possible operand size, i.e. the largest adjustment to USP that might occur on completion of the interrupted instruction) did not seem to cause any issues with stack growth either. I can give this some more testing in ARAnyM (extending the stack shift to format 7 frames) but I'd say it's got as much testing on 030 hardware as we can do. Cheers, Michael > > Thanks! > > Gr{oetje,eeting}s, > > Geert > > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like that. > -- Linus Torvalds ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 2023-05-23 1:11 ` Michael Schmitz @ 2023-05-23 6:59 ` Geert Uytterhoeven 0 siblings, 0 replies; 6+ messages in thread From: Geert Uytterhoeven @ 2023-05-23 6:59 UTC (permalink / raw) To: Michael Schmitz Cc: Finn Thain, Andreas Schwab, stable, linux-m68k, linux-kernel Hi Michael, On Tue, May 23, 2023 at 3:11 AM Michael Schmitz <schmitzmic@gmail.com> wrote: > On 22/05/23 23:41, Geert Uytterhoeven wrote: > > On Sat, May 6, 2023 at 11:36 AM Finn Thain <fthain@linux-m68k.org> wrote: > >> On 68030/020, an instruction such as, moveml %a2-%a3/%a5,%sp@- may cause > >> a stack page fault during instruction execution (i.e. not at an > >> instruction boundary) and produce a format 0xB exception frame. > >> > >> In this situation, the value of USP will be unreliable. If a signal is to > >> be delivered following the exception, this USP value is used to calculate > >> the location for a signal frame. This can result in a corrupted user > >> stack. > >> > >> The corruption was detected in dash (actually in glibc) where it showed > >> up as an intermittent "stack smashing detected" message and crash > >> following signal delivery for SIGCHLD. > >> > >> It was hard to reproduce that failure because delivery of the signal > >> raced with the page fault and because the kernel places an unpredictable > >> gap of up to 7 bytes between the USP and the signal frame. > >> > >> A format 0xB exception frame can be produced by a bus error or an address > >> error. The 68030 Users Manual says that address errors occur immediately > >> upon detection during instruction prefetch. The instruction pipeline > >> allows prefetch to overlap with other instructions, which means an > >> address error can arise during the execution of a different instruction. > >> So it seems likely that this patch may help in the address error case also. > >> > >> Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> > >> Link: https://lore.kernel.org/all/CAMuHMdW3yD22_ApemzW_6me3adq6A458u1_F0v-1EYwK_62jPA@mail.gmail.com/ > >> Cc: Michael Schmitz <schmitzmic@gmail.com> > >> Cc: Andreas Schwab <schwab@linux-m68k.org> > >> Cc: stable@vger.kernel.org > >> Co-developed-by: Michael Schmitz <schmitzmic@gmail.com> > >> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> > >> Signed-off-by: Finn Thain <fthain@linux-m68k.org> > > Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> > > i.e. will queue as a fix in the m68k for-v6.4 branch. > > > > I plan to send this upstream later this week, so any additional > > testing would be appreciated. > > I've given this some lengthy stress testing, and haven't seen it fail once. > > In contrast, various attempts of mine to improve on the concept (by only > moving the signal frame away from the USP in case it's likely to clash) > sometimes came up against a kernel bus error in setup_frame() when > copying the signo to the signal frame. I must be making some incorrect > assumptions still ... > > Limiting the signal frame shift to bus fault exceptions that happen > mid-instruction is not too much of an overhead even in low memory > settings, and using 256 bytes (the largest possible operand size, i.e. > the largest adjustment to USP that might occur on completion of the > interrupted instruction) did not seem to cause any issues with stack > growth either. > > I can give this some more testing in ARAnyM (extending the stack shift > to format 7 frames) but I'd say it's got as much testing on 030 hardware > as we can do. Thank you for your continued testing! And thanks a lot to anyone involved in nailing this issue! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 2023-05-22 11:41 ` Geert Uytterhoeven 2023-05-23 1:11 ` Michael Schmitz @ 2023-05-25 2:13 ` Finn Thain 1 sibling, 0 replies; 6+ messages in thread From: Finn Thain @ 2023-05-25 2:13 UTC (permalink / raw) To: Geert Uytterhoeven Cc: Michael Schmitz, Andreas Schwab, stable, linux-m68k, linux-kernel On Mon, 22 May 2023, Geert Uytterhoeven wrote: > > Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> > i.e. will queue as a fix in the m68k for-v6.4 branch. > Thanks Geert! > I plan to send this upstream later this week, so any additional > testing would be appreciated. > If there is a userland program that needs all of its sigaltstack space, and cannot accomodate a 256 byte margin, it will be hard to find that program. Often these signal stacks are used for handling (would-be) fatal signals that may not happen in regression testing. If we fix the kernel and a failure eventually does show up in a userland program, we may be able to patch that program. Looking through Debian's codebase, which is easily searchable, I found lots of different ways to determine the signal stack size. But it's not clear to me which programs calculate it correctly. Many programs take the architecture's SIGSTKSZ or MINSIGSTKSZ or getauxval(AT_MINSIGSTKSZ) or sysconf(_SC_SIGSTKSZ), or some combination, and then apply an arbitrary safety factor. Some add a constant or apply a lower bound that may reflect the actual requirements of the signal handler. Some just use page_size(). https://sources.debian.org/src/rust-generator/0.7.1-1/src/stack/unix.rs/?hl=85#L84 https://sources.debian.org/src/ocaml/4.13.1-4/runtime/signals_nat.c/?hl=293#L289 https://sources.debian.org/src/ruby3.1/3.1.2-7/signal.c/?hl=538#L531 https://sources.debian.org/src/python3.11/3.11.2-6/Modules/faulthandler.c/?hl=1385#L1368 https://sources.debian.org/src/llvm-toolchain-9/1:9.0.1-20/llvm/lib/Support/Unix/Signals.inc/?hl=259#L250 https://sources.debian.org/src/musl/1.2.3-1/src/aio/aio.c/?hl=98#L83 https://sources.debian.org/src/libreoffice/4:7.4.5-2/external/breakpad/SIGSTKSZ.patch/?hl=4#L4 https://sources.debian.org/src/libuv1/1.44.2-1/src/unix/thread.c/?hl=168#L165 https://sources.debian.org/src/stress-ng/0.15.08-1/core-helper.c/?hl=3290#L3276 https://sources.debian.org/src/cysignals/1.11.2+ds-2/src/cysignals/implementation.c/?hl=446#L442 https://sources.debian.org/src/vim/2:9.0.1378-2/src/os_unix.c/?hl=828#L811 https://sources.debian.org/src/m4/1.4.19-3/m4/sigaltstack.m4/?hl=88#L72 https://sources.debian.org/src/varnish/7.1.1-1.1/bin/varnishd/cache/cache_main.c/?hl=342#L330 https://sources.debian.org/src/sfxr-qt/1.5.0+ds-2/debian/patches/catch2-patch/?hl=3223#L3121 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-05-25 2:15 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-05-06 9:38 [PATCH v1 0/1] 68020/030 signal handling following exceptions Finn Thain 2023-05-06 9:38 ` [PATCH v1 1/1] m68k: Move signal frame following exception on 68020/030 Finn Thain 2023-05-22 11:41 ` Geert Uytterhoeven 2023-05-23 1:11 ` Michael Schmitz 2023-05-23 6:59 ` Geert Uytterhoeven 2023-05-25 2:13 ` Finn Thain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).