From: "Michał Pecio" <michal.pecio@gmail.com>
To: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org,
Catalin Marinas <catalin.marinas@arm.com>,
Linux kernel regressions list <regressions@lists.linux.dev>,
Kees Cook <kees@kernel.org>
Subject: Re: cacheflush completely broken, suspecting PAN+LPAE
Date: Tue, 12 Nov 2024 10:32:29 +0100 [thread overview]
Message-ID: <20241112103229.566b1ff3@foxbook> (raw)
In-Reply-To: <CACRpkdYsgnBFsyF9upr=xoLdMJ25DCXMEJ-co7rJTnRanJncug@mail.gmail.com>
Hi Linus,
On Tue, 12 Nov 2024 02:15:19 +0100, Linus Walleij wrote:
> We are trying to locate the issue, which I think is the same as this
> but not sure:
> https://bugzilla.kernel.org/show_bug.cgi?id=219247
You can verify by asking the reporter to run the crashing program under
strace. If SIGSEGV follows a failed cacheflush, it's my bug most likely.
A straightforward repro of this bug:
gdb
GUILE_JIT_THRESHOLD=0 gdb
GUILE_JIT_THRESHOLD=-1 gdb
Expected outcome: segfault, segfault, shows command prompt.
> I have been trying to replicate it on a Chromebook but didn't get so
> far yet because the installation is pretty idiomatic :/ also there is
> only appears in a single Qt program and not as predictable as here.
My bug also appears in a single program ;) This system works fine, but
any JIT is broken by this kind of bug. The failure may be random if the
caches resynchronize by a fluke, but with gdb it was every time so far.
> But. It appears the code is issuing cacheflush() which I guess ends
> up in arm_syscall() here:
>
> case NR(cacheflush):
> return do_cache_op(regs->ARM_r0, regs->ARM_r1, regs->ARM_r2);
>
> To here:
>
> static inline int
> do_cache_op(unsigned long start, unsigned long end, int flags)
> {
> if (end < start || flags)
> return -EINVAL;
>
> if (!access_ok((void __user *)start, end - start))
> return -EFAULT;
>
> return __do_cache_op(start, end);
> }
Yep. I added printks here and it is particularly the call to
flush_icache_range() from __do_cache_op() which returns -EFAULT.
> Here userspace access should be fine because we have entered a
> syscall from userspace. I tried to emulate the situation with this
> program:
>
> #include <stdlib.h>
> #include <stdio.h>
> #include <errno.h>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sys/mman.h>
>
> #define NR_cacheflush 0xf0002
>
> /* libgcc */
> extern void __clear_cache(void *, void *);
>
> int main (int argc, char **argv) {
> void *addr;
> int ret;
>
> printf("Test()\n");
> addr = mmap(NULL, 0x1000, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
> if (addr == MAP_FAILED) {
> printf("mmap() failed\n");
> exit(1);
> }
This seems incomplete, there is no __clear_cache(). But if you add it
at the end then yes, it should fail. Confirm it with strace.
> I added prints in the cacheflush trap:
>
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index 480e307501bb..400650519bd1 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -592,11 +592,14 @@ __do_cache_op(unsigned long start, unsigned
> long end) static inline int
> do_cache_op(unsigned long start, unsigned long end, int flags)
> {
> + pr_info("%s(%08lx-%08lx)\n", __func__, start, end);
> if (end < start || flags)
> return -EINVAL;
>
> - if (!access_ok((void __user *)start, end - start))
> + if (!access_ok((void __user *)start, end - start)) {
> + pr_err("ACCESS NOT OK\n");
> return -EFAULT;
> + }
>
> return __do_cache_op(start, end);
> }
You also need to check what __do_cache_op() returns.
Regards,
Michal
next prev parent reply other threads:[~2024-11-12 9:37 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-11 22:38 cacheflush completely broken, suspecting PAN+LPAE Michał Pecio
2024-11-12 1:15 ` Linus Walleij
2024-11-12 6:41 ` Arnd Bergmann
2024-11-12 9:46 ` Michał Pecio
2024-11-12 9:32 ` Michał Pecio [this message]
2024-11-12 10:16 ` Michał Pecio
2024-11-12 10:21 ` Russell King (Oracle)
2024-11-12 10:45 ` Michał Pecio
2024-11-12 13:58 ` Linus Walleij
2024-11-12 17:10 ` Michał Pecio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241112103229.566b1ff3@foxbook \
--to=michal.pecio@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=kees@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.