From: Greg KH <gregkh@linuxfoundation.org>
To: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: stable@vger.kernel.org, sashal@kernel.org, x86@kernel.org,
kernel@gpiccoli.net, kernel-dev@igalia.com
Subject: Re: [PATCH 5.10.y] x86/mm: Remove broken vsyscall emulation code from the page fault code
Date: Wed, 12 Jun 2024 15:57:01 +0200 [thread overview]
Message-ID: <2024061249-sadly-tripping-c315@gregkh> (raw)
In-Reply-To: <20240602152525.78730-1-gpiccoli@igalia.com>
On Sun, Jun 02, 2024 at 12:24:34PM -0300, Guilherme G. Piccoli wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
>
> commit 02b670c1f88e78f42a6c5aee155c7b26960ca054 upstream.
>
> The syzbot-reported stack trace from hell in this discussion thread
> actually has three nested page faults:
>
> https://lore.kernel.org/r/000000000000d5f4fc0616e816d4@google.com
>
> ... and I think that's actually the important thing here:
>
> - the first page fault is from user space, and triggers the vsyscall
> emulation.
>
> - the second page fault is from __do_sys_gettimeofday(), and that should
> just have caused the exception that then sets the return value to
> -EFAULT
>
> - the third nested page fault is due to _raw_spin_unlock_irqrestore() ->
> preempt_schedule() -> trace_sched_switch(), which then causes a BPF
> trace program to run, which does that bpf_probe_read_compat(), which
> causes that page fault under pagefault_disable().
>
> It's quite the nasty backtrace, and there's a lot going on.
>
> The problem is literally the vsyscall emulation, which sets
>
> current->thread.sig_on_uaccess_err = 1;
>
> and that causes the fixup_exception() code to send the signal *despite* the
> exception being caught.
>
> And I think that is in fact completely bogus. It's completely bogus
> exactly because it sends that signal even when it *shouldn't* be sent -
> like for the BPF user mode trace gathering.
>
> In other words, I think the whole "sig_on_uaccess_err" thing is entirely
> broken, because it makes any nested page-faults do all the wrong things.
>
> Now, arguably, I don't think anybody should enable vsyscall emulation any
> more, but this test case clearly does.
>
> I think we should just make the "send SIGSEGV" be something that the
> vsyscall emulation does on its own, not this broken per-thread state for
> something that isn't actually per thread.
>
> The x86 page fault code actually tried to deal with the "incorrect nesting"
> by having that:
>
> if (in_interrupt())
> return;
>
> which ignores the sig_on_uaccess_err case when it happens in interrupts,
> but as shown by this example, these nested page faults do not need to be
> about interrupts at all.
>
> IOW, I think the only right thing is to remove that horrendously broken
> code.
>
> The attached patch looks like the ObviouslyCorrect(tm) thing to do.
>
> NOTE! This broken code goes back to this commit in 2011:
>
> 4fc3490114bb ("x86-64: Set siginfo and context on vsyscall emulation faults")
>
> ... and back then the reason was to get all the siginfo details right.
> Honestly, I do not for a moment believe that it's worth getting the siginfo
> details right here, but part of the commit says:
>
> This fixes issues with UML when vsyscall=emulate.
>
> ... and so my patch to remove this garbage will probably break UML in this
> situation.
>
> I do not believe that anybody should be running with vsyscall=emulate in
> 2024 in the first place, much less if you are doing things like UML. But
> let's see if somebody screams.
>
> Reported-and-tested-by: syzbot+83e7f982ca045ab4405c@syzkaller.appspotmail.com
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Tested-by: Jiri Olsa <jolsa@kernel.org>
> Acked-by: Andy Lutomirski <luto@kernel.org>
> Link: https://lore.kernel.org/r/CAHk-=wh9D6f7HUkDgZHKmDCHUQmp+Co89GP+b8+z+G56BKeyNg@mail.gmail.com
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> [gpiccoli: Backport the patch due to differences in the trees. The main change
> between 5.10.y and 5.15.y is due to renaming the fixup function, by
> commit 6456a2a69ee1 ("x86/fault: Rename no_context() to kernelmode_fixup_or_oops()").
>
> Following 2 commits cause divergence in the diffs too (in the removed lines):
> cd072dab453a ("x86/fault: Add a helper function to sanitize error code")
> d4ffd5df9d18 ("x86/fault: Fix wrong signal when vsyscall fails with pkey")
>
> Finally, there is context adjustment in the processor.h file.]
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> ---
>
>
> Hi folks, this was backported by AUTOSEL up to 5.15.y; I'm manually submitting
> the backport to 5.4.y and 5.10.y. I've detailed a bit the changes necessary
> due to other nonrelated missing patches, but these are really simple and
> non-intrusive. Nevertheless, I've explicitely CCed x86 ML to be sure the
> maintainers are aware of the backport, and if anybody thinks we shouldn't
> do it for these (very) old releases, please respond here.
Both now queued up, thanks.
greg k-h
prev parent reply other threads:[~2024-06-12 13:57 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-02 15:24 [PATCH 5.10.y] x86/mm: Remove broken vsyscall emulation code from the page fault code Guilherme G. Piccoli
2024-06-12 13:57 ` Greg KH [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2024061249-sadly-tripping-c315@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=gpiccoli@igalia.com \
--cc=kernel-dev@igalia.com \
--cc=kernel@gpiccoli.net \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox