* Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. [not found] <4fc1b4e8f1fb4c8c81f280db09178797@intel.com> @ 2021-03-08 19:00 ` Andy Lutomirski 2021-03-11 1:19 ` Aili Yao 0 siblings, 1 reply; 6+ messages in thread From: Andy Lutomirski @ 2021-03-08 19:00 UTC (permalink / raw) To: Luck, Tony, Oleg Nesterov, Linux API Cc: Aili Yao, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1, Linux-MM, LKML > On Mar 8, 2021, at 10:31 AM, Luck, Tony <tony.luck@intel.com> wrote: > > >> >> Can you point me at that SIGBUS code in a current kernel? > > It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() > or copy from user variant was in use in the kernel when the poison memory was consumed. > > if (p->mce_vaddr != (void __user *)-1l) { > force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); Hmm. On the one hand, no one has complained yet. On the other hand, hardware that supports this isn’t exactly common. We may need some actual ABI design here. We also need to make sure that things like io_uring accesses or, more generally, anything using the use_mm / use_temporary_mm ends up either sending no signal or sending a signal to the right target. > > Would it be any better if we used the BUS_MCEERR_AO code that goes into siginfo? Dunno. > > That would make it match up better with what happens when poison is found > asynchronously by the patrol scrubber. I.e. the semantics are: > > AR: You just touched poison at this address and need to do something about that. > AO: Just letting you know that you have some poison at the address in siginfo. > > -Tony ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. 2021-03-08 19:00 ` [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access Andy Lutomirski @ 2021-03-11 1:19 ` Aili Yao 2021-03-11 1:28 ` Andy Lutomirski 0 siblings, 1 reply; 6+ messages in thread From: Aili Yao @ 2021-03-11 1:19 UTC (permalink / raw) To: Andy Lutomirski Cc: Luck, Tony, Oleg Nesterov, Linux API, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1, Linux-MM, LKML, yaoaili, sunhao2 On Mon, 8 Mar 2021 11:00:28 -0800 Andy Lutomirski <luto@amacapital.net> wrote: > > On Mar 8, 2021, at 10:31 AM, Luck, Tony <tony.luck@intel.com> wrote: > > > > > >> > >> Can you point me at that SIGBUS code in a current kernel? > > > > It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() > > or copy from user variant was in use in the kernel when the poison memory was consumed. > > > > if (p->mce_vaddr != (void __user *)-1l) { > > force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); > > Hmm. On the one hand, no one has complained yet. On the other hand, hardware that supports this isn’t exactly common. > > We may need some actual ABI design here. We also need to make sure that things like io_uring accesses or, more generally, anything using the use_mm / use_temporary_mm ends up either sending no signal or sending a signal to the right target. > > > > > Would it be any better if we used the BUS_MCEERR_AO code that goes into siginfo? > > Dunno. I have one thought here but don't know if it's proper: Previous patch use force_sig_mceerr to the user process for such a scenario; with this method The SIGBUS can't be ignored as force_sig_mceerr() was designed to. If the user process don't want this signal, will it set signal config to ignore? Maybe we can use a send_sig_mceerr() instead of force_sig_mceerr(), if process want to ignore the SIGBUS, then it will ignore that, or it can also process the SIGBUS? -- Thanks! Aili Yao ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. 2021-03-11 1:19 ` Aili Yao @ 2021-03-11 1:28 ` Andy Lutomirski 2021-03-11 2:01 ` Aili Yao 2021-03-11 16:52 ` Luck, Tony 0 siblings, 2 replies; 6+ messages in thread From: Andy Lutomirski @ 2021-03-11 1:28 UTC (permalink / raw) To: Aili Yao Cc: Luck, Tony, Oleg Nesterov, Linux API, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1, Linux-MM, LKML, sunhao2 On Wed, Mar 10, 2021 at 5:19 PM Aili Yao <yaoaili@kingsoft.com> wrote: > > On Mon, 8 Mar 2021 11:00:28 -0800 > Andy Lutomirski <luto@amacapital.net> wrote: > > > > On Mar 8, 2021, at 10:31 AM, Luck, Tony <tony.luck@intel.com> wrote: > > > > > > > > >> > > >> Can you point me at that SIGBUS code in a current kernel? > > > > > > It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() > > > or copy from user variant was in use in the kernel when the poison memory was consumed. > > > > > > if (p->mce_vaddr != (void __user *)-1l) { > > > force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); > > > > Hmm. On the one hand, no one has complained yet. On the other hand, hardware that supports this isn’t exactly common. > > > > We may need some actual ABI design here. We also need to make sure that things like io_uring accesses or, more generally, anything using the use_mm / use_temporary_mm ends up either sending no signal or sending a signal to the right target. > > > > > > > > Would it be any better if we used the BUS_MCEERR_AO code that goes into siginfo? > > > > Dunno. > > I have one thought here but don't know if it's proper: > > Previous patch use force_sig_mceerr to the user process for such a scenario; with this method > The SIGBUS can't be ignored as force_sig_mceerr() was designed to. > > If the user process don't want this signal, will it set signal config to ignore? > Maybe we can use a send_sig_mceerr() instead of force_sig_mceerr(), if process want to > ignore the SIGBUS, then it will ignore that, or it can also process the SIGBUS? I don't think the signal blocking mechanism makes sense for this. Blocking a signal is for saying that, if another process sends the signal (or an async event like ctrl-C), then the process doesn't want it. Blocking doesn't block synchronous things like faults. I think we need to at least fix the existing bug before we add more signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel threads. --Andy ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. 2021-03-11 1:28 ` Andy Lutomirski @ 2021-03-11 2:01 ` Aili Yao 2021-03-11 16:52 ` Luck, Tony 1 sibling, 0 replies; 6+ messages in thread From: Aili Yao @ 2021-03-11 2:01 UTC (permalink / raw) To: Andy Lutomirski Cc: Luck, Tony, Oleg Nesterov, Linux API, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1, Linux-MM, LKML, sunhao2, yaoaili, suhua1 On Wed, 10 Mar 2021 17:28:12 -0800 Andy Lutomirski <luto@amacapital.net> wrote: > On Wed, Mar 10, 2021 at 5:19 PM Aili Yao <yaoaili@kingsoft.com> wrote: > > > > On Mon, 8 Mar 2021 11:00:28 -0800 > > Andy Lutomirski <luto@amacapital.net> wrote: > > > > > > On Mar 8, 2021, at 10:31 AM, Luck, Tony <tony.luck@intel.com> wrote: > > > > > > > > > > > >> > > > >> Can you point me at that SIGBUS code in a current kernel? > > > > > > > > It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() > > > > or copy from user variant was in use in the kernel when the poison memory was consumed. > > > > > > > > if (p->mce_vaddr != (void __user *)-1l) { > > > > force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); > > > > > > Hmm. On the one hand, no one has complained yet. On the other hand, hardware that supports this isn’t exactly common. > > > > > > We may need some actual ABI design here. We also need to make sure that things like io_uring accesses or, more generally, anything using the use_mm / use_temporary_mm ends up either sending no signal or sending a signal to the right target. > > > > > > > > > > > Would it be any better if we used the BUS_MCEERR_AO code that goes into siginfo? > > > > > > Dunno. > > > > I have one thought here but don't know if it's proper: > > > > Previous patch use force_sig_mceerr to the user process for such a scenario; with this method > > The SIGBUS can't be ignored as force_sig_mceerr() was designed to. > > > > If the user process don't want this signal, will it set signal config to ignore? > > Maybe we can use a send_sig_mceerr() instead of force_sig_mceerr(), if process want to > > ignore the SIGBUS, then it will ignore that, or it can also process the SIGBUS? > > I don't think the signal blocking mechanism makes sense for this. > Blocking a signal is for saying that, if another process sends the > signal (or an async event like ctrl-C), then the process doesn't want > it. Blocking doesn't block synchronous things like faults. > > I think we need to at least fix the existing bug before we add more > signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel > threads. Got this, Thanks! I read https://man7.org/linux/man-pages/man2/write.2.html, and it seems the write syscall is not expecting an signal, maybe a specific error code for this scenario is enough. -- Thanks! Aili Yao ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. 2021-03-11 1:28 ` Andy Lutomirski 2021-03-11 2:01 ` Aili Yao @ 2021-03-11 16:52 ` Luck, Tony 2021-03-11 16:56 ` Peter Zijlstra 1 sibling, 1 reply; 6+ messages in thread From: Luck, Tony @ 2021-03-11 16:52 UTC (permalink / raw) To: Andy Lutomirski, Aili Yao Cc: Oleg Nesterov, Linux API, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1@kingsoft.com, Linux-MM, LKML, sunhao2@kingsoft.com > I think we need to at least fix the existing bug before we add more > signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel > threads. Can a kernel thread do get_user() or copy_from_user()? Do we have kernel threads that have an associated user address space to copy from? -Tony ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. 2021-03-11 16:52 ` Luck, Tony @ 2021-03-11 16:56 ` Peter Zijlstra 0 siblings, 0 replies; 6+ messages in thread From: Peter Zijlstra @ 2021-03-11 16:56 UTC (permalink / raw) To: Luck, Tony Cc: Andy Lutomirski, Aili Yao, Oleg Nesterov, Linux API, Andy Lutomirski, HORIGUCHI NAOYA, Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, yangfeng1@kingsoft.com, Linux-MM, LKML, sunhao2@kingsoft.com On Thu, Mar 11, 2021 at 04:52:10PM +0000, Luck, Tony wrote: > > I think we need to at least fix the existing bug before we add more > > signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel > > threads. > > Can a kernel thread do get_user() or copy_from_user()? Do we have kernel threads > that have an associated user address space to copy from? kthread_use_mm() ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-11 16:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4fc1b4e8f1fb4c8c81f280db09178797@intel.com>
2021-03-08 19:00 ` [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access Andy Lutomirski
2021-03-11 1:19 ` Aili Yao
2021-03-11 1:28 ` Andy Lutomirski
2021-03-11 2:01 ` Aili Yao
2021-03-11 16:52 ` Luck, Tony
2021-03-11 16:56 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).