segfaults of processes while being killed after commit "mm: make the page fault mmap locking killable"

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Fiona Ebner <f.ebner@proxmox.com>
To: torvalds@linux-foundation.org, akpm@linux-foundation.org
Cc: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Wolfgang Bumiller <w.bumiller@proxmox.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: segfaults of processes while being killed after commit "mm: make the page fault mmap locking killable"
Date: Tue, 25 Jul 2023 13:16:27 +0200	[thread overview]
Message-ID: <8d063a26-43f5-0bb7-3203-c6a04dc159f8@proxmox.com> (raw)

Hi,

we are seeing segfaults from processes while being killed starting with
kernels which include commit eda0047296a16d65a7f2bc60a408f70d178b2014
("mm: make the page fault mmap locking killable") all the way up to
v6.5-rc3 which is the kernel I based this report on.

I don't have a simple reproducer unfortunately, the one I have is big
and quite racy. My working theory for what happens is (see also the
bpftrace script and output [0]):

Since get_mmap_lock_carefully() now uses mmap_read_lock_killable(), if
rwsem_down_write_slowpath() is taken and there is a fatal signal
pending, rwsem_down_write_slowpath() will return -EINTR and this is
propagated up until get_mmap_lock_carefully() will return its boolean
negation with !mmap_read_lock_killable(mm), i.e. 0.

Then lock_mm_and_find_vma() returns NULL

>     if (!get_mmap_lock_carefully(mm, regs))
>         return NULL;
and so do_user_addr_fault()

>     vma = lock_mm_and_find_vma(mm, address, regs);
>     if (unlikely(!vma)) {
>         bad_area_nosemaphore(regs, error_code, address);
>         return;
>     }

will end up without a vma and cause/log the segfault. Of course the
process is already being killed, but I'd argue it is very confusing to
users when apparent segfaults from such processes are being logged by
the kernel.

Happy to provide other traces or information if required!

Best Regards,
Fiona

[0]:

I ended up with the following bpftrace script

> #include <linux/signal.h>
> #include <linux/sched.h>
> 
> kprobe:down_read_killable {
>     printf("%s %d %d\n", func, pid, tid);
> }
> 
> kprobe:rwsem_down_read_slowpath {
>     printf("%s %d %d\n", func, pid, tid);
> }
> 
> kretprobe:rwsem_down_read_slowpath {
>     printf("%s %d %d retval 0x%x\n", func, pid, tid, retval);
>     printf("%s\n", kstack());
> }
> 
> kprobe:bad_area_nosemaphore {
>     printf("%s %d %d %s pending signal: %d\n", func, pid, tid, comm,
>     curtask->pending.signal.sig[0]
>     );
>     if (curtask->pending.signal.sig[0]) {
> 	printf("%s\n", kstack());
>     }
> }

and here is a capture of a process running into the segfault. AFAIU, the
pending signal translates to SIGKILL and the return value from
down_read_killable() is -EINTR.

> down_read_killable 987299 987299
> rwsem_down_read_slowpath 987299 987299
> down_read_killable 987299 987299 retval 0xfffffffc
> 
>         down_read_killable+72
>         lock_mm_and_find_vma+167
>         do_user_addr_fault+477
>         exc_page_fault+131
>         asm_exc_page_fault+39
> 
> bad_area_nosemaphore 987299 987299 pverados pending signal: 256
> 
>         bad_area_nosemaphore+1
>         exc_page_fault+131
>         asm_exc_page_fault+39
> 
> bad_area_nosemaphore 987299 987299 pverados pending signal: 256
> 
>         bad_area_nosemaphore+1
>         exc_page_fault+131
>         asm_exc_page_fault+39
>         rep_movs_alternative+96
>         show_opcodes+118
>         __bad_area_nosemaphore+640
>         bad_area_nosemaphore+22
>         do_user_addr_fault+708
>         exc_page_fault+131
>         asm_exc_page_fault+39
> 
> bad_area_nosemaphore 987299 987299 pverados pending signal: 256
> 
>         bad_area_nosemaphore+1
>         exc_page_fault+131
>         asm_exc_page_fault+39
>         rep_movs_alternative+15
>         show_opcodes+118
>         __bad_area_nosemaphore+640
>         bad_area_nosemaphore+22
>         do_user_addr_fault+708
>         exc_page_fault+131
>         asm_exc_page_fault+39
>

next             reply	other threads:[~2023-07-25 11:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 11:16 Fiona Ebner [this message]
2023-07-25 16:38 ` segfaults of processes while being killed after commit "mm: make the page fault mmap locking killable" Linus Torvalds
2023-07-26  6:51   ` Thomas Lamprecht
2023-07-26  8:19   ` Fiona Ebner
2023-07-26 17:59     ` Linus Torvalds
2023-07-27  7:57       ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d063a26-43f5-0bb7-3203-c6a04dc159f8@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=t.lamprecht@proxmox.com \
    --cc=torvalds@linux-foundation.org \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).