public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] execve updates for v6.12-rc1
@ 2024-09-16  8:39 Kees Cook
  2024-09-18 10:40 ` pr-tracker-bot
  2024-09-26 18:29 ` Vegard Nossum
  0 siblings, 2 replies; 8+ messages in thread
From: Kees Cook @ 2024-09-16  8:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, Allen Pais, Brian Mak, Eric W. Biederman, Jeff Xu,
	Kees Cook, Roman Kisel

Hi Linus,

Please pull these execve updates for v6.12-rc1. Note there is a trivial
merge conflict between this and mm, which was resolved in -next with:
https://lore.kernel.org/linux-next/20240909171843.78c294da@canb.auug.org.au/

Thanks!

-Kees

The following changes since commit de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed:

  Linux 6.11-rc2 (2024-08-04 13:50:53 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git tags/execve-v6.12-rc1

for you to fetch changes up to 44f65d900698278a8451988abe0d5ca37fd46882:

  binfmt_elf: mseal address zero (2024-08-14 09:56:48 -0700)

----------------------------------------------------------------
execve updates for v6.12-rc1

- binfmt_elf: Dump smaller VMAs first in ELF cores (Brian Mak)

- binfmt_elf: mseal address zero (Jeff Xu)

- binfmt_elf, coredump: Log the reason of the failed core dumps
  (Roman Kisel)

----------------------------------------------------------------
Brian Mak (1):
      binfmt_elf: Dump smaller VMAs first in ELF cores

Jeff Xu (1):
      binfmt_elf: mseal address zero

Roman Kisel (2):
      coredump: Standartize and fix logging
      binfmt_elf, coredump: Log the reason of the failed core dumps

 fs/binfmt_elf.c          |  53 +++++++++++----
 fs/coredump.c            | 166 ++++++++++++++++++++++++++++++++++-------------
 include/linux/coredump.h |  30 ++++++++-
 include/linux/mm.h       |  10 +++
 kernel/signal.c          |  21 +++++-
 mm/mseal.c               |   2 +-
 6 files changed, 220 insertions(+), 62 deletions(-)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-16  8:39 [GIT PULL] execve updates for v6.12-rc1 Kees Cook
@ 2024-09-18 10:40 ` pr-tracker-bot
  2024-09-26 18:29 ` Vegard Nossum
  1 sibling, 0 replies; 8+ messages in thread
From: pr-tracker-bot @ 2024-09-18 10:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, linux-kernel, Allen Pais, Brian Mak,
	Eric W. Biederman, Jeff Xu, Kees Cook, Roman Kisel

The pull request you sent on Mon, 16 Sep 2024 01:39:39 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git tags/execve-v6.12-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/667495de218c25e909c6b33ed647b592a8a71a02

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-16  8:39 [GIT PULL] execve updates for v6.12-rc1 Kees Cook
  2024-09-18 10:40 ` pr-tracker-bot
@ 2024-09-26 18:29 ` Vegard Nossum
  2024-09-26 18:43   ` Linus Torvalds
  2024-09-28 21:09   ` Kees Cook
  1 sibling, 2 replies; 8+ messages in thread
From: Vegard Nossum @ 2024-09-26 18:29 UTC (permalink / raw)
  To: Kees Cook, Linus Torvalds
  Cc: linux-kernel, Allen Pais, Brian Mak, Eric W. Biederman, Jeff Xu,
	Roman Kisel, regressions


On 16/09/2024 10:39, Kees Cook wrote:
> Hi Linus,
> 
> Please pull these execve updates for v6.12-rc1. Note there is a trivial
> merge conflict between this and mm, which was resolved in -next with:
> https://lore.kernel.org/linux-next/20240909171843.78c294da@canb.auug.org.au/
> 
> Thanks!
> 
> -Kees
> 
> The following changes since commit de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed:
> 
>    Linux 6.11-rc2 (2024-08-04 13:50:53 -0700)
> 
> are available in the Git repository at:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git tags/execve-v6.12-rc1
> 
> for you to fetch changes up to 44f65d900698278a8451988abe0d5ca37fd46882:
> 
>    binfmt_elf: mseal address zero (2024-08-14 09:56:48 -0700)
> 
> ----------------------------------------------------------------
> execve updates for v6.12-rc1
> 
> - binfmt_elf: Dump smaller VMAs first in ELF cores (Brian Mak)
> 
> - binfmt_elf: mseal address zero (Jeff Xu)
> 
> - binfmt_elf, coredump: Log the reason of the failed core dumps
>    (Roman Kisel)

Hi,

This last commit seems to introduce a regression for me, creating a
completely unkillable process (but idle/0% CPU) that is stuck here:

$ sudo cat /proc/2453/stack
[<0>] do_exit+0xee/0xac0
[<0>] do_group_exit+0x34/0x90
[<0>] get_signal+0xa63/0xa70
[<0>] arch_do_signal_or_restart+0x42/0x260
[<0>] irqentry_exit_to_user_mode+0x1e0/0x250
[<0>] irqentry_exit+0x43/0x50
[<0>] exc_page_fault+0x94/0x1d0
[<0>] asm_exc_page_fault+0x27/0x30

$ cat /proc/2453/status
...
State:  I (idle)
...
TracerPid:      0
...
Kthread:        0
VmPeak:     2240 kB
VmSize:     2240 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:       568 kB
VmRSS:       568 kB
RssAnon:             136 kB
RssFile:             432 kB
RssShmem:              0 kB
VmData:      420 kB
VmStk:       132 kB
VmExe:      1644 kB
VmLib:        16 kB
VmPTE:        60 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
CoreDumping:    1
THP_enabled:    1
untag_mask:     0xffffffffffffffff
Threads:        1
SigQ:   0/62622
SigPnd: 0000000000000100
ShdPnd: 0000000000000100
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 00000000000020db
...

The process is so unkillable I can't even shut my laptop down without
holding the power button for 5 seconds -- apart from that, everything
works correctly.

Bisection ended up here:

# first bad commit: [fb97d2eb542faf19a8725afbd75cbc2518903210] 
binfmt_elf, coredump: Log the reason of the failed core dumps

I have to admit I don't immediately see what's wrong with the patch.


Vegard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-26 18:29 ` Vegard Nossum
@ 2024-09-26 18:43   ` Linus Torvalds
  2024-09-26 19:09     ` Eric W. Biederman
  2024-09-28 21:09   ` Kees Cook
  1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2024-09-26 18:43 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Kees Cook, linux-kernel, Allen Pais, Brian Mak, Eric W. Biederman,
	Jeff Xu, Roman Kisel, regressions

On Thu, 26 Sept 2024 at 11:29, Vegard Nossum <vegard.nossum@oracle.com> wrote:
>
> # first bad commit: [fb97d2eb542faf19a8725afbd75cbc2518903210]
> binfmt_elf, coredump: Log the reason of the failed core dumps
>
> I have to admit I don't immediately see what's wrong with the patch.

That commit looks entirely broken.

I *suspect* that the problem is the crazy "get_task_comm()" in that
takes the task lock inside coredump_report_failure().

But honestly, I'm not going to bother even trying to debug this. The
whole notion was broken. People who have problems with truncated
core-files should be looking at their debuggers, not asking the kernel
for help.

               Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-26 18:43   ` Linus Torvalds
@ 2024-09-26 19:09     ` Eric W. Biederman
  2024-09-26 19:17       ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2024-09-26 19:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vegard Nossum, Kees Cook, linux-kernel, Allen Pais, Brian Mak,
	Jeff Xu, Roman Kisel, regressions

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 26 Sept 2024 at 11:29, Vegard Nossum <vegard.nossum@oracle.com> wrote:
>>
>> # first bad commit: [fb97d2eb542faf19a8725afbd75cbc2518903210]
>> binfmt_elf, coredump: Log the reason of the failed core dumps
>>
>> I have to admit I don't immediately see what's wrong with the patch.
>
> That commit looks entirely broken.
>
> I *suspect* that the problem is the crazy "get_task_comm()" in that
> takes the task lock inside coredump_report_failure().
>
> But honestly, I'm not going to bother even trying to debug this. The
> whole notion was broken. People who have problems with truncated
> core-files should be looking at their debuggers, not asking the kernel
> for help.

One of the common causes for coredump truncation is weird interactions
between io_uring and the coredump code.  (AKA kernel bugs).

That is something you can't ask your debugger to tell you.

So from 10,000 feet I think the idea is sane.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-26 19:09     ` Eric W. Biederman
@ 2024-09-26 19:17       ` Linus Torvalds
  2024-09-26 20:37         ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2024-09-26 19:17 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Vegard Nossum, Kees Cook, linux-kernel, Allen Pais, Brian Mak,
	Jeff Xu, Roman Kisel, regressions

On Thu, 26 Sept 2024 at 12:10, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> One of the common causes for coredump truncation is weird interactions
> between io_uring and the coredump code.  (AKA kernel bugs).
>
> That is something you can't ask your debugger to tell you.
>
> So from 10,000 feet I think the idea is sane.

What? No. Adding printk's to chase kernel bugs is certainly a
time-honored tradition. But we don't leave them in the kernel sources
for posterity.

And none of the coredumpo failure reports had anything to do with
io_uring bugs anyway. They were literally "print out when disk filled
up or core dumps weren't enabled".

If you didn't get a core dump because the kernel didn't have core
dumps configured, we shouldn't print out some babying kernel message
about that.

None of this has anything to do with io_uring or kernel bugs.

                  Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-26 19:17       ` Linus Torvalds
@ 2024-09-26 20:37         ` Eric W. Biederman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2024-09-26 20:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vegard Nossum, Kees Cook, linux-kernel, Allen Pais, Brian Mak,
	Jeff Xu, Roman Kisel, regressions

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, 26 Sept 2024 at 12:10, Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
>> One of the common causes for coredump truncation is weird interactions
>> between io_uring and the coredump code.  (AKA kernel bugs).
>>
>> That is something you can't ask your debugger to tell you.
>>
>> So from 10,000 feet I think the idea is sane.
>
> What? No. Adding printk's to chase kernel bugs is certainly a
> time-honored tradition. But we don't leave them in the kernel sources
> for posterity.

No argument from me there.  We certainly don't leave them enabled by
default.  Although in truth most of the failures the coredump code
can experience are cases that should never happen in normal operation.

> And none of the coredumpo failure reports had anything to do with
> io_uring bugs anyway. They were literally "print out when disk filled
> up or core dumps weren't enabled".

dump_interrupted was instrumented.  That is what io_uring was
triggering.  In fact dump_interrupted still has problems with I think
dumping to a pipe.

> If you didn't get a core dump because the kernel didn't have core
> dumps configured, we shouldn't print out some babying kernel message
> about that.

Some of them are certainly silly, or excessive.

> None of this has anything to do with io_uring or kernel bugs.

I respectfully disagree.

A huge part of the problem is that when io_uring triggers
dump_interrupted it is so subtle people don't have a clue what is going
on.  Not that I am saying it is necessarily io_uring that is just the
one I have debugged and tried to sort out.  Other kernel subsystems
could have similar weird interactions, but io_uring where it plays with
TIF_NOTIFY_SIGNAL has caused problems in the past.

I don't vouch for this implementation or think it is necessarily
the right way to get better information out, but the coredump code
is very much a black box that is quite difficult for people to work
with.

What I know is that recently truncated core dumps have been on peoples
radar enough that we received two separate patches from two different
organizations to do something about them.  That says to me that this an
actual problem that people are experiencing, not some theoretical thing.

I am all for reverting code that doesn't work, and for looking for
better solutions, but simply saying to people their pain is not a real
problem.  That seems terribly wrong.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] execve updates for v6.12-rc1
  2024-09-26 18:29 ` Vegard Nossum
  2024-09-26 18:43   ` Linus Torvalds
@ 2024-09-28 21:09   ` Kees Cook
  1 sibling, 0 replies; 8+ messages in thread
From: Kees Cook @ 2024-09-28 21:09 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Linus Torvalds, linux-kernel, Allen Pais, Brian Mak,
	Eric W. Biederman, Jeff Xu, Roman Kisel, regressions

On Thu, Sep 26, 2024 at 08:29:01PM +0200, Vegard Nossum wrote:
> This last commit seems to introduce a regression for me, creating a
> completely unkillable process (but idle/0% CPU) that is stuck here:

I've sent a potential fix here:
https://lore.kernel.org/all/20240928210830.work.307-kees@kernel.org/

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-28 21:09 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-16  8:39 [GIT PULL] execve updates for v6.12-rc1 Kees Cook
2024-09-18 10:40 ` pr-tracker-bot
2024-09-26 18:29 ` Vegard Nossum
2024-09-26 18:43   ` Linus Torvalds
2024-09-26 19:09     ` Eric W. Biederman
2024-09-26 19:17       ` Linus Torvalds
2024-09-26 20:37         ` Eric W. Biederman
2024-09-28 21:09   ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox