public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it
       [not found] <CACKH++YAtEMYu2nTLUyfmxZoGO37fqogKMDkBpddmNaz5HE6ng@mail.gmail.com>
@ 2024-11-11 12:02 ` Thorsten Leemhuis
  2024-11-26 22:33   ` Rui Ueyama
  0 siblings, 1 reply; 5+ messages in thread
From: Thorsten Leemhuis @ 2024-11-11 12:02 UTC (permalink / raw)
  To: brauner, Rui Ueyama; +Cc: regressions, LKML, Linux-fsdevel, Alexander Viro

[adding a few CCs, dropping stable]

On 28.10.24 12:15, Rui Ueyama wrote:
> I'm the creator and the maintainer of the mold linker
> (https://github.com/rui314/mold). Recently, we discovered that mold
> started causing process crashes in certain situations due to a change
> in the Linux kernel. Here are the details:
> 
> - In general, overwriting an existing file is much faster than
> creating an empty file and writing to it on Linux, so mold attempts to
> reuse an existing executable file if it exists.
> 
> - If a program is running, opening the executable file for writing
> previously failed with ETXTBSY. If that happens, mold falls back to
> creating a new file.
> 
> - However, the Linux kernel recently changed the behavior so that
> writing to an executable file is now always permitted
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853).

FWIW, that is 2a010c41285345 ("fs: don't block i_writecount during
exec") [v6.11-rc1] from Christian Brauner.

> That caused mold to write to an executable file even if there's a
> process running that file. Since changes to mmap'ed files are
> immediately visible to other processes, any processes running that
> file would almost certainly crash in a very mysterious way.
> Identifying the cause of these random crashes took us a few days.
> 
> Rejecting writes to an executable file that is currently running is a
> well-known behavior, and Linux had operated that way for a very long
> time. So, I don’t believe relying on this behavior was our mistake;
> rather, I see this as a regression in the Linux kernel.
> 
> Here is a bug report to the mold linker:
> https://github.com/rui314/mold/issues/1361

Thx for the report. I might be missing something, but from here it looks
like nothing happened. So please allow me to ask:

What's the status? Did anyone look into this? Is this sill happening?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it
  2024-11-11 12:02 ` [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it Thorsten Leemhuis
@ 2024-11-26 22:33   ` Rui Ueyama
  2024-11-27 12:11     ` Christian Brauner
  0 siblings, 1 reply; 5+ messages in thread
From: Rui Ueyama @ 2024-11-26 22:33 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: brauner, regressions, LKML, Linux-fsdevel, Alexander Viro

On Mon, Nov 11, 2024 at 9:02 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
> [adding a few CCs, dropping stable]
>
> On 28.10.24 12:15, Rui Ueyama wrote:
> > I'm the creator and the maintainer of the mold linker
> > (https://github.com/rui314/mold). Recently, we discovered that mold
> > started causing process crashes in certain situations due to a change
> > in the Linux kernel. Here are the details:
> >
> > - In general, overwriting an existing file is much faster than
> > creating an empty file and writing to it on Linux, so mold attempts to
> > reuse an existing executable file if it exists.
> >
> > - If a program is running, opening the executable file for writing
> > previously failed with ETXTBSY. If that happens, mold falls back to
> > creating a new file.
> >
> > - However, the Linux kernel recently changed the behavior so that
> > writing to an executable file is now always permitted
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853).
>
> FWIW, that is 2a010c41285345 ("fs: don't block i_writecount during
> exec") [v6.11-rc1] from Christian Brauner.
>
> > That caused mold to write to an executable file even if there's a
> > process running that file. Since changes to mmap'ed files are
> > immediately visible to other processes, any processes running that
> > file would almost certainly crash in a very mysterious way.
> > Identifying the cause of these random crashes took us a few days.
> >
> > Rejecting writes to an executable file that is currently running is a
> > well-known behavior, and Linux had operated that way for a very long
> > time. So, I don’t believe relying on this behavior was our mistake;
> > rather, I see this as a regression in the Linux kernel.
> >
> > Here is a bug report to the mold linker:
> > https://github.com/rui314/mold/issues/1361
>
> Thx for the report. I might be missing something, but from here it looks
> like nothing happened. So please allow me to ask:
>
> What's the status? Did anyone look into this? Is this sill happening?

Ping? I think this is a fairly major kernel regression. We can't ask
all our mold linker users to upgrade their linker before upgrading
their kernel. Isn't "Never break userland" the kernel's policy? I
wonder why no action or even a discussion has taken place so far.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it
  2024-11-26 22:33   ` Rui Ueyama
@ 2024-11-27 12:11     ` Christian Brauner
  2024-11-27 13:39       ` Rui Ueyama
  2024-11-27 15:54       ` Linus Torvalds
  0 siblings, 2 replies; 5+ messages in thread
From: Christian Brauner @ 2024-11-27 12:11 UTC (permalink / raw)
  To: Linus Torvalds, Jan Kara, Amir Goldstein, Josef Bacik
  Cc: Thorsten Leemhuis, regressions, LKML, Linux-fsdevel,
	Alexander Viro, Rui Ueyama

On Wed, Nov 27, 2024 at 07:33:53AM +0900, Rui Ueyama wrote:
> On Mon, Nov 11, 2024 at 9:02 PM Thorsten Leemhuis
> <regressions@leemhuis.info> wrote:
> >
> > [adding a few CCs, dropping stable]
> >
> > On 28.10.24 12:15, Rui Ueyama wrote:
> > > I'm the creator and the maintainer of the mold linker
> > > (https://github.com/rui314/mold). Recently, we discovered that mold
> > > started causing process crashes in certain situations due to a change
> > > in the Linux kernel. Here are the details:
> > >
> > > - In general, overwriting an existing file is much faster than
> > > creating an empty file and writing to it on Linux, so mold attempts to
> > > reuse an existing executable file if it exists.
> > >
> > > - If a program is running, opening the executable file for writing
> > > previously failed with ETXTBSY. If that happens, mold falls back to
> > > creating a new file.
> > >
> > > - However, the Linux kernel recently changed the behavior so that
> > > writing to an executable file is now always permitted
> > > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853).
> >
> > FWIW, that is 2a010c41285345 ("fs: don't block i_writecount during
> > exec") [v6.11-rc1] from Christian Brauner.
> >
> > > That caused mold to write to an executable file even if there's a
> > > process running that file. Since changes to mmap'ed files are
> > > immediately visible to other processes, any processes running that
> > > file would almost certainly crash in a very mysterious way.
> > > Identifying the cause of these random crashes took us a few days.
> > >
> > > Rejecting writes to an executable file that is currently running is a
> > > well-known behavior, and Linux had operated that way for a very long
> > > time. So, I don’t believe relying on this behavior was our mistake;
> > > rather, I see this as a regression in the Linux kernel.
> > >
> > > Here is a bug report to the mold linker:
> > > https://github.com/rui314/mold/issues/1361
> >
> > Thx for the report. I might be missing something, but from here it looks
> > like nothing happened. So please allow me to ask:
> >
> > What's the status? Did anyone look into this? Is this sill happening?

Linus, it seems that the mold linker relies on the deny_write_access()
mechanism for executables. The mold linker tries to open a file for
writing and if ETXTBSY is returned mold falls back to creating a new
file.

There is now a fix in mold upstream in
https://github.com/rui314/mold/commit/8e4f7b53832d8af4f48a633a8385cbc932d1944e

However, mold upstream still insists on a revert (no judgement on my
part in case that sentence is misinterpreted).

Note, that the revert will cause issues for the fanotify pre-content
hook patch series in [1] which was the cause for the removal of the
deny_write_access() protection for executables so that on page faults
the contents of executables could be filled-in by userspace. This is
useful when dealing with very large executables and is apparently used
by Meta.

[1]: https://lore.kernel.org/r/20241121112218.8249-1-jack@suse.cz

While Amir tells me that they may have a way around this I expect this
to be hacky.

This will also trigger a revert/rework of the LTP testsuite which has
adapted various tests to the deny_write_access() removal for
executables.

There's been some delay in responding to this after my initial comment
on Github because I entered into a month of sickness. So I just got
reminded of this issue now. In any case, here's a tag that you can pull
if you agree with the revert. 

The following changes since commit 7eef7e306d3c40a0c5b9ff6adc9b273cc894dbd5:

  Merge tag 'for-6.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm (2024-11-25 18:54:00 -0800)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.13.exec.deny_write_access.revert

for you to fetch changes up to 3b832035387ff508fdcf0fba66701afc78f79e3d:

  Revert "fs: don't block i_writecount during exec" (2024-11-27 12:51:30 +0100)

----------------------------------------------------------------
vfs-6.13.exec.deny_write_access.revert

----------------------------------------------------------------
Christian Brauner (1):
      Revert "fs: don't block i_writecount during exec"

 fs/binfmt_elf.c       |  2 ++
 fs/binfmt_elf_fdpic.c |  5 ++++-
 fs/binfmt_misc.c      |  7 +++++--
 fs/exec.c             | 23 +++++++++++++++--------
 kernel/fork.c         | 26 +++++++++++++++++++++++---
 5 files changed, 49 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it
  2024-11-27 12:11     ` Christian Brauner
@ 2024-11-27 13:39       ` Rui Ueyama
  2024-11-27 15:54       ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Rui Ueyama @ 2024-11-27 13:39 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jan Kara, Amir Goldstein, Josef Bacik,
	Thorsten Leemhuis, regressions, LKML, Linux-fsdevel,
	Alexander Viro

On Wed, Nov 27, 2024 at 9:11 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Nov 27, 2024 at 07:33:53AM +0900, Rui Ueyama wrote:
> > On Mon, Nov 11, 2024 at 9:02 PM Thorsten Leemhuis
> > <regressions@leemhuis.info> wrote:
> > >
> > > [adding a few CCs, dropping stable]
> > >
> > > On 28.10.24 12:15, Rui Ueyama wrote:
> > > > I'm the creator and the maintainer of the mold linker
> > > > (https://github.com/rui314/mold). Recently, we discovered that mold
> > > > started causing process crashes in certain situations due to a change
> > > > in the Linux kernel. Here are the details:
> > > >
> > > > - In general, overwriting an existing file is much faster than
> > > > creating an empty file and writing to it on Linux, so mold attempts to
> > > > reuse an existing executable file if it exists.
> > > >
> > > > - If a program is running, opening the executable file for writing
> > > > previously failed with ETXTBSY. If that happens, mold falls back to
> > > > creating a new file.
> > > >
> > > > - However, the Linux kernel recently changed the behavior so that
> > > > writing to an executable file is now always permitted
> > > > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a010c412853).
> > >
> > > FWIW, that is 2a010c41285345 ("fs: don't block i_writecount during
> > > exec") [v6.11-rc1] from Christian Brauner.
> > >
> > > > That caused mold to write to an executable file even if there's a
> > > > process running that file. Since changes to mmap'ed files are
> > > > immediately visible to other processes, any processes running that
> > > > file would almost certainly crash in a very mysterious way.
> > > > Identifying the cause of these random crashes took us a few days.
> > > >
> > > > Rejecting writes to an executable file that is currently running is a
> > > > well-known behavior, and Linux had operated that way for a very long
> > > > time. So, I don’t believe relying on this behavior was our mistake;
> > > > rather, I see this as a regression in the Linux kernel.
> > > >
> > > > Here is a bug report to the mold linker:
> > > > https://github.com/rui314/mold/issues/1361
> > >
> > > Thx for the report. I might be missing something, but from here it looks
> > > like nothing happened. So please allow me to ask:
> > >
> > > What's the status? Did anyone look into this? Is this sill happening?
>
> Linus, it seems that the mold linker relies on the deny_write_access()
> mechanism for executables. The mold linker tries to open a file for
> writing and if ETXTBSY is returned mold falls back to creating a new
> file.
>
> There is now a fix in mold upstream in
> https://github.com/rui314/mold/commit/8e4f7b53832d8af4f48a633a8385cbc932d1944e
>
> However, mold upstream still insists on a revert (no judgement on my
> part in case that sentence is misinterpreted).

I don't have a strong opinion on whether returning ETXTBSY is
desirable or not. We can cooperate to make a smooth transition to the
new behavior of open(2). That being said, making an abrupt kernel
change that breaks userland in a very mysterious way is, in my
opinion, not acceptable. I'm not personally affected by this issue,
but I needed to speak for our users who may upgrade their kernels
before upgrading their linker.

> Note, that the revert will cause issues for the fanotify pre-content
> hook patch series in [1] which was the cause for the removal of the
> deny_write_access() protection for executables so that on page faults
> the contents of executables could be filled-in by userspace. This is
> useful when dealing with very large executables and is apparently used
> by Meta.
>
> [1]: https://lore.kernel.org/r/20241121112218.8249-1-jack@suse.cz
>
> While Amir tells me that they may have a way around this I expect this
> to be hacky.
>
> This will also trigger a revert/rework of the LTP testsuite which has
> adapted various tests to the deny_write_access() removal for
> executables.
>
> There's been some delay in responding to this after my initial comment
> on Github because I entered into a month of sickness. So I just got
> reminded of this issue now. In any case, here's a tag that you can pull
> if you agree with the revert.
>
> The following changes since commit 7eef7e306d3c40a0c5b9ff6adc9b273cc894dbd5:
>
>   Merge tag 'for-6.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm (2024-11-25 18:54:00 -0800)
>
> are available in the Git repository at:
>
>   git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.13.exec.deny_write_access.revert
>
> for you to fetch changes up to 3b832035387ff508fdcf0fba66701afc78f79e3d:
>
>   Revert "fs: don't block i_writecount during exec" (2024-11-27 12:51:30 +0100)
>
> ----------------------------------------------------------------
> vfs-6.13.exec.deny_write_access.revert
>
> ----------------------------------------------------------------
> Christian Brauner (1):
>       Revert "fs: don't block i_writecount during exec"
>
>  fs/binfmt_elf.c       |  2 ++
>  fs/binfmt_elf_fdpic.c |  5 ++++-
>  fs/binfmt_misc.c      |  7 +++++--
>  fs/exec.c             | 23 +++++++++++++++--------
>  kernel/fork.c         | 26 +++++++++++++++++++++++---
>  5 files changed, 49 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it
  2024-11-27 12:11     ` Christian Brauner
  2024-11-27 13:39       ` Rui Ueyama
@ 2024-11-27 15:54       ` Linus Torvalds
  1 sibling, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2024-11-27 15:54 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Amir Goldstein, Josef Bacik, Thorsten Leemhuis,
	regressions, LKML, Linux-fsdevel, Alexander Viro, Rui Ueyama

On Wed, 27 Nov 2024 at 04:11, Christian Brauner <brauner@kernel.org> wrote:
>
> Linus, it seems that the mold linker relies on the deny_write_access()
> mechanism for executables. The mold linker tries to open a file for
> writing and if ETXTBSY is returned mold falls back to creating a new
> file.

Uhhuh. Ok, unfortunate, but this is clearly a real use case, so of
course we'll revert the kernel change.

           Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-11-27 15:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CACKH++YAtEMYu2nTLUyfmxZoGO37fqogKMDkBpddmNaz5HE6ng@mail.gmail.com>
2024-11-11 12:02 ` [REGRESSION] mold linker depends on ETXTBSY, but open(2) no longer returns it Thorsten Leemhuis
2024-11-26 22:33   ` Rui Ueyama
2024-11-27 12:11     ` Christian Brauner
2024-11-27 13:39       ` Rui Ueyama
2024-11-27 15:54       ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox