[PATCH] tmpfs: don't interrupt fallocate with EINTR

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] tmpfs: don't interrupt fallocate with EINTR
@ 2024-05-15 22:10 Jan Kara
  2024-05-15 23:09 ` Matthew Wilcox
  2024-06-03 14:35 ` Christian Brauner
  0 siblings, 2 replies; 11+ messages in thread
From: Jan Kara @ 2024-05-15 22:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, Hugh Dickins, linux-mm, Mikulas Patocka, Jan Kara

From: Mikulas Patocka <mpatocka@redhat.com>

I have a program that sets up a periodic timer with 10ms interval. When
the program attempts to call fallocate(2) on tmpfs, it goes into an
infinite loop. fallocate(2) takes longer than 10ms, so it gets
interrupted by a signal and it returns EINTR. On EINTR, the fallocate
call is restarted, going into the same loop again.

Let's change the signal_pending() check in shmem_fallocate() loop to
fatal_signal_pending(). This solves the problem of shmem_fallocate()
constantly restarting. Since most other filesystem's fallocate methods
don't react to signals, it is unlikely userspace really relies on timely
delivery of non-fatal signals while fallocate is running. Also the
comment before the signal check:

/*
 * Good, the fallocate(2) manpage permits EINTR: we may have
 * been interrupted because we are using up too much memory.
 */

indicates that the check was mainly added for OOM situations in which
case the process will be sent a fatal signal so this change preserves
the behavior in OOM situations.

[JK: Update changelog and comment based on upstream discussion]

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/shmem.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 1f84a41aeb85..9c148f9723f4 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3167,10 +3167,13 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
 		struct folio *folio;

 		/*
-		 * Good, the fallocate(2) manpage permits EINTR: we may have
-		 * been interrupted because we are using up too much memory.
+		 * Check for fatal signal so that we abort early in OOM
+		 * situations. We don't want to abort in case of non-fatal
+		 * signals as large fallocate can take noticeable time and
+		 * e.g. periodic timers may result in fallocate constantly
+		 * restarting.
 		 */
-		if (signal_pending(current))
+		if (fatal_signal_pending(current))
 			error = -EINTR;
 		else if (shmem_falloc.nr_unswapped > shmem_falloc.nr_falloced)
 			error = -ENOMEM;
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-05-15 22:10 [PATCH] tmpfs: don't interrupt fallocate with EINTR Jan Kara
@ 2024-05-15 23:09 ` Matthew Wilcox
  2024-06-03 14:35 ` Christian Brauner
  1 sibling, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2024-05-15 23:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christian Brauner, linux-fsdevel, Hugh Dickins, linux-mm,
	Mikulas Patocka

On Thu, May 16, 2024 at 12:10:44AM +0200, Jan Kara wrote:
> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> I have a program that sets up a periodic timer with 10ms interval. When
> the program attempts to call fallocate(2) on tmpfs, it goes into an
> infinite loop. fallocate(2) takes longer than 10ms, so it gets
> interrupted by a signal and it returns EINTR. On EINTR, the fallocate
> call is restarted, going into the same loop again.
> 
> Let's change the signal_pending() check in shmem_fallocate() loop to
> fatal_signal_pending(). This solves the problem of shmem_fallocate()
> constantly restarting. Since most other filesystem's fallocate methods
> don't react to signals, it is unlikely userspace really relies on timely
> delivery of non-fatal signals while fallocate is running. Also the
> comment before the signal check:
> 
> /*
>  * Good, the fallocate(2) manpage permits EINTR: we may have
>  * been interrupted because we are using up too much memory.
>  */
> 
> indicates that the check was mainly added for OOM situations in which
> case the process will be sent a fatal signal so this change preserves
> the behavior in OOM situations.
> 
> [JK: Update changelog and comment based on upstream discussion]
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-05-15 22:10 [PATCH] tmpfs: don't interrupt fallocate with EINTR Jan Kara
  2024-05-15 23:09 ` Matthew Wilcox
@ 2024-06-03 14:35 ` Christian Brauner
  1 sibling, 0 replies; 11+ messages in thread
From: Christian Brauner @ 2024-06-03 14:35 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christian Brauner, linux-fsdevel, Hugh Dickins, linux-mm,
	Mikulas Patocka

On Thu, 16 May 2024 00:10:44 +0200, Jan Kara wrote:
> I have a program that sets up a periodic timer with 10ms interval. When
> the program attempts to call fallocate(2) on tmpfs, it goes into an
> infinite loop. fallocate(2) takes longer than 10ms, so it gets
> interrupted by a signal and it returns EINTR. On EINTR, the fallocate
> call is restarted, going into the same loop again.
> 
> Let's change the signal_pending() check in shmem_fallocate() loop to
> fatal_signal_pending(). This solves the problem of shmem_fallocate()
> constantly restarting. Since most other filesystem's fallocate methods
> don't react to signals, it is unlikely userspace really relies on timely
> delivery of non-fatal signals while fallocate is running. Also the
> comment before the signal check:
> 
> [...]

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] tmpfs: don't interrupt fallocate with EINTR
      https://git.kernel.org/vfs/vfs/c/f113ef08b6bd


^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <ef5c3b-fcd0-db5c-8d4-eeae79e62267@redhat.com>]

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
       [not found] <ef5c3b-fcd0-db5c-8d4-eeae79e62267@redhat.com>
@ 2024-03-05  8:42 ` Christian Brauner
  2024-03-05  9:34   ` Mikulas Patocka
  2024-03-06 17:49   ` Jan Kara
  0 siblings, 2 replies; 11+ messages in thread
From: Christian Brauner @ 2024-03-05  8:42 UTC (permalink / raw)
  To: Mikulas Patocka, Hugh Dickins
  Cc: Alexander Viro, Jan Kara, linux-fsdevel, linux-mm

On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> Hi
> 
> I have a program that sets up a periodic timer with 10ms interval. When
> the program attempts to call fallocate on tmpfs, it goes into an infinite
> loop. fallocate takes longer than 10ms, so it gets interrupted by a
> signal and it returns EINTR. On EINTR, the fallocate call is restarted,
> going into the same loop again.
> 
> fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> 
> Should there be fatal_signal_pending instead of signal_pending in the
> shmem_fallocate loop?
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> 
> ---
>  mm/shmem.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6/mm/shmem.c
> ===================================================================
> --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
>  		 * Good, the fallocate(2) manpage permits EINTR: we may have
>  		 * been interrupted because we are using up too much memory.
>  		 */
> -		if (signal_pending(current))
> +		if (fatal_signal_pending(current))

I think that's likely wrong and probably would cause regressions as
there may be users relying on this?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-05  8:42 ` Christian Brauner
@ 2024-03-05  9:34   ` Mikulas Patocka
  2024-03-05 10:10     ` Christian Brauner
  2024-03-06 17:49   ` Jan Kara
  1 sibling, 1 reply; 11+ messages in thread
From: Mikulas Patocka @ 2024-03-05  9:34 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Hugh Dickins, Alexander Viro, Jan Kara, linux-fsdevel, linux-mm



On Tue, 5 Mar 2024, Christian Brauner wrote:

> On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > 
> > Index: linux-2.6/mm/shmem.c
> > ===================================================================
> > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> >  		 * been interrupted because we are using up too much memory.
> >  		 */
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> 
> I think that's likely wrong and probably would cause regressions as
> there may be users relying on this?

ext4 fallocate doesn't return -EINTR. So, userspace code can't rely on it.

Mikulas



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-05  9:34   ` Mikulas Patocka
@ 2024-03-05 10:10     ` Christian Brauner
  2024-03-05 14:03       ` Mikulas Patocka
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2024-03-05 10:10 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Hugh Dickins, Alexander Viro, Jan Kara, linux-fsdevel, linux-mm

On Tue, Mar 05, 2024 at 10:34:26AM +0100, Mikulas Patocka wrote:
> 
> 
> On Tue, 5 Mar 2024, Christian Brauner wrote:
> 
> > On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > > 
> > > Index: linux-2.6/mm/shmem.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> > >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> > >  		 * been interrupted because we are using up too much memory.
> > >  		 */
> > > -		if (signal_pending(current))
> > > +		if (fatal_signal_pending(current))
> > 
> > I think that's likely wrong and probably would cause regressions as
> > there may be users relying on this?
> 
> ext4 fallocate doesn't return -EINTR. So, userspace code can't rely on it.

I'm confused what does this have to do with ext4 since this is about
tmpfs. Also note, that fallocate(2) documents EINTR as a valid return
value. And fwiw, the manpage also states that "EINTR  A signal was
caught during execution; see signal(7)." not a "fatal signal".

Aside from that. If a user sends SIGUSR1 then with the code as it is now
that fallocate call will be interrupted. With your change that SIGUSR1
won't do anything anymore. Instead userspace would need to send SIGKILL.
So userspace that uses SIGUSR1 will suddenly hang.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-05 10:10     ` Christian Brauner
@ 2024-03-05 14:03       ` Mikulas Patocka
  2024-03-07 10:47         ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Mikulas Patocka @ 2024-03-05 14:03 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Hugh Dickins, Alexander Viro, Jan Kara, linux-fsdevel, linux-mm



On Tue, 5 Mar 2024, Christian Brauner wrote:

> On Tue, Mar 05, 2024 at 10:34:26AM +0100, Mikulas Patocka wrote:
> > 
> > 
> > On Tue, 5 Mar 2024, Christian Brauner wrote:
> > 
> > > On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > > > 
> > > > Index: linux-2.6/mm/shmem.c
> > > > ===================================================================
> > > > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > > > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > > > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> > > >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> > > >  		 * been interrupted because we are using up too much memory.
> > > >  		 */
> > > > -		if (signal_pending(current))
> > > > +		if (fatal_signal_pending(current))
> > > 
> > > I think that's likely wrong and probably would cause regressions as
> > > there may be users relying on this?
> > 
> > ext4 fallocate doesn't return -EINTR. So, userspace code can't rely on it.
> 
> I'm confused what does this have to do with ext4 since this is about
> tmpfs.

You said that applications may rely on -EINTR and I said they don't 
because ext4 doesn't return -EINTR.

> Also note, that fallocate(2) documents EINTR as a valid return
> value. And fwiw, the manpage also states that "EINTR  A signal was
> caught during execution; see signal(7)." not a "fatal signal".

Yes, but how should the userspace use the fallocate call reliably? Block 
all the signals around the call to fallocate? What to do if I use some 
library that calls fallocate and retries on EINTR?

> Aside from that. If a user sends SIGUSR1 then with the code as it is now
> that fallocate call will be interrupted. With your change that SIGUSR1
> won't do anything anymore. Instead userspace would need to send SIGKILL.
> So userspace that uses SIGUSR1 will suddenly hang.

It will survive one SIGUSR, but it hangs if the signal is being sent at a 
periodic interval.

A quick search shows that people are already adding loops when fallocate 
returns EINTR. All these loops will livelock when a signal is repeatedly 
being delivered: 
https://forge.chapril.org/hardcoresushi/libgocryptfs/commit/8518d6d7bde33fdc7ef5bcb7c3c7709404392ad8?style=unified&whitespace= 
https://postgrespro.com/media/maillist-attaches/pgsql-hackers/2022/07/1/20220701154105.jjfutmngoedgiad3@alvherre.pgsql/v2-0001-retry-ftruncate.patch 
https://lists.nongnu.org/archive/html/qemu-devel/2015-02/msg01116.html

Here, Postgres developers hit the same problem with retrying (they have 
5ms timer):
https://www.postgresql.org/message-id/CA%2BhUKGKS2Radu-1Ewhe1-LEj19C-3XAQ7wnkQMb4e9E9q9ZXSg%40mail.gmail.com

Mikulas



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-05 14:03       ` Mikulas Patocka
@ 2024-03-07 10:47         ` Christian Brauner
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Brauner @ 2024-03-07 10:47 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Hugh Dickins, Alexander Viro, Jan Kara, linux-fsdevel, linux-mm

On Tue, Mar 05, 2024 at 03:03:53PM +0100, Mikulas Patocka wrote:
> 
> 
> On Tue, 5 Mar 2024, Christian Brauner wrote:
> 
> > On Tue, Mar 05, 2024 at 10:34:26AM +0100, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Tue, 5 Mar 2024, Christian Brauner wrote:
> > > 
> > > > On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > > > > 
> > > > > Index: linux-2.6/mm/shmem.c
> > > > > ===================================================================
> > > > > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > > > > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > > > > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> > > > >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> > > > >  		 * been interrupted because we are using up too much memory.
> > > > >  		 */
> > > > > -		if (signal_pending(current))
> > > > > +		if (fatal_signal_pending(current))
> > > > 
> > > > I think that's likely wrong and probably would cause regressions as
> > > > there may be users relying on this?
> > > 
> > > ext4 fallocate doesn't return -EINTR. So, userspace code can't rely on it.
> > 
> > I'm confused what does this have to do with ext4 since this is about
> > tmpfs.
> 
> You said that applications may rely on -EINTR and I said they don't 
> because ext4 doesn't return -EINTR.
> 
> > Also note, that fallocate(2) documents EINTR as a valid return
> > value. And fwiw, the manpage also states that "EINTR  A signal was
> > caught during execution; see signal(7)." not a "fatal signal".
> 
> Yes, but how should the userspace use the fallocate call reliably? Block 
> all the signals around the call to fallocate? What to do if I use some 
> library that calls fallocate and retries on EINTR?
> 
> > Aside from that. If a user sends SIGUSR1 then with the code as it is now
> > that fallocate call will be interrupted. With your change that SIGUSR1
> > won't do anything anymore. Instead userspace would need to send SIGKILL.
> > So userspace that uses SIGUSR1 will suddenly hang.
> 
> It will survive one SIGUSR, but it hangs if the signal is being sent at a 
> periodic interval.
> 
> A quick search shows that people are already adding loops when fallocate 
> returns EINTR. All these loops will livelock when a signal is repeatedly 
> being delivered: 
> https://forge.chapril.org/hardcoresushi/libgocryptfs/commit/8518d6d7bde33fdc7ef5bcb7c3c7709404392ad8?style=unified&whitespace= 
> https://postgrespro.com/media/maillist-attaches/pgsql-hackers/2022/07/1/20220701154105.jjfutmngoedgiad3@alvherre.pgsql/v2-0001-retry-ftruncate.patch 
> https://lists.nongnu.org/archive/html/qemu-devel/2015-02/msg01116.html
> 
> Here, Postgres developers hit the same problem with retrying (they have 
> 5ms timer):
> https://www.postgresql.org/message-id/CA%2BhUKGKS2Radu-1Ewhe1-LEj19C-3XAQ7wnkQMb4e9E9q9ZXSg%40mail.gmail.com

All fair points.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-05  8:42 ` Christian Brauner
  2024-03-05  9:34   ` Mikulas Patocka
@ 2024-03-06 17:49   ` Jan Kara
  2024-03-07 10:45     ` Christian Brauner
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Kara @ 2024-03-06 17:49 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Mikulas Patocka, Hugh Dickins, Alexander Viro, Jan Kara,
	linux-fsdevel, linux-mm

Hello,

On Tue 05-03-24 09:42:27, Christian Brauner wrote:
> On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > I have a program that sets up a periodic timer with 10ms interval. When
> > the program attempts to call fallocate on tmpfs, it goes into an infinite
> > loop. fallocate takes longer than 10ms, so it gets interrupted by a
> > signal and it returns EINTR. On EINTR, the fallocate call is restarted,
> > going into the same loop again.
> > 
> > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > 
> > Should there be fatal_signal_pending instead of signal_pending in the
> > shmem_fallocate loop?
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  mm/shmem.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-2.6/mm/shmem.c
> > ===================================================================
> > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> >  		 * been interrupted because we are using up too much memory.
> >  		 */
> > -		if (signal_pending(current))
> > +		if (fatal_signal_pending(current))
> 
> I think that's likely wrong and probably would cause regressions as
> there may be users relying on this?

I understand your concern about userspace regressions but is the EINTR
behavior that useful? Sure, something can be relying on terminating
fallocate(2) with any signal but since tmpfs is the only filesystem having
this behavior, it is fair to say there are even higher chances some
application will be surprised by this behavior when used on tmpfs as
Mikulas was? So I wouldn't be that opposed to this change. *But* tmpfs has
a comment explaining the signal_pending() check:

                /*
                 * Good, the fallocate(2) manpage permits EINTR: we may have
                 * been interrupted because we are using up too much memory.
                 */

Now I'd expect the signal to be fatal in this case but we definitely need
to make sure this is the case if we want to consider changing the test.
Hugh?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-06 17:49   ` Jan Kara
@ 2024-03-07 10:45     ` Christian Brauner
  2024-03-07 14:59       ` Matthew Wilcox
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2024-03-07 10:45 UTC (permalink / raw)
  To: Jan Kara
  Cc: Mikulas Patocka, Hugh Dickins, Alexander Viro, linux-fsdevel,
	linux-mm

On Wed, Mar 06, 2024 at 06:49:11PM +0100, Jan Kara wrote:
> Hello,
> 
> On Tue 05-03-24 09:42:27, Christian Brauner wrote:
> > On Mon, Mar 04, 2024 at 07:43:39PM +0100, Mikulas Patocka wrote:
> > > I have a program that sets up a periodic timer with 10ms interval. When
> > > the program attempts to call fallocate on tmpfs, it goes into an infinite
> > > loop. fallocate takes longer than 10ms, so it gets interrupted by a
> > > signal and it returns EINTR. On EINTR, the fallocate call is restarted,
> > > going into the same loop again.
> > > 
> > > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > > fallocate(19, FALLOC_FL_KEEP_SIZE, 0, 206057565) = -1 EINTR (Přerušené volání systému)
> > > --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
> > > sigreturn({mask=[]})                    = -1 EINTR (Přerušené volání systému)
> > > 
> > > Should there be fatal_signal_pending instead of signal_pending in the
> > > shmem_fallocate loop?
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > 
> > > ---
> > >  mm/shmem.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > Index: linux-2.6/mm/shmem.c
> > > ===================================================================
> > > --- linux-2.6.orig/mm/shmem.c	2024-01-18 19:18:31.000000000 +0100
> > > +++ linux-2.6/mm/shmem.c	2024-03-04 19:05:25.000000000 +0100
> > > @@ -3143,7 +3143,7 @@ static long shmem_fallocate(struct file
> > >  		 * Good, the fallocate(2) manpage permits EINTR: we may have
> > >  		 * been interrupted because we are using up too much memory.
> > >  		 */
> > > -		if (signal_pending(current))
> > > +		if (fatal_signal_pending(current))
> > 
> > I think that's likely wrong and probably would cause regressions as
> > there may be users relying on this?
> 
> I understand your concern about userspace regressions but is the EINTR
> behavior that useful? Sure, something can be relying on terminating

I don't know.

> fallocate(2) with any signal but since tmpfs is the only filesystem having

Hugetlbfs has the same logic.

> this behavior, it is fair to say there are even higher chances some
> application will be surprised by this behavior when used on tmpfs as
> Mikulas was? So I wouldn't be that opposed to this change. *But* tmpfs has
> a comment explaining the signal_pending() check:
> 
>                 /*
>                  * Good, the fallocate(2) manpage permits EINTR: we may have
>                  * been interrupted because we are using up too much memory.
>                  */
> 
> Now I'd expect the signal to be fatal in this case but we definitely need
> to make sure this is the case if we want to consider changing the test.

Right now fallocate() is restartable. You could get EINTR and then
retry. Changing this to fatal_signal_pending() would mean that this
property is lost. The task will have to be wiped.

If this is only done for the sake of the OOM killer then we can probably
try and change it. But then we'd need to also reflect that on the
manpage.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tmpfs: don't interrupt fallocate with EINTR
  2024-03-07 10:45     ` Christian Brauner
@ 2024-03-07 14:59       ` Matthew Wilcox
  0 siblings, 0 replies; 11+ messages in thread
From: Matthew Wilcox @ 2024-03-07 14:59 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jan Kara, Mikulas Patocka, Hugh Dickins, Alexander Viro,
	linux-fsdevel, linux-mm

On Thu, Mar 07, 2024 at 11:45:59AM +0100, Christian Brauner wrote:
> Right now fallocate() is restartable. You could get EINTR and then
> retry. Changing this to fatal_signal_pending() would mean that this
> property is lost. The task will have to be wiped.

People made much the same argument when I removed the nfs 'intr'
mount option.  It hasn't actually caused any issues that I've seen.

> If this is only done for the sake of the OOM killer then we can probably
> try and change it. But then we'd need to also reflect that on the
> manpage.

It's part of POSIX, I wouldn't remove the manpage.  I think it's a QoI
issue; we should only check for fatal signals.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-06-03 14:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-15 22:10 [PATCH] tmpfs: don't interrupt fallocate with EINTR Jan Kara
2024-05-15 23:09 ` Matthew Wilcox
2024-06-03 14:35 ` Christian Brauner
     [not found] <ef5c3b-fcd0-db5c-8d4-eeae79e62267@redhat.com>
2024-03-05  8:42 ` Christian Brauner
2024-03-05  9:34   ` Mikulas Patocka
2024-03-05 10:10     ` Christian Brauner
2024-03-05 14:03       ` Mikulas Patocka
2024-03-07 10:47         ` Christian Brauner
2024-03-06 17:49   ` Jan Kara
2024-03-07 10:45     ` Christian Brauner
2024-03-07 14:59       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).