linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] ext4: Prevent an infinite loop in the lazyinit thread.
@ 2024-11-06 13:47 Mathieu Othacehe
  2024-11-06 13:47 ` [PATCH 1/1] " Mathieu Othacehe
  0 siblings, 1 reply; 4+ messages in thread
From: Mathieu Othacehe @ 2024-11-06 13:47 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger
  Cc: linux-ext4, linux-kernel, lukas.skupinski, anton.reding,
	Mathieu Othacehe

Hello,

Under the following conditions, the lazyinit thread can reschedule itself
indefinitely without doing anything, consuming a large amount of the system
resources:

In the ext4_run_li_request function, a start_time timestamp is taken. Right
before elr->lr_timeout is computed, in the same function, the system clock is
updated in userspace, from the Unix Epoch to the current time. The
elr->lr_timeout takes a large value. The elr->lr_next_sched is then set to a
value far away in the future.

/*
 * Away from jiffies because of a time jump when computing
 * elr->lr_timeout.
 */
elr->lr_next_sched = jiffies + elr->lr_timeout;

Back, in the ext4_lazyinit_thread that called the ext4_run_li_request, the
following condition can be false:

// elr->lr_next_sched > next_wakeup
if (time_before(elr->lr_next_sched, next_wakeup))
        next_wakeup = elr->lr_next_sched;

so that next_wakeup is not updated. Assuming that next_wakeup was not updated
above and still has the MAX_JIFFY_OFFSET value, the following condition will
be true:

// next_wakeup == MAX_JIFFY_OFFSET
if ((time_after_eq(cur, next_wakeup)) ||
    (MAX_JIFFY_OFFSET == next_wakeup)) {
	cond_resched();
	continue;
}

causing us to process the li_request_list again. If we now have jiffies < 
elr->lr_next_sched, as we have already elr->lr_next_sched > next_wakeup, we
will just continue without updating next_wakeup,

// jiffies < elr->lr_next_sched && elr->lr_next_sched > next_wakeup
if (time_before(jiffies, elr->lr_next_sched)) {
	if (time_before(elr->lr_next_sched, next_wakeup))
		next_wakeup = elr->lr_next_sched;
	continue;
}

and again, we will call cond_resched because next_wakeup is not updated, and
we now have an infinite loop.

This was put into evidence with the following values:

jiffies = 4294938821
elr->lr_next_sched = 1966790060
next_wakeup = 1073741822 (MAX_JIFFY_OFFSET)

on an armv7 (32 bits) system, without an RTC, while updating the system clock
during the lazyinit thread is working.

Fix that by using ktime_get_ns insted of ktime_get_real_ns and by using a
boolean instead of MAX_JIFFY_OFFSET to determine whether the next_wakeup value
has been set.

Thanks,

Mathieu

Mathieu Othacehe (1):
  ext4: Prevent an infinite loop in the lazyinit thread.

 fs/ext4/super.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

-- 
2.46.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] ext4: Prevent an infinite loop in the lazyinit thread.
  2024-11-06 13:47 [PATCH 0/1] ext4: Prevent an infinite loop in the lazyinit thread Mathieu Othacehe
@ 2024-11-06 13:47 ` Mathieu Othacehe
  2024-11-08 10:49   ` Jan Kara
  2024-11-14 13:53   ` Theodore Ts'o
  0 siblings, 2 replies; 4+ messages in thread
From: Mathieu Othacehe @ 2024-11-06 13:47 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger
  Cc: linux-ext4, linux-kernel, lukas.skupinski, anton.reding,
	Mathieu Othacehe

Use ktime_get_ns instead of ktime_get_real_ns when computing the lr_timeout
not to be affected by system time jumps.

Use a boolean instead of the MAX_JIFFY_OFFSET value to determine whether
the next_wakeup value has been set. Comparing elr->lr_next_sched to
MAX_JIFFY_OFFSET can cause the lazyinit thread to loop indefinitely.

Co-developed-by: Lukas Skupinski <lukas.skupinski@landisgyr.com>
Signed-off-by: Lukas Skupinski <lukas.skupinski@landisgyr.com>
Signed-off-by: Mathieu Othacehe <othacehe@gnu.org>
---
 fs/ext4/super.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9fcf44064c6a6..b4839ccd83ad5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3712,12 +3712,12 @@ static int ext4_run_li_request(struct ext4_li_request *elr)
 		ret = 1;
 
 	if (!ret) {
-		start_time = ktime_get_real_ns();
+		start_time = ktime_get_ns();
 		ret = ext4_init_inode_table(sb, group,
 					    elr->lr_timeout ? 0 : 1);
 		trace_ext4_lazy_itable_init(sb, group);
 		if (elr->lr_timeout == 0) {
-			elr->lr_timeout = nsecs_to_jiffies((ktime_get_real_ns() - start_time) *
+			elr->lr_timeout = nsecs_to_jiffies((ktime_get_ns() - start_time) *
 				EXT4_SB(elr->lr_super)->s_li_wait_mult);
 		}
 		elr->lr_next_sched = jiffies + elr->lr_timeout;
@@ -3777,8 +3777,9 @@ static int ext4_lazyinit_thread(void *arg)
 
 cont_thread:
 	while (true) {
-		next_wakeup = MAX_JIFFY_OFFSET;
+		bool next_wakeup_initialized = false;
 
+		next_wakeup = 0;
 		mutex_lock(&eli->li_list_mtx);
 		if (list_empty(&eli->li_request_list)) {
 			mutex_unlock(&eli->li_list_mtx);
@@ -3791,8 +3792,11 @@ static int ext4_lazyinit_thread(void *arg)
 					 lr_request);
 
 			if (time_before(jiffies, elr->lr_next_sched)) {
-				if (time_before(elr->lr_next_sched, next_wakeup))
+				if (!next_wakeup_initialized ||
+				    time_before(elr->lr_next_sched, next_wakeup)) {
 					next_wakeup = elr->lr_next_sched;
+					next_wakeup_initialized = true;
+				}
 				continue;
 			}
 			if (down_read_trylock(&elr->lr_super->s_umount)) {
@@ -3820,16 +3824,18 @@ static int ext4_lazyinit_thread(void *arg)
 				elr->lr_next_sched = jiffies +
 					get_random_u32_below(EXT4_DEF_LI_MAX_START_DELAY * HZ);
 			}
-			if (time_before(elr->lr_next_sched, next_wakeup))
+			if (!next_wakeup_initialized ||
+			    time_before(elr->lr_next_sched, next_wakeup)) {
 				next_wakeup = elr->lr_next_sched;
+				next_wakeup_initialized = true;
+			}
 		}
 		mutex_unlock(&eli->li_list_mtx);
 
 		try_to_freeze();
 
 		cur = jiffies;
-		if ((time_after_eq(cur, next_wakeup)) ||
-		    (MAX_JIFFY_OFFSET == next_wakeup)) {
+		if (!next_wakeup_initialized || time_after_eq(cur, next_wakeup)) {
 			cond_resched();
 			continue;
 		}
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] ext4: Prevent an infinite loop in the lazyinit thread.
  2024-11-06 13:47 ` [PATCH 1/1] " Mathieu Othacehe
@ 2024-11-08 10:49   ` Jan Kara
  2024-11-14 13:53   ` Theodore Ts'o
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Kara @ 2024-11-08 10:49 UTC (permalink / raw)
  To: Mathieu Othacehe
  Cc: Theodore Ts'o, Andreas Dilger, linux-ext4, linux-kernel,
	lukas.skupinski, anton.reding

On Wed 06-11-24 14:47:41, Mathieu Othacehe wrote:
> Use ktime_get_ns instead of ktime_get_real_ns when computing the lr_timeout
> not to be affected by system time jumps.
> 
> Use a boolean instead of the MAX_JIFFY_OFFSET value to determine whether
> the next_wakeup value has been set. Comparing elr->lr_next_sched to
> MAX_JIFFY_OFFSET can cause the lazyinit thread to loop indefinitely.
> 
> Co-developed-by: Lukas Skupinski <lukas.skupinski@landisgyr.com>
> Signed-off-by: Lukas Skupinski <lukas.skupinski@landisgyr.com>
> Signed-off-by: Mathieu Othacehe <othacehe@gnu.org>

Nice catch! The patch looks good so feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

But the analysis you've done in patch 0/1 would ideally be in the changelog
of this patch so that we can easily get back to it in the future in git logs.
Maybe Ted can handle that on commit?

								Honza

> ---
>  fs/ext4/super.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 9fcf44064c6a6..b4839ccd83ad5 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3712,12 +3712,12 @@ static int ext4_run_li_request(struct ext4_li_request *elr)
>  		ret = 1;
>  
>  	if (!ret) {
> -		start_time = ktime_get_real_ns();
> +		start_time = ktime_get_ns();
>  		ret = ext4_init_inode_table(sb, group,
>  					    elr->lr_timeout ? 0 : 1);
>  		trace_ext4_lazy_itable_init(sb, group);
>  		if (elr->lr_timeout == 0) {
> -			elr->lr_timeout = nsecs_to_jiffies((ktime_get_real_ns() - start_time) *
> +			elr->lr_timeout = nsecs_to_jiffies((ktime_get_ns() - start_time) *
>  				EXT4_SB(elr->lr_super)->s_li_wait_mult);
>  		}
>  		elr->lr_next_sched = jiffies + elr->lr_timeout;
> @@ -3777,8 +3777,9 @@ static int ext4_lazyinit_thread(void *arg)
>  
>  cont_thread:
>  	while (true) {
> -		next_wakeup = MAX_JIFFY_OFFSET;
> +		bool next_wakeup_initialized = false;
>  
> +		next_wakeup = 0;
>  		mutex_lock(&eli->li_list_mtx);
>  		if (list_empty(&eli->li_request_list)) {
>  			mutex_unlock(&eli->li_list_mtx);
> @@ -3791,8 +3792,11 @@ static int ext4_lazyinit_thread(void *arg)
>  					 lr_request);
>  
>  			if (time_before(jiffies, elr->lr_next_sched)) {
> -				if (time_before(elr->lr_next_sched, next_wakeup))
> +				if (!next_wakeup_initialized ||
> +				    time_before(elr->lr_next_sched, next_wakeup)) {
>  					next_wakeup = elr->lr_next_sched;
> +					next_wakeup_initialized = true;
> +				}
>  				continue;
>  			}
>  			if (down_read_trylock(&elr->lr_super->s_umount)) {
> @@ -3820,16 +3824,18 @@ static int ext4_lazyinit_thread(void *arg)
>  				elr->lr_next_sched = jiffies +
>  					get_random_u32_below(EXT4_DEF_LI_MAX_START_DELAY * HZ);
>  			}
> -			if (time_before(elr->lr_next_sched, next_wakeup))
> +			if (!next_wakeup_initialized ||
> +			    time_before(elr->lr_next_sched, next_wakeup)) {
>  				next_wakeup = elr->lr_next_sched;
> +				next_wakeup_initialized = true;
> +			}
>  		}
>  		mutex_unlock(&eli->li_list_mtx);
>  
>  		try_to_freeze();
>  
>  		cur = jiffies;
> -		if ((time_after_eq(cur, next_wakeup)) ||
> -		    (MAX_JIFFY_OFFSET == next_wakeup)) {
> +		if (!next_wakeup_initialized || time_after_eq(cur, next_wakeup)) {
>  			cond_resched();
>  			continue;
>  		}
> -- 
> 2.46.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] ext4: Prevent an infinite loop in the lazyinit thread.
  2024-11-06 13:47 ` [PATCH 1/1] " Mathieu Othacehe
  2024-11-08 10:49   ` Jan Kara
@ 2024-11-14 13:53   ` Theodore Ts'o
  1 sibling, 0 replies; 4+ messages in thread
From: Theodore Ts'o @ 2024-11-14 13:53 UTC (permalink / raw)
  To: Andreas Dilger, Mathieu Othacehe
  Cc: Theodore Ts'o, linux-ext4, linux-kernel, lukas.skupinski,
	anton.reding


On Wed, 06 Nov 2024 14:47:41 +0100, Mathieu Othacehe wrote:
> Use ktime_get_ns instead of ktime_get_real_ns when computing the lr_timeout
> not to be affected by system time jumps.
> 
> Use a boolean instead of the MAX_JIFFY_OFFSET value to determine whether
> the next_wakeup value has been set. Comparing elr->lr_next_sched to
> MAX_JIFFY_OFFSET can cause the lazyinit thread to loop indefinitely.
> 
> [...]

Applied, thanks!

[1/1] ext4: Prevent an infinite loop in the lazyinit thread.
      commit: e06a8c24f6445c2f1b5255caa4f63b38e31c43fa

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-11-14 13:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-06 13:47 [PATCH 0/1] ext4: Prevent an infinite loop in the lazyinit thread Mathieu Othacehe
2024-11-06 13:47 ` [PATCH 1/1] " Mathieu Othacehe
2024-11-08 10:49   ` Jan Kara
2024-11-14 13:53   ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).