[BUG] infinite loop when unmounting ext3/4

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] infinite loop when unmounting ext3/4
@ 2015-07-15 14:30 Eryu Guan
  2015-07-16  6:18 ` Joseph Qi
  0 siblings, 1 reply; 7+ messages in thread
From: Eryu Guan @ 2015-07-15 14:30 UTC (permalink / raw)
  To: linux-ext4; +Cc: joseph.qi

Hi all,

I found an infinite loop when unmounting a known bad ext3 image (using
ext4 driver) with 4.2-rc1 kernel.

The fs image can be found here
https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1

Reproduce steps:
  mount -o loop ext3.img /mnt/ext3
  rm -rf /mnt/ext3/{dev,proc,sys}
  umount /mnt/ext3	# never return

And this issue was introduced by
6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails

It's looping in
fs/jbd2/journal.c:jbd2_journal_destroy()
...
1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
1694                 spin_unlock(&journal->j_list_lock);                                                                                      
1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
1696                 jbd2_log_do_checkpoint(journal);                                                                                         
1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
1698                 spin_lock(&journal->j_list_lock);                                                                                        
1699         }
...

Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
journal->j_checkpoint_transactions in this while loop.

A quick hack would be

diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 4227dc4..1b2ea47 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
         * don't need checkpointing, just eliminate them from the
         * journal straight away.
         */
-       result = jbd2_cleanup_journal_tail(journal);
-       trace_jbd2_checkpoint(journal, result);
-       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
-       if (result <= 0)
-               return result;
+       if (!is_journal_aborted(journal)) {
+               result = jbd2_cleanup_journal_tail(journal);
+               trace_jbd2_checkpoint(journal, result);
+               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
+               if (result <= 0)
+                       return result;
+       }
 
        /*
         * OK, we need to start writing disk blocks.  Take one transaction

to restore the old behavior (continue on aborted journal). Maybe someone
has a better fix.

Thanks,
Eryu

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-15 14:30 [BUG] infinite loop when unmounting ext3/4 Eryu Guan
@ 2015-07-16  6:18 ` Joseph Qi
  2015-07-20  7:23   ` Eryu Guan
  0 siblings, 1 reply; 7+ messages in thread
From: Joseph Qi @ 2015-07-16  6:18 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-ext4, Theodore Ts'o

I got the problem.
I am not sure why old code uses "result <= 0" even if
it won't return negative value. Could we use "result == 0" instead of
"result <= 0"?

On 2015/7/15 22:30, Eryu Guan wrote:
> Hi all,
> 
> I found an infinite loop when unmounting a known bad ext3 image (using
> ext4 driver) with 4.2-rc1 kernel.
> 
> The fs image can be found here
> https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> 
> Reproduce steps:
>   mount -o loop ext3.img /mnt/ext3
>   rm -rf /mnt/ext3/{dev,proc,sys}
>   umount /mnt/ext3	# never return
> 
> And this issue was introduced by
> 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> 
> It's looping in
> fs/jbd2/journal.c:jbd2_journal_destroy()
> ...
> 1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
> 1694                 spin_unlock(&journal->j_list_lock);                                                                                      
> 1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
> 1696                 jbd2_log_do_checkpoint(journal);                                                                                         
> 1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
> 1698                 spin_lock(&journal->j_list_lock);                                                                                        
> 1699         }
> ...
> 
> Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> journal->j_checkpoint_transactions in this while loop.
> 
> A quick hack would be
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 4227dc4..1b2ea47 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
>          * don't need checkpointing, just eliminate them from the
>          * journal straight away.
>          */
> -       result = jbd2_cleanup_journal_tail(journal);
> -       trace_jbd2_checkpoint(journal, result);
> -       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> -       if (result <= 0)
> -               return result;
> +       if (!is_journal_aborted(journal)) {
> +               result = jbd2_cleanup_journal_tail(journal);
> +               trace_jbd2_checkpoint(journal, result);
> +               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> +               if (result <= 0)
> +                       return result;
> +       }
>  
>         /*
>          * OK, we need to start writing disk blocks.  Take one transaction
> 
> to restore the old behavior (continue on aborted journal). Maybe someone
> has a better fix.
> 
> Thanks,
> Eryu
> 
> .
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-16  6:18 ` Joseph Qi
@ 2015-07-20  7:23   ` Eryu Guan
  2015-07-23 20:41     ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Eryu Guan @ 2015-07-20  7:23 UTC (permalink / raw)
  To: Joseph Qi; +Cc: linux-ext4, Theodore Ts'o

On Thu, Jul 16, 2015 at 02:18:16PM +0800, Joseph Qi wrote:
> I got the problem.
> I am not sure why old code uses "result <= 0" even if
> it won't return negative value. Could we use "result == 0" instead of
> "result <= 0"?

I thought about this too, but I'm not sure if it has other side effects.
Someone else familiar with this code could comment on this?

Thanks,
Eryu
> 
> On 2015/7/15 22:30, Eryu Guan wrote:
> > Hi all,
> > 
> > I found an infinite loop when unmounting a known bad ext3 image (using
> > ext4 driver) with 4.2-rc1 kernel.
> > 
> > The fs image can be found here
> > https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> > 
> > Reproduce steps:
> >   mount -o loop ext3.img /mnt/ext3
> >   rm -rf /mnt/ext3/{dev,proc,sys}
> >   umount /mnt/ext3	# never return
> > 
> > And this issue was introduced by
> > 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> > 
> > It's looping in
> > fs/jbd2/journal.c:jbd2_journal_destroy()
> > ...
> > 1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
> > 1694                 spin_unlock(&journal->j_list_lock);                                                                                      
> > 1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
> > 1696                 jbd2_log_do_checkpoint(journal);                                                                                         
> > 1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
> > 1698                 spin_lock(&journal->j_list_lock);                                                                                        
> > 1699         }
> > ...
> > 
> > Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> > now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> > journal->j_checkpoint_transactions in this while loop.
> > 
> > A quick hack would be
> > 
> > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > index 4227dc4..1b2ea47 100644
> > --- a/fs/jbd2/checkpoint.c
> > +++ b/fs/jbd2/checkpoint.c
> > @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> >          * don't need checkpointing, just eliminate them from the
> >          * journal straight away.
> >          */
> > -       result = jbd2_cleanup_journal_tail(journal);
> > -       trace_jbd2_checkpoint(journal, result);
> > -       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > -       if (result <= 0)
> > -               return result;
> > +       if (!is_journal_aborted(journal)) {
> > +               result = jbd2_cleanup_journal_tail(journal);
> > +               trace_jbd2_checkpoint(journal, result);
> > +               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > +               if (result <= 0)
> > +                       return result;
> > +       }
> >  
> >         /*
> >          * OK, we need to start writing disk blocks.  Take one transaction
> > 
> > to restore the old behavior (continue on aborted journal). Maybe someone
> > has a better fix.
> > 
> > Thanks,
> > Eryu
> > 
> > .
> > 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-20  7:23   ` Eryu Guan
@ 2015-07-23 20:41     ` Jan Kara
  2015-07-27 19:54       ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2015-07-23 20:41 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Joseph Qi, linux-ext4, Theodore Ts'o

On Mon 20-07-15 15:23:56, Eryu Guan wrote:
> On Thu, Jul 16, 2015 at 02:18:16PM +0800, Joseph Qi wrote:
> > I got the problem.
> > I am not sure why old code uses "result <= 0" even if
> > it won't return negative value. Could we use "result == 0" instead of
> > "result <= 0"?
> 
> I thought about this too, but I'm not sure if it has other side effects.
> Someone else familiar with this code could comment on this?

Well, we should rather decide, what is the right behavior of the
checkpointing code when the journal is aborted. When journal gets aborted,
we are in serious trouble. Our standard answer to this is to stop modifying
the filesystem as that has a chance of corrupting it even more. I think
that avoiding checkpointing on a filesystem with aborted journal is thus
what we really want.

To fix the issue you've reported, we just need to teach
jbd2_log_do_checkpoint() to cleanup all the buffers in the transactions
when the journal is aborted so that journal_destroy() can proceed. I can
have a look at it sometime next week unless someone beats me to it.

								Honza

> > On 2015/7/15 22:30, Eryu Guan wrote:
> > > Hi all,
> > > 
> > > I found an infinite loop when unmounting a known bad ext3 image (using
> > > ext4 driver) with 4.2-rc1 kernel.
> > > 
> > > The fs image can be found here
> > > https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> > > 
> > > Reproduce steps:
> > >   mount -o loop ext3.img /mnt/ext3
> > >   rm -rf /mnt/ext3/{dev,proc,sys}
> > >   umount /mnt/ext3	# never return
> > > 
> > > And this issue was introduced by
> > > 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> > > 
> > > It's looping in
> > > fs/jbd2/journal.c:jbd2_journal_destroy()
> > > ...
> > > 1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
> > > 1694                 spin_unlock(&journal->j_list_lock);                                                                                      
> > > 1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
> > > 1696                 jbd2_log_do_checkpoint(journal);                                                                                         
> > > 1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
> > > 1698                 spin_lock(&journal->j_list_lock);                                                                                        
> > > 1699         }
> > > ...
> > > 
> > > Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> > > now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> > > journal->j_checkpoint_transactions in this while loop.
> > > 
> > > A quick hack would be
> > > 
> > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > > index 4227dc4..1b2ea47 100644
> > > --- a/fs/jbd2/checkpoint.c
> > > +++ b/fs/jbd2/checkpoint.c
> > > @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> > >          * don't need checkpointing, just eliminate them from the
> > >          * journal straight away.
> > >          */
> > > -       result = jbd2_cleanup_journal_tail(journal);
> > > -       trace_jbd2_checkpoint(journal, result);
> > > -       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > -       if (result <= 0)
> > > -               return result;
> > > +       if (!is_journal_aborted(journal)) {
> > > +               result = jbd2_cleanup_journal_tail(journal);
> > > +               trace_jbd2_checkpoint(journal, result);
> > > +               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > +               if (result <= 0)
> > > +                       return result;
> > > +       }
> > >  
> > >         /*
> > >          * OK, we need to start writing disk blocks.  Take one transaction
> > > 
> > > to restore the old behavior (continue on aborted journal). Maybe someone
> > > has a better fix.
> > > 
> > > Thanks,
> > > Eryu
> > > 
> > > .
> > > 
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-23 20:41     ` Jan Kara
@ 2015-07-27 19:54       ` Jan Kara
  2015-07-28  7:15         ` Eryu Guan
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2015-07-27 19:54 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Joseph Qi, linux-ext4, Theodore Ts'o

[-- Attachment #1: Type: text/plain, Size: 4997 bytes --]

On Thu 23-07-15 22:41:55, Jan Kara wrote:
> On Mon 20-07-15 15:23:56, Eryu Guan wrote:
> > On Thu, Jul 16, 2015 at 02:18:16PM +0800, Joseph Qi wrote:
> > > I got the problem.
> > > I am not sure why old code uses "result <= 0" even if
> > > it won't return negative value. Could we use "result == 0" instead of
> > > "result <= 0"?
> > 
> > I thought about this too, but I'm not sure if it has other side effects.
> > Someone else familiar with this code could comment on this?
> 
> Well, we should rather decide, what is the right behavior of the
> checkpointing code when the journal is aborted. When journal gets aborted,
> we are in serious trouble. Our standard answer to this is to stop modifying
> the filesystem as that has a chance of corrupting it even more. I think
> that avoiding checkpointing on a filesystem with aborted journal is thus
> what we really want.
> 
> To fix the issue you've reported, we just need to teach
> jbd2_log_do_checkpoint() to cleanup all the buffers in the transactions
> when the journal is aborted so that journal_destroy() can proceed. I can
> have a look at it sometime next week unless someone beats me to it.

OK, attached is the patch that fixes the issue for me. Ted, can you please
pick it up? Thanks!

								Honza
> > > On 2015/7/15 22:30, Eryu Guan wrote:
> > > > Hi all,
> > > > 
> > > > I found an infinite loop when unmounting a known bad ext3 image (using
> > > > ext4 driver) with 4.2-rc1 kernel.
> > > > 
> > > > The fs image can be found here
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> > > > 
> > > > Reproduce steps:
> > > >   mount -o loop ext3.img /mnt/ext3
> > > >   rm -rf /mnt/ext3/{dev,proc,sys}
> > > >   umount /mnt/ext3	# never return
> > > > 
> > > > And this issue was introduced by
> > > > 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> > > > 
> > > > It's looping in
> > > > fs/jbd2/journal.c:jbd2_journal_destroy()
> > > > ...
> > > > 1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
> > > > 1694                 spin_unlock(&journal->j_list_lock);                                                                                      
> > > > 1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
> > > > 1696                 jbd2_log_do_checkpoint(journal);                                                                                         
> > > > 1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
> > > > 1698                 spin_lock(&journal->j_list_lock);                                                                                        
> > > > 1699         }
> > > > ...
> > > > 
> > > > Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> > > > now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> > > > journal->j_checkpoint_transactions in this while loop.
> > > > 
> > > > A quick hack would be
> > > > 
> > > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > > > index 4227dc4..1b2ea47 100644
> > > > --- a/fs/jbd2/checkpoint.c
> > > > +++ b/fs/jbd2/checkpoint.c
> > > > @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> > > >          * don't need checkpointing, just eliminate them from the
> > > >          * journal straight away.
> > > >          */
> > > > -       result = jbd2_cleanup_journal_tail(journal);
> > > > -       trace_jbd2_checkpoint(journal, result);
> > > > -       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > -       if (result <= 0)
> > > > -               return result;
> > > > +       if (!is_journal_aborted(journal)) {
> > > > +               result = jbd2_cleanup_journal_tail(journal);
> > > > +               trace_jbd2_checkpoint(journal, result);
> > > > +               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > +               if (result <= 0)
> > > > +                       return result;
> > > > +       }
> > > >  
> > > >         /*
> > > >          * OK, we need to start writing disk blocks.  Take one transaction
> > > > 
> > > > to restore the old behavior (continue on aborted journal). Maybe someone
> > > > has a better fix.
> > > > 
> > > > Thanks,
> > > > Eryu
> > > > 
> > > > .
> > > > 
> > > 
> > > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-jbd2-Avoid-infinite-loop-when-destroying-aborted-jou.patch --]
[-- Type: text/x-patch, Size: 6178 bytes --]

>From 2839d314353b6d9e88355dc0b30927d38b2cb7b7 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.com>
Date: Mon, 27 Jul 2015 21:42:08 +0200
Subject: [PATCH] jbd2: Avoid infinite loop when destroying aborted journal

Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
superblock fails" changed jbd2_cleanup_journal_tail() to return EIO when
the journal is aborted. That makes logic in jbd2_log_do_checkpoint()
bail out which is fine, except that jbd2_journal_destroy() expects
jbd2_log_do_checkpoint() to always make a progress in cleaning the
journal. Without it jbd2_journal_destroy() just loops in an infinite
loop.

Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
jbd2_log_do_checkpoint() fails with error.

Reported-by: Eryu Guan <guaneryu@gmail.com>
Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
Signed-off-by: Jan Kara <jack@suse.com>
---
 fs/jbd2/checkpoint.c | 39 +++++++++++++++++++++++++++++++++------
 fs/jbd2/commit.c     |  2 +-
 fs/jbd2/journal.c    | 11 ++++++++++-
 include/linux/jbd2.h |  3 ++-
 4 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 4227dc4f7437..8c44654ce274 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -417,12 +417,12 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
  * journal_clean_one_cp_list
  *
  * Find all the written-back checkpoint buffers in the given list and
- * release them.
+ * release them. If 'destroy' is set, clean all buffers unconditionally.
  *
  * Called with j_list_lock held.
  * Returns 1 if we freed the transaction, 0 otherwise.
  */
-static int journal_clean_one_cp_list(struct journal_head *jh)
+static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
 {
 	struct journal_head *last_jh;
 	struct journal_head *next_jh = jh;
@@ -436,7 +436,10 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
 	do {
 		jh = next_jh;
 		next_jh = jh->b_cpnext;
-		ret = __try_to_free_cp_buf(jh);
+		if (!destroy)
+			ret = __try_to_free_cp_buf(jh);
+		else
+			ret = __jbd2_journal_remove_checkpoint(jh) + 1;
 		if (!ret)
 			return freed;
 		if (ret == 2)
@@ -459,10 +462,11 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
  * journal_clean_checkpoint_list
  *
  * Find all the written-back checkpoint buffers in the journal and release them.
+ * If 'destroy' is set, release all buffers unconditionally.
  *
  * Called with j_list_lock held.
  */
-void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
+void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
 {
 	transaction_t *transaction, *last_transaction, *next_transaction;
 	int ret;
@@ -476,7 +480,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
 	do {
 		transaction = next_transaction;
 		next_transaction = transaction->t_cpnext;
-		ret = journal_clean_one_cp_list(transaction->t_checkpoint_list);
+		ret = journal_clean_one_cp_list(transaction->t_checkpoint_list,
+						destroy);
 		/*
 		 * This function only frees up some memory if possible so we
 		 * dont have an obligation to finish processing. Bail out if
@@ -492,7 +497,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
 		 * we can possibly see not yet submitted buffers on io_list
 		 */
 		ret = journal_clean_one_cp_list(transaction->
-				t_checkpoint_io_list);
+				t_checkpoint_io_list, destroy);
 		if (need_resched())
 			return;
 		/*
@@ -506,6 +511,28 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
 }
 
 /*
+ * Remove buffers from all checkpoint lists as journal is aborted and we just
+ * need to free memory
+ */
+void jbd2_journal_destroy_checkpoint(journal_t *journal)
+{
+	/*
+	 * We loop because __jbd2_journal_clean_checkpoint_list() may abort
+	 * early due to a need of rescheduling.
+	 */
+	while (1) {
+		spin_lock(&journal->j_list_lock);
+		if (!journal->j_checkpoint_transactions) {
+			spin_unlock(&journal->j_list_lock);
+			break;
+		}
+		__jbd2_journal_clean_checkpoint_list(journal, true);
+		spin_unlock(&journal->j_list_lock);
+		cond_resched();
+	}
+}
+
+/*
  * journal_remove_checkpoint: called after a buffer has been committed
  * to disk (either by being write-back flushed to disk, or being
  * committed to the log).
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index b73e0215baa7..362e5f614450 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -510,7 +510,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	 * frees some memory
 	 */
 	spin_lock(&journal->j_list_lock);
-	__jbd2_journal_clean_checkpoint_list(journal);
+	__jbd2_journal_clean_checkpoint_list(journal, false);
 	spin_unlock(&journal->j_list_lock);
 
 	jbd_debug(3, "JBD2: commit phase 1\n");
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 4ff3fad4e9e3..2721513adb1f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1693,8 +1693,17 @@ int jbd2_journal_destroy(journal_t *journal)
 	while (journal->j_checkpoint_transactions != NULL) {
 		spin_unlock(&journal->j_list_lock);
 		mutex_lock(&journal->j_checkpoint_mutex);
-		jbd2_log_do_checkpoint(journal);
+		err = jbd2_log_do_checkpoint(journal);
 		mutex_unlock(&journal->j_checkpoint_mutex);
+		/*
+		 * If checkpointing failed, just free the buffers to avoid
+		 * looping forever
+		 */
+		if (err) {
+			jbd2_journal_destroy_checkpoint(journal);
+			spin_lock(&journal->j_list_lock);
+			break;
+		}
 		spin_lock(&journal->j_list_lock);
 	}
 
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index edb640ae9a94..eb1cebed3f36 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1042,8 +1042,9 @@ void jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block);
 extern void jbd2_journal_commit_transaction(journal_t *);
 
 /* Checkpoint list management */
-void __jbd2_journal_clean_checkpoint_list(journal_t *journal);
+void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy);
 int __jbd2_journal_remove_checkpoint(struct journal_head *);
+void jbd2_journal_destroy_checkpoint(journal_t *journal);
 void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
 
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-27 19:54       ` Jan Kara
@ 2015-07-28  7:15         ` Eryu Guan
  2015-07-28 19:04           ` Theodore Ts'o
  0 siblings, 1 reply; 7+ messages in thread
From: Eryu Guan @ 2015-07-28  7:15 UTC (permalink / raw)
  To: Jan Kara; +Cc: Joseph Qi, linux-ext4, Theodore Ts'o

On Mon, Jul 27, 2015 at 09:54:40PM +0200, Jan Kara wrote:
> On Thu 23-07-15 22:41:55, Jan Kara wrote:
> > On Mon 20-07-15 15:23:56, Eryu Guan wrote:
> > > On Thu, Jul 16, 2015 at 02:18:16PM +0800, Joseph Qi wrote:
> > > > I got the problem.
> > > > I am not sure why old code uses "result <= 0" even if
> > > > it won't return negative value. Could we use "result == 0" instead of
> > > > "result <= 0"?
> > > 
> > > I thought about this too, but I'm not sure if it has other side effects.
> > > Someone else familiar with this code could comment on this?
> > 
> > Well, we should rather decide, what is the right behavior of the
> > checkpointing code when the journal is aborted. When journal gets aborted,
> > we are in serious trouble. Our standard answer to this is to stop modifying
> > the filesystem as that has a chance of corrupting it even more. I think
> > that avoiding checkpointing on a filesystem with aborted journal is thus
> > what we really want.
> > 
> > To fix the issue you've reported, we just need to teach
> > jbd2_log_do_checkpoint() to cleanup all the buffers in the transactions
> > when the journal is aborted so that journal_destroy() can proceed. I can
> > have a look at it sometime next week unless someone beats me to it.
> 
> OK, attached is the patch that fixes the issue for me. Ted, can you please
> pick it up? Thanks!

I confirmed that the attached patch fixed the infinite loop on aborted
journal. Thanks!

Eryu
> 
> 								Honza
> > > > On 2015/7/15 22:30, Eryu Guan wrote:
> > > > > Hi all,
> > > > > 
> > > > > I found an infinite loop when unmounting a known bad ext3 image (using
> > > > > ext4 driver) with 4.2-rc1 kernel.
> > > > > 
> > > > > The fs image can be found here
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> > > > > 
> > > > > Reproduce steps:
> > > > >   mount -o loop ext3.img /mnt/ext3
> > > > >   rm -rf /mnt/ext3/{dev,proc,sys}
> > > > >   umount /mnt/ext3	# never return
> > > > > 
> > > > > And this issue was introduced by
> > > > > 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> > > > > 
> > > > > It's looping in
> > > > > fs/jbd2/journal.c:jbd2_journal_destroy()
> > > > > ...
> > > > > 1693         while (journal->j_checkpoint_transactions != NULL) {                                                                             
> > > > > 1694                 spin_unlock(&journal->j_list_lock);                                                                                      
> > > > > 1695                 mutex_lock(&journal->j_checkpoint_mutex);                                                                                
> > > > > 1696                 jbd2_log_do_checkpoint(journal);                                                                                         
> > > > > 1697                 mutex_unlock(&journal->j_checkpoint_mutex);                                                                              
> > > > > 1698                 spin_lock(&journal->j_list_lock);                                                                                        
> > > > > 1699         }
> > > > > ...
> > > > > 
> > > > > Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> > > > > now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> > > > > journal->j_checkpoint_transactions in this while loop.
> > > > > 
> > > > > A quick hack would be
> > > > > 
> > > > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > > > > index 4227dc4..1b2ea47 100644
> > > > > --- a/fs/jbd2/checkpoint.c
> > > > > +++ b/fs/jbd2/checkpoint.c
> > > > > @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> > > > >          * don't need checkpointing, just eliminate them from the
> > > > >          * journal straight away.
> > > > >          */
> > > > > -       result = jbd2_cleanup_journal_tail(journal);
> > > > > -       trace_jbd2_checkpoint(journal, result);
> > > > > -       jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > > -       if (result <= 0)
> > > > > -               return result;
> > > > > +       if (!is_journal_aborted(journal)) {
> > > > > +               result = jbd2_cleanup_journal_tail(journal);
> > > > > +               trace_jbd2_checkpoint(journal, result);
> > > > > +               jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > > +               if (result <= 0)
> > > > > +                       return result;
> > > > > +       }
> > > > >  
> > > > >         /*
> > > > >          * OK, we need to start writing disk blocks.  Take one transaction
> > > > > 
> > > > > to restore the old behavior (continue on aborted journal). Maybe someone
> > > > > has a better fix.
> > > > > 
> > > > > Thanks,
> > > > > Eryu
> > > > > 
> > > > > .
> > > > > 
> > > > 
> > > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > -- 
> > Jan Kara <jack@suse.com>
> > SUSE Labs, CR
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

> From 2839d314353b6d9e88355dc0b30927d38b2cb7b7 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.com>
> Date: Mon, 27 Jul 2015 21:42:08 +0200
> Subject: [PATCH] jbd2: Avoid infinite loop when destroying aborted journal
> 
> Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
> superblock fails" changed jbd2_cleanup_journal_tail() to return EIO when
> the journal is aborted. That makes logic in jbd2_log_do_checkpoint()
> bail out which is fine, except that jbd2_journal_destroy() expects
> jbd2_log_do_checkpoint() to always make a progress in cleaning the
> journal. Without it jbd2_journal_destroy() just loops in an infinite
> loop.
> 
> Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
> jbd2_log_do_checkpoint() fails with error.
> 
> Reported-by: Eryu Guan <guaneryu@gmail.com>
> Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
> Signed-off-by: Jan Kara <jack@suse.com>
> ---
>  fs/jbd2/checkpoint.c | 39 +++++++++++++++++++++++++++++++++------
>  fs/jbd2/commit.c     |  2 +-
>  fs/jbd2/journal.c    | 11 ++++++++++-
>  include/linux/jbd2.h |  3 ++-
>  4 files changed, 46 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 4227dc4f7437..8c44654ce274 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -417,12 +417,12 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
>   * journal_clean_one_cp_list
>   *
>   * Find all the written-back checkpoint buffers in the given list and
> - * release them.
> + * release them. If 'destroy' is set, clean all buffers unconditionally.
>   *
>   * Called with j_list_lock held.
>   * Returns 1 if we freed the transaction, 0 otherwise.
>   */
> -static int journal_clean_one_cp_list(struct journal_head *jh)
> +static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
>  {
>  	struct journal_head *last_jh;
>  	struct journal_head *next_jh = jh;
> @@ -436,7 +436,10 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
>  	do {
>  		jh = next_jh;
>  		next_jh = jh->b_cpnext;
> -		ret = __try_to_free_cp_buf(jh);
> +		if (!destroy)
> +			ret = __try_to_free_cp_buf(jh);
> +		else
> +			ret = __jbd2_journal_remove_checkpoint(jh) + 1;
>  		if (!ret)
>  			return freed;
>  		if (ret == 2)
> @@ -459,10 +462,11 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
>   * journal_clean_checkpoint_list
>   *
>   * Find all the written-back checkpoint buffers in the journal and release them.
> + * If 'destroy' is set, release all buffers unconditionally.
>   *
>   * Called with j_list_lock held.
>   */
> -void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
> +void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
>  {
>  	transaction_t *transaction, *last_transaction, *next_transaction;
>  	int ret;
> @@ -476,7 +480,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
>  	do {
>  		transaction = next_transaction;
>  		next_transaction = transaction->t_cpnext;
> -		ret = journal_clean_one_cp_list(transaction->t_checkpoint_list);
> +		ret = journal_clean_one_cp_list(transaction->t_checkpoint_list,
> +						destroy);
>  		/*
>  		 * This function only frees up some memory if possible so we
>  		 * dont have an obligation to finish processing. Bail out if
> @@ -492,7 +497,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
>  		 * we can possibly see not yet submitted buffers on io_list
>  		 */
>  		ret = journal_clean_one_cp_list(transaction->
> -				t_checkpoint_io_list);
> +				t_checkpoint_io_list, destroy);
>  		if (need_resched())
>  			return;
>  		/*
> @@ -506,6 +511,28 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
>  }
>  
>  /*
> + * Remove buffers from all checkpoint lists as journal is aborted and we just
> + * need to free memory
> + */
> +void jbd2_journal_destroy_checkpoint(journal_t *journal)
> +{
> +	/*
> +	 * We loop because __jbd2_journal_clean_checkpoint_list() may abort
> +	 * early due to a need of rescheduling.
> +	 */
> +	while (1) {
> +		spin_lock(&journal->j_list_lock);
> +		if (!journal->j_checkpoint_transactions) {
> +			spin_unlock(&journal->j_list_lock);
> +			break;
> +		}
> +		__jbd2_journal_clean_checkpoint_list(journal, true);
> +		spin_unlock(&journal->j_list_lock);
> +		cond_resched();
> +	}
> +}
> +
> +/*
>   * journal_remove_checkpoint: called after a buffer has been committed
>   * to disk (either by being write-back flushed to disk, or being
>   * committed to the log).
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index b73e0215baa7..362e5f614450 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -510,7 +510,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>  	 * frees some memory
>  	 */
>  	spin_lock(&journal->j_list_lock);
> -	__jbd2_journal_clean_checkpoint_list(journal);
> +	__jbd2_journal_clean_checkpoint_list(journal, false);
>  	spin_unlock(&journal->j_list_lock);
>  
>  	jbd_debug(3, "JBD2: commit phase 1\n");
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 4ff3fad4e9e3..2721513adb1f 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1693,8 +1693,17 @@ int jbd2_journal_destroy(journal_t *journal)
>  	while (journal->j_checkpoint_transactions != NULL) {
>  		spin_unlock(&journal->j_list_lock);
>  		mutex_lock(&journal->j_checkpoint_mutex);
> -		jbd2_log_do_checkpoint(journal);
> +		err = jbd2_log_do_checkpoint(journal);
>  		mutex_unlock(&journal->j_checkpoint_mutex);
> +		/*
> +		 * If checkpointing failed, just free the buffers to avoid
> +		 * looping forever
> +		 */
> +		if (err) {
> +			jbd2_journal_destroy_checkpoint(journal);
> +			spin_lock(&journal->j_list_lock);
> +			break;
> +		}
>  		spin_lock(&journal->j_list_lock);
>  	}
>  
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index edb640ae9a94..eb1cebed3f36 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1042,8 +1042,9 @@ void jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block);
>  extern void jbd2_journal_commit_transaction(journal_t *);
>  
>  /* Checkpoint list management */
> -void __jbd2_journal_clean_checkpoint_list(journal_t *journal);
> +void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy);
>  int __jbd2_journal_remove_checkpoint(struct journal_head *);
> +void jbd2_journal_destroy_checkpoint(journal_t *journal);
>  void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
>  
>  
> -- 
> 2.1.4
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] infinite loop when unmounting ext3/4
  2015-07-28  7:15         ` Eryu Guan
@ 2015-07-28 19:04           ` Theodore Ts'o
  0 siblings, 0 replies; 7+ messages in thread
From: Theodore Ts'o @ 2015-07-28 19:04 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Jan Kara, Joseph Qi, linux-ext4

On Tue, Jul 28, 2015 at 03:15:26PM +0800, Eryu Guan wrote:
> > > To fix the issue you've reported, we just need to teach
> > > jbd2_log_do_checkpoint() to cleanup all the buffers in the transactions
> > > when the journal is aborted so that journal_destroy() can proceed. I can
> > > have a look at it sometime next week unless someone beats me to it.
> > 
> > OK, attached is the patch that fixes the issue for me. Ted, can you please
> > pick it up? Thanks!
> 
> I confirmed that the attached patch fixed the infinite loop on aborted
> journal. Thanks!

Applied, thanks.

						- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-07-28 19:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-15 14:30 [BUG] infinite loop when unmounting ext3/4 Eryu Guan
2015-07-16  6:18 ` Joseph Qi
2015-07-20  7:23   ` Eryu Guan
2015-07-23 20:41     ` Jan Kara
2015-07-27 19:54       ` Jan Kara
2015-07-28  7:15         ` Eryu Guan
2015-07-28 19:04           ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).