Re: [PATCH v2] f2fs: fix sbi->extent_list corruption issue

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sahitya Tummala <stummala@codeaurora.org>
To: Chao Yu <yuchao0@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] f2fs: fix sbi->extent_list corruption issue
Date: Thu, 10 Jan 2019 12:09:21 +0530	[thread overview]
Message-ID: <20190110063921.GA7308@codeaurora.org> (raw)
In-Reply-To: <9e8fb00b-75e7-1f69-0255-26de800a6688@huawei.com>

On Wed, Jan 09, 2019 at 02:13:02PM +0800, Chao Yu wrote:
> On 2019/1/9 12:38, Jaegeuk Kim wrote:
> > On 01/07, Chao Yu wrote:
> >> On 2019/1/5 4:33, Jaegeuk Kim wrote:
> >>> On 01/04, Sahitya Tummala wrote:
> >>>> On Mon, Nov 26, 2018 at 10:17:20AM +0530, Sahitya Tummala wrote:
> >>>>> When there is a failure in f2fs_fill_super() after/during
> >>>>> the recovery of fsync'd nodes, it frees the current sbi and
> >>>>> retries again. This time the mount is successful, but the files
> >>>>> that got recovered before retry, still holds the extent tree,
> >>>>> whose extent nodes list is corrupted since sbi and sbi->extent_list
> >>>>> is freed up. The list_del corruption issue is observed when the
> >>>>> file system is getting unmounted and when those recoverd files extent
> >>>>> node is being freed up in the below context.
> >>>>>
> >>>>> list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
> >>>>> <...>
> >>>>> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
> >>>>> task: fffffff1f46f2280 task.stack: ffffff8008068000
> >>>>> lr : __list_del_entry_valid+0x94/0xb4
> >>>>> pc : __list_del_entry_valid+0x94/0xb4
> >>>>> <...>
> >>>>> Call trace:
> >>>>> __list_del_entry_valid+0x94/0xb4
> >>>>> __release_extent_node+0xb0/0x114
> >>>>> __free_extent_tree+0x58/0x7c
> >>>>> f2fs_shrink_extent_tree+0xdc/0x3b0
> >>>>> f2fs_leave_shrinker+0x28/0x7c
> >>>>> f2fs_put_super+0xfc/0x1e0
> >>>>> generic_shutdown_super+0x70/0xf4
> >>>>> kill_block_super+0x2c/0x5c
> >>>>> kill_f2fs_super+0x44/0x50
> >>>>> deactivate_locked_super+0x60/0x8c
> >>>>> deactivate_super+0x68/0x74
> >>>>> cleanup_mnt+0x40/0x78
> >>>>> __cleanup_mnt+0x1c/0x28
> >>>>> task_work_run+0x48/0xd0
> >>>>> do_notify_resume+0x678/0xe98
> >>>>> work_pending+0x8/0x14
> >>>>>
> >>>>> Fix this by cleaning up inodes, extent tree and nodes of those
> >>>>> recovered files before freeing up sbi and before next retry.
> >>>>>
> >>>> Hi Jaegeuk, Chao,
> >>>>
> >>>> I have observed another scenario where the similar list corruption issue
> >>>> can happen with sbi->inode_list as well. If recover_fsync_data()
> >>>> fails at some point in write_checkpoint() due to some error and if
> >>>> those recovered inodes are still dirty, then after the mount is
> >>>> successful, this issue is observed when that dirty inode is under 
> >>>> writeback.
> >>>
> >>> recover_fsync_data() does iget/iput in pair, and destroy_fsync_dnodes() drops
> >>> its dirty list and call iput(), when there is an error. So, after then, there'd
> >>> be no dirty inodes. If there's no error, checkpoint() flushes quota/dentry pages
> >>> in dirty inodes as well. Can we check where this dirty inode came from?
> >>
> >> I guess it comes from:
> >>
> >> f2fs_recover_fsync_data()
> >>
> >> 	/* Needed for iput() to work correctly and not trash data */
> >> 	sbi->sb->s_flags |= SB_ACTIVE;
> >>
> >> iput_final()
> >>
> >> 	if (!drop && (sb->s_flags & SB_ACTIVE)) {
> >> 		inode_add_lru(inode);
> >> 		spin_unlock(&inode->i_lock);
> >> 		return;
> >> 	}
> >>
> >> So dirty data in those inode can be remained after iput(), then meta/node
> >> can be persisted during next checkpoint, if checkpoint failed due to error,
> >> dirty inode remain in system. IIUC.
> > 
> > 
> > 749         err = recover_data(sbi, &inode_list, &tmp_inode_list, &dir_list);
> > 750         if (!err)
> > 751                 f2fs_bug_on(sbi, !list_empty(&inode_list));
> > 752         else {
> > 753                 /* restore s_flags to let iput() trash data */
> > 754                 sbi->sb->s_flags = s_flags;
> > 755         }
> > 
> > We deactivate sb before iput?
> 
> We only restore s_flags in error path of recover_data? I remember Sahitya
> said his case is encountering error in later checkpoint()?
> 

Thanks Chao and Jaegeuk for taking a look at this.
I might be wrong earlier on the point of error.
The case that Jaegeuk has addressed below looks good for now.
https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=dev&id=cd2dcebde8b8088b8ad77184ee3a1e81fd3da6ba

> Thanks,
> 
> > 
> >>
> >>>
> >>> Oh, one sceanrio can be an error by f2fs_disable_checkpoint() which will do GC.
> >>>
> >>>>
> >>>> [   90.400500] list_del corruption. prev->next should be ffffffed1f566208, but was (null)
> >>>> [   90.675349] Call trace:
> >>>> [   90.677869]  __list_del_entry_valid+0x94/0xb4
> >>>> [   90.682351]  remove_dirty_inode+0xac/0x114
> >>>> [   90.686563]  __f2fs_write_data_pages+0x6a8/0x6c8
> >>>> [   90.691302]  f2fs_write_data_pages+0x40/0x4c
> >>>> [   90.695695]  do_writepages+0x80/0xf0
> >>>> [   90.699372]  __writeback_single_inode+0xdc/0x4ac
> >>>> [   90.704113]  writeback_sb_inodes+0x280/0x440
> >>>> [   90.708501]  wb_writeback+0x1b8/0x3d0
> >>>> [   90.712267]  wb_workfn+0x1a8/0x4d4
> >>>> [   90.715765]  process_one_work+0x1c0/0x3d4
> >>>> [   90.719883]  worker_thread+0x224/0x344
> >>>> [   90.723739]  kthread+0x120/0x130
> >>>> [   90.727055]  ret_from_fork+0x10/0x18
> >>>>
> >>>> I think it is better to cleanup those inodes completely before freeing sbi
> >>>> and before next retry as done in this patch. Would you like to re-consider
> >>>> this patch for this new issue?
> >>>
> >>> The patch was merged in mainline already.
> >>> Could you take a look at this patch?
> >>>
> >>> >From cb1d20e640402beed300c2bdce79311ee8a781ad Mon Sep 17 00:00:00 2001
> >>> From: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> Date: Fri, 4 Jan 2019 12:29:00 -0800
> >>> Subject: [PATCH] f2fs: sync filesystem after roll-forward recovery
> >>
> >> You mean android kernel mainline?
> > 
> > I meant the previous patch was upstreamed. We need another patch to address this
> > second issue.
> > 
> > Thanks,
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Some works after roll-forward recovery can get an error which will release
> >>> all the data structures. Let's flush them in order to make it clean.
> >>>
> >>> One possible corruption came from:
> >>>
> >>> [   90.400500] list_del corruption. prev->next should be ffffffed1f566208, but was (null)
> >>> [   90.675349] Call trace:
> >>> [   90.677869]  __list_del_entry_valid+0x94/0xb4
> >>> [   90.682351]  remove_dirty_inode+0xac/0x114
> >>> [   90.686563]  __f2fs_write_data_pages+0x6a8/0x6c8
> >>> [   90.691302]  f2fs_write_data_pages+0x40/0x4c
> >>> [   90.695695]  do_writepages+0x80/0xf0
> >>> [   90.699372]  __writeback_single_inode+0xdc/0x4ac
> >>> [   90.704113]  writeback_sb_inodes+0x280/0x440
> >>> [   90.708501]  wb_writeback+0x1b8/0x3d0
> >>> [   90.712267]  wb_workfn+0x1a8/0x4d4
> >>> [   90.715765]  process_one_work+0x1c0/0x3d4
> >>> [   90.719883]  worker_thread+0x224/0x344
> >>> [   90.723739]  kthread+0x120/0x130
> >>> [   90.727055]  ret_from_fork+0x10/0x18
> >>>
> >>> Reported-by: Sahitya Tummala <stummala@codeaurora.org>
> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> >>> ---
> >>>  fs/f2fs/super.c | 8 ++++++--
> >>>  1 file changed, 6 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>> index 547cb7459be7..bb02186293a3 100644
> >>> --- a/fs/f2fs/super.c
> >>> +++ b/fs/f2fs/super.c
> >>> @@ -3357,7 +3357,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  	if (test_opt(sbi, DISABLE_CHECKPOINT)) {
> >>>  		err = f2fs_disable_checkpoint(sbi);
> >>>  		if (err)
> >>> -			goto free_meta;
> >>> +			goto sync_free_meta;
> >>>  	} else if (is_set_ckpt_flags(sbi, CP_DISABLED_FLAG)) {
> >>>  		f2fs_enable_checkpoint(sbi);
> >>>  	}
> >>> @@ -3370,7 +3370,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  		/* After POR, we can run background GC thread.*/
> >>>  		err = f2fs_start_gc_thread(sbi);
> >>>  		if (err)
> >>> -			goto free_meta;
> >>> +			goto sync_free_meta;
> >>>  	}
> >>>  	kvfree(options);
> >>>  
> >>> @@ -3392,6 +3392,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
> >>>  	f2fs_update_time(sbi, REQ_TIME);
> >>>  	return 0;
> >>>  
> >>> +sync_free_meta:
> >>> +	/* safe to flush all the data */
> >>> +	sync_filesystem(sbi->sb);
> >>> +
> >>>  free_meta:
> >>>  	/* flush dirty orphan inode objects */
> >>>  	f2fs_sync_inode_meta(sbi);
> >>>
> > 
> > .
> > 
> 

-- 
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

     prev parent reply	other threads:[~2019-01-10  6:39 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-26  4:47 [PATCH v2] f2fs: fix sbi->extent_list corruption issue Sahitya Tummala
2018-11-26  7:17 ` Chao Yu
2018-11-26  7:17   ` Chao Yu
2018-11-27  0:30 ` Jaegeuk Kim
2018-11-27  1:42   ` Chao Yu
2018-11-27  1:42     ` Chao Yu
2018-11-29  3:32     ` Sahitya Tummala
2018-11-30 20:33       ` Jaegeuk Kim
2018-12-07  9:47         ` Chao Yu
2018-12-07  9:47           ` Chao Yu
2018-12-12  3:17           ` Sahitya Tummala
2018-12-12  3:36             ` Chao Yu
2018-12-12  3:36               ` Chao Yu
2018-12-14  7:56               ` Sahitya Tummala
2018-12-14 14:25                 ` Jaegeuk Kim
2018-12-18 10:28                   ` Chao Yu
2018-12-18 10:28                     ` Chao Yu
2018-12-18 22:47                     ` Jaegeuk Kim
2018-12-19  3:43                       ` Chao Yu
2018-12-19  3:43                         ` Chao Yu
2018-12-19 23:42                         ` Jaegeuk Kim
2019-01-04  8:05 ` Sahitya Tummala
2019-01-04 20:33   ` Jaegeuk Kim
2019-01-04 20:33     ` Jaegeuk Kim
2019-01-07  2:25     ` Chao Yu
2019-01-07  2:25       ` Chao Yu
2019-01-09  4:38       ` Jaegeuk Kim
2019-01-09  6:13         ` Chao Yu
2019-01-09  6:13           ` Chao Yu
2019-01-10  6:39           ` Sahitya Tummala [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190110063921.GA7308@codeaurora.org \
    --to=stummala@codeaurora.org \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuchao0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.