linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Chao Yu <chao2.yu@samsung.com>
To: 'He YunLei' <heyunlei@huawei.com>,
	jaegeuk@kernel.org, linux-f2fs-devel@lists.sourceforge.net
Cc: 'Biao He' <hebiao6@huawei.com>
Subject: Re: [PATCH] f2fs: avoid hungtask problem caused by losing wake_up
Date: Wed, 24 Feb 2016 11:46:38 +0800	[thread overview]
Message-ID: <00c401d16eb6$14754cf0$3d5fe6d0$@samsung.com> (raw)
In-Reply-To: <56CC442C.1050202@huawei.com>

Hi Yunlei,

> -----Original Message-----
> From: He YunLei [mailto:heyunlei@huawei.com]
> Sent: Tuesday, February 23, 2016 7:36 PM
> To: Chao Yu; jaegeuk@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> Cc: bintian.wang@huawei.com; 'Biao He'
> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by losing wake_up
> 
> On 2016/2/23 17:15, Chao Yu wrote:
> Hi Chao,
> 
> > Hi Yunlei,
> >
> >> -----Original Message-----
> >> From: He YunLei [mailto:heyunlei@huawei.com]
> >> Sent: Tuesday, February 23, 2016 3:03 PM
> >> To: Chao Yu; jaegeuk@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >> Cc: bintian.wang@huawei.com; 'Biao He'
> >> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by losing wake_up
> >>
> >> On 2016/2/23 13:44, Chao Yu wrote:
> >>> Hi Yunlei,
> >> Hi Chao,
> >>>
> >>>> -----Original Message-----
> >>>> From: Yunlei He [mailto:heyunlei@huawei.com]
> >>>> Sent: Tuesday, February 23, 2016 12:08 PM
> >>>> To: chao2.yu@samsung.com; jaegeuk@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >>>> Cc: bintian.wang@huawei.com; Yunlei He; Biao He
> >>>> Subject: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by losing wake_up
> >>>>
> >>>> The D state of wait_on_all_pages_writeback should be waken by
> >>>> function f2fs_write_end_io when all writeback pages have been
> >>>> succesfully written to device. It's possible that wake_up comes
> >>>> between get_pages and io_schedule. Maybe in this case it will
> >>>> lost wake_up and still in D state even if all pages have been
> >>>> write back to device, and finally, the whole system will be into
> >>>> the hungtask state.
> >>>
> >>> I haven't encountered such issue so far, do you suffer this in real
> >>> world?
> >>>
> >> yes, I have encounter it, the whole file system is blocked at function
> >> wait_on_all_pages_writeback beyond 120s when write cp, and no error reported
> >> by storage device driver.
> >
> > Could this reproducible? If it could, could you please share the details.
> > And did this occur in a huge size f2fs image?
> >
> >>>>
> >>>>                   if (!get_pages(sbi, F2FS_WRITEBACK))
> >>>>                            break;
> >>>> 					<---------  wake_up
> >>>
> >>> wake_up will put all tasks linked in sbi->cp_wait on run-queue, so
> >>> here it should be save to call io_schedule, after being rescheduled,
> >>> it will get the chance to check above condition to break out.
> >>>
> >>> Thanks,
> >>
> >> Here, we just doubt something weird may cause wait_on_all_pages_writeback
> >> could not be waken. Wake_up trigger only one time by last bio's end_io
> >> function, if the thread happen to miss it, the thread will be in D state
> >> forever. So we change the code to make wait_on_all_pages_writeback awaken
> >> periodically, then check the condition.
> >
> > Got it.
> >
> > The patch can fix issue that checkpointer will wait forever in case of
> > write_end_io was failed to call wake_up for some reason.

I found one possible case:

CPU0:					CPU1:
 - write_checkpoint
  - do_checkpoint
   - wait_on_all_pages_writeback
					 - f2fs_write_end_io
					  - wake_up
					this is last writebacked page, but
					no sleeper in sbi->cp_wait wait
					queue, wake_up is not been called.
    - prepare_to_wait(TASK_UNINTERRUPTIBLE)
    Here, current task is been preempted,
    but there will be no waker to wake up
    this task since last write_end_io
    has been called before. So current
    task will sleep forever.
    - io_schedule

How do you think of it?

And if this is right, following patch can fix this issue.

---
 fs/f2fs/checkpoint.c | 14 +++++++++-----
 fs/f2fs/data.c       |  9 +++++++--
 fs/f2fs/f2fs.h       |  3 ++-
 fs/f2fs/super.c      |  1 +
 4 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 9d277f8..9446c3d 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -914,15 +914,19 @@ static void wait_on_all_pages_writeback(struct f2fs_sb_info *sbi)
 {
 	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
+	spin_lock(&sbi->cp_wb_lock);
 
-		if (!get_pages(sbi, F2FS_WRITEBACK))
-			break;
+	while (get_pages(sbi, F2FS_WRITEBACK)) {
+		prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
 
+		spin_unlock(&sbi->cp_wb_lock);
 		io_schedule();
+		spin_lock(&sbi->cp_wb_lock);
+
+		finish_wait(&sbi->cp_wait, &wait);
 	}
-	finish_wait(&sbi->cp_wait, &wait);
+
+	spin_unlock(&sbi->cp_wb_lock);
 }
 
 static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index e5c762b..e31deb97 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -59,6 +59,7 @@ static void f2fs_write_end_io(struct bio *bio)
 {
 	struct f2fs_sb_info *sbi = bio->bi_private;
 	struct bio_vec *bvec;
+	unsigned long flags;
 	int i;
 
 	bio_for_each_segment_all(bvec, bio, i) {
@@ -74,8 +75,12 @@ static void f2fs_write_end_io(struct bio *bio)
 		dec_page_count(sbi, F2FS_WRITEBACK);
 	}
 
-	if (!get_pages(sbi, F2FS_WRITEBACK) && wq_has_sleeper(&sbi->cp_wait))
-		wake_up(&sbi->cp_wait);
+	if (!get_pages(sbi, F2FS_WRITEBACK)) {
+		spin_lock_irqsave(&sbi->cp_wb_lock, flags);
+		if (wq_has_sleeper(&sbi->cp_wait))
+			wake_up(&sbi->cp_wait);
+		spin_unlock_irqrestore(&sbi->cp_wb_lock, flags);
+	}
 
 	bio_put(bio);
 }
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 0d25430..fd47984 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -727,7 +727,8 @@ struct f2fs_sb_info {
 	struct rw_semaphore cp_rwsem;		/* blocking FS operations */
 	struct rw_semaphore node_write;		/* locking node writes */
 	struct mutex writepages;		/* mutex for writepages() */
-	wait_queue_head_t cp_wait;
+	wait_queue_head_t cp_wait;		/* for wait pages writeback */
+	spinlock_t cp_wb_lock;			/* for protect cp_wait */
 	unsigned long last_time[MAX_TIME];	/* to store time in jiffies */
 	long interval_time[MAX_TIME];		/* to store thresholds */
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 7b62016..5316c7a 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1374,6 +1374,7 @@ try_onemore:
 
 	init_rwsem(&sbi->cp_rwsem);
 	init_waitqueue_head(&sbi->cp_wait);
+	spin_lock_init(&sbi->cp_wb_lock);
 	init_sb_info(sbi);
 
 	/* get an inode for meta space */
-- 
2.7.0

> >
> > But I doubt more that the reason we are stuck is there are remained pages
> > cached in bio buffer without being submitted. To make sure, maybe in
> > wait_on_all_pages_writeback we could add print info to see whether
> > sbi->write_io[].bio is valid or not.
> >
> We use tool dump f2fs_sb_info information and find that:
> 
> 	write_io[DATA].bio = 0;
> 	write_io[NODE].bio = 0;
> 	write_io[META].bio = 0;
> 
> 	nr_pages[F2FS_WRITEBACK] = 0;
> 	nr_pages[F2FS_DIRTY_DENTS] = 0;
> 	nr_pages[F2FS_DIRTY_NODES] = 13;

Weird, dirty nodes count should be 0.

Thanks

> 	nr_pages[F2FS_DIRTY_META] = 0;
> 	nr_pages[F2FS_INMEM_PAGES] = 0;
> 
> So we believe that the block device is ok!
> 
> Thanks,
> 
> > Thanks,
> >
> >>
> >>>
> >>>>                   io_schedule();
> >>>>
> >>>> Signed-off-by: Yunlei He <heyunlei@huawei.com>
> >>>> Signed-off-by: Biao He <hebiao6@huawei.com>
> >>>> ---
> >>>>    fs/f2fs/checkpoint.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >>>> index 2bac8a1..f55355d 100644
> >>>> --- a/fs/f2fs/checkpoint.c
> >>>> +++ b/fs/f2fs/checkpoint.c
> >>>> @@ -920,7 +920,7 @@ static void wait_on_all_pages_writeback(struct f2fs_sb_info *sbi)
> >>>>    		if (!get_pages(sbi, F2FS_WRITEBACK))
> >>>>    			break;
> >>>>
> >>>> -		io_schedule();
> >>>> +		io_schedule_timeout(5*HZ);
> >>>>    	}
> >>>>    	finish_wait(&sbi->cp_wait, &wait);
> >>>>    }
> >>>> --
> >>>> 1.9.1
> >>>
> >>>
> >>>
> >>> .
> >>>
> >
> >
> >
> > .
> >



------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140

  reply	other threads:[~2016-02-24  3:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-23  4:07 [PATCH] f2fs: avoid hungtask problem caused by losing wake_up Yunlei He
2016-02-23  5:44 ` Chao Yu
2016-02-23  7:02   ` He YunLei
2016-02-23  9:15     ` Chao Yu
2016-02-23 11:36       ` He YunLei
2016-02-24  3:46         ` Chao Yu [this message]
2016-02-24  7:32           ` He YunLei
2016-02-24  8:05             ` Chao Yu
2016-02-24  9:45               ` hebiao (G)
2016-02-25  9:32                 ` Chao Yu
2016-02-25  7:36           ` He YunLei
2016-02-25  9:41             ` Chao Yu
2016-02-25 19:03               ` Jaegeuk Kim
2016-02-26  1:15                 ` Chao Yu
2016-02-23  9:32     ` Shawn Lin
2016-02-23 11:45       ` He YunLei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00c401d16eb6$14754cf0$3d5fe6d0$@samsung.com' \
    --to=chao2.yu@samsung.com \
    --cc=hebiao6@huawei.com \
    --cc=heyunlei@huawei.com \
    --cc=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).