Re: [f2fs-dev] [PATCH] f2fs: fix long latency due to discard during umount

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Sahitya Tummala <stummala@codeaurora.org>
Cc: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: fix long latency due to discard during umount
Date: Fri, 13 Mar 2020 08:38:47 -0700	[thread overview]
Message-ID: <20200313153847.GA185439@google.com> (raw)
In-Reply-To: <20200313051245.GK20234@codeaurora.org>

On 03/13, Sahitya Tummala wrote:
> On Thu, Mar 12, 2020 at 06:45:35PM -0700, Jaegeuk Kim wrote:
> > On 03/13, Sahitya Tummala wrote:
> > > On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> > > > On 03/12, Sahitya Tummala wrote:
> > > > > F2FS already has a default timeout of 5 secs for discards that
> > > > > can be issued during umount, but it can take more than the 5 sec
> > > > > timeout if the underlying UFS device queue is already full and there
> > > > > are no more available free tags to be used. In that case, submit_bio()
> > > > > will wait for the already queued discard requests to complete to get
> > > > > a free tag, which can potentially take way more than 5 sec.
> > > > > 
> > > > > Fix this by submitting the discard requests with REQ_NOWAIT
> > > > > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > > > > scenario without waiting in the context of submit_bio(). The FS can
> > > > > then handle these requests by retrying again within the stipulated
> > > > > discard timeout period to avoid long latencies.
> > > > > 
> > > > > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> > > > > ---
> > > > >  fs/f2fs/segment.c | 14 +++++++++++++-
> > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > > > index fb3e531..a06bbac 100644
> > > > > --- a/fs/f2fs/segment.c
> > > > > +++ b/fs/f2fs/segment.c
> > > > > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  	struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > > > >  	struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > > > >  					&(dcc->fstrim_list) : &(dcc->wait_list);
> > > > > -	int flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > +	int flag;
> > > > >  	block_t lstart, start, len, total_len;
> > > > >  	int err = 0;
> > > > >  
> > > > > +	flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > +	flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > > > > +
> > > > >  	if (dc->state != D_PREP)
> > > > >  		return 0;
> > > > >  
> > > > > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  		bio->bi_end_io = f2fs_submit_discard_endio;
> > > > >  		bio->bi_opf |= flag;
> > > > >  		submit_bio(bio);
> > > > > +		if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > > > > +			dc->state = D_PREP;
> > > > > +			err = dc->error;
> > > > > +			break;
> > > > > +		}
> > > > >  
> > > > >  		atomic_inc(&dcc->issued_discard);
> > > > >  
> > > > > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  			}
> > > > >  
> > > > >  			__submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > > > > +			if (dc->error == -EAGAIN) {
> > > > > +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> > > > 
> > > > 						--> need to be DEFAULT_IO_TIMEOUT
> > > 
> > > Yes, i will update it.
> > > 
> > > > 
> > > > > +				__relocate_discard_cmd(dcc, dc);
> > > > 
> > > > It seems we need to submit bio first, and then move dc to wait_list, if there's
> > > > no error, in __submit_discard_cmd().
> > > 
> > > Yes, that is not changed and it still happens for the failed request
> > > that is re-queued here too when it gets submitted again later.
> > > 
> > > I am requeuing the discard request failed with -EAGAIN error back to 
> > > dcc->pend_list[] from wait_list. It will call submit_bio() for this request
> > > and also move to wait_list when it calls __submit_discard_cmd() again next
> > > time. Please let me know if I am missing anything?
> > 
> > This patch has no problem, but I'm thinking that __submit_discard_cmd() needs
> > to return with any values by assumption where the waiting list should have
> > submitted commands.
> 
> I think dc->queued will indicated that dc is moved to wait_list. This can be
> used along with return value to take right action. Can you check if this
> works?

I mean why can't do this *in* __submit_discard_cmd()? Otherwise, existing and
future callers should consider to handle the errors everytime.

> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index a06bbac..91df060 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1478,7 +1478,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>         struct list_head *pend_list;
>         struct discard_cmd *dc, *tmp;
>         struct blk_plug plug;
> -       int i, issued = 0;
> +       int i, err, issued = 0;
>         bool io_interrupted = false;
> 
>         if (dpolicy->timeout != 0)
> @@ -1517,8 +1517,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>                                 break;
>                         }
> 
> -                       __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> -                       if (dc->error == -EAGAIN) {
> +                       err = __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> +                       if (err && err != -EAGAIN) {
> +                               __remove_discard_cmd(sbi, dc);
> +                       } else if (err == -EAGAIN && dc->queued) {
>                                 congestion_wait(BLK_RW_ASYNC, HZ/50);
>                                 __relocate_discard_cmd(dcc, dc);
>                         }
> 
> thanks,
> > 
> > > 
> > > Thanks,
> > > 
> > > > 
> > > > > +			}
> > > > >  
> > > > >  			if (issued >= dpolicy->max_requests)
> > > > >  				break;
> > > > > -- 
> > > > > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > > > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> > > 
> > > -- 
> > > --
> > > Sent by a consultant of the Qualcomm Innovation Center, Inc.
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> 
> -- 
> --
> Sent by a consultant of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

WARNING: multiple messages have this Message-ID (diff)

From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Sahitya Tummala <stummala@codeaurora.org>
Cc: Chao Yu <yuchao0@huawei.com>,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount
Date: Fri, 13 Mar 2020 08:38:47 -0700	[thread overview]
Message-ID: <20200313153847.GA185439@google.com> (raw)
In-Reply-To: <20200313051245.GK20234@codeaurora.org>

On 03/13, Sahitya Tummala wrote:
> On Thu, Mar 12, 2020 at 06:45:35PM -0700, Jaegeuk Kim wrote:
> > On 03/13, Sahitya Tummala wrote:
> > > On Thu, Mar 12, 2020 at 10:02:42AM -0700, Jaegeuk Kim wrote:
> > > > On 03/12, Sahitya Tummala wrote:
> > > > > F2FS already has a default timeout of 5 secs for discards that
> > > > > can be issued during umount, but it can take more than the 5 sec
> > > > > timeout if the underlying UFS device queue is already full and there
> > > > > are no more available free tags to be used. In that case, submit_bio()
> > > > > will wait for the already queued discard requests to complete to get
> > > > > a free tag, which can potentially take way more than 5 sec.
> > > > > 
> > > > > Fix this by submitting the discard requests with REQ_NOWAIT
> > > > > flags during umount. This will return -EAGAIN for UFS queue/tag full
> > > > > scenario without waiting in the context of submit_bio(). The FS can
> > > > > then handle these requests by retrying again within the stipulated
> > > > > discard timeout period to avoid long latencies.
> > > > > 
> > > > > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
> > > > > ---
> > > > >  fs/f2fs/segment.c | 14 +++++++++++++-
> > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > > > index fb3e531..a06bbac 100644
> > > > > --- a/fs/f2fs/segment.c
> > > > > +++ b/fs/f2fs/segment.c
> > > > > @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  	struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info;
> > > > >  	struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ?
> > > > >  					&(dcc->fstrim_list) : &(dcc->wait_list);
> > > > > -	int flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > +	int flag;
> > > > >  	block_t lstart, start, len, total_len;
> > > > >  	int err = 0;
> > > > >  
> > > > > +	flag = dpolicy->sync ? REQ_SYNC : 0;
> > > > > +	flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0;
> > > > > +
> > > > >  	if (dc->state != D_PREP)
> > > > >  		return 0;
> > > > >  
> > > > > @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  		bio->bi_end_io = f2fs_submit_discard_endio;
> > > > >  		bio->bi_opf |= flag;
> > > > >  		submit_bio(bio);
> > > > > +		if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) {
> > > > > +			dc->state = D_PREP;
> > > > > +			err = dc->error;
> > > > > +			break;
> > > > > +		}
> > > > >  
> > > > >  		atomic_inc(&dcc->issued_discard);
> > > > >  
> > > > > @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
> > > > >  			}
> > > > >  
> > > > >  			__submit_discard_cmd(sbi, dpolicy, dc, &issued);
> > > > > +			if (dc->error == -EAGAIN) {
> > > > > +				congestion_wait(BLK_RW_ASYNC, HZ/50);
> > > > 
> > > > 						--> need to be DEFAULT_IO_TIMEOUT
> > > 
> > > Yes, i will update it.
> > > 
> > > > 
> > > > > +				__relocate_discard_cmd(dcc, dc);
> > > > 
> > > > It seems we need to submit bio first, and then move dc to wait_list, if there's
> > > > no error, in __submit_discard_cmd().
> > > 
> > > Yes, that is not changed and it still happens for the failed request
> > > that is re-queued here too when it gets submitted again later.
> > > 
> > > I am requeuing the discard request failed with -EAGAIN error back to 
> > > dcc->pend_list[] from wait_list. It will call submit_bio() for this request
> > > and also move to wait_list when it calls __submit_discard_cmd() again next
> > > time. Please let me know if I am missing anything?
> > 
> > This patch has no problem, but I'm thinking that __submit_discard_cmd() needs
> > to return with any values by assumption where the waiting list should have
> > submitted commands.
> 
> I think dc->queued will indicated that dc is moved to wait_list. This can be
> used along with return value to take right action. Can you check if this
> works?

I mean why can't do this *in* __submit_discard_cmd()? Otherwise, existing and
future callers should consider to handle the errors everytime.

> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index a06bbac..91df060 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -1478,7 +1478,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>         struct list_head *pend_list;
>         struct discard_cmd *dc, *tmp;
>         struct blk_plug plug;
> -       int i, issued = 0;
> +       int i, err, issued = 0;
>         bool io_interrupted = false;
> 
>         if (dpolicy->timeout != 0)
> @@ -1517,8 +1517,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi,
>                                 break;
>                         }
> 
> -                       __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> -                       if (dc->error == -EAGAIN) {
> +                       err = __submit_discard_cmd(sbi, dpolicy, dc, &issued);
> +                       if (err && err != -EAGAIN) {
> +                               __remove_discard_cmd(sbi, dc);
> +                       } else if (err == -EAGAIN && dc->queued) {
>                                 congestion_wait(BLK_RW_ASYNC, HZ/50);
>                                 __relocate_discard_cmd(dcc, dc);
>                         }
> 
> thanks,
> > 
> > > 
> > > Thanks,
> > > 
> > > > 
> > > > > +			}
> > > > >  
> > > > >  			if (issued >= dpolicy->max_requests)
> > > > >  				break;
> > > > > -- 
> > > > > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> > > > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.
> > > 
> > > -- 
> > > --
> > > Sent by a consultant of the Qualcomm Innovation Center, Inc.
> > > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
> 
> -- 
> --
> Sent by a consultant of the Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

next prev parent reply	other threads:[~2020-03-13 15:39 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-12 11:14 [PATCH] f2fs: fix long latency due to discard during umount Sahitya Tummala
2020-03-12 17:02 ` [f2fs-dev] " Jaegeuk Kim
2020-03-12 17:02   ` Jaegeuk Kim
2020-03-13  1:26   ` Sahitya Tummala
2020-03-13  1:45     ` [f2fs-dev] " Jaegeuk Kim
2020-03-13  1:45       ` Jaegeuk Kim
2020-03-13  5:12       ` [f2fs-dev] " Sahitya Tummala
2020-03-13  5:12         ` Sahitya Tummala
2020-03-13 15:38         ` Jaegeuk Kim [this message]
2020-03-13 15:38           ` Jaegeuk Kim
2020-03-13  2:20 ` [f2fs-dev] " Chao Yu
2020-03-13  2:20   ` Chao Yu
2020-03-13  3:39   ` Sahitya Tummala
2020-03-13  6:30     ` [f2fs-dev] " Chao Yu
2020-03-13  6:30       ` Chao Yu
2020-03-13 11:08       ` Sahitya Tummala
2020-03-16  0:52         ` [f2fs-dev] " Chao Yu
2020-03-16  0:52           ` Chao Yu
2020-03-16  3:52           ` [f2fs-dev] " Sahitya Tummala
2020-03-16  3:52             ` Sahitya Tummala
  -- strict thread matches above, loose matches on Subject: below --
2020-03-18  4:44 Sahitya Tummala
2020-03-24  9:08 ` [f2fs-dev] " Chao Yu
2020-03-24  9:47   ` Chao Yu
2020-03-24  9:47     ` Chao Yu
2020-03-26  9:00 ` Chao Yu
2020-03-26 13:37   ` Sahitya Tummala
2020-03-27  1:51     ` Chao Yu
2020-03-27  3:05       ` Sahitya Tummala
2020-03-30  6:53         ` [f2fs-dev] " Sahitya Tummala
2020-03-30  8:38           ` Chao Yu
2020-03-30 10:16             ` Chao Yu
2020-03-30 10:16               ` Chao Yu
2020-03-30 10:51               ` Sahitya Tummala
2020-03-30 10:51                 ` Sahitya Tummala
2020-03-31  1:46                 ` Chao Yu
2020-03-31  1:46                   ` Chao Yu
2020-03-31  3:10                   ` Sahitya Tummala
2020-03-31  3:10                     ` Sahitya Tummala
2020-03-31  3:50                     ` Jaegeuk Kim
2020-03-31  3:50                       ` Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200313153847.GA185439@google.com \
    --to=jaegeuk@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stummala@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.