From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Can Guo <cang@codeaurora.org>
Cc: bvanassche@acm.org, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net, avri.altman@wdc.com,
alim.akhtar@samsung.com, kernel-team@android.com
Subject: Re: [f2fs-dev] [PATCH v3 1/5] scsi: ufs: atomic update for clkgating_enable
Date: Mon, 26 Oct 2020 12:48:17 -0700 [thread overview]
Message-ID: <20201026194817.GA359340@google.com> (raw)
In-Reply-To: <6c029b64cb4d78e7624bc896f9c9f16d@codeaurora.org>
On 10/26, Can Guo wrote:
> On 2020-10-26 14:13, Jaegeuk Kim wrote:
> > On 10/26, Can Guo wrote:
> > > On 2020-10-24 23:06, Jaegeuk Kim wrote:
> > > > From: Jaegeuk Kim <jaegeuk@google.com>
> > > >
> > > > When giving a stress test which enables/disables clkgating, we hit
> > > > device
> > > > timeout sometimes. This patch avoids subtle racy condition to address
> > > > it.
> > > >
> > > > If we use __ufshcd_release(), I've seen that gate_work can be called in
> > > > parallel
> > > > with ungate_work, which results in UFS timeout when doing hibern8.
> > > > Should avoid it.
> > > >
> > >
> > > I don't understand this comment. gate_work and ungate_work are
> > > queued on
> > > an ordered workqueue and an ordered workqueue executes at most one
> > > work item
> > > at any given time in the queued order. How can the two run in
> > > parallel?
> >
> > When I hit UFS stuck, I saw this by clkgating tracepoint.
> >
> > - REQ_CLK_OFF
> > - CLKS_OFF
> > - REQ_CLK_OFF
> > - REQ_CLKS_ON
> > ..
> >
>
> I don't see how can you tell that the two works are running in parallel
> just from above trace. May I know what is the exact error by "UFS timeout
> when doing hibern8"?
>
> By using __ufshcd_release() here, I do see one potential issue if your test
> quickly toggles on/off of clk_gating - disable it, enable it, disable it and
> enable it, which will cause that __ufshcd_release() being called twice,
> meaning
> we queue two gate_works back to back. So can you try below code and let me
> know
> if it helps or not? I am OK with your current change, but I would like to
> understand the problem. Thanks.
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 1791bce..3eee438 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -2271,6 +2271,8 @@ static void ufshcd_gate_work(struct work_struct *work)
> unsigned long flags;
>
> spin_lock_irqsave(hba->host->host_lock, flags);
> + if (hba->clk_gating.state == CLKS_OFF)
> + goto rel_lock;
> /*
> * In case you are here to cancel this work the gating state
> * would be marked as REQ_CLKS_ON. In this case save time by
This doesn't help. So, I checked this back again, and, like what you said, now
suspect __ufshcd_release() which changed state to REQ_CLKS_OFF on CLKS_OFF.
With the below change, I can see the issue anymore. Let me send v4.
---
drivers/scsi/ufs/ufshcd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b8f573a02713..cc8d5f0c3fdc 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1745,7 +1745,8 @@ static void __ufshcd_release(struct ufs_hba *hba)
if (hba->clk_gating.active_reqs || hba->clk_gating.is_suspended ||
hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL ||
ufshcd_any_tag_in_use(hba) || hba->outstanding_tasks ||
- hba->active_uic_cmd || hba->uic_async_done)
+ hba->active_uic_cmd || hba->uic_async_done ||
+ hba->clk_gating.state == CLKS_OFF)
return;
hba->clk_gating.state = REQ_CLKS_OFF;
--
2.29.0.rc1.297.gfa9743e501-goog
>
> Regards,
>
> Can Guo.
>
> > By using active_req, I don't see any problem.
> >
> > >
> > > Thanks,
> > >
> > > Can Guo.
> > >
> > > > Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
> > > > ---
> > > > drivers/scsi/ufs/ufshcd.c | 12 ++++++------
> > > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > > index b8f573a02713..e0b479f9eb8a 100644
> > > > --- a/drivers/scsi/ufs/ufshcd.c
> > > > +++ b/drivers/scsi/ufs/ufshcd.c
> > > > @@ -1807,19 +1807,19 @@ static ssize_t
> > > > ufshcd_clkgate_enable_store(struct device *dev,
> > > > return -EINVAL;
> > > >
> > > > value = !!value;
> > > > +
> > > > + spin_lock_irqsave(hba->host->host_lock, flags);
> > > > if (value == hba->clk_gating.is_enabled)
> > > > goto out;
> > > >
> > > > - if (value) {
> > > > - ufshcd_release(hba);
> > > > - } else {
> > > > - spin_lock_irqsave(hba->host->host_lock, flags);
> > > > + if (value)
> > > > + hba->clk_gating.active_reqs--;
> > > > + else
> > > > hba->clk_gating.active_reqs++;
> > > > - spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > > - }
> > > >
> > > > hba->clk_gating.is_enabled = value;
> > > > out:
> > > > + spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > > return count;
> > > > }
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
WARNING: multiple messages have this Message-ID (diff)
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Can Guo <cang@codeaurora.org>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net, kernel-team@android.com,
alim.akhtar@samsung.com, avri.altman@wdc.com, bvanassche@acm.org
Subject: Re: [PATCH v3 1/5] scsi: ufs: atomic update for clkgating_enable
Date: Mon, 26 Oct 2020 12:48:17 -0700 [thread overview]
Message-ID: <20201026194817.GA359340@google.com> (raw)
In-Reply-To: <6c029b64cb4d78e7624bc896f9c9f16d@codeaurora.org>
On 10/26, Can Guo wrote:
> On 2020-10-26 14:13, Jaegeuk Kim wrote:
> > On 10/26, Can Guo wrote:
> > > On 2020-10-24 23:06, Jaegeuk Kim wrote:
> > > > From: Jaegeuk Kim <jaegeuk@google.com>
> > > >
> > > > When giving a stress test which enables/disables clkgating, we hit
> > > > device
> > > > timeout sometimes. This patch avoids subtle racy condition to address
> > > > it.
> > > >
> > > > If we use __ufshcd_release(), I've seen that gate_work can be called in
> > > > parallel
> > > > with ungate_work, which results in UFS timeout when doing hibern8.
> > > > Should avoid it.
> > > >
> > >
> > > I don't understand this comment. gate_work and ungate_work are
> > > queued on
> > > an ordered workqueue and an ordered workqueue executes at most one
> > > work item
> > > at any given time in the queued order. How can the two run in
> > > parallel?
> >
> > When I hit UFS stuck, I saw this by clkgating tracepoint.
> >
> > - REQ_CLK_OFF
> > - CLKS_OFF
> > - REQ_CLK_OFF
> > - REQ_CLKS_ON
> > ..
> >
>
> I don't see how can you tell that the two works are running in parallel
> just from above trace. May I know what is the exact error by "UFS timeout
> when doing hibern8"?
>
> By using __ufshcd_release() here, I do see one potential issue if your test
> quickly toggles on/off of clk_gating - disable it, enable it, disable it and
> enable it, which will cause that __ufshcd_release() being called twice,
> meaning
> we queue two gate_works back to back. So can you try below code and let me
> know
> if it helps or not? I am OK with your current change, but I would like to
> understand the problem. Thanks.
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 1791bce..3eee438 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -2271,6 +2271,8 @@ static void ufshcd_gate_work(struct work_struct *work)
> unsigned long flags;
>
> spin_lock_irqsave(hba->host->host_lock, flags);
> + if (hba->clk_gating.state == CLKS_OFF)
> + goto rel_lock;
> /*
> * In case you are here to cancel this work the gating state
> * would be marked as REQ_CLKS_ON. In this case save time by
This doesn't help. So, I checked this back again, and, like what you said, now
suspect __ufshcd_release() which changed state to REQ_CLKS_OFF on CLKS_OFF.
With the below change, I can see the issue anymore. Let me send v4.
---
drivers/scsi/ufs/ufshcd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b8f573a02713..cc8d5f0c3fdc 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1745,7 +1745,8 @@ static void __ufshcd_release(struct ufs_hba *hba)
if (hba->clk_gating.active_reqs || hba->clk_gating.is_suspended ||
hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL ||
ufshcd_any_tag_in_use(hba) || hba->outstanding_tasks ||
- hba->active_uic_cmd || hba->uic_async_done)
+ hba->active_uic_cmd || hba->uic_async_done ||
+ hba->clk_gating.state == CLKS_OFF)
return;
hba->clk_gating.state = REQ_CLKS_OFF;
--
2.29.0.rc1.297.gfa9743e501-goog
>
> Regards,
>
> Can Guo.
>
> > By using active_req, I don't see any problem.
> >
> > >
> > > Thanks,
> > >
> > > Can Guo.
> > >
> > > > Signed-off-by: Jaegeuk Kim <jaegeuk@google.com>
> > > > ---
> > > > drivers/scsi/ufs/ufshcd.c | 12 ++++++------
> > > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > > index b8f573a02713..e0b479f9eb8a 100644
> > > > --- a/drivers/scsi/ufs/ufshcd.c
> > > > +++ b/drivers/scsi/ufs/ufshcd.c
> > > > @@ -1807,19 +1807,19 @@ static ssize_t
> > > > ufshcd_clkgate_enable_store(struct device *dev,
> > > > return -EINVAL;
> > > >
> > > > value = !!value;
> > > > +
> > > > + spin_lock_irqsave(hba->host->host_lock, flags);
> > > > if (value == hba->clk_gating.is_enabled)
> > > > goto out;
> > > >
> > > > - if (value) {
> > > > - ufshcd_release(hba);
> > > > - } else {
> > > > - spin_lock_irqsave(hba->host->host_lock, flags);
> > > > + if (value)
> > > > + hba->clk_gating.active_reqs--;
> > > > + else
> > > > hba->clk_gating.active_reqs++;
> > > > - spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > > - }
> > > >
> > > > hba->clk_gating.is_enabled = value;
> > > > out:
> > > > + spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > > return count;
> > > > }
next prev parent reply other threads:[~2020-10-26 19:48 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-24 15:06 [f2fs-dev] [UFS v3] UFS fixes Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-24 15:06 ` [f2fs-dev] [PATCH v3 1/5] scsi: ufs: atomic update for clkgating_enable Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-26 3:28 ` [f2fs-dev] " Can Guo
2020-10-26 3:28 ` Can Guo
2020-10-26 6:13 ` [f2fs-dev] " Jaegeuk Kim
2020-10-26 6:13 ` Jaegeuk Kim
2020-10-26 6:43 ` [f2fs-dev] " Can Guo
2020-10-26 6:43 ` Can Guo
2020-10-26 19:48 ` Jaegeuk Kim [this message]
2020-10-26 19:48 ` Jaegeuk Kim
2020-10-27 2:44 ` [f2fs-dev] " Can Guo
2020-10-27 2:44 ` Can Guo
2020-10-24 15:06 ` [f2fs-dev] [PATCH v3 2/5] scsi: ufs: clear UAC for FFU and RPMB LUNs Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-26 18:25 ` [f2fs-dev] " asutoshd
2020-10-26 18:25 ` asutoshd
2020-10-26 19:43 ` [f2fs-dev] " Jaegeuk Kim
2020-10-26 19:43 ` Jaegeuk Kim
2020-10-27 17:48 ` kernel test robot
2020-10-27 17:48 ` kernel test robot
2020-10-24 15:06 ` [f2fs-dev] [PATCH v3 3/5] scsi: ufs: use WQ_HIGHPRI for gating work Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-26 18:27 ` [f2fs-dev] " asutoshd
2020-10-26 18:27 ` asutoshd
2020-10-24 15:06 ` [f2fs-dev] [PATCH v3 4/5] scsi: add more contexts in the ufs tracepoints Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-24 15:06 ` [f2fs-dev] [PATCH v3 5/5] scsi: ufs: fix clkgating on/off correctly Jaegeuk Kim
2020-10-24 15:06 ` Jaegeuk Kim
2020-10-26 18:33 ` [f2fs-dev] " asutoshd
2020-10-26 18:33 ` asutoshd
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201026194817.GA359340@google.com \
--to=jaegeuk@kernel.org \
--cc=alim.akhtar@samsung.com \
--cc=avri.altman@wdc.com \
--cc=bvanassche@acm.org \
--cc=cang@codeaurora.org \
--cc=kernel-team@android.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.