All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jean Rene Dawin <jdawin@math.uni-bielefeld.de>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "H. Nikolaus Schaller" <hns@goldelico.com>,
	Huijin Park <huijin.park@samsung.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Discussions about the Letux Kernel 
	<letux-kernel@openphoenux.org>,
	Linux-OMAP <linux-omap@vger.kernel.org>,
	linux-mmc@vger.kernel.org, Tony Lindgren <tony@atomide.com>
Subject: Re: [BUG] mmc: core: adjust polling interval for CMD1
Date: Wed, 2 Mar 2022 09:20:34 +0100	[thread overview]
Message-ID: <20220302082034.GA5723@math.uni-bielefeld.de> (raw)
In-Reply-To: <CAPDyKFrz_2Vp64SUzB8CiHJLTjO8Hx8m3QEhY1VU2ksZhVEx7A@mail.gmail.com>

Ulf Hansson wrote on Tue  1/03/22 14:38:
> On Thu, 17 Feb 2022 at 21:12, H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >
> > Hi,
> > Jean Rene Dawin did report to me a problem on the Beagle Bone Black starting
> > with our disto kernel based on v5.17-rc1:
> >
> > >> since kernel 5.17-rc1 I noticed slower emmc performance on Beaglebone
> > >> Black, but didn't check the logs.
> > >> When I tried to run 5.17.0-rc3-letux+ it booted fine, but during IO
> > >> traffic there were messages like
> > >>
> > >> [  662.529584] mmc1: error -110 doing runtime resume
> > >> [  669.293590] mmc1: Card stuck being busy! __mmc_poll_for_busy
> > >>
> > >> [  739.076072] mmc1: Timeout waiting for hardware interrupt.
> > >> [  739.145676] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
> > >> [  739.231053] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
> > >> [  739.316472] mmc1: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000400
> > >> [  739.401937] mmc1: sdhci: Argument:  0x00342d30 | Trn mode: 0x00000023
> > >> [  739.487439] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
> > >> [  739.573007] mmc1: sdhci: Power:     0x0000000f | Blk gap:  0x00000000
> > >> [  739.658609] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00003c07
> > >> [  739.744224] mmc1: sdhci: Timeout:   0x00000007 | Int stat: 0x00000002
> > >> [  739.829896] mmc1: sdhci: Int enab:  0x027f000b | Sig enab: 0x027f000b
> > >> [  739.915623] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
> > >> [  740.001394] mmc1: sdhci: Caps:      0x07e10080 | Caps_1:   0x00000000
> > >> [  740.087208] mmc1: sdhci: Cmd:       0x0000193a | Max curr: 0x00000000
> > >> [  740.173051] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x00000000
> > >> [  740.258928] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
> > >> [  740.344854] mmc1: sdhci: Host ctl2: 0x00000000
> > >> [  740.402796] mmc1: sdhci: ============================================
> > >>
> > >> and finally IO errors and a corrupted filesystem.
> > >>
> > >> 5.17.0-rc4-letux+ shows the same behaviour.
> >
> > I checked with my BeagleBoard Black (am3358) and can confirm this observation.
> > It happens only with the integrated eMMC but not with the µSD connected to
> > the other mmc interface.
> >
> > A git bisect found:
> >
> > 76bfc7ccc2fa9d382576f6013b57a0ef93d5a722 is the first bad commit
> > commit 76bfc7ccc2fa9d382576f6013b57a0ef93d5a722
> > Author: Huijin Park <huijin.park@samsung.com>
> > Date:   Thu Nov 4 15:32:31 2021 +0900
> >
> >   mmc: core: adjust polling interval for CMD1
> >
> >   In mmc_send_op_cond(), loops are continuously performed at the same
> >   interval of 10 ms.  However the behaviour is not good for some eMMC
> >   which can be out from a busy state earlier than 10 ms if normal.
> >
> >   Rather than fixing about the interval time in mmc_send_op_cond(),
> >   let's instead convert into using the common __mmc_poll_for_busy().
> >
> >   The reason for adjusting the interval time is that it is important
> >   to reduce the eMMC initialization time, especially in devices that
> >   use eMMC as rootfs.
> >
> >   Test log(eMMC:KLM8G1GETF-B041):
> >
> >   before: 12 ms (0.311016 - 0.298729)
> >   [    0.295823] mmc0: starting CMD0 arg 00000000 flags 000000c0
> >   [    0.298729] mmc0: starting CMD1 arg 40000080 flags 000000e1<-start
> >   [    0.311016] mmc0: starting CMD1 arg 40000080 flags 000000e1<-finish
> >   [    0.311336] mmc0: starting CMD2 arg 00000000 flags 00000007
> >
> >   after: 2 ms (0.301270 - 0.298762)
> >   [    0.295862] mmc0: starting CMD0 arg 00000000 flags 000000c0
> >   [    0.298762] mmc0: starting CMD1 arg 40000080 flags 000000e1<-start
> >   [    0.299067] mmc0: starting CMD1 arg 40000080 flags 000000e1
> >   [    0.299441] mmc0: starting CMD1 arg 40000080 flags 000000e1
> >   [    0.299879] mmc0: starting CMD1 arg 40000080 flags 000000e1
> >   [    0.300446] mmc0: starting CMD1 arg 40000080 flags 000000e1
> >   [    0.301270] mmc0: starting CMD1 arg 40000080 flags 000000e1<-finish
> >   [    0.301572] mmc0: starting CMD2 arg 00000000 flags 00000007
> >
> >   Signed-off-by: Huijin Park <huijin.park@samsung.com>
> >   Link: https://lore.kernel.org/r/20211104063231.2115-3-huijin.park@samsung.com
> >   Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> >
> > Reverting this makes v5.17-rc[1-4] work.
> >
> > Any suggestions or fixes?
> >
> > BR and thanks,
> > Nikolaus Schaller
> >
> > Reported-by: jdawin@math.uni-bielefeld.de
> >
> 
> Thanks for reporting and bisecting!
> 
> What changed with the offending commit is two things:
> 
> 1) We are sending the CMD1 more frequently, initially in the loop in
> the __mmc_poll_for_busy. Step by step, we increase the polling period.
> 2) We may end up using a slightly shorter total timeout for polling
> time, compared to what we used before the offending commit.
> 
> Hopefully the problem is related to 2), in which case I think the
> below patch should help. Can you please give it a try?
> 
> Kind regards
> Uffe
> 
> From: Ulf Hansson <ulf.hansson@linaro.org>
> Date: Tue, 1 Mar 2022 14:24:21 +0100
> Subject: [PATCH] mmc: core: Extend timeout to 2s for MMC_SEND_OP_COND
> 
> It looks like the timeout for the MMC_SEND_OP_COND (CMD1) might have become
> a bit too small due to recent changes. Therefore, let's extend it to 2s,
> which is probably more inline with its previous value, to fix the reported
> timeout problems.
> 
> While at it, let's add a define for the timeout value, rather than using
> a hard-coded value for it.
> 
> Reported-by: Jean Rene Dawin <jdawin@math.uni-bielefeld.de>
> Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
> Cc: Huijin Park <huijin.park@samsung.com>
> Fixes: 76bfc7ccc2fa ("mmc: core: adjust polling interval for CMD1")
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  drivers/mmc/core/mmc_ops.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> index d63d1c735335..1f57174b3cf3 100644
> --- a/drivers/mmc/core/mmc_ops.c
> +++ b/drivers/mmc/core/mmc_ops.c
> @@ -21,6 +21,7 @@
> 
>  #define MMC_BKOPS_TIMEOUT_MS           (120 * 1000) /* 120s */
>  #define MMC_SANITIZE_TIMEOUT_MS                (240 * 1000) /* 240s */
> +#define MMC_OP_COND_TIMEOUT_MS         2000 /* 2s */
> 
>  static const u8 tuning_blk_pattern_4bit[] = {
>         0xff, 0x0f, 0xff, 0x00, 0xff, 0xcc, 0xc3, 0xcc,
> @@ -232,7 +233,8 @@ int mmc_send_op_cond(struct mmc_host *host, u32
> ocr, u32 *rocr)
>         cmd.arg = mmc_host_is_spi(host) ? 0 : ocr;
>         cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR;
> 
> -       err = __mmc_poll_for_busy(host, 1000, &__mmc_send_op_cond_cb, &cb_data);
> +       err = __mmc_poll_for_busy(host, MMC_OP_COND_TIMEOUT_MS,
> +                                 &__mmc_send_op_cond_cb, &cb_data);
>         if (err)
>                 return err;
> 
> -- 
> 2.25.1

Hi,

thanks. But testing with this patch still gives the same errors:

[   52.259940] mmc1: Card stuck being busy! __mmc_poll_for_busy
[   52.273380] mmc1: error -110 doing runtime resume

and the system gets stuck eventually.

Regards,
Jean Rene Dawin

  reply	other threads:[~2022-03-02  8:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 20:11 [BUG] mmc: core: adjust polling interval for CMD1 H. Nikolaus Schaller
2022-03-01 13:38 ` Ulf Hansson
2022-03-02  8:20   ` Jean Rene Dawin [this message]
2022-03-02  9:40     ` H. Nikolaus Schaller
2022-03-03 10:40       ` Ulf Hansson
2022-03-04  9:28         ` Jean Rene Dawin
2022-03-04 10:08           ` Ulf Hansson
     [not found] <CGME20220303121624epcas1p3781f58f2f04b0a26c35311aa0f4defcf@epcas1p3.samsung.com>
2022-03-03 12:16 ` Huijin Park
2022-03-04  9:42   ` Jean Rene Dawin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220302082034.GA5723@math.uni-bielefeld.de \
    --to=jdawin@math.uni-bielefeld.de \
    --cc=hns@goldelico.com \
    --cc=huijin.park@samsung.com \
    --cc=letux-kernel@openphoenux.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=tony@atomide.com \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.