public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Sean Nyekjaer <sean@geanix.com>
Cc: s.hauer@pengutronix.de, richard@nod.at, linux-pm@vger.kernel.org,
	han.xu@nxp.com, linux-mtd@lists.infradead.org
Subject: Re: [BUG RESEND] unsuspend failure under load
Date: Mon, 23 Aug 2021 10:18:02 +0200	[thread overview]
Message-ID: <20210823101802.27823313@xps13> (raw)
In-Reply-To: <20210706093541.f4kjxqrkft7fqbcx@skn-laptop>

Hello,

I think we should involve Richard as well in the discussion.

Sean Nyekjaer <sean@geanix.com> wrote on Tue, 6 Jul 2021 11:35:41 +0200:

> On Mon, Jul 05, 2021 at 08:58:38AM +0200, Sean Nyekjaer wrote:
> > Hi Miquel and Sascha,  
> + Richard + linux-pm
> > 
> > We are having some trouble when our i.MX6 unsuspends while writes to
> > ubifs is in progess. In the log it looks like it syncing the filesystem
> > before suspend.
> > 
> > The SoC a i.MX6ul/ull, the issue is (lucky for us) quite easy to reproduce.
> > The reproduce script: [0]
> > Kernel log when it happens: [1]
> > 
> > I have bisected the bug to: ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
> > 
> > Any idea to where I should start looking? or to what happens?
> > 
> > Esben have posted to patches that relates to suspend/unsuspend but it
> > doesn't seem to releated to this issue.
> > 5bc6bb603b4d ("mtd: rawnand: gpmi: Fix suspend/resume problem")
> > d70486668cdf ("mtd: rawnand: gpmi: Restore nfc timing setup after suspend/resume")
> > 
> > /Sean  
> 
> nand_resume() is called some time after ubi_io_write tries to write. Thats why
> mtd_write() is returning -EBUSY.

Just to be sure:
- platform resumes
- your app started a write before being suspended
- the write gets refused because the suspended state has not been
  cleared yet
Am I understanding this issue correctly?

But I would expect "Filesystems sync" to actually let the lower layers
the time to flush the data to the storage devices, suspending without
waiting for this to happen looks strange to me.

Sascha, Richard, what's your input?

> I have tried patch [3], and it seems to fix it.
> I think it would be okay to add the retry option, but the mdelay is not
> obviously a nogo.
> 
> Any idea to how we could wait here for the nand_resume() to be called?
> 
> @linux-pm:
> I have noticed "Filsystems sync" happens before "Freezing user space
> processes".
> If I apply patch [4] (without [3]), it would also fix our issue. But I
> don't have en insight in to what impact the change might have.
> 
> /Sean
> 
> > 
> > [0]
> > #!/bin/sh
> > dd if=/dev/urandom of=/tmp/test50M bs=1M count=50
> > cp /tmp/test50M /data/ &
> > echo mem > /sys/power/state
> > 
> > [1]
> > root@iwg26-v2:/data/root# ./ubicrash.sh
> > 50+0 records in
> > 50+0 records out
> > PM: suspend entry (deep)
> > Filesystems sync: 33.642 seconds
> > Freezing user space processes ... (elapsed 0.004 seconds) done.
> > OOM killer disabled.
> > Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done.
> > printk: Suspending console(s) (use no_console_suspend to debug)
> > <SUSPEND/WAKE>
> > PM: suspend devices took 0.040 seconds
> > Disabling non-boot CPUs ...
> > ubi0 error: ubi_io_write: error -16 while writing 4096 bytes to PEB 544:53248, written 0 bytes
> > CPU: 0 PID: 69 Comm: kworker/u2:2 Not tainted 5.13.0 #3
> > Hardware name: Freescale i.MX6 Ultralite (Device Tree)
> > Workqueue: writeback wb_workfn (flush-ubifs_0_8)
> > [<c010d9b0>] (unwind_backtrace) from [<c010a28c>] (show_stack+0x10/0x14)
> > [<c010a28c>] (show_stack) from [<c0970798>] (dump_stack+0xc0/0xdc)
> > [<c0970798>] (dump_stack) from [<c05dfe10>] (ubi_io_write+0x510/0x6b0)
> > [<c05dfe10>] (ubi_io_write) from [<c05dcd90>] (ubi_eba_write_leb+0x388/0x910)
> > [<c05dcd90>] (ubi_eba_write_leb) from [<c05daf34>] (ubi_leb_write+0xd0/0xe8)
> > [<c05daf34>] (ubi_leb_write) from [<c03cfeb4>] (ubifs_leb_write+0x68/0x104)  
> 
> [ ... ]
> 
> > UBIFS error (ubi0:8 pid 157): make_reservation: cannot reserve 4144 bytes in jhead 2, error -30
> > UBIFS error (ubi0:8 pid 157): do_writepage: cannot write page 10962 of inode 821, error -30
> > UBIFS error (ubi0:8 pid 157): make_reservation: cannot reserve 4144 bytes in jhead 2, error -30
> > UBIFS error (ubi0:8 pid 157): do_writepage: cannot write page 10963 of inode 821, error -30
> > UBIFS error (ubi0:8 pid 157): make_reservation: cannot reserve 696 bytes in jhead 2, error -30
> > UBIFS error (ubi0:8 pid 157): do_writepage: cannot write page 0 of inode 819, error -30
> > UBIFS error (ubi0:8 pid 157): make_reservation: cannot reserve 4144 bytes in jhead 2, error -30  
> 
> [3]:
> diff --git a/drivers/mtd/ubi/io.c b/drivers/mtd/ubi/io.c
> index 14d890b00d2c..b24c571fa022 100644
> --- a/drivers/mtd/ubi/io.c
> +++ b/drivers/mtd/ubi/io.c
> @@ -268,8 +269,18 @@ int ubi_io_write(struct ubi_device *ubi, const void *buf, int pnum, int offset,
>  	}
> 
>  	addr = (loff_t)pnum * ubi->peb_size + offset;
> +retry:
>  	err = mtd_write(ubi->mtd, addr, len, &written, buf);
>  	if (err) {
> +		if (retries++ < UBI_IO_RETRIES) {
> +			ubi_warn(ubi, "error %d while writing %d bytes to PEB %d:%d, written %zd bytes",
> +				 err, len, pnum, offset, written);
> +			mdelay(25); yield();
> +			goto retry;
> +		}
> +
>  		ubi_err(ubi, "error %d while writing %d bytes to PEB %d:%d, written %zd bytes",
>  			err, len, pnum, offset, written);
>  		dump_stack();
> 
> [4]:
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 32391acc806b..61a213ad5a13 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -563,18 +563,18 @@ static int enter_state(suspend_state_t state)
>  	if (state == PM_SUSPEND_TO_IDLE)
>  		s2idle_begin();
>  
> -	if (sync_on_suspend_enabled) {
> -		trace_suspend_resume(TPS("sync_filesystems"), 0, true);
> -		ksys_sync_helper();
> -		trace_suspend_resume(TPS("sync_filesystems"), 0, false);
> -	}
> -
>  	pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]);
>  	pm_suspend_clear_flags();
>  	error = suspend_prepare(state);
>  	if (error)
>  		goto Unlock;
>  
> +	if (sync_on_suspend_enabled) {
> +		trace_suspend_resume(TPS("sync_filesystems"), 0, true);
> +		ksys_sync_helper();
> +		trace_suspend_resume(TPS("sync_filesystems"), 0, false);
> +	}
> +
>  	if (suspend_test(TEST_FREEZER))
>  		goto Finish;
>  

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  parent reply	other threads:[~2021-08-23  8:18 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-05  6:58 [BUG RESEND] mtd: rawnand: gpmi: unsuspend failure under load Sean Nyekjaer
2021-07-06  9:35 ` [BUG RESEND] " Sean Nyekjaer
2021-08-17 17:29   ` Sean Nyekjaer
2021-08-23  8:18   ` Miquel Raynal [this message]
2021-08-24  7:53     ` Sean Nyekjaer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210823101802.27823313@xps13 \
    --to=miquel.raynal@bootlin.com \
    --cc=han.xu@nxp.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=richard@nod.at \
    --cc=s.hauer@pengutronix.de \
    --cc=sean@geanix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox