Trouble with new marvell_nand driver on PXA3xx

linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* Trouble with new marvell_nand driver on PXA3xx
@ 2018-09-24  6:45 Daniel Mack
  2018-09-24  7:20 ` Miquel Raynal
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Daniel Mack @ 2018-09-24  6:45 UTC (permalink / raw)
  To: Miquel Raynal, Boris Brezillon, linux-mtd
  Cc: linux-mtd@lists.infradead.org, Chris Packham

Hi Miquel,

I'm having issues using the new marvell_nand driver on a PXA3xx based 
platform. My test does a ubiformat on the chip, then creates a volume, 
mounts it and runs bonnie++ on the file system. After some time (usually 
less than half a minute), the driver spits out a warning like the one 
below, and eventually the UBI layer bails out, which leads to a r/o 
remount and (possibly) file system corruptions.

FWIW, this is the test script I'm using:

 > #!/bin/sh
 >
 > UBIDEV=0
 > UBIMTD=3
 >
 > umount /mnt
 > ubidetach /dev/ubi_ctrl -d $UBIDEV
 > ubiformat -y /dev/mtd$UBIMTD
 > ubiattach /dev/ubi_ctrl -d $UBIDEV -m $UBIMTD
 > ubimkvol /dev/ubi$UBIDEV -N test -m
 > mount -t ubifs ubi0:test /mnt
 > bonnie\+\+ -d /mnt -u 0:0


The legacy pxa3xx_nand driver didn't have this issue, but my system was 
also running a much older kernel with that. I'm currently still 
struggling to resurrect the old code, but I'm running into "Wait time 
out!!!" conditions immediately right now. Not sure what's going on.

Interestingly, I can't seem to reproduce the bug with any of the mtd 
kernel tests, I've tried all of them, several times, and all succeed. So 
a file system test that includes the UBI/UBIFS layers seems to trigger 
different things in the driver than the the tests that operate on the 
mtd device directly.

I'v also tried this with and without the keep-config DT property, but 
that didn't change anything.

Could you try my script on some other device that runs the new driver 
and see if you can reproduce? If bonnie++ is unavailable, extracting a 
bigger tarball a couple of times will also trigger the bug at some point.

Meanwhile, I can start poking around in the driver. I'd be grateful for 
a hint on where to start.


Thanks,
Daniel


/ # bonnie\+\+ -d /mnt -u 0:0
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...[ 1290.458167] marvell-nfc 
43100000.nand-controller: Timeout waiting for RB signal
[ 1290.465532] ubi0 error: ubi_io_write: error -110 while writing 2048 
bytes to PEB 905:67584, written 0 bytes
[ 1290.475325] CPU: 0 PID: 1340 Comm: bonnie++ Not tainted 4.19.0-rc4+ #438
[ 1290.482055] Hardware name: Marvell PXA3xx (Device Tree Support)
[ 1290.487934] Backtrace:
[ 1290.490488] [<c0106120>] (dump_backtrace) from [<c01063dc>] 
(show_stack+0x18/0x1c)
[ 1290.498092]  r6:00000389 r5:00000000 r4:07130800 r3:2ebc2099
[ 1290.503745] [<c01063c4>] (show_stack) from [<c0686a64>] 
(dump_stack+0x20/0x28)
[ 1290.511013] [<c0686a44>] (dump_stack) from [<c045cf08>] 
(ubi_io_write+0x418/0x6bc)
[ 1290.518637] [<c045caf0>] (ubi_io_write) from [<c0459f80>] 
(ubi_eba_write_leb+0xc0/0x6f8)
[ 1290.526692]  r10:00000000 r9:c6769600 r8:00000104 r7:00000104 
r6:c686ec00 r5:c602d000
[ 1290.534546]  r4:00000000
[ 1290.537088] [<c0459ec0>] (ubi_eba_write_leb) from [<c04587f8>] 
(ubi_leb_write+0xc4/0xdc)
[ 1290.545209]  r10:00000000 r9:c6769600 r8:00000800 r7:00000104 
r6:00000080 r5:c6754000
[ 1290.553061]  r4:000007ff
[ 1290.555606] [<c0458734>] (ubi_leb_write) from [<c02d9420>] 
(ubifs_leb_write+0x88/0xf8)
[ 1290.563552]  r6:0000f800 r5:c6029000 r4:c6754000
[ 1290.568235] [<c02d9398>] (ubifs_leb_write) from [<c02da5b0>] 
(ubifs_wbuf_write_nolock+0x328/0x704)
[ 1290.577141]  r8:00000188 r7:c6beddc0 r6:00000188 r5:c6029000 r4:c67b5480
[ 1290.583909] [<c02da288>] (ubifs_wbuf_write_nolock) from [<c02cd260>] 
(write_head.constprop.1+0x3c/0x5c)
[ 1290.593333]  r10:00000000 r9:c6769648 r8:c6ce4000 r7:c6beddc0 
r6:c38809f0 r5:c6769600
[ 1290.601177]  r4:c67b5480
[ 1290.603720] [<c02cd224>] (write_head.constprop.1) from [<c02cd5dc>] 
(ubifs_jnl_update+0x35c/0x5ec)
[ 1290.612690]  r4:c6029000 r3:c6bedd48
[ 1290.616261] [<c02cd280>] (ubifs_jnl_update) from [<c02d3ce0>] 
(ubifs_create+0x134/0x1ec)
[ 1290.624384]  r10:c6ce4180 r9:c6ce4168 r8:c38767f8 r7:c6029000 
r6:c38809f0 r5:00000000
[ 1290.632238]  r4:c6ce4000
[ 1290.634785] [<c02d3bac>] (ubifs_create) from [<c01d0ba0>] 
(path_openat+0x770/0xe3c)
[ 1290.642475]  r10:c6ce4000 r9:c38767f8 r8:00000241 r7:c38767f8 
r6:00000000 r5:c666c280
[ 1290.650321]  r4:c6bede98
[ 1290.652858] [<c01d0430>] (path_openat) from [<c01d12b8>] 
(do_filp_open+0x4c/0xb0)
[ 1290.660376]  r10:00020000 r9:c6bec000 r8:c6068000 r7:00000001 
r6:c6bedf50 r5:c0a03008
[ 1290.668221]  r4:00000004
[ 1290.670761] [<c01d126c>] (do_filp_open) from [<c01befac>] 
(do_sys_open+0x124/0x1e0)
[ 1290.678447]  r7:00000241 r6:ffffff9c r5:c0a03008 r4:00000004
[ 1290.684086] [<c01bee88>] (do_sys_open) from [<c01bf0dc>] 
(sys_creat+0x28/0x30)
[ 1290.691351]  r10:00020000 r9:c6bec000 r8:c01011e4 r7:00000008 
r6:00000000 r5:0000f6c4
[ 1290.699204]  r4:b6c1f18c
[ 1290.701743] [<c01bf0b4>] (sys_creat) from [<c0101000>] 
(ret_fast_syscall+0x0/0x50)
[ 1290.709338] Exception stack(0xc6bedfa8 to 0xc6bedff0)
[ 1290.714367] dfa0:                   b6c1f18c 0000f6c4 b6c1f18c 
00000180 00000064 00000000
[ 1290.722566] dfc0: b6c1f18c 0000f6c4 00000000 00000008 beb2d8bc 
0002da58 00000000 0002c170
[ 1290.730768] dfe0: 00028010 beb2d850 00013f3c b6d20648
[ 1290.736257] ubi0: dumping 2048 bytes of data from PEB 905, offset 67584
...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  6:45 Trouble with new marvell_nand driver on PXA3xx Daniel Mack
@ 2018-09-24  7:20 ` Miquel Raynal
  2018-09-24  8:14   ` Daniel Mack
  2018-09-24 19:57   ` Daniel Mack
  2018-09-24  8:09 ` Boris Brezillon
  2018-09-26 21:19 ` Daniel Mack
  2 siblings, 2 replies; 8+ messages in thread
From: Miquel Raynal @ 2018-09-24  7:20 UTC (permalink / raw)
  To: Daniel Mack; +Cc: Boris Brezillon, linux-mtd, Chris Packham

Hi Daniel,

Daniel Mack <daniel@zonque.org> wrote on Mon, 24 Sep 2018 08:45:44
+0200:

> Hi Miquel,
> 
> I'm having issues using the new marvell_nand driver on a PXA3xx based platform. My test does a ubiformat on the chip, then creates a volume, mounts it and runs bonnie++ on the file system. After some time (usually less than half a minute), the driver spits out a warning like the one below, and eventually the UBI layer bails out, which leads to a r/o remount and (possibly) file system corruptions.
> 
> FWIW, this is the test script I'm using:
> 
>  > #!/bin/sh
>  >
>  > UBIDEV=0
>  > UBIMTD=3
>  >
>  > umount /mnt
>  > ubidetach /dev/ubi_ctrl -d $UBIDEV
>  > ubiformat -y /dev/mtd$UBIMTD
>  > ubiattach /dev/ubi_ctrl -d $UBIDEV -m $UBIMTD
>  > ubimkvol /dev/ubi$UBIDEV -N test -m
>  > mount -t ubifs ubi0:test /mnt
>  > bonnie\+\+ -d /mnt -u 0:0  
> 
> 
> The legacy pxa3xx_nand driver didn't have this issue, but my system was also running a much older kernel with that. I'm currently still struggling to resurrect the old code, but I'm running into "Wait time out!!!" conditions immediately right now. Not sure what's going on.
> 
> Interestingly, I can't seem to reproduce the bug with any of the mtd kernel tests, I've tried all of them, several times, and all succeed. So a file system test that includes the UBI/UBIFS layers seems to trigger different things in the driver than the the tests that operate on the mtd device directly.
> 
> I'v also tried this with and without the keep-config DT property, but that didn't change anything.
> 
> Could you try my script on some other device that runs the new driver and see if you can reproduce? If bonnie++ is unavailable, extracting a bigger tarball a couple of times will also trigger the bug at some point.
> 
> Meanwhile, I can start poking around in the driver. I'd be grateful for a hint on where to start.
> 

Interesting, thanks for the feedback.

Right now I have no idea of what happens, but you might want to add a
dump_stack() at the "Timeout waiting for RB signal" error to see what
path in the driver failed.

You might also try with and without DMA?

I'll try to take the time this week to check if it is pxa-related by
testing this on an armada board.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  6:45 Trouble with new marvell_nand driver on PXA3xx Daniel Mack
  2018-09-24  7:20 ` Miquel Raynal
@ 2018-09-24  8:09 ` Boris Brezillon
  2018-09-24  8:30   ` Daniel Mack
  2018-09-26 21:19 ` Daniel Mack
  2 siblings, 1 reply; 8+ messages in thread
From: Boris Brezillon @ 2018-09-24  8:09 UTC (permalink / raw)
  To: Daniel Mack; +Cc: Miquel Raynal, linux-mtd, Chris Packham

Hi Daniel,

On Mon, 24 Sep 2018 08:45:44 +0200
Daniel Mack <daniel@zonque.org> wrote:

> Hi Miquel,
> 
> I'm having issues using the new marvell_nand driver on a PXA3xx based 
> platform. My test does a ubiformat on the chip, then creates a volume, 
> mounts it and runs bonnie++ on the file system. After some time (usually 
> less than half a minute), the driver spits out a warning like the one 
> below, and eventually the UBI layer bails out, which leads to a r/o 
> remount and (possibly) file system corruptions.
> 
> FWIW, this is the test script I'm using:
> 
>  > #!/bin/sh
>  >
>  > UBIDEV=0
>  > UBIMTD=3
>  >
>  > umount /mnt
>  > ubidetach /dev/ubi_ctrl -d $UBIDEV
>  > ubiformat -y /dev/mtd$UBIMTD
>  > ubiattach /dev/ubi_ctrl -d $UBIDEV -m $UBIMTD
>  > ubimkvol /dev/ubi$UBIDEV -N test -m
>  > mount -t ubifs ubi0:test /mnt
>  > bonnie\+\+ -d /mnt -u 0:0  
> 
> 
> The legacy pxa3xx_nand driver didn't have this issue, but my system was 
> also running a much older kernel with that. I'm currently still 
> struggling to resurrect the old code, but I'm running into "Wait time 
> out!!!" conditions immediately right now. Not sure what's going on.

Hm, so that means the old driver has pretty much the same issue.

> 
> Interestingly, I can't seem to reproduce the bug with any of the mtd 
> kernel tests, I've tried all of them, several times, and all succeed. So 
> a file system test that includes the UBI/UBIFS layers seems to trigger 
> different things in the driver than the the tests that operate on the 
> mtd device directly.

Looking at the backtrace, it seems to fail on a high PEB num. Are you
interfacing with a dual-die chip? Can you share the part number of your
chip?

> 
> I'v also tried this with and without the keep-config DT property, but 
> that didn't change anything.
> 
> Could you try my script on some other device that runs the new driver 
> and see if you can reproduce? If bonnie++ is unavailable, extracting a 
> bigger tarball a couple of times will also trigger the bug at some point.
> 
> Meanwhile, I can start poking around in the driver. I'd be grateful for 
> a hint on where to start.

You can try to run the mtd tests on eraseblock 905, just to check if
they pass or not. Also, when you run the ubi/ubifs/bonnie++ tests, does
it always fail on the same PEB?

Regards,

Boris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  7:20 ` Miquel Raynal
@ 2018-09-24  8:14   ` Daniel Mack
  2018-09-24 19:57   ` Daniel Mack
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Mack @ 2018-09-24  8:14 UTC (permalink / raw)
  To: Miquel Raynal; +Cc: Boris Brezillon, linux-mtd, Chris Packham

Hi Miquel,

On 24/9/2018 9:20 AM, Miquel Raynal wrote:
> Daniel Mack <daniel@zonque.org> wrote on Mon, 24 Sep 2018 08:45:44 
> +0200:
>> I'm having issues using the new marvell_nand driver on a PXA3xx
>> based platform. My test does a ubiformat on the chip, then creates
>> a volume, mounts it and runs bonnie++ on the file system. After
>> some time (usually less than half a minute), the driver spits out a
>> warning like the one below, and eventually the UBI layer bails out,
>> which leads to a r/o remount and (possibly) file system
>> corruptions.
>> 

[...]

> Interesting, thanks for the feedback.
> 
> Right now I have no idea of what happens, but you might want to add
> a dump_stack() at the "Timeout waiting for RB signal" error to see
> what path in the driver failed.

Sure, will do that tonight.

> You might also try with and without DMA?

I did that, and disabling DMA didn't help.

> I'll try to take the time this week to check if it is pxa-related by 
> testing this on an armada board.

That'd help, yes.


Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  8:09 ` Boris Brezillon
@ 2018-09-24  8:30   ` Daniel Mack
  2018-09-24  9:04     ` Boris Brezillon
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Mack @ 2018-09-24  8:30 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: Miquel Raynal, linux-mtd, Chris Packham

Hi Boris,

On 24/9/2018 10:09 AM, Boris Brezillon wrote:
> On Mon, 24 Sep 2018 08:45:44 +0200
> Daniel Mack <daniel@zonque.org> wrote:

[...]

>> The legacy pxa3xx_nand driver didn't have this issue, but my system was
>> also running a much older kernel with that. I'm currently still
>> struggling to resurrect the old code, but I'm running into "Wait time
>> out!!!" conditions immediately right now. Not sure what's going on.
> 
> Hm, so that means the old driver has pretty much the same issue.

I thought so too, but unlike the new driver, the old one bails out 
pretty much immediately, and also when using the in-kernel mtd tests, so 
the behavior seems different. Not sure if that's really the same effect.

>> Interestingly, I can't seem to reproduce the bug with any of the mtd
>> kernel tests, I've tried all of them, several times, and all succeed. So
>> a file system test that includes the UBI/UBIFS layers seems to trigger
>> different things in the driver than the the tests that operate on the
>> mtd device directly.
> 
> Looking at the backtrace, it seems to fail on a high PEB num. Are you
> interfacing with a dual-die chip? Can you share the part number of your
> chip?

The chip is a NAND01GR3B2BZA which is identified like this during probe:

[    3.980817] nand: device found, Manufacturer ID: 0x20, Chip ID: 0xa1
[    3.988736] nand: ST Micro NAND 128MiB 1,8V 8-bit
[    3.994644] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, 
OOB size: 64
[    4.003021] marvell-nfc 43100000.nand-controller: No minimum ECC 
strength, using 1b/512B
[    4.011219] Scanning device for bad blocks
[    4.128042] Bad eraseblock 399 at 0x0000031e0000
[    4.168843] Bad eraseblock 528 at 0x000004200000
[    4.174076] Bad eraseblock 529 at 0x000004220000

Note that the hardware design is now almost 10 years old.

> You can try to run the mtd tests on eraseblock 905, just to check if
> they pass or not.

Will do when I'm back on that board, but just for my better 
understanding: aren't the tests running on all eraseblocks anyway? Would 
it make a difference to just let it run on a specific one?

> Also, when you run the ubi/ubifs/bonnie++ tests, does
> it always fail on the same PEB?

Nope. My backlog shows the issue for PEB 465, 572, 569, 586, 612, 729, 
905, 962 etc.


Thanks for your help,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  8:30   ` Daniel Mack
@ 2018-09-24  9:04     ` Boris Brezillon
  0 siblings, 0 replies; 8+ messages in thread
From: Boris Brezillon @ 2018-09-24  9:04 UTC (permalink / raw)
  To: Daniel Mack; +Cc: Miquel Raynal, linux-mtd, Chris Packham

On Mon, 24 Sep 2018 10:30:47 +0200
Daniel Mack <daniel@zonque.org> wrote:

> >> Interestingly, I can't seem to reproduce the bug with any of the mtd
> >> kernel tests, I've tried all of them, several times, and all succeed. So
> >> a file system test that includes the UBI/UBIFS layers seems to trigger
> >> different things in the driver than the the tests that operate on the
> >> mtd device directly.  
> > 
> > Looking at the backtrace, it seems to fail on a high PEB num. Are you
> > interfacing with a dual-die chip? Can you share the part number of your
> > chip?  
> 
> The chip is a NAND01GR3B2BZA which is identified like this during probe:

Okay, so it's definitely not caused by a missing R/B pin on a multi-die
chip (this chip only has one die).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  7:20 ` Miquel Raynal
  2018-09-24  8:14   ` Daniel Mack
@ 2018-09-24 19:57   ` Daniel Mack
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Mack @ 2018-09-24 19:57 UTC (permalink / raw)
  To: Miquel Raynal; +Cc: Boris Brezillon, linux-mtd, Chris Packham

Hi Miquel,

On 24/9/2018 9:20 AM, Miquel Raynal wrote:
> Right now I have no idea of what happens, but you might want to add a
> dump_stack() at the "Timeout waiting for RB signal" error to see what
> path in the driver failed.

I've triggered it several times now, and all traces show 
marvell_nfc_hw_ecc_hmg_write_page() as a stack parent.

So it seems to affect the writing routines only.


Any idea yet?

Thanks,
Daniel


> [  365.951351] WARNING: CPU: 0 PID: 1305 at drivers/mtd/nand/raw/marvell_nand.c:629 marvell_nfc_wait_op+0x88/0xb8
> [  365.961375] Modules linked in: pxamci
> [  365.965047] CPU: 0 PID: 1305 Comm: bonnie++ Not tainted 4.19.0-rc5+ #445
> [  365.971768] Hardware name: Marvell PXA3xx (Device Tree Support)
> [  365.977712] Backtrace: 
> [  365.980192] [<c0106120>] (dump_backtrace) from [<c01063dc>] (show_stack+0x18/0x1c)
> [  365.987789]  r6:00000000 r5:c07eb850 r4:00000000 r3:c6923b63
> [  365.993437] [<c01063c4>] (show_stack) from [<c0686c04>] (dump_stack+0x20/0x28)
> [  366.000716] [<c0686be4>] (dump_stack) from [<c0112054>] (__warn+0xe0/0x10c)
> [  366.007724] [<c0111f74>] (__warn) from [<c011219c>] (warn_slowpath_null+0x44/0x50)
> [  366.015249]  r9:89705f41 r8:36b4a597 r7:00000042 r6:c07eb850 r5:00000275 r4:c0450a48
> [  366.023033] [<c0112158>] (warn_slowpath_null) from [<c0450a48>] (marvell_nfc_wait_op+0x88/0xb8)
> [  366.031754]  r6:c64a8c90 r5:c64a8c70 r4:00000000
> [  366.036362] [<c04509c0>] (marvell_nfc_wait_op) from [<c0450e78>] (marvell_nfc_hw_ecc_hmg_do_write_page+0x19c/0x1cc)
> [  366.046801]  r7:c0a03008 r6:c64a8c70 r5:00000028 r4:c6610010
> [  366.052438] [<c0450cdc>] (marvell_nfc_hw_ecc_hmg_do_write_page) from [<c0450f38>] (marvell_nfc_hw_ecc_hmg_write_page+0x3c/0x54)
> [  366.063912]  r10:c674a000 r9:00000800 r8:00000000 r7:c674a000 r6:c68bbb4c r5:c674a000
> [  366.071753]  r4:c6610010
> [  366.074304] [<c0450efc>] (marvell_nfc_hw_ecc_hmg_write_page) from [<c0444d7c>] (nand_do_write_ops+0x3a0/0x3ec)
> [  366.084304]  r5:00000800 r4:c6610010
> [  366.087948] [<c04449dc>] (nand_do_write_ops) from [<c04467ac>] (nand_write_oob+0x68/0x84)
> [  366.096080]  r10:c0a03008 r9:c674a000 r8:c68bbbc4 r7:00000000 r6:05d1e000 r5:c6610010
> [  366.104243]  r4:c68bbb4c
> [  366.106896] [<c0446744>] (nand_write_oob) from [<c043908c>] (part_write_oob+0x38/0x40)
> [  366.114770]  r7:00000000 r6:05bfe000 r5:00000000 r4:00120000
> [  366.120492] [<c0439054>] (part_write_oob) from [<c0435a70>] (mtd_write+0xdc/0x12c)
> [  366.128087]  r5:c661dc00 r4:00000800
> [  366.131670] [<c0435994>] (mtd_write) from [<c045d074>] (ubi_io_write+0x3e8/0x6bc)
> [  366.139178]  r10:0001e000 r9:00000000 r8:0001e000 r7:00000000 r6:000002df r5:00000000
> [  366.147025]  r4:05bfe000
> [  366.149561] [<c045cc8c>] (ubi_io_write) from [<c045a11c>] (ubi_eba_write_leb+0xc0/0x6f8)
> [  366.157672]  r10:00000000 r9:c68c0600 r8:0000002e r7:0000002e r6:c6755c00 r5:c664f000
> [  366.165444]  r4:00000000
> [  366.168042] [<c045a05c>] (ubi_eba_write_leb) from [<c0458994>] (ubi_leb_write+0xc4/0xdc)
> [  366.176090]  r10:00000000 r9:c68c0600 r8:00000800 r7:0000002e r6:00000080 r5:c674a000
> [  366.183928]  r4:000007ff
> [  366.186463] [<c04588d0>] (ubi_leb_write) from [<c02d9420>] (ubifs_leb_write+0x88/0xf8)
> [  366.194394]  r6:0001d000 r5:c664a000 r4:c674a000
> [  366.199061] [<c02d9398>] (ubifs_leb_write) from [<c02da5b0>] (ubifs_wbuf_write_nolock+0x328/0x704)
> [  366.208035]  r8:00000190 r7:c68bbdc0 r6:00000190 r5:c664a000 r4:c674fa80
> [  366.214730] [<c02da288>] (ubifs_wbuf_write_nolock) from [<c02cd260>] (write_head.constprop.1+0x3c/0x5c)
> [  366.224132]  r10:00000000 r9:c68c0650 r8:c18d8848 r7:c68bbdc0 r6:c19826a0 r5:c68c0600
> [  366.231973]  r4:c674fa80
> [  366.234508] [<c02cd224>] (write_head.constprop.1) from [<c02cd5dc>] (ubifs_jnl_update+0x35c/0x5ec)
> [  366.243470]  r4:c664a000 r3:c68bbd48
> [  366.247109] [<c02cd280>] (ubifs_jnl_update) from [<c02d3ce0>] (ubifs_create+0x134/0x1ec)
> [  366.255154]  r10:c18d89c8 r9:c18d89b0 r8:c19733b8 r7:c664a000 r6:c19826a0 r5:00000000
> [  366.262989]  r4:c18d8848
> [  366.265529] [<c02d3bac>] (ubifs_create) from [<c01d0ba0>] (path_openat+0x770/0xe3c)
> [  366.273214]  r10:c18d8848 r9:c19733b8 r8:00000241 r7:c19733b8 r6:00000000 r5:c666a640
> [  366.281054]  r4:c68bbe98
> [  366.283586] [<c01d0430>] (path_openat) from [<c01d12b8>] (do_filp_open+0x4c/0xb0)
> [  366.291098]  r10:00020000 r9:c68ba000 r8:c6068000 r7:00000001 r6:c68bbf50 r5:c0a03008
> [  366.298939]  r4:00000004
> [  366.301472] [<c01d126c>] (do_filp_open) from [<c01befac>] (do_sys_open+0x124/0x1e0)
> [  366.309150]  r7:00000241 r6:ffffff9c r5:c0a03008 r4:00000004
> [  366.314778] [<c01bee88>] (do_sys_open) from [<c01bf0dc>] (sys_creat+0x28/0x30)
> [  366.322026]  r10:00020000 r9:c68ba000 r8:c01011e4 r7:00000008 r6:00000000 r5:00009f74
> [  366.329869]  r4:00052c48
> [  366.332400] [<c01bf0b4>] (sys_creat) from [<c0101000>] (ret_fast_syscall+0x0/0x50)
> [  366.339980] Exception stack(0xc68bbfa8 to 0xc68bbff0)
> [  366.345009] bfa0:                   00052c48 00009f74 00052c48 00000180 00000064 00000000
> [  366.353205] bfc0: 00052c48 00009f74 00000000 00000008 beba98bc 0002da58 00000000 0002c170
> [  366.361396] bfe0: 00028010 beba9850 00013f3c b6cfe648
> [  366.366421] ---[ end trace c36e65bc21373d32 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Trouble with new marvell_nand driver on PXA3xx
  2018-09-24  6:45 Trouble with new marvell_nand driver on PXA3xx Daniel Mack
  2018-09-24  7:20 ` Miquel Raynal
  2018-09-24  8:09 ` Boris Brezillon
@ 2018-09-26 21:19 ` Daniel Mack
  2 siblings, 0 replies; 8+ messages in thread
From: Daniel Mack @ 2018-09-26 21:19 UTC (permalink / raw)
  To: Miquel Raynal, Boris Brezillon, linux-mtd; +Cc: Chris Packham

On 24/9/2018 8:45 AM, Daniel Mack wrote:
> I'm having issues using the new marvell_nand driver on a PXA3xx based
> platform. My test does a ubiformat on the chip, then creates a volume,
> mounts it and runs bonnie++ on the file system. After some time (usually
> less than half a minute), the driver spits out a warning like the one
> below, and eventually the UBI layer bails out, which leads to a r/o
> remount and (possibly) file system corruptions.

Okay, I think I figured it out. What happens is that after the 
controller has been sent a command to process it, it sets the RDY bits 
before the driver ends up in that marvell_nfc_wait_op(). Enabling the 
interrupt for RDY does not latch it if the RDY bit already set, so the 
condition wait times out. IOW, it's a race - the controller is faster 
than than the driver expects it to be.

I'll send a patch to fix that. With this applied, I wasn't able to 
reproduce the issue anymore. It could possibly be fixed by other means 
too, so I'm open for discussions and I can also test other approaches 
easily. Let me know what you think.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-26 21:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-24  6:45 Trouble with new marvell_nand driver on PXA3xx Daniel Mack
2018-09-24  7:20 ` Miquel Raynal
2018-09-24  8:14   ` Daniel Mack
2018-09-24 19:57   ` Daniel Mack
2018-09-24  8:09 ` Boris Brezillon
2018-09-24  8:30   ` Daniel Mack
2018-09-24  9:04     ` Boris Brezillon
2018-09-26 21:19 ` Daniel Mack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).