public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [MX25][MMC] mmc esdhc failure in 3.3-rc5 (still present in rc7)
       [not found] <5e57eb999780335721212bb8d411406f@mail.fqingenieria.es>
@ 2012-03-12 13:12 ` joancarles
       [not found] ` <20120312132423.GC2459@pengutronix.de>
  1 sibling, 0 replies; 18+ messages in thread
From: joancarles @ 2012-03-12 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

On Tue, 06 Mar 2012 11:29:50 +0000, joancarles wrote:
> Hi
>
> I have the following error when trying to access the SD-card on my 
> MX25
> device:
>
> root at mx25-noah:~$> mount /media/card/
> root at mx25-noah:~$> df -kh /media/card
> Filesystem                Size      Used Available Use% Mounted on
> /dev/mmcblk0p1          123.6M      2.7M    120.9M   2% /media/card
> root at mx25-noah:~$> touch /media/card/file
> [   82.982920] mmcblk0: error -110 transferring data, sector 559, nr 
> 1,
> cmd response 0x900, card status 0x0
> [   82.993544] end_request: I/O error, dev mmcblk0, sector 559
> [   82.999169] Buffer I/O error on device mmcblk0p1, logical block 
> 497
> [   83.005458] lost page write due to I/O error on mmcblk0p1
> [   83.013922] mmcblk0: error -84 transferring data, sector 559, nr 
> 1, cmd
> response 0x900, card status 0x0
> [   83.024114] end_request: I/O error, dev mmcblk0, sector 559
> [   83.029965] FAT-fs (mmcblk0p1): Directory bread(block 497) failed
> [   83.038051] mmcblk0: error -84 transferring data, sector 560, nr 
> 1, cmd
> response 0x900, card status 0x0
> [   83.048579] end_request: I/O error, dev mmcblk0, sector 560
> [   83.054205] Buffer I/O error on device mmcblk0p1, logical block 
> 498
> [   83.060494] lost page write due to I/O error on mmcblk0p1
> touch: /media/card/file: Input/output error
>
> I have compiled the kernel 3.3-rc5 with driver core debug support and 
> I
> get the following information:
>
> root at mx25-noah:~$> dmesg | grep -i mmc
> [    0.228098] bus: 'mmc': registered
> [    0.228139] device class 'mmc_host': registering
> [    1.405180] bus: 'mmc': add driver mmcblk
> [    1.429969] device: 'mmc0::': device_add
> [    1.430249] PM: Adding info for No Bus:mmc0::
> [    1.430837] Registered led device: mmc0::
> [    1.430884] device: 'mmc0': device_add
> [    1.431114] PM: Adding info for No Bus:mmc0
> [    1.432953] mmc0: SDHCI controller on sdhci-esdhc-imx25.0
> [sdhci-esdhc-imx25.0] using DMA
> [    1.771322] mmc0: host does not support reading read-only switch.
> assuming write-enable.
> [    1.798584] mmc0: new high speed SD card at address 0002
> [    1.804018] device: 'mmc0:0002': device_add
> [    1.804283] bus: 'mmc': add device mmc0:0002
> [    1.804445] PM: Adding info for mmc:mmc0:0002
> [    1.815198] bus: 'mmc': driver_probe_device: matched device 
> mmc0:0002
> with driver mmcblk
> [    1.815256] bus: 'mmc': really_probe: probing driver mmcblk with 
> device
> mmc0:0002
> [    1.825154] mmcblk0: mmc0:0002 00000 1.86 GiB
> [    1.835076] device: 'mmcblk0': device_add
> [    1.835680] PM: Adding info for No Bus:mmcblk0
> [    1.849286]  mmcblk0: p1 p2
> [    1.852252] device: 'mmcblk0p1': device_add
> [    1.855232] PM: Adding info for No Bus:mmcblk0p1
> [    1.855371] device: 'mmcblk0p2': device_add
> [    1.855874] PM: Adding info for No Bus:mmcblk0p2
> [    1.866667] driver: 'mmc0:0002': driver_bound: bound to device 
> 'mmcblk'
> [    1.866720] bus: 'mmc': really_probe: bound device mmc0:0002 to 
> driver
> mmcblk
> [   82.982920] mmcblk0: error -110 transferring data, sector 559, nr 
> 1,
> cmd response 0x900, card status 0x0
> [   82.993544] end_request: I/O error, dev mmcblk0, sector 559
> [   82.999169] Buffer I/O error on device mmcblk0p1, logical block 
> 497
> [   83.005458] lost page write due to I/O error on mmcblk0p1
> [   83.013922] mmcblk0: error -84 transferring data, sector 559, nr 
> 1, cmd
> response 0x900, card status 0x0
> [   83.024114] end_request: I/O error, dev mmcblk0, sector 559
> [   83.029965] FAT-fs (mmcblk0p1): Directory bread(block 497) failed
> [   83.038051] mmcblk0: error -84 transferring data, sector 560, nr 
> 1, cmd
> response 0x900, card status 0x0
> [   83.048579] end_request: I/O error, dev mmcblk0, sector 560
> [   83.054205] Buffer I/O error on device mmcblk0p1, logical block 
> 498
> [   83.060494] lost page write due to I/O error on mmcblk0p1
>
> The same device booted into a 2.6.39.3 kernel works just fine. The
> initialization of the esdhc driver in the device section is as 
> follows:
>
> imx25_add_sdhci_esdhc_imx(0, NULL);
>
> Changing it to a pdata init as shown below does not change anything 
> with
> regard to the failure:
>
> imx25_add_sdhci_esdhc_imx(0, &noah_esdhc_pdata);
> static const struct esdhc_platform_data noah_esdhc_pdata __initconst 
> = {
>         .wp_gpio = -EINVAL,
>         .cd_gpio = -EINVAL,
>         .wp_type = ESDHC_WP_NONE,
>         .cd_type = ESDHC_CD_NONE,
> };
>
> What could have changed between 2.6.39.3 and 3.3-rc5 to trigger this
> behaviour? A quick look at
>
> git diff v2.6.39 drivers/mmc/host/sdhci-esdhc-imx.c
>
> only confuses me with regard to this issue.
>
> Regards

This is still present in 3.3-rc7. Anyone? Where could I start debugging 
this? The same device works just fine with the 2.6.39.3 kernel based 
sdhci driver (drivers/mmc/host/sdhci-esdhc-imx.c). Unfortunately, due to 
out-of-tree device support for a custom i.MX25 board that I have started 
with a fresh 3.3-rc2 git tree, I cannot bisect the tree down to 2.6.39, 
or at least I don't know how.

Best regards
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
       [not found]       ` <a06e68c3687c7390e15950bc20831fd4@mail.fqingenieria.es>
@ 2012-03-27  7:32         ` joancarles
  2012-03-27  7:59           ` joancarles
  2012-03-27  8:12           ` Eric Bénard
  0 siblings, 2 replies; 18+ messages in thread
From: joancarles @ 2012-03-27  7:32 UTC (permalink / raw)
  To: linux-arm-kernel

G'day

This is still present, and we have absolutely no idea where to continue 
searching for the solution of the problem. Any pointers? Any git trees 
of mmc driver changes we could try? Anybody else with a MX25 that has 
this issue?

>> Hi:
>> -110 is the timeout problem.
>> You can dump the esdhc registers if you want to debug into this 
>> issue
>> when this timeout problem occurred.
>> The register dump can help to figure out what's wrong on the bus.
>
> I have no familiarity with the SDHC drivers, so there's nothing
> useful for me in the output as of this moment:
>
> [ 1080.818765] sdhci: ===========================================
> [ 1080.818830] mmc0: req done (CMD17): 0: 00000900 00000000 00000000 
> 00000000
> [ 1080.818862] mmc0:     512 bytes transferred: 0
> [ 1080.819395] mmc0: starting CMD24 arg 00008800 flags 000000b5
> [ 1080.819444] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 
> ms nsac 0
> [ 1080.819544] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000009
> [ 1080.819595] sdhci [sdhci_data_irq()]: mmc0: DMA base 0x82547c00,
> transferred 0x038400 bytes, next 0x82580000
> [ 1080.819630] sdhci: =========== REGISTER DUMP (mmc0)===========
> [ 1080.819658] sdhci: Sys addr: 0x82547e00 | Version:  0x00000000
> [ 1080.819683] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
> [ 1080.819707] sdhci: Argument: 0x00008800 | Trn mode: 0x00000003
> [ 1080.819732] sdhci: Present:  0xfe880006 | Host ctl: 0x00000023
> [ 1080.819756] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
> [ 1080.819780] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
> [ 1080.819804] sdhci: Timeout:  0x0000000c | Int stat: 0x00100002
> [ 1080.819830] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
> [ 1080.819854] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
> [ 1080.819879] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
> [ 1080.819904] sdhci: Cmd:      0x0000183a | Max curr: 0x00000000
> [ 1080.819926] sdhci: Host ctl2: 0x00000000
> [ 1080.819942] sdhci: ===========================================
> [ 1080.819985] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00100002
> [ 1080.820048] mmc0: req done (CMD24): 0: 00000900 00000000 00000000 
> 00000000
> [ 1080.820082] mmc0:     0 bytes transferred: -110
> [ 1080.820134] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.820201] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.820259] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.820330] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.820391] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.820445] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.820504] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.820572] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.820627] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.820785] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.820879] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.820946] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.821021] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.821085] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.821140] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.821197] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.821255] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.821308] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.824944] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.825041] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.825112] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.825186] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.825248] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.825302] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.825364] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.825432] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.825485] mmc0: req done (CMD13): 0: 00000e00 00000000 00000000 
> 00000000
> [ 1080.825640] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.825736] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.825803] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 
> 00000000
> [ 1080.825883] mmcblk0: error -110 transferring data, sector 68, nr
> 1, cmd response 0x900, card status 0x0
> [ 1080.849435] mmc0: starting CMD55 arg 00020000 flags 00000095
> [ 1080.849531] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.849596] mmc0: req done (CMD55): 0: 00000920 00000000 00000000 
> 00000000
> [ 1080.850319] mmc0: starting CMD22 arg 00000000 flags 000000b5
> [ 1080.850367] mmc0:     blksz 4 blocks 1 flags 00000200 tsac 100 ms 
> nsac 0
> [ 1080.850468] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.850523] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x0020000a
> [ 1080.850591] mmc0: req done (CMD22): 0: 00000920 00000000 00000000 
> 00000000
> [ 1080.850624] mmc0:     0 bytes transferred: -84
> [ 1080.851339] end_request: I/O error, dev mmcblk0, sector 68
> [ 1080.856881] Buffer I/O error on device mmcblk0p1, logical block 6
> [ 1080.862998] lost page write due to I/O error on mmcblk0p1
> [ 1080.897203] mmc0: starting CMD17 arg 00008800 flags 000000b5
> [ 1080.897250] mmc0:     blksz 512 blocks 1 flags 00000200 tsac 100 
> ms nsac 0
> [ 1080.897350] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.898143] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x0020000a
> [ 1080.898241] mmc0: req done (CMD17): 0: 00000900 00000000 00000000 
> 00000000
> [ 1080.898276] mmc0:     0 bytes transferred: -84
> [ 1080.898949] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1080.899134] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1080.899204] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 
> 00000000
> [ 1080.899284] mmcblk0: error -84 transferring data, sector 68, nr 1,
> cmd response 0x900, card status 0x0
> [ 1080.908738] end_request: I/O error, dev mmcblk0, sector 68
> [ 1080.914465] FAT-fs (mmcblk0p1): FAT read failed (blocknr 6)
> [ 1085.809683] mmc0: starting CMD24 arg 00045e00 flags 000000b5
> [ 1085.809731] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 
> ms nsac 0
> [ 1085.809831] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000009
> [ 1085.809880] sdhci [sdhci_data_irq()]: mmc0: DMA base 0x82548200,
> transferred 0x037e00 bytes, next 0x82580000
> [ 1085.809914] sdhci: =========== REGISTER DUMP (mmc0)===========
> [ 1085.809941] sdhci: Sys addr: 0x82548400 | Version:  0x00000000
> [ 1085.809967] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
> [ 1085.809992] sdhci: Argument: 0x00045e00 | Trn mode: 0x00000003
> [ 1085.810017] sdhci: Present:  0xff880000 | Host ctl: 0x00000021
> [ 1085.810042] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
> [ 1085.810067] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
> [ 1085.810093] sdhci: Timeout:  0x0000000c | Int stat: 0x00200002
> [ 1085.810119] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
> [ 1085.810143] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
> [ 1085.810169] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
> [ 1085.810193] sdhci: Cmd:      0x0000183a | Max curr: 0x00000000
> [ 1085.810214] sdhci: Host ctl2: 0x00000000
> [ 1085.810232] sdhci: ===========================================
> [ 1085.810275] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00200002
> [ 1085.810340] mmc0: req done (CMD24): 0: 00000900 00000000 00000000 
> 00000000
> [ 1085.810373] mmc0:     0 bytes transferred: -84
> [ 1085.810427] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1085.810489] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1085.810543] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 
> 00000000
> [ 1085.810616] mmc0: starting CMD13 arg 00020000 flags 00000195
> [ 1085.810690] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1085.810747] mmc0: req done (CMD13): 0: 00000900 00000000 00000000 
> 00000000
> [ 1085.810838] mmcblk0: error -84 transferring data, sector 559, nr
> 1, cmd response 0x900, card status 0x0
> [ 1085.834557] mmc0: starting CMD55 arg 00020000 flags 00000095
> [ 1085.834651] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1085.834722] mmc0: req done (CMD55): 0: 00000920 00000000 00000000 
> 00000000
> [ 1085.834802] mmc0: starting CMD22 arg 00000000 flags 000000b5
> [ 1085.834839] mmc0:     blksz 4 blocks 1 flags 00000200 tsac 100 ms 
> nsac 0
> [ 1085.834915] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x00000001
> [ 1085.834986] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 
> 0x0020000a
> [ 1085.835056] mmc0: req done (CMD22): 0: 00000920 00000000 00000000 
> 00000000
>
> What else can I provide you with? The MMC/SDHC/eSDHC driver framework
> is a huge piece of software, so I don't really know where to start
> looking. Electrically, there seems nothing really wrong with the
> signal of the clock; the only thing that looks a wee bit suspicious 
> is
> that the MX25 clock for the SDHC is currently offering a 33MHz signal
> (as gathered from the oscilloscope).

-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  7:32         ` [MX25][MMC] mmc esdhc failure in 3.3 joancarles
@ 2012-03-27  7:59           ` joancarles
  2012-03-27  8:12           ` Eric Bénard
  1 sibling, 0 replies; 18+ messages in thread
From: joancarles @ 2012-03-27  7:59 UTC (permalink / raw)
  To: linux-arm-kernel

ADDENDUM:

On Tue, 27 Mar 2012 09:32:36 +0200, joancarles wrote:
> G'day
>
> This is still present, and we have absolutely no idea where to
> continue searching for the solution of the problem. Any pointers? Any
> git trees of mmc driver changes we could try? Anybody else with a 
> MX25
> that has this issue?

Enabling CONFIG_MMC_CLKGATE, I have managed to create an oops simply be 
trying to remove a file:

[  192.821592] mmcblk0: error -110 transferring data, sector 559, nr 1, 
cmd response 0x900, card status 0x0
[  192.846928] mmc0: starting CMD55 arg 00020000 flags 00000095
[  192.846991] mmc0: clock 50000000Hz busmode 2 powermode 2 cs 0 Vdd 17 
width 2 timing 2
[  192.847048] sdhci-esdhc-imx sdhci-esdhc-imx25.0: desired SD clock: 
50000000, actual: 33250000
[  192.848090] mmc0: ungated MCI clock
[  192.848159] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 0x00000001
[  192.848221] mmc0: req done (CMD55): 0: 00000920 00000000 00000000 
00000000
[  192.848353] mmc0: clock 0Hz busmode 2 powermode 2 cs 0 Vdd 17 width 
2 timing 2
[  192.848400] mmc0: gated MCI clock
[  192.848458] Division by zero in kernel.
[  192.852384] [<c00141f0>] (unwind_backtrace+0x0/0x130) from 
[<c01c0bb4>] (Ldiv0+0x8/0x10)
[  192.867389] [<c01c0bb4>] (Ldiv0+0x8/0x10) from [<c029b288>] 
(mmc_sd_num_wr_blocks+0x148/0x230)
[  192.876924] [<c029b288>] (mmc_sd_num_wr_blocks+0x148/0x230) from 
[<c029c900>] (mmc_blk_issue_rw_rq+0x3fc/0x4ac)
[  192.887944] [<c029c900>] (mmc_blk_issue_rw_rq+0x3fc/0x4ac) from 
[<c029cbe8>] (mmc_blk_issue_rq+0x238/0x514)
[  192.898606] [<c029cbe8>] (mmc_blk_issue_rq+0x238/0x514) from 
[<c029d3b8>] (mmc_queue_thread+0x58/0x144)
[  192.908879] [<c029d3b8>] (mmc_queue_thread+0x58/0x144) from 
[<c0038b44>] (kthread+0x84/0x90)
[  192.918477] [<c0038b44>] (kthread+0x84/0x90) from [<c000f9a0>] 
(kernel_thread_exit+0x0/0x8)
[  192.927654] mmc0: starting CMD22 arg 00000000 flags 000000b5
[  192.927700] mmc0:     blksz 4 blocks 1 flags 00000200 tsac 100 ms 
nsac 0
[  192.927752] mmc0: clock 50000000Hz busmode 2 powermode 2 cs 0 Vdd 17 
width 2 timing 2
[  192.927810] sdhci-esdhc-imx sdhci-esdhc-imx25.0: desired SD clock: 
50000000, actual: 33250000
[  192.928852] mmc0: ungated MCI clock
[  192.928940] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 0x00000001
[  192.928991] sdhci [sdhci_irq()]: *** mmc0 got interrupt: 0x0000000a
[  192.929031] sdhci [sdhci_data_irq()]: mmc0: DMA base 0x839a3040, 
transferred 0x05cfc0 bytes, next 0x83a00000
[  192.929094] mmc0: req done (CMD22): 0: 00000920 00000000 00000000 
00000000
[  192.929126] mmc0:     4 bytes transferred: 0
[  192.929242] mmc0: clock 0Hz busmode 2 powermode 2 cs 0 Vdd 17 width 
2 timing 2
[  192.929288] mmc0: gated MCI clock

Maybe this helps. The file, in fact, has been removed, as far as I can 
see. Also, how come, the actual clock differs so much from the desired 
clock of 50MHz? As commented before, we do actually see the 33.25MHz 
clock in the oscilloscope.

Best regards
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  7:32         ` [MX25][MMC] mmc esdhc failure in 3.3 joancarles
  2012-03-27  7:59           ` joancarles
@ 2012-03-27  8:12           ` Eric Bénard
  2012-03-27  8:46             ` joancarles
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Bénard @ 2012-03-27  8:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Le Tue, 27 Mar 2012 09:32:36 +0200,
joancarles <joancarles@fqingenieria.es> a ?crit :

> G'day
> 
> This is still present, and we have absolutely no idea where to continue 
> searching for the solution of the problem. Any pointers? Any git trees 
> of mmc driver changes we could try? Anybody else with a MX25 that has 
> this issue?
> 
does that attached patch fix your problem ?

Eric
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-esdhc-imx-unbreak-on-i.MX25.patch
Type: text/x-patch
Size: 1054 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120327/6729fea5/attachment.bin>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  8:12           ` Eric Bénard
@ 2012-03-27  8:46             ` joancarles
  2012-03-27  8:58               ` Eric Bénard
  2012-03-27  9:01               ` Wolfram Sang
  0 siblings, 2 replies; 18+ messages in thread
From: joancarles @ 2012-03-27  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Eric

>> G'day
>>
>> This is still present, and we have absolutely no idea where to 
>> continue
>> searching for the solution of the problem. Any pointers? Any git 
>> trees
>> of mmc driver changes we could try? Anybody else with a MX25 that 
>> has
>> this issue?
>>
> does that attached patch fix your problem ?

YES, indeed it does to a certain degree. The timeout errors do not 
occur anymore and I can successfully perform many fs->ops, such as READ, 
WRITE, UNLINK ... . So thank you very much for this insight through your 
patch!

The speed of the card is excruciatingly slow, however. We roughly get 
between 34KB/s and 64KB/s from a dd if=/dev/zero write to any of the 
following partitions:

root at mx25-noah:~$> grep mmc /proc/mounts
/dev/mmcblk0p2 /media/card-ext3 ext3 
rw,sync,relatime,user_xattr,barrier=1,nodelalloc,data=ordered 0 0
/dev/mmcblk0p1 /media/card vfat 
rw,sync,relatime,fmask=0022,dmask=0022,codepage=cp437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 
0 0
root at mx25-noah:~$> df -kh | grep mmc
/dev/mmcblk0p2            1.7G    133.1M      1.5G   8% 
/media/card-ext3
/dev/mmcblk0p1          123.6M      2.7M    120.9M   2% /media/card
root at mx25-noah:~$> dd if=/dev/zero of=/media/card/test-10M bs=1024K 
count=10
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 296.004564 seconds, 34.6KB/s
root at mx25-noah:~$> dd if=/dev/zero of=/media/card-ext3/test-10M 
bs=1024K count=10
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 158.207481 seconds, 64.7KB/s

What's going on here, why is it so slow?

root at mx25-noah:~$> dmesg | grep -A1 CMD24
[  635.195042] mmc0: starting CMD24 arg 0e5c1600 flags 000000b5
[  635.195081] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 ms 
nsac 0
--
[  635.199831] mmc0: req done (CMD24): 0: 00000900 00000000 00000000 
00000000
[  635.199866] mmc0:     512 bytes transferred: 0
--
[  635.200212] mmc0: starting CMD24 arg 0e5c1800 flags 000000b5
[  635.200251] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 ms 
nsac 0
--
[  635.206516] mmc0: req done (CMD24): 0: 00000900 00000000 00000000 
00000000
[  635.206549] mmc0:     512 bytes transferred: 0
--
[  635.206892] mmc0: starting CMD24 arg 0e5c1a00 flags 000000b5
[  635.206932] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 ms 
nsac 0
--
[  635.211763] mmc0: req done (CMD24): 0: 00000900 00000000 00000000 
00000000
[  635.211796] mmc0:     512 bytes transferred: 0
--
[  635.212409] mmc0: starting CMD24 arg 0e5c1c00 flags 000000b5
[  635.212449] mmc0:     blksz 512 blocks 1 flags 00000100 tsac 300 ms 
nsac 0

Also, debugfs values don't make sense to me:

root at mx25-noah:/sys/kernel/debug/mmc0$> cat clock
50000000
root at mx25-noah:/sys/kernel/debug/mmc0$> cat ios
clock:		50000000 Hz
vdd:		17 (2.9 ~ 3.0 V)
bus mode:	2 (push-pull)
chip select:	0 (don't care)
power mode:	2 (on)
bus width:	2 (4 bits)
timing spec:	2 (sd high-speed)
root at mx25-noah:/sys/kernel/debug/mmc0$> cat mmc0\:0002/stat
state   status
root at mx25-noah:/sys/kernel/debug/mmc0$> cat mmc0\:0002/state
0x00000005
root at mx25-noah:/sys/kernel/debug/mmc0$> cat mmc0\:0002/status
00000900

Best regards and thank you so much for this real improvement of our 
situation here.
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  8:46             ` joancarles
@ 2012-03-27  8:58               ` Eric Bénard
  2012-03-27 13:03                 ` Zhu Richard-R65037
  2012-03-27  9:01               ` Wolfram Sang
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Bénard @ 2012-03-27  8:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Joan Carles,

Le Tue, 27 Mar 2012 10:46:45 +0200,
joancarles <joancarles@fqingenieria.es> a ?crit :
> >> This is still present, and we have absolutely no idea where to 
> >> continue
> >> searching for the solution of the problem. Any pointers? Any git 
> >> trees
> >> of mmc driver changes we could try? Anybody else with a MX25 that 
> >> has
> >> this issue?
> >>
> > does that attached patch fix your problem ?
> 
> YES, indeed it does to a certain degree. The timeout errors do not 
> occur anymore and I can successfully perform many fs->ops, such as READ, 
> WRITE, UNLINK ... . So thank you very much for this insight through your 
> patch!
> 
> The speed of the card is excruciatingly slow, however. We roughly get 
> between 34KB/s and 64KB/s from a dd if=/dev/zero write to any of the 
> following partitions:
> 
I have the same speed problem here on i.MX25 but still haven't found
the reason.
Shawn, Richard : do you have an idea of why the eSDHC is so slow on
i.MX25 ?

Thanks,
Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  8:46             ` joancarles
  2012-03-27  8:58               ` Eric Bénard
@ 2012-03-27  9:01               ` Wolfram Sang
  2012-03-27  9:06                 ` Eric Bénard
  1 sibling, 1 reply; 18+ messages in thread
From: Wolfram Sang @ 2012-03-27  9:01 UTC (permalink / raw)
  To: linux-arm-kernel


> >does that attached patch fix your problem ?
> 
> YES, indeed it does to a certain degree. The timeout errors do not
> occur anymore and I can successfully perform many fs->ops, such as

Interesting question is now why it worked on your older kernel? The code
around BROKEN_TIMEOUT is there for much longer, I'd think.

> The speed of the card is excruciatingly slow, however. We roughly
> get between 34KB/s and 64KB/s from a dd if=/dev/zero write to any of
> the following partitions:

Check the code below what is patched. That's currently the best solution
for the errata. There might be better ones, though, but you'd need to
find out.

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120327/0b5c95e0/attachment.sig>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:01               ` Wolfram Sang
@ 2012-03-27  9:06                 ` Eric Bénard
  2012-03-27  9:14                   ` Wolfram Sang
  2012-03-27  9:33                   ` joancarles
  0 siblings, 2 replies; 18+ messages in thread
From: Eric Bénard @ 2012-03-27  9:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Wolfram,

Le Tue, 27 Mar 2012 11:01:49 +0200,
Wolfram Sang <w.sang@pengutronix.de> a ?crit :
> > >does that attached patch fix your problem ?
> > 
> > YES, indeed it does to a certain degree. The timeout errors do not
> > occur anymore and I can successfully perform many fs->ops, such as
> 
> Interesting question is now why it worked on your older kernel? The code
> around BROKEN_TIMEOUT is there for much longer, I'd think.
> 
not in fact it seems to have been broken from a long time and I think
I'm responsible of that in 37865fe91582582a6f6c00652f6a2b1ff71f8a78
"mmc: sdhci-esdhc-imx: fix timeout on i.MX's sdhc"
because unlike the i.MX35 it seems that the i.MX25 manages to read
properly the partition table even without the timeout quirk and it
seems that I didn't do more extensive tests for this patch.

Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:06                 ` Eric Bénard
@ 2012-03-27  9:14                   ` Wolfram Sang
  2012-03-27  9:23                     ` Eric Bénard
  2012-03-27  9:33                   ` joancarles
  1 sibling, 1 reply; 18+ messages in thread
From: Wolfram Sang @ 2012-03-27  9:14 UTC (permalink / raw)
  To: linux-arm-kernel


> > Interesting question is now why it worked on your older kernel? The code
> > around BROKEN_TIMEOUT is there for much longer, I'd think.
> > 
> not in fact it seems to have been broken from a long time and I think

I know and you are saying the same, in fact :) Your patch came in around
2.6.37 and here it was said that 2.6.39 works fine. Might be random,
though.

> because unlike the i.MX35 it seems that the i.MX25 manages to read
> properly the partition table even without the timeout quirk and it
> seems that I didn't do more extensive tests for this patch.

Please do next time.

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120327/10d98581/attachment.sig>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:14                   ` Wolfram Sang
@ 2012-03-27  9:23                     ` Eric Bénard
  2012-03-27  9:36                       ` Wolfram Sang
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Bénard @ 2012-03-27  9:23 UTC (permalink / raw)
  To: linux-arm-kernel

Le Tue, 27 Mar 2012 11:14:25 +0200,
Wolfram Sang <w.sang@pengutronix.de> a ?crit :

> 
> > > Interesting question is now why it worked on your older kernel? The code
> > > around BROKEN_TIMEOUT is there for much longer, I'd think.
> > > 
> > not in fact it seems to have been broken from a long time and I think
> 
> I know and you are saying the same, in fact :) Your patch came in around
> 2.6.37 and here it was said that 2.6.39 works fine. Might be random,
> though.
> 
what is surprising is that no one reported this problem between 2.6.37
and 3.3

> > because unlike the i.MX35 it seems that the i.MX25 manages to read
> > properly the partition table even without the timeout quirk and it
> > seems that I didn't do more extensive tests for this patch.
> 
> Please do next time.
> 
well I should have said "enough extensive" tests as I did a lot of tests
on the 3 archs (i.MX25, i.MX35 and i.MX51). What I see here is that on
i.MX35 and i.MX51 the problem is very easy to reproduce without
extensive tests (card not detected) and that on i.MX25 it needs more
tests as the card is properly detected.

Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:06                 ` Eric Bénard
  2012-03-27  9:14                   ` Wolfram Sang
@ 2012-03-27  9:33                   ` joancarles
  2012-03-27  9:40                     ` Eric Bénard
  2012-03-27  9:45                     ` Wolfram Sang
  1 sibling, 2 replies; 18+ messages in thread
From: joancarles @ 2012-03-27  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

>> Interesting question is now why it worked on your older kernel? The 
>> code
>> around BROKEN_TIMEOUT is there for much longer, I'd think.
>>
> not in fact it seems to have been broken from a long time and I think
> I'm responsible of that in 37865fe91582582a6f6c00652f6a2b1ff71f8a78
> "mmc: sdhci-esdhc-imx: fix timeout on i.MX's sdhc"
> because unlike the i.MX35 it seems that the i.MX25 manages to read
> properly the partition table even without the timeout quirk and it
> seems that I didn't do more extensive tests for this patch.

Might be unrelated, however I have been keeping my eyes on the fix of 
ENGcm07207 quirk introduced with 16a790bcc. According to the 
IMX25CE.pdf, to abort data transfers on the AHB, software can reset the 
eSDHC by writing 1 to SYSCTL[24] (RSTA), which currently is not done 
with SDHCI_QUIRK_NO_MULTIBLOCK. It sets the max_blk_count to 1 instead 
of 65535. Not sure if this is also limiting the speed.

I have tried putting the SD card into an ALL-in-One reader via USB and 
I get 6MB/s read and 15MB/s write performance. Since I didn't know the 
exact class of the card, this reassures me that there is nothing 
substantially wrong with the card per se.

So, how can we find a solution to this speed issue? Also, do you plan 
on submitting your patch to revert the timeout quirk for MX25?

Cheers
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:23                     ` Eric Bénard
@ 2012-03-27  9:36                       ` Wolfram Sang
  0 siblings, 0 replies; 18+ messages in thread
From: Wolfram Sang @ 2012-03-27  9:36 UTC (permalink / raw)
  To: linux-arm-kernel


> well I should have said "enough extensive" tests as I did a lot of tests
> on the 3 archs (i.MX25, i.MX35 and i.MX51). What I see here is that on
> i.MX35 and i.MX51 the problem is very easy to reproduce without
> extensive tests (card not detected) and that on i.MX25 it needs more
> tests as the card is properly detected.

Ah, okay, thanks for the clarification! Well, since you retested now and
found it is harder to trigger that could explain why nobody reported it
and why 2.6.39 seems to work (although it maybe doesn't). Thanks for
rechecking. If you submit the patch, you can add my

Acked-by: Wolfram Sang <w.sang@pengutronix.de>

Best regards.

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120327/7b294c8d/attachment.sig>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:33                   ` joancarles
@ 2012-03-27  9:40                     ` Eric Bénard
  2012-03-27 10:56                       ` joancarles
  2012-03-27  9:45                     ` Wolfram Sang
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Bénard @ 2012-03-27  9:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Le Tue, 27 Mar 2012 11:33:57 +0200,
joancarles <joancarles@fqingenieria.es> a ?crit :
> >> Interesting question is now why it worked on your older kernel? The 
> >> code
> >> around BROKEN_TIMEOUT is there for much longer, I'd think.
> >>
> > not in fact it seems to have been broken from a long time and I think
> > I'm responsible of that in 37865fe91582582a6f6c00652f6a2b1ff71f8a78
> > "mmc: sdhci-esdhc-imx: fix timeout on i.MX's sdhc"
> > because unlike the i.MX35 it seems that the i.MX25 manages to read
> > properly the partition table even without the timeout quirk and it
> > seems that I didn't do more extensive tests for this patch.
> 
> Might be unrelated, however I have been keeping my eyes on the fix of 
> ENGcm07207 quirk introduced with 16a790bcc. According to the 
> IMX25CE.pdf, to abort data transfers on the AHB, software can reset the 
> eSDHC by writing 1 to SYSCTL[24] (RSTA), which currently is not done 
> with SDHCI_QUIRK_NO_MULTIBLOCK. It sets the max_blk_count to 1 instead 
> of 65535. Not sure if this is also limiting the speed.
> 
I tried to increase max_blk_count and it breaks after a few tests.
Using an oscilloscope it seems we have a big latency between each
transfer which could explain the low throughput, I didn't have the
time to look deeper at this problem.

> I have tried putting the SD card into an ALL-in-One reader via USB and 
> I get 6MB/s read and 15MB/s write performance. Since I didn't know the 
> exact class of the card, this reassures me that there is nothing 
> substantially wrong with the card per se.
> 
> So, how can we find a solution to this speed issue? Also, do you plan 
> on submitting your patch to revert the timeout quirk for MX25?
> 
yes, I'm writing a proper commit message and will send it in a few
minutes.

Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:33                   ` joancarles
  2012-03-27  9:40                     ` Eric Bénard
@ 2012-03-27  9:45                     ` Wolfram Sang
  1 sibling, 0 replies; 18+ messages in thread
From: Wolfram Sang @ 2012-03-27  9:45 UTC (permalink / raw)
  To: linux-arm-kernel


> Might be unrelated, however I have been keeping my eyes on the fix
> of ENGcm07207 quirk introduced with 16a790bcc. According to the
> IMX25CE.pdf, to abort data transfers on the AHB, software can reset
> the eSDHC by writing 1 to SYSCTL[24] (RSTA), which currently is not
> done with SDHCI_QUIRK_NO_MULTIBLOCK. It sets the max_blk_count to 1
> instead of 65535. Not sure if this is also limiting the speed.

Oh, it is limiting speed, for sure.

SDHCI_QUIRK_NO_MULTIBLOCK is a _generic_ quirk which can be enabled by
various buggy sdhci implementations, not only esdhc (which is pretty
good at being buggy). It is not there to fix MX25 specific issues,
that would have to be done seperately in sdhci-esdhc-imx.c.

So, the above quirk is a big hammer solution. There might be more
precise ones, but those still would have to be developed. I don't know
of anyone planning to do this.

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20120327/c71d2935/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  9:40                     ` Eric Bénard
@ 2012-03-27 10:56                       ` joancarles
  2012-03-28  9:26                         ` joancarles
  0 siblings, 1 reply; 18+ messages in thread
From: joancarles @ 2012-03-27 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

>> Might be unrelated, however I have been keeping my eyes on the fix 
>> of
>> ENGcm07207 quirk introduced with 16a790bcc. According to the
>> IMX25CE.pdf, to abort data transfers on the AHB, software can reset 
>> the
>> eSDHC by writing 1 to SYSCTL[24] (RSTA), which currently is not done
>> with SDHCI_QUIRK_NO_MULTIBLOCK. It sets the max_blk_count to 1 
>> instead
>> of 65535. Not sure if this is also limiting the speed.
>>
> I tried to increase max_blk_count and it breaks after a few tests.

I will redo your tests in a jiffy, however the latency issues worry me 
a wee bit.

> Using an oscilloscope it seems we have a big latency between each
> transfer which could explain the low throughput, I didn't have the
> time to look deeper at this problem.

The latency we can see here in the oscilloscope is tremendous between 
two CMD24: 8ms! What is causing this? It is certainly unrelated to the 
MULTIBLOCK feature. Having a clock at 33.25MHz, our maximal read, as 
confirmed by dd is:

33.25[MHz] * 4[bits data] / 8[bits/Byte] * 30 [us write] / 8000 [us 
sleep] =~ 60KB/s

So, what's keeping the driver 8ms from sending the data??? Any pointers 
on where I could twiddle some bits in the driver? Where do I start 
looking?

Cheers
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27  8:58               ` Eric Bénard
@ 2012-03-27 13:03                 ` Zhu Richard-R65037
  0 siblings, 0 replies; 18+ messages in thread
From: Zhu Richard-R65037 @ 2012-03-27 13:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Eric:
I'm on my business trip out of my FSL office, can't reproduce this issue on i.MX25 board now.
Sorry about that.

Best Regards
Richard Zhu

-----Original Message-----
From: Eric B?nard [mailto:eric at eukrea.com] 
Sent: 2012?3?27? 3:58
To: joancarles
Cc: Zhu Richard-R65037; shawn.guo at linaro.org; linux-mmc at vger.kernel.org; Wolfram Sang; linux-arm-kernel at lists.infradead.org
Subject: Re: [MX25][MMC] mmc esdhc failure in 3.3

Hi Joan Carles,

Le Tue, 27 Mar 2012 10:46:45 +0200,
joancarles <joancarles@fqingenieria.es> a ?crit :
> >> This is still present, and we have absolutely no idea where to 
> >> continue searching for the solution of the problem. Any pointers? 
> >> Any git trees of mmc driver changes we could try? Anybody else with 
> >> a MX25 that has this issue?
> >>
> > does that attached patch fix your problem ?
> 
> YES, indeed it does to a certain degree. The timeout errors do not 
> occur anymore and I can successfully perform many fs->ops, such as 
> READ, WRITE, UNLINK ... . So thank you very much for this insight 
> through your patch!
> 
> The speed of the card is excruciatingly slow, however. We roughly get 
> between 34KB/s and 64KB/s from a dd if=/dev/zero write to any of the 
> following partitions:
> 
I have the same speed problem here on i.MX25 but still haven't found the reason.
Shawn, Richard : do you have an idea of why the eSDHC is so slow on
i.MX25 ?

Thanks,
Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-27 10:56                       ` joancarles
@ 2012-03-28  9:26                         ` joancarles
  2012-03-28 14:01                           ` joancarles
  0 siblings, 1 reply; 18+ messages in thread
From: joancarles @ 2012-03-28  9:26 UTC (permalink / raw)
  To: linux-arm-kernel

ADDENDUM:

>> I tried to increase max_blk_count and it breaks after a few tests.
>
> I will redo your tests in a jiffy, however the latency issues worry
> me a wee bit.

Increasing max_blk_count did not help really, however I am not sure if 
setting it unconditionally to 65535 was the only change I had to make.

> So, what's keeping the driver 8ms from sending the data??? Any
> pointers on where I could twiddle some bits in the driver? Where do I
> start looking?

Using the following setup, I tried to get some more information:

diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c 
b/drivers/mmc/host/sdhci-esdhc-imx.c
index 0be4e20..4896f24 100644
--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -467,8 +467,7 @@ static int __devinit sdhci_esdhc_imx_probe(struct 
platform_device *pdev)
         clk_enable(clk);
         pltfm_host->clk = clk;

-       if (!is_imx25_esdhc(imx_data))
-               host->quirks |= SDHCI_QUIRK_BROKEN_TIMEOUT_VAL;
+       host->quirks |= SDHCI_QUIRK_BROKEN_TIMEOUT_VAL;

         if (is_imx25_esdhc(imx_data) || is_imx35_esdhc(imx_data))
                 /* Fix errata ENGcm07207 present on i.MX25 and i.MX35 
*/
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 8d66706..0b60d14 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -713,6 +713,8 @@ static void sdhci_prepare_data(struct sdhci_host 
*host, struct mmc_command *cmd)
                 sdhci_writeb(host, count, SDHCI_TIMEOUT_CONTROL);
         }

+       sdhci_dumpregs(host);
+
         if (!data)
                 return;

echo 'file core.c line 193 +p' > 
/sys/kernel/debug/dynamic_debug/control
echo 'file core.c line 158 +p' > 
/sys/kernel/debug/dynamic_debug/control
echo 'file sdhci.c line 72-115 +p' > 
/sys/kernel/debug/dynamic_debug/control
mount /media/card
dd if=/dev/urandom of=/media/card/file-4K bs=4K count=1

[  119.895460] mmc0: starting CMD24 arg 00045e00 flags 000000b5
[  119.895531] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.895570] sdhci: Sys addr: 0x82480000 | Version:  0x00000000
[  119.895602] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.895633] sdhci: Argument: 0x00049c00 | Trn mode: 0x00000013
[  119.895664] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.895693] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.895724] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.895754] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.895787] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.895819] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.895852] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.895883] sdhci: Cmd:      0x0000113a | Max curr: 0x00000000
[  119.895911] sdhci: Host ctl2: 0x00000000
[  119.895936] sdhci: ===========================================
[  119.900527] mmc0: starting CMD13 arg 00020000 flags 00000195
[  119.900585] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.900624] sdhci: Sys addr: 0x82452400 | Version:  0x00000000
[  119.900655] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.900687] sdhci: Argument: 0x00045e00 | Trn mode: 0x00000003
[  119.900718] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.900746] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.900776] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.900807] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.900840] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.900869] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.900902] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.900933] sdhci: Cmd:      0x0000183a | Max curr: 0x00000000
[  119.900962] sdhci: Host ctl2: 0x00000000
[  119.900985] sdhci: ===========================================
[  119.901357] mmc0: starting CMD17 arg 00009000 flags 000000b5
[  119.901419] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.901458] sdhci: Sys addr: 0x82452400 | Version:  0x00000000
[  119.901490] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.901520] sdhci: Argument: 0x00020000 | Trn mode: 0x00000003
[  119.901551] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.901579] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.901610] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.901641] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.901673] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.901703] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.901735] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.901765] sdhci: Cmd:      0x00000d1a | Max curr: 0x00000000
[  119.901793] sdhci: Host ctl2: 0x00000000
[  119.901816] sdhci: ===========================================
[  119.908072] mmc0: starting CMD17 arg 00009200 flags 000000b5
[  119.908141] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.908179] sdhci: Sys addr: 0x82480000 | Version:  0x00000000
[  119.908212] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.908243] sdhci: Argument: 0x00009000 | Trn mode: 0x00000013
[  119.908274] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.908304] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.908335] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.908367] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.908400] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.908432] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.908464] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.908495] sdhci: Cmd:      0x0000113a | Max curr: 0x00000000
[  119.908524] sdhci: Host ctl2: 0x00000000
[  119.908547] sdhci: ===========================================
[  119.910858] mmc0: starting CMD17 arg 00009400 flags 000000b5
[  119.910929] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.910969] sdhci: Sys addr: 0x82480000 | Version:  0x00000000
[  119.911002] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.911034] sdhci: Argument: 0x00009200 | Trn mode: 0x00000013
[  119.911065] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.911095] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.911126] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.911157] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.911191] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.911223] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.911255] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.911286] sdhci: Cmd:      0x0000113a | Max curr: 0x00000000
[  119.911314] sdhci: Host ctl2: 0x00000000
[  119.911338] sdhci: ===========================================
[  119.913935] mmc0: starting CMD24 arg 00009000 flags 000000b5
[  119.914004] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.914042] sdhci: Sys addr: 0x82480000 | Version:  0x00000000
[  119.914073] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.914105] sdhci: Argument: 0x00009400 | Trn mode: 0x00000013
[  119.914135] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.914164] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.914194] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.914224] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.914257] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.914288] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.914321] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.914352] sdhci: Cmd:      0x0000113a | Max curr: 0x00000000
[  119.914380] sdhci: Host ctl2: 0x00000000
[  119.914404] sdhci: ===========================================
[  119.921304] mmc0: starting CMD13 arg 00020000 flags 00000195
[  119.921368] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.921407] sdhci: Sys addr: 0x82403600 | Version:  0x00000000
[  119.921440] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.921472] sdhci: Argument: 0x00009000 | Trn mode: 0x00000003
[  119.921503] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.921532] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.921564] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.921594] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.921626] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.921658] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.921689] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.921720] sdhci: Cmd:      0x0000183a | Max curr: 0x00000000
[  119.921747] sdhci: Host ctl2: 0x00000000
[  119.921770] sdhci: ===========================================
[  119.921889] mmc0: starting CMD24 arg 00009200 flags 000000b5
[  119.921941] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.921976] sdhci: Sys addr: 0x82403600 | Version:  0x00000000
[  119.922007] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.922038] sdhci: Argument: 0x00020000 | Trn mode: 0x00000003
[  119.922070] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.922098] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.922129] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.922161] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.922193] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.922223] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.922256] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.922287] sdhci: Cmd:      0x00000d1a | Max curr: 0x00000000
[  119.922315] sdhci: Host ctl2: 0x00000000
[  119.922338] sdhci: ===========================================
[  119.927417] mmc0: starting CMD13 arg 00020000 flags 00000195
[  119.927481] sdhci: =========== REGISTER DUMP (mmc0)===========
[  119.927523] sdhci: Sys addr: 0x82403800 | Version:  0x00000000
[  119.927554] sdhci: Blk size: 0x00000200 | Blk cnt:  0x00000000
[  119.927586] sdhci: Argument: 0x00009200 | Trn mode: 0x00000003
[  119.927616] sdhci: Present:  0xff880000 | Host ctl: 0x00000023
[  119.927645] sdhci: Power:    0x00000000 | Blk gap:  0x00000000
[  119.927675] sdhci: Wake-up:  0x00000000 | Clock:    0x00000107
[  119.927707] sdhci: Timeout:  0x0000000e | Int stat: 0x00000000
[  119.927740] sdhci: Int enab: 0x107f000b | Sig enab: 0x107f000b
[  119.927772] sdhci: AC12 err: 0x00000000 | Slot int: 0x00001001
[  119.927804] sdhci: Caps:     0x07eb0000 | Caps_1:   0x04100410
[  119.927835] sdhci: Cmd:      0x0000183a | Max curr: 0x00000000
[  119.927864] sdhci: Host ctl2: 0x00000000

[CMD24/CMD13 changes go on like this ...]

Anybody with more insights? Why is the interval between two CMD24 more 
or less in the range of 6ms and the CMD13 follows in varying intervals 
between 1ms to 4ms? What's keeping the driver from bursting CMD24 and 
successive CMD13 requests?

Thanks and best regards
-- 
Joan C. Abelaira

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [MX25][MMC] mmc esdhc failure in 3.3
  2012-03-28  9:26                         ` joancarles
@ 2012-03-28 14:01                           ` joancarles
  0 siblings, 0 replies; 18+ messages in thread
From: joancarles @ 2012-03-28 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

> Anybody with more insights? Why is the interval between two CMD24
> more or less in the range of 6ms and the CMD13 follows in varying
> intervals between 1ms to 4ms? What's keeping the driver from bursting
> CMD24 and successive CMD13 requests?

A 4*128KB dd run shows the following patterns of latency between 
various CMDs issued:

http://i43.tinypic.com/2bcmsw.png

The latency distribution is plotted here:

http://i41.tinypic.com/9gvbqo.png

bin limits	freq count	pct
<0ms	0	0.0%
<1ms	922	43.6%
<6ms	491	23.2%
<10ms	411	19.4%
<20ms	113	5.3%
<100ms	163	7.7%
<300ms	9	0.4%
<1000ms	7	0.3%

It took 25 seconds to write 4*128KB to the SD-card using the current 
drivers. More than 50% of the CMD requests issued take more than 1ms 
from issuing to response between the host and the card. Anybody else 
with this issue on the i.MX25? How can I pinpoint this down to the 
culprits in the kernel source?

Regards
-- 
Joan C. Abelaira

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-03-28 14:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <5e57eb999780335721212bb8d411406f@mail.fqingenieria.es>
2012-03-12 13:12 ` [MX25][MMC] mmc esdhc failure in 3.3-rc5 (still present in rc7) joancarles
     [not found] ` <20120312132423.GC2459@pengutronix.de>
     [not found]   ` <9018463dfcd3c9e9a311aeed42b758bf@mail.fqingenieria.es>
     [not found]     ` <0E83723C55F66F43A6041464FE31119D099A34@039-SN2MPN1-011.039d.mgd.msft.net>
     [not found]       ` <a06e68c3687c7390e15950bc20831fd4@mail.fqingenieria.es>
2012-03-27  7:32         ` [MX25][MMC] mmc esdhc failure in 3.3 joancarles
2012-03-27  7:59           ` joancarles
2012-03-27  8:12           ` Eric Bénard
2012-03-27  8:46             ` joancarles
2012-03-27  8:58               ` Eric Bénard
2012-03-27 13:03                 ` Zhu Richard-R65037
2012-03-27  9:01               ` Wolfram Sang
2012-03-27  9:06                 ` Eric Bénard
2012-03-27  9:14                   ` Wolfram Sang
2012-03-27  9:23                     ` Eric Bénard
2012-03-27  9:36                       ` Wolfram Sang
2012-03-27  9:33                   ` joancarles
2012-03-27  9:40                     ` Eric Bénard
2012-03-27 10:56                       ` joancarles
2012-03-28  9:26                         ` joancarles
2012-03-28 14:01                           ` joancarles
2012-03-27  9:45                     ` Wolfram Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox