max_discard anomaly on certain Sandisk eMMC

linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* max_discard anomaly on certain Sandisk eMMC
@ 2013-12-13 22:43 Stephen Warren
  2013-12-16 23:18 ` Stephen Warren
       [not found] ` <52AB8DA2.9000001-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
  0 siblings, 2 replies; 17+ messages in thread
From: Stephen Warren @ 2013-12-13 22:43 UTC (permalink / raw)
  To: Chris Ball; +Cc: linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org

On one of my eMMC devices, I see the following results from calling
mmc_do_calc_max_discard() with various parameters:

[    3.057263] MMC_DISCARD_ARG max_discard 1
[    3.057266] MMC_ERASE_ARG   max_discard 4096
[    3.057267] MMC_TRIM_ARG    max_discard 1

This causes mmc_calc_max_discard() to return 1, which makes the discard
IOCTL extremely slow.

For almost all my other eMMC devices, either:

* Both arguments to mmc_do_calc_max_discard() yield zero. Hence, the
discard IOCTL is not supported.

* Both arguments to mmc_do_calc_max_discard() yield some reasonable
large value. Hence, the discard IOCTL executes reasonably quickly.

Do you think that TRIM_ARG result is expected, or is the eMMC firmware
simply buggy?

If I modify mmc_calc_max_discard() to simply ignore the TRIM_ARG result
and always use the ERASE_ARG result, I see no errors when executing
discard operations from either mke2fs, or from the blkdiscard utility. I
have no idea if the discard operation is doing anything useful though.

As an aside, another eMMC device (with same manfid/oemid/name) I have
returns the same 1 for TRIM_ARG, but returns 0 for ERASE_ARG, and hence
discard is disabled, so I don't see this problem:

[    1.835747] MMC_DISCARD_ARG max_discard 1
[    1.839779] MMC_ERASE_ARG   max_discard 0
[    1.843791] MMC_TRIM_ARG    max_discard 1

To solve my slow discard operations, I'm tempted to modify
mmc_calc_max_discard() as follows:

if (max_discard == 1)
	max_discard = 0;

... but I'm not sure if that would be seen as a regression, since it'd
disable the discard operation completely on theoretically working (but
perhaps practically useless) systems.

Alternatively, perhaps I should replace:

	if (mmc_can_trim(card)) {
		max_trim = mmc_do_calc_max_discard(card, MMC_TRIM_ARG);
		if (max_trim < max_discard)
			max_discard = max_trim;

with:

	if (mmc_can_trim(card)) {
		max_trim = mmc_do_calc_max_discard(card, MMC_TRIM_ARG);
		if (max_trim > 1 && max_trim < max_discard)
			max_discard = max_trim;

Alternatively, should I install a quirk for the specific eMMC device,
which guards one of the changes above, or completely ignores the
TRIM_ARG result?

The eMMC device is question is:

manfid = 0x45
oemid = 0x100
name = SEM16G

Strangely, this is apparently a Sandisk eMMC device, yet there already
exist some quirks for a set of similarly named Sandisk devices, yet they
are triggered by manfid == 2, not 0x45. I'm not sure why Sandisk uses
two separate manufacturer IDs...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
  2013-12-13 22:43 max_discard anomaly on certain Sandisk eMMC Stephen Warren
@ 2013-12-16 23:18 ` Stephen Warren
  2013-12-17  8:17   ` Adrian Hunter
       [not found] ` <52AB8DA2.9000001-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Stephen Warren @ 2013-12-16 23:18 UTC (permalink / raw)
  To: Chris Ball, Adrian Hunter
  Cc: linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org

On 12/13/2013 03:43 PM, Stephen Warren wrote:
> On one of my eMMC devices, I see the following results from calling
> mmc_do_calc_max_discard() with various parameters:
> 
> [    3.057263] MMC_DISCARD_ARG max_discard 1
> [    3.057266] MMC_ERASE_ARG   max_discard 4096
> [    3.057267] MMC_TRIM_ARG    max_discard 1
> 
> This causes mmc_calc_max_discard() to return 1, which makes the discard
> IOCTL extremely slow.

Further investigation shows that if I make a few hacks that essentially
revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
discard timeout":

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 357bbc54fe4b..e66af930d0e3 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
request_queue *q,
 		return;

 	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
-	q->limits.max_discard_sectors = max_discard;
+	q->limits.max_discard_sectors = UINT_MAX;
 	if (card->erased_byte == 0 && !mmc_can_discard(card))
 		q->limits.discard_zeroes_data = 1;
 	q->limits.discard_granularity = card->pref_erase << 9;
 	/* granularity must not be greater than max. discard */
+#if 0
 	if (card->pref_erase > max_discard)
 		q->limits.discard_granularity = 0;
+#endif
 	if (mmc_can_secure_erase_trim(card))
 		queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
 }

I end up with:

$ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
2097152
$ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
2199023255040
$ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
1

With those values, mke2fs is fast, and I validated that "blkdiscard"
works; I filled a large partition with /dev/urandom, executed
"blkdiscard" on the 4M at the start, and saw zeroes when reading the
discarded part back.

This implies that the issue is simply the operation of
mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
discard abilities, doesn't it?

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
  2013-12-16 23:18 ` Stephen Warren
@ 2013-12-17  8:17   ` Adrian Hunter
       [not found]     ` <52B008AB.7060909-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2013-12-17  8:17 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Chris Ball, linux-mmc@vger.kernel.org,
	linux-tegra@vger.kernel.org

On 17/12/13 01:18, Stephen Warren wrote:
> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>> On one of my eMMC devices, I see the following results from calling
>> mmc_do_calc_max_discard() with various parameters:
>>
>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>
>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>> IOCTL extremely slow.
> 
> Further investigation shows that if I make a few hacks that essentially
> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
> discard timeout":
> 
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index 357bbc54fe4b..e66af930d0e3 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
> request_queue *q,
>  		return;
> 
>  	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
> -	q->limits.max_discard_sectors = max_discard;
> +	q->limits.max_discard_sectors = UINT_MAX;
>  	if (card->erased_byte == 0 && !mmc_can_discard(card))
>  		q->limits.discard_zeroes_data = 1;
>  	q->limits.discard_granularity = card->pref_erase << 9;
>  	/* granularity must not be greater than max. discard */
> +#if 0
>  	if (card->pref_erase > max_discard)
>  		q->limits.discard_granularity = 0;
> +#endif
>  	if (mmc_can_secure_erase_trim(card))
>  		queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>  }
> 
> I end up with:
> 
> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
> 2097152
> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
> 2199023255040
> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
> 1
> 
> With those values, mke2fs is fast, and I validated that "blkdiscard"
> works; I filled a large partition with /dev/urandom, executed
> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
> discarded part back.
> 
> This implies that the issue is simply the operation of
> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
> discard abilities, doesn't it?

No.

The underlying problem is a combination of:
	a) JEDEC specified very large timeouts for erase operations e.g. can be
minutes for large erases
	b) SDHCI controllers have been implemented with high frequency timeout
clocks which limit the maximum timeout to a few seconds
	c) It is not possible to disable the timeout on SDHCI

What a) means is that you can get away with much larger erases than you can
specify the timeout for - which is what you have discovered.

To understand the timeouts, you should manually do the calculations.

Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
beware of the partitioning implications of changing that.

The best solution is to change the hardware to use the lowest possible
frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
up to 36 hours.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found] ` <52AB8DA2.9000001-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
@ 2013-12-17  9:25   ` Dong Aisheng
       [not found]     ` <CAA+hA=R3wnbuJrJQhfG9PQEHrwE9nrwg_+xSpyXryOeM2Wtwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Dong Aisheng @ 2013-12-17  9:25 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Chris Ball, linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hi Stephen,

On Sat, Dec 14, 2013 at 6:43 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
> On one of my eMMC devices, I see the following results from calling
> mmc_do_calc_max_discard() with various parameters:
>
> [    3.057263] MMC_DISCARD_ARG max_discard 1
> [    3.057266] MMC_ERASE_ARG   max_discard 4096
> [    3.057267] MMC_TRIM_ARG    max_discard 1
>
> This causes mmc_calc_max_discard() to return 1, which makes the discard
> IOCTL extremely slow.
>

IMX met the similar issue.
http://www.spinics.net/lists/linux-mmc/msg23375.html
It's caused by the max_discard_to supported by host is too small.

I submitted the fix patches:
http://www.spinics.net/lists/arm-kernel/msg294924.html
Please see if it helps for you, especially patch #5.
It could increase the max_discard_to if Tegra has same problem.

Regards
Dong Aisheng

> For almost all my other eMMC devices, either:
>
> * Both arguments to mmc_do_calc_max_discard() yield zero. Hence, the
> discard IOCTL is not supported.
>
> * Both arguments to mmc_do_calc_max_discard() yield some reasonable
> large value. Hence, the discard IOCTL executes reasonably quickly.
>
> Do you think that TRIM_ARG result is expected, or is the eMMC firmware
> simply buggy?
>
> If I modify mmc_calc_max_discard() to simply ignore the TRIM_ARG result
> and always use the ERASE_ARG result, I see no errors when executing
> discard operations from either mke2fs, or from the blkdiscard utility. I
> have no idea if the discard operation is doing anything useful though.
>
> As an aside, another eMMC device (with same manfid/oemid/name) I have
> returns the same 1 for TRIM_ARG, but returns 0 for ERASE_ARG, and hence
> discard is disabled, so I don't see this problem:
>
> [    1.835747] MMC_DISCARD_ARG max_discard 1
> [    1.839779] MMC_ERASE_ARG   max_discard 0
> [    1.843791] MMC_TRIM_ARG    max_discard 1
>
> To solve my slow discard operations, I'm tempted to modify
> mmc_calc_max_discard() as follows:
>
> if (max_discard == 1)
>         max_discard = 0;
>
> ... but I'm not sure if that would be seen as a regression, since it'd
> disable the discard operation completely on theoretically working (but
> perhaps practically useless) systems.
>
> Alternatively, perhaps I should replace:
>
>         if (mmc_can_trim(card)) {
>                 max_trim = mmc_do_calc_max_discard(card, MMC_TRIM_ARG);
>                 if (max_trim < max_discard)
>                         max_discard = max_trim;
>
> with:
>
>         if (mmc_can_trim(card)) {
>                 max_trim = mmc_do_calc_max_discard(card, MMC_TRIM_ARG);
>                 if (max_trim > 1 && max_trim < max_discard)
>                         max_discard = max_trim;
>
> Alternatively, should I install a quirk for the specific eMMC device,
> which guards one of the changes above, or completely ignores the
> TRIM_ARG result?
>
> The eMMC device is question is:
>
> manfid = 0x45
> oemid = 0x100
> name = SEM16G
>
> Strangely, this is apparently a Sandisk eMMC device, yet there already
> exist some quirks for a set of similarly named Sandisk devices, yet they
> are triggered by manfid == 2, not 0x45. I'm not sure why Sandisk uses
> two separate manufacturer IDs...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]     ` <52B008AB.7060909-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-12-17  9:40       ` Dong Aisheng
  2013-12-17  9:45         ` Vladimir Zapolskiy
  2013-12-17 10:04       ` Ulf Hansson
  1 sibling, 1 reply; 17+ messages in thread
From: Dong Aisheng @ 2013-12-17  9:40 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Tue, Dec 17, 2013 at 4:17 PM, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> On 17/12/13 01:18, Stephen Warren wrote:
>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>> On one of my eMMC devices, I see the following results from calling
>>> mmc_do_calc_max_discard() with various parameters:
>>>
>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>
>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>> IOCTL extremely slow.
>>
>> Further investigation shows that if I make a few hacks that essentially
>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>> discard timeout":
>>
>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>> index 357bbc54fe4b..e66af930d0e3 100644
>> --- a/drivers/mmc/card/queue.c
>> +++ b/drivers/mmc/card/queue.c
>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>> request_queue *q,
>>               return;
>>
>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>> -     q->limits.max_discard_sectors = max_discard;
>> +     q->limits.max_discard_sectors = UINT_MAX;
>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>               q->limits.discard_zeroes_data = 1;
>>       q->limits.discard_granularity = card->pref_erase << 9;
>>       /* granularity must not be greater than max. discard */
>> +#if 0
>>       if (card->pref_erase > max_discard)
>>               q->limits.discard_granularity = 0;
>> +#endif
>>       if (mmc_can_secure_erase_trim(card))
>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>  }
>>
>> I end up with:
>>
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>> 2097152
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>> 2199023255040
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>> 1
>>
>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>> works; I filled a large partition with /dev/urandom, executed
>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>> discarded part back.
>>
>> This implies that the issue is simply the operation of
>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>> discard abilities, doesn't it?
>
> No.
>
> The underlying problem is a combination of:
>         a) JEDEC specified very large timeouts for erase operations e.g. can be
> minutes for large erases
>         b) SDHCI controllers have been implemented with high frequency timeout
> clocks which limit the maximum timeout to a few seconds

Right, especially for controllers using SDCLK as timeout clock.
I'm a bit suspect the timeout supported by host whether is designed
for erase operation
since they have huge gap.
For IMX, when running on 198Mhz for a SD3.0 cards, the max_discard_to is 1355ms.
However, i have one Toshiba SDHC U1 card which ERASE_OFFSET is 2s.
That means our host has no chance to support discard for such card.

Now, i'm think for those host controller with limited timeout time, if we should
use CMD13 to polling the status instead of using HW timeout machanism.
And actuall the mmc core already has some base code to support it.
The timeout is 10 seconds.

See mmc_do_erase function and
/* If the device is not responding */
#define MMC_CORE_TIMEOUT_MS     (10 * 60 * 1000) /* 10 minute timeout */

Regards
Dong Aisheng

>         c) It is not possible to disable the timeout on SDHCI
>
> What a) means is that you can get away with much larger erases than you can
> specify the timeout for - which is what you have discovered.
>
> To understand the timeouts, you should manually do the calculations.
>
> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
> beware of the partitioning implications of changing that.
>
> The best solution is to change the hardware to use the lowest possible
> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
> up to 36 hours.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
  2013-12-17  9:40       ` Dong Aisheng
@ 2013-12-17  9:45         ` Vladimir Zapolskiy
  0 siblings, 0 replies; 17+ messages in thread
From: Vladimir Zapolskiy @ 2013-12-17  9:45 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org

On 12/17/13 10:40, Dong Aisheng wrote:
> On Tue, Dec 17, 2013 at 4:17 PM, Adrian Hunter<adrian.hunter@intel.com>  wrote:
>> On 17/12/13 01:18, Stephen Warren wrote:
>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>> On one of my eMMC devices, I see the following results from calling
>>>> mmc_do_calc_max_discard() with various parameters:
>>>>
>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>
>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>> IOCTL extremely slow.
>>>
>>> Further investigation shows that if I make a few hacks that essentially
>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>> discard timeout":
>>>
>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>> index 357bbc54fe4b..e66af930d0e3 100644
>>> --- a/drivers/mmc/card/queue.c
>>> +++ b/drivers/mmc/card/queue.c
>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>> request_queue *q,
>>>                return;
>>>
>>>        queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>> -     q->limits.max_discard_sectors = max_discard;
>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>        if (card->erased_byte == 0&&  !mmc_can_discard(card))
>>>                q->limits.discard_zeroes_data = 1;
>>>        q->limits.discard_granularity = card->pref_erase<<  9;
>>>        /* granularity must not be greater than max. discard */
>>> +#if 0
>>>        if (card->pref_erase>  max_discard)
>>>                q->limits.discard_granularity = 0;
>>> +#endif
>>>        if (mmc_can_secure_erase_trim(card))
>>>                queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>   }
>>>
>>> I end up with:
>>>
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>> 2097152
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>> 2199023255040
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>> 1
>>>
>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>> works; I filled a large partition with /dev/urandom, executed
>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>> discarded part back.
>>>
>>> This implies that the issue is simply the operation of
>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>> discard abilities, doesn't it?
>>
>> No.
>>
>> The underlying problem is a combination of:
>>          a) JEDEC specified very large timeouts for erase operations e.g. can be
>> minutes for large erases
>>          b) SDHCI controllers have been implemented with high frequency timeout
>> clocks which limit the maximum timeout to a few seconds
>
> Right, especially for controllers using SDCLK as timeout clock.
> I'm a bit suspect the timeout supported by host whether is designed
> for erase operation
> since they have huge gap.
> For IMX, when running on 198Mhz for a SD3.0 cards, the max_discard_to is 1355ms.
> However, i have one Toshiba SDHC U1 card which ERASE_OFFSET is 2s.
> That means our host has no chance to support discard for such card.
>
> Now, i'm think for those host controller with limited timeout time, if we should
> use CMD13 to polling the status instead of using HW timeout machanism.
> And actuall the mmc core already has some base code to support it.
> The timeout is 10 seconds.

That's my point also. I presume JEDEC specifies maximum safe timeout for 
erase
operations, but since it is so huge (if properly calculated it may reach 
hours
for multiple erase groups) and erase operations are so fast, I don't think
we should care much of data line timeout on controller's side during
erase/trim/discard.

> See mmc_do_erase function and
> /* If the device is not responding */
> #define MMC_CORE_TIMEOUT_MS     (10 * 60 * 1000) /* 10 minute timeout */
>
> Regards
> Dong Aisheng
>
>>          c) It is not possible to disable the timeout on SDHCI
>>
>> What a) means is that you can get away with much larger erases than you can
>> specify the timeout for - which is what you have discovered.
>>
>> To understand the timeouts, you should manually do the calculations.
>>
>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>> beware of the partitioning implications of changing that.
>>
>> The best solution is to change the hardware to use the lowest possible
>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>> up to 36 hours.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]     ` <52B008AB.7060909-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-12-17  9:40       ` Dong Aisheng
@ 2013-12-17 10:04       ` Ulf Hansson
  2013-12-17 11:05         ` Dong Aisheng
       [not found]         ` <CAPDyKFooKY7nyOdLxQS-u9oC_pZL3V8pH5kixLgQpUkPG=kqKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 2 replies; 17+ messages in thread
From: Ulf Hansson @ 2013-12-17 10:04 UTC (permalink / raw)
  To: Adrian Hunter, Stephen Warren
  Cc: Chris Ball, linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 17 December 2013 09:17, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> On 17/12/13 01:18, Stephen Warren wrote:
>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>> On one of my eMMC devices, I see the following results from calling
>>> mmc_do_calc_max_discard() with various parameters:
>>>
>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>
>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>> IOCTL extremely slow.
>>
>> Further investigation shows that if I make a few hacks that essentially
>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>> discard timeout":
>>
>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>> index 357bbc54fe4b..e66af930d0e3 100644
>> --- a/drivers/mmc/card/queue.c
>> +++ b/drivers/mmc/card/queue.c
>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>> request_queue *q,
>>               return;
>>
>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>> -     q->limits.max_discard_sectors = max_discard;
>> +     q->limits.max_discard_sectors = UINT_MAX;
>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>               q->limits.discard_zeroes_data = 1;
>>       q->limits.discard_granularity = card->pref_erase << 9;
>>       /* granularity must not be greater than max. discard */
>> +#if 0
>>       if (card->pref_erase > max_discard)
>>               q->limits.discard_granularity = 0;
>> +#endif
>>       if (mmc_can_secure_erase_trim(card))
>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>  }
>>
>> I end up with:
>>
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>> 2097152
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>> 2199023255040
>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>> 1
>>
>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>> works; I filled a large partition with /dev/urandom, executed
>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>> discarded part back.
>>
>> This implies that the issue is simply the operation of
>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>> discard abilities, doesn't it?
>
> No.
>
> The underlying problem is a combination of:
>         a) JEDEC specified very large timeouts for erase operations e.g. can be
> minutes for large erases
>         b) SDHCI controllers have been implemented with high frequency timeout
> clocks which limit the maximum timeout to a few seconds
>         c) It is not possible to disable the timeout on SDHCI
>
> What a) means is that you can get away with much larger erases than you can
> specify the timeout for - which is what you have discovered.
>
> To understand the timeouts, you should manually do the calculations.
>
> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
> beware of the partitioning implications of changing that.
>
> The best solution is to change the hardware to use the lowest possible
> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
> up to 36 hours.

Don't know the details about the limitations for SDHCI, but I guess
similar exists for other controllers as well.

I do get the impression that we have got a problem in the mmc
core/block layer for how erase/trim/discard timeouts are being
handled.

I don't think the mmc hw-controller should be waiting for the R1B
response from the CMD38 as long as this "timeout" we are discussing
here. According to the spec, at least how I interpret it, the card
should respond rather quickly to CMD38, then it will assert the DAT0
line to indicate busy.

The total time the card is allowed to stay busy, that is what the
timeout specifies. We may either use a mmc hw-controller busy
detection mechanism or send CMD13 to poll for status. The latter is
somewhat already being handled in mmc_do_erase(), but we are using
"MMC_CORE_TIMEOUT_MS" instead of the correct timeout.

Kind regards
Ulf Hansson

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
  2013-12-17 10:04       ` Ulf Hansson
@ 2013-12-17 11:05         ` Dong Aisheng
  2013-12-17 12:33           ` Ulf Hansson
       [not found]         ` <CAPDyKFooKY7nyOdLxQS-u9oC_pZL3V8pH5kixLgQpUkPG=kqKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 17+ messages in thread
From: Dong Aisheng @ 2013-12-17 11:05 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org

On Tue, Dec 17, 2013 at 6:04 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter@intel.com> wrote:
>> On 17/12/13 01:18, Stephen Warren wrote:
>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>> On one of my eMMC devices, I see the following results from calling
>>>> mmc_do_calc_max_discard() with various parameters:
>>>>
>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>
>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>> IOCTL extremely slow.
>>>
>>> Further investigation shows that if I make a few hacks that essentially
>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>> discard timeout":
>>>
>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>> index 357bbc54fe4b..e66af930d0e3 100644
>>> --- a/drivers/mmc/card/queue.c
>>> +++ b/drivers/mmc/card/queue.c
>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>> request_queue *q,
>>>               return;
>>>
>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>> -     q->limits.max_discard_sectors = max_discard;
>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>               q->limits.discard_zeroes_data = 1;
>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>       /* granularity must not be greater than max. discard */
>>> +#if 0
>>>       if (card->pref_erase > max_discard)
>>>               q->limits.discard_granularity = 0;
>>> +#endif
>>>       if (mmc_can_secure_erase_trim(card))
>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>  }
>>>
>>> I end up with:
>>>
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>> 2097152
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>> 2199023255040
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>> 1
>>>
>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>> works; I filled a large partition with /dev/urandom, executed
>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>> discarded part back.
>>>
>>> This implies that the issue is simply the operation of
>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>> discard abilities, doesn't it?
>>
>> No.
>>
>> The underlying problem is a combination of:
>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>> minutes for large erases
>>         b) SDHCI controllers have been implemented with high frequency timeout
>> clocks which limit the maximum timeout to a few seconds
>>         c) It is not possible to disable the timeout on SDHCI
>>
>> What a) means is that you can get away with much larger erases than you can
>> specify the timeout for - which is what you have discovered.
>>
>> To understand the timeouts, you should manually do the calculations.
>>
>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>> beware of the partitioning implications of changing that.
>>
>> The best solution is to change the hardware to use the lowest possible
>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>> up to 36 hours.
>
> Don't know the details about the limitations for SDHCI, but I guess
> similar exists for other controllers as well.
>
> I do get the impression that we have got a problem in the mmc
> core/block layer for how erase/trim/discard timeouts are being
> handled.
>
> I don't think the mmc hw-controller should be waiting for the R1B
> response from the CMD38 as long as this "timeout" we are discussing
> here. According to the spec, at least how I interpret it, the card
> should respond rather quickly to CMD38, then it will assert the DAT0
> line to indicate busy.
>

For IMX, CMD38 responds very quick since it does not wait for TC interrupt
when DAT0 de-assertion due to IP limitation.

> The total time the card is allowed to stay busy, that is what the
> timeout specifies. We may either use a mmc hw-controller busy
> detection mechanism or send CMD13 to poll for status. The latter is
> somewhat already being handled in mmc_do_erase(), but we are using
> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
>

Maybe one better way may be using polling for status if erase timeout
is bigger than
host capability, else still prefer to use hw timeout mechanism instead
to save CPU.
However, then we have two issues:
1) not waiting for R1B seems a bit violation with standard spec.
Also it increase complexity on handling the R1B of the same command
for two different
cases: using hw timeout or polling status for CMD38.

2) In current implementation, the data size to erase will not exceed
the max_discard_bytes
which is calculated based on max_discard_to of host.
Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
Will it be too long to affect other activities in the same system?

Regards
Dong Aisheng

> Kind regards
> Ulf Hansson
>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]         ` <CAPDyKFooKY7nyOdLxQS-u9oC_pZL3V8pH5kixLgQpUkPG=kqKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-17 11:20           ` Adrian Hunter
       [not found]             ` <52B03373.3000505-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2013-12-17 13:14           ` Vladimir Zapolskiy
  1 sibling, 1 reply; 17+ messages in thread
From: Adrian Hunter @ 2013-12-17 11:20 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 17/12/13 12:04, Ulf Hansson wrote:
> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>> On 17/12/13 01:18, Stephen Warren wrote:
>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>> On one of my eMMC devices, I see the following results from calling
>>>> mmc_do_calc_max_discard() with various parameters:
>>>>
>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>
>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>> IOCTL extremely slow.
>>>
>>> Further investigation shows that if I make a few hacks that essentially
>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>> discard timeout":
>>>
>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>> index 357bbc54fe4b..e66af930d0e3 100644
>>> --- a/drivers/mmc/card/queue.c
>>> +++ b/drivers/mmc/card/queue.c
>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>> request_queue *q,
>>>               return;
>>>
>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>> -     q->limits.max_discard_sectors = max_discard;
>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>               q->limits.discard_zeroes_data = 1;
>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>       /* granularity must not be greater than max. discard */
>>> +#if 0
>>>       if (card->pref_erase > max_discard)
>>>               q->limits.discard_granularity = 0;
>>> +#endif
>>>       if (mmc_can_secure_erase_trim(card))
>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>  }
>>>
>>> I end up with:
>>>
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>> 2097152
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>> 2199023255040
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>> 1
>>>
>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>> works; I filled a large partition with /dev/urandom, executed
>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>> discarded part back.
>>>
>>> This implies that the issue is simply the operation of
>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>> discard abilities, doesn't it?
>>
>> No.
>>
>> The underlying problem is a combination of:
>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>> minutes for large erases
>>         b) SDHCI controllers have been implemented with high frequency timeout
>> clocks which limit the maximum timeout to a few seconds
>>         c) It is not possible to disable the timeout on SDHCI
>>
>> What a) means is that you can get away with much larger erases than you can
>> specify the timeout for - which is what you have discovered.
>>
>> To understand the timeouts, you should manually do the calculations.
>>
>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>> beware of the partitioning implications of changing that.
>>
>> The best solution is to change the hardware to use the lowest possible
>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>> up to 36 hours.
> 
> Don't know the details about the limitations for SDHCI, but I guess
> similar exists for other controllers as well.

Not necessarily.  For example omap_hsmmc just disables the timeout for erase
operations.

> 
> I do get the impression that we have got a problem in the mmc
> core/block layer for how erase/trim/discard timeouts are being
> handled.
> 
> I don't think the mmc hw-controller should be waiting for the R1B
> response from the CMD38 as long as this "timeout" we are discussing
> here. According to the spec, at least how I interpret it, the card
> should respond rather quickly to CMD38, then it will assert the DAT0
> line to indicate busy.
> 
> The total time the card is allowed to stay busy, that is what the
> timeout specifies. We may either use a mmc hw-controller busy
> detection mechanism or send CMD13 to poll for status. The latter is
> somewhat already being handled in mmc_do_erase(), but we are using
> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
> 
> Kind regards
> Ulf Hansson
> 
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]             ` <52B03373.3000505-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2013-12-17 12:25               ` Ulf Hansson
  0 siblings, 0 replies; 17+ messages in thread
From: Ulf Hansson @ 2013-12-17 12:25 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 17 December 2013 12:20, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> On 17/12/13 12:04, Ulf Hansson wrote:
>> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>> On 17/12/13 01:18, Stephen Warren wrote:
>>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>>> On one of my eMMC devices, I see the following results from calling
>>>>> mmc_do_calc_max_discard() with various parameters:
>>>>>
>>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>>
>>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>>> IOCTL extremely slow.
>>>>
>>>> Further investigation shows that if I make a few hacks that essentially
>>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>>> discard timeout":
>>>>
>>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>>> index 357bbc54fe4b..e66af930d0e3 100644
>>>> --- a/drivers/mmc/card/queue.c
>>>> +++ b/drivers/mmc/card/queue.c
>>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>>> request_queue *q,
>>>>               return;
>>>>
>>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>>> -     q->limits.max_discard_sectors = max_discard;
>>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>>               q->limits.discard_zeroes_data = 1;
>>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>>       /* granularity must not be greater than max. discard */
>>>> +#if 0
>>>>       if (card->pref_erase > max_discard)
>>>>               q->limits.discard_granularity = 0;
>>>> +#endif
>>>>       if (mmc_can_secure_erase_trim(card))
>>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>>  }
>>>>
>>>> I end up with:
>>>>
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>>> 2097152
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>>> 2199023255040
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>>> 1
>>>>
>>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>>> works; I filled a large partition with /dev/urandom, executed
>>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>>> discarded part back.
>>>>
>>>> This implies that the issue is simply the operation of
>>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>>> discard abilities, doesn't it?
>>>
>>> No.
>>>
>>> The underlying problem is a combination of:
>>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>>> minutes for large erases
>>>         b) SDHCI controllers have been implemented with high frequency timeout
>>> clocks which limit the maximum timeout to a few seconds
>>>         c) It is not possible to disable the timeout on SDHCI
>>>
>>> What a) means is that you can get away with much larger erases than you can
>>> specify the timeout for - which is what you have discovered.
>>>
>>> To understand the timeouts, you should manually do the calculations.
>>>
>>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>>> beware of the partitioning implications of changing that.
>>>
>>> The best solution is to change the hardware to use the lowest possible
>>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>>> up to 36 hours.
>>
>> Don't know the details about the limitations for SDHCI, but I guess
>> similar exists for other controllers as well.
>
> Not necessarily.  For example omap_hsmmc just disables the timeout for erase
> operations.
>

Interesting! :-) Actually, it is disabling the data time out and
keeping the command time out.

>>
>> I do get the impression that we have got a problem in the mmc
>> core/block layer for how erase/trim/discard timeouts are being
>> handled.
>>
>> I don't think the mmc hw-controller should be waiting for the R1B
>> response from the CMD38 as long as this "timeout" we are discussing
>> here. According to the spec, at least how I interpret it, the card
>> should respond rather quickly to CMD38, then it will assert the DAT0
>> line to indicate busy.
>>
>> The total time the card is allowed to stay busy, that is what the
>> timeout specifies. We may either use a mmc hw-controller busy
>> detection mechanism or send CMD13 to poll for status. The latter is
>> somewhat already being handled in mmc_do_erase(), but we are using
>> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
>>
>> Kind regards
>> Ulf Hansson
>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
  2013-12-17 11:05         ` Dong Aisheng
@ 2013-12-17 12:33           ` Ulf Hansson
       [not found]             ` <CAPDyKFpGpWJRz6AFNqb2AqQMoTuywwZz5Ekq5rc9kboYTGFg7A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Ulf Hansson @ 2013-12-17 12:33 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc@vger.kernel.org, linux-tegra@vger.kernel.org

On 17 December 2013 12:05, Dong Aisheng <dongas86@gmail.com> wrote:
> On Tue, Dec 17, 2013 at 6:04 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter@intel.com> wrote:
>>> On 17/12/13 01:18, Stephen Warren wrote:
>>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>>> On one of my eMMC devices, I see the following results from calling
>>>>> mmc_do_calc_max_discard() with various parameters:
>>>>>
>>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>>
>>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>>> IOCTL extremely slow.
>>>>
>>>> Further investigation shows that if I make a few hacks that essentially
>>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>>> discard timeout":
>>>>
>>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>>> index 357bbc54fe4b..e66af930d0e3 100644
>>>> --- a/drivers/mmc/card/queue.c
>>>> +++ b/drivers/mmc/card/queue.c
>>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>>> request_queue *q,
>>>>               return;
>>>>
>>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>>> -     q->limits.max_discard_sectors = max_discard;
>>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>>               q->limits.discard_zeroes_data = 1;
>>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>>       /* granularity must not be greater than max. discard */
>>>> +#if 0
>>>>       if (card->pref_erase > max_discard)
>>>>               q->limits.discard_granularity = 0;
>>>> +#endif
>>>>       if (mmc_can_secure_erase_trim(card))
>>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>>  }
>>>>
>>>> I end up with:
>>>>
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>>> 2097152
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>>> 2199023255040
>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>>> 1
>>>>
>>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>>> works; I filled a large partition with /dev/urandom, executed
>>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>>> discarded part back.
>>>>
>>>> This implies that the issue is simply the operation of
>>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>>> discard abilities, doesn't it?
>>>
>>> No.
>>>
>>> The underlying problem is a combination of:
>>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>>> minutes for large erases
>>>         b) SDHCI controllers have been implemented with high frequency timeout
>>> clocks which limit the maximum timeout to a few seconds
>>>         c) It is not possible to disable the timeout on SDHCI
>>>
>>> What a) means is that you can get away with much larger erases than you can
>>> specify the timeout for - which is what you have discovered.
>>>
>>> To understand the timeouts, you should manually do the calculations.
>>>
>>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>>> beware of the partitioning implications of changing that.
>>>
>>> The best solution is to change the hardware to use the lowest possible
>>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>>> up to 36 hours.
>>
>> Don't know the details about the limitations for SDHCI, but I guess
>> similar exists for other controllers as well.
>>
>> I do get the impression that we have got a problem in the mmc
>> core/block layer for how erase/trim/discard timeouts are being
>> handled.
>>
>> I don't think the mmc hw-controller should be waiting for the R1B
>> response from the CMD38 as long as this "timeout" we are discussing
>> here. According to the spec, at least how I interpret it, the card
>> should respond rather quickly to CMD38, then it will assert the DAT0
>> line to indicate busy.
>>
>
> For IMX, CMD38 responds very quick since it does not wait for TC interrupt
> when DAT0 de-assertion due to IP limitation.
>
>> The total time the card is allowed to stay busy, that is what the
>> timeout specifies. We may either use a mmc hw-controller busy
>> detection mechanism or send CMD13 to poll for status. The latter is
>> somewhat already being handled in mmc_do_erase(), but we are using
>> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
>>
>
> Maybe one better way may be using polling for status if erase timeout
> is bigger than
> host capability, else still prefer to use hw timeout mechanism instead
> to save CPU.

Nope, this wont work.

Just because we get the R1B response within some chosen timeout that
does not mean the card has completed it's operation.

We need to monitor if the card is signalling busy, after the R1B
response has been received to know. Thus polling with CMD13 will be
needed, no matter how.

Kind regards
Ulf Hansson

> However, then we have two issues:
> 1) not waiting for R1B seems a bit violation with standard spec.
> Also it increase complexity on handling the R1B of the same command
> for two different
> cases: using hw timeout or polling status for CMD38.
>
> 2) In current implementation, the data size to erase will not exceed
> the max_discard_bytes
> which is calculated based on max_discard_to of host.
> Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
> Will it be too long to affect other activities in the same system?



>
> Regards
> Dong Aisheng
>
>> Kind regards
>> Ulf Hansson
>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]             ` <CAPDyKFpGpWJRz6AFNqb2AqQMoTuywwZz5Ekq5rc9kboYTGFg7A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-17 12:44               ` Ulf Hansson
  2013-12-18  3:11               ` Dong Aisheng
  1 sibling, 0 replies; 17+ messages in thread
From: Ulf Hansson @ 2013-12-17 12:44 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 17 December 2013 13:33, Ulf Hansson <ulf.hansson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On 17 December 2013 12:05, Dong Aisheng <dongas86-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Tue, Dec 17, 2013 at 6:04 PM, Ulf Hansson <ulf.hansson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>>> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>>> On 17/12/13 01:18, Stephen Warren wrote:
>>>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>>>> On one of my eMMC devices, I see the following results from calling
>>>>>> mmc_do_calc_max_discard() with various parameters:
>>>>>>
>>>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>>>
>>>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>>>> IOCTL extremely slow.
>>>>>
>>>>> Further investigation shows that if I make a few hacks that essentially
>>>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>>>> discard timeout":
>>>>>
>>>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>>>> index 357bbc54fe4b..e66af930d0e3 100644
>>>>> --- a/drivers/mmc/card/queue.c
>>>>> +++ b/drivers/mmc/card/queue.c
>>>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>>>> request_queue *q,
>>>>>               return;
>>>>>
>>>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>>>> -     q->limits.max_discard_sectors = max_discard;
>>>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>>>               q->limits.discard_zeroes_data = 1;
>>>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>>>       /* granularity must not be greater than max. discard */
>>>>> +#if 0
>>>>>       if (card->pref_erase > max_discard)
>>>>>               q->limits.discard_granularity = 0;
>>>>> +#endif
>>>>>       if (mmc_can_secure_erase_trim(card))
>>>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>>>  }
>>>>>
>>>>> I end up with:
>>>>>
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>>>> 2097152
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>>>> 2199023255040
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>>>> 1
>>>>>
>>>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>>>> works; I filled a large partition with /dev/urandom, executed
>>>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>>>> discarded part back.
>>>>>
>>>>> This implies that the issue is simply the operation of
>>>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>>>> discard abilities, doesn't it?
>>>>
>>>> No.
>>>>
>>>> The underlying problem is a combination of:
>>>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>>>> minutes for large erases
>>>>         b) SDHCI controllers have been implemented with high frequency timeout
>>>> clocks which limit the maximum timeout to a few seconds
>>>>         c) It is not possible to disable the timeout on SDHCI
>>>>
>>>> What a) means is that you can get away with much larger erases than you can
>>>> specify the timeout for - which is what you have discovered.
>>>>
>>>> To understand the timeouts, you should manually do the calculations.
>>>>
>>>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>>>> beware of the partitioning implications of changing that.
>>>>
>>>> The best solution is to change the hardware to use the lowest possible
>>>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>>>> up to 36 hours.
>>>
>>> Don't know the details about the limitations for SDHCI, but I guess
>>> similar exists for other controllers as well.
>>>
>>> I do get the impression that we have got a problem in the mmc
>>> core/block layer for how erase/trim/discard timeouts are being
>>> handled.
>>>
>>> I don't think the mmc hw-controller should be waiting for the R1B
>>> response from the CMD38 as long as this "timeout" we are discussing
>>> here. According to the spec, at least how I interpret it, the card
>>> should respond rather quickly to CMD38, then it will assert the DAT0
>>> line to indicate busy.
>>>
>>
>> For IMX, CMD38 responds very quick since it does not wait for TC interrupt
>> when DAT0 de-assertion due to IP limitation.
>>
>>> The total time the card is allowed to stay busy, that is what the
>>> timeout specifies. We may either use a mmc hw-controller busy
>>> detection mechanism or send CMD13 to poll for status. The latter is
>>> somewhat already being handled in mmc_do_erase(), but we are using
>>> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
>>>
>>
>> Maybe one better way may be using polling for status if erase timeout
>> is bigger than
>> host capability, else still prefer to use hw timeout mechanism instead
>> to save CPU.
>
> Nope, this wont work.
>
> Just because we get the R1B response within some chosen timeout that
> does not mean the card has completed it's operation.
>
> We need to monitor if the card is signalling busy, after the R1B
> response has been received to know. Thus polling with CMD13 will be
> needed, no matter how.

In this context I think it should be worth mentioning about how big
the command timeout can be expected to be.

"Ncr" (abbreviation from eMMC spec), the response timeout can be a
value between 2-64 clock cycles.

Kind regards
Ulf Hansson

>
> Kind regards
> Ulf Hansson
>
>> However, then we have two issues:
>> 1) not waiting for R1B seems a bit violation with standard spec.
>> Also it increase complexity on handling the R1B of the same command
>> for two different
>> cases: using hw timeout or polling status for CMD38.
>>
>> 2) In current implementation, the data size to erase will not exceed
>> the max_discard_bytes
>> which is calculated based on max_discard_to of host.
>> Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
>> Will it be too long to affect other activities in the same system?
>
>
>
>>
>> Regards
>> Dong Aisheng
>>
>>> Kind regards
>>> Ulf Hansson
>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]         ` <CAPDyKFooKY7nyOdLxQS-u9oC_pZL3V8pH5kixLgQpUkPG=kqKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-12-17 11:20           ` Adrian Hunter
@ 2013-12-17 13:14           ` Vladimir Zapolskiy
  1 sibling, 0 replies; 17+ messages in thread
From: Vladimir Zapolskiy @ 2013-12-17 13:14 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 12/17/13 11:04, Ulf Hansson wrote:
> On 17 December 2013 09:17, Adrian Hunter<adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>  wrote:
>> On 17/12/13 01:18, Stephen Warren wrote:
>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>> On one of my eMMC devices, I see the following results from calling
>>>> mmc_do_calc_max_discard() with various parameters:
>>>>
>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>
>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>> IOCTL extremely slow.
>>>
>>> Further investigation shows that if I make a few hacks that essentially
>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>> discard timeout":
>>>
>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>> index 357bbc54fe4b..e66af930d0e3 100644
>>> --- a/drivers/mmc/card/queue.c
>>> +++ b/drivers/mmc/card/queue.c
>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>> request_queue *q,
>>>                return;
>>>
>>>        queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>> -     q->limits.max_discard_sectors = max_discard;
>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>        if (card->erased_byte == 0&&  !mmc_can_discard(card))
>>>                q->limits.discard_zeroes_data = 1;
>>>        q->limits.discard_granularity = card->pref_erase<<  9;
>>>        /* granularity must not be greater than max. discard */
>>> +#if 0
>>>        if (card->pref_erase>  max_discard)
>>>                q->limits.discard_granularity = 0;
>>> +#endif
>>>        if (mmc_can_secure_erase_trim(card))
>>>                queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>   }
>>>
>>> I end up with:
>>>
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>> 2097152
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>> 2199023255040
>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>> 1
>>>
>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>> works; I filled a large partition with /dev/urandom, executed
>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>> discarded part back.
>>>
>>> This implies that the issue is simply the operation of
>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>> discard abilities, doesn't it?
>>
>> No.
>>
>> The underlying problem is a combination of:
>>          a) JEDEC specified very large timeouts for erase operations e.g. can be
>> minutes for large erases
>>          b) SDHCI controllers have been implemented with high frequency timeout
>> clocks which limit the maximum timeout to a few seconds
>>          c) It is not possible to disable the timeout on SDHCI
>>
>> What a) means is that you can get away with much larger erases than you can
>> specify the timeout for - which is what you have discovered.
>>
>> To understand the timeouts, you should manually do the calculations.
>>
>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>> beware of the partitioning implications of changing that.
>>
>> The best solution is to change the hardware to use the lowest possible
>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>> up to 36 hours.
>
> Don't know the details about the limitations for SDHCI, but I guess
> similar exists for other controllers as well.
>
> I do get the impression that we have got a problem in the mmc
> core/block layer for how erase/trim/discard timeouts are being
> handled.
>
> I don't think the mmc hw-controller should be waiting for the R1B
> response from the CMD38 as long as this "timeout" we are discussing
> here. According to the spec, at least how I interpret it, the card
> should respond rather quickly to CMD38, then it will assert the DAT0
> line to indicate busy.
>
> The total time the card is allowed to stay busy, that is what the
> timeout specifies. We may either use a mmc hw-controller busy
> detection mechanism or send CMD13 to poll for status. The latter is
> somewhat already being handled in mmc_do_erase(), but we are using
> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.

What is the correct timeout? The currently implemented logic doesn't
allow to set let say 1000 erase groups (and the correspondent timeout),
on the other hand if it is allowed, then this correct timeout may
take tens of hours, which should not be permitted from user's
perspective. I think the predefined MMC_CORE_TIMEOUT_MS is good enough,
the only missing part is a permission to erase as many erase groups
as wanted by user.

With best wishes,
Vladimir

> Kind regards
> Ulf Hansson
>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]     ` <CAA+hA=R3wnbuJrJQhfG9PQEHrwE9nrwg_+xSpyXryOeM2Wtwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-17 17:27       ` Stephen Warren
       [not found]         ` <52B0897B.5010700-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Stephen Warren @ 2013-12-17 17:27 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Chris Ball, linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 12/17/2013 02:25 AM, Dong Aisheng wrote:
> Hi Stephen,
> 
> On Sat, Dec 14, 2013 at 6:43 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
>> On one of my eMMC devices, I see the following results from calling
>> mmc_do_calc_max_discard() with various parameters:
>>
>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>
>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>> IOCTL extremely slow.
>>
> 
> IMX met the similar issue.
> http://www.spinics.net/lists/linux-mmc/msg23375.html
> It's caused by the max_discard_to supported by host is too small.
> 
> I submitted the fix patches:
> http://www.spinics.net/lists/arm-kernel/msg294924.html
> Please see if it helps for you, especially patch #5.
> It could increase the max_discard_to if Tegra has same problem.

Thanks for the pointer!

Yes, Tegra has SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK set, and has a max
clock of 208MHz specified in HW, yet we only run the HW at 48MHz
upstream, since we haven't actually implemented any of the
advanced/faster transfer rates yet. Hence that patch does avoid/solve
the issue on 2 of my boards.

However, the patch doesn't solve the problem on 2 other boards, since
the eMMC device on those boards specifies a much larger timeout, which
still causes max_discard to be set to 0. It sounds like the real
solution is what was discussed elsewhere in this thread; to use command
polling for erases?

Even on the boards where your patch solves the problem, isn't it just a
temporary measure; as soon as we upstream the changes to enable the
faster transfer modes, we'll have a faster SDCLK, and hence again be
limited in the discard size, perhaps down to a single sector again.

(Incidentally, I think the code should be limiting to a single erase
block, not a single sector. I'll send a separate patch to fix that, and
Cc everyone here).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]             ` <CAPDyKFpGpWJRz6AFNqb2AqQMoTuywwZz5Ekq5rc9kboYTGFg7A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2013-12-17 12:44               ` Ulf Hansson
@ 2013-12-18  3:11               ` Dong Aisheng
  1 sibling, 0 replies; 17+ messages in thread
From: Dong Aisheng @ 2013-12-18  3:11 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Adrian Hunter, Stephen Warren, Chris Ball,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Tue, Dec 17, 2013 at 8:33 PM, Ulf Hansson <ulf.hansson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
> On 17 December 2013 12:05, Dong Aisheng <dongas86-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Tue, Dec 17, 2013 at 6:04 PM, Ulf Hansson <ulf.hansson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>>> On 17 December 2013 09:17, Adrian Hunter <adrian.hunter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>>> On 17/12/13 01:18, Stephen Warren wrote:
>>>>> On 12/13/2013 03:43 PM, Stephen Warren wrote:
>>>>>> On one of my eMMC devices, I see the following results from calling
>>>>>> mmc_do_calc_max_discard() with various parameters:
>>>>>>
>>>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>>>
>>>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>>>> IOCTL extremely slow.
>>>>>
>>>>> Further investigation shows that if I make a few hacks that essentially
>>>>> revert e056a1b5b67b "mmc: queue: let host controllers specify maximum
>>>>> discard timeout":
>>>>>
>>>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>>>> index 357bbc54fe4b..e66af930d0e3 100644
>>>>> --- a/drivers/mmc/card/queue.c
>>>>> +++ b/drivers/mmc/card/queue.c
>>>>> @@ -167,13 +167,15 @@ static void mmc_queue_setup_discard(struct
>>>>> request_queue *q,
>>>>>               return;
>>>>>
>>>>>       queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
>>>>> -     q->limits.max_discard_sectors = max_discard;
>>>>> +     q->limits.max_discard_sectors = UINT_MAX;
>>>>>       if (card->erased_byte == 0 && !mmc_can_discard(card))
>>>>>               q->limits.discard_zeroes_data = 1;
>>>>>       q->limits.discard_granularity = card->pref_erase << 9;
>>>>>       /* granularity must not be greater than max. discard */
>>>>> +#if 0
>>>>>       if (card->pref_erase > max_discard)
>>>>>               q->limits.discard_granularity = 0;
>>>>> +#endif
>>>>>       if (mmc_can_secure_erase_trim(card))
>>>>>               queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
>>>>>  }
>>>>>
>>>>> I end up with:
>>>>>
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_granularity
>>>>> 2097152
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_max_bytes
>>>>> 2199023255040
>>>>> $ cat /sys/.../block/mmcblk1/queue# cat discard_zeroes_data
>>>>> 1
>>>>>
>>>>> With those values, mke2fs is fast, and I validated that "blkdiscard"
>>>>> works; I filled a large partition with /dev/urandom, executed
>>>>> "blkdiscard" on the 4M at the start, and saw zeroes when reading the
>>>>> discarded part back.
>>>>>
>>>>> This implies that the issue is simply the operation of
>>>>> mmc_calc_max_discard(), rather than the eMMC device mis-reporting its
>>>>> discard abilities, doesn't it?
>>>>
>>>> No.
>>>>
>>>> The underlying problem is a combination of:
>>>>         a) JEDEC specified very large timeouts for erase operations e.g. can be
>>>> minutes for large erases
>>>>         b) SDHCI controllers have been implemented with high frequency timeout
>>>> clocks which limit the maximum timeout to a few seconds
>>>>         c) It is not possible to disable the timeout on SDHCI
>>>>
>>>> What a) means is that you can get away with much larger erases than you can
>>>> specify the timeout for - which is what you have discovered.
>>>>
>>>> To understand the timeouts, you should manually do the calculations.
>>>>
>>>> Also note, that using HC Erase Size may help (MMC_CAP2_HC_ERASE_SZ), but
>>>> beware of the partitioning implications of changing that.
>>>>
>>>> The best solution is to change the hardware to use the lowest possible
>>>> frequency timeout clock e.g. a 1KHz timeout clock could support timeouts of
>>>> up to 36 hours.
>>>
>>> Don't know the details about the limitations for SDHCI, but I guess
>>> similar exists for other controllers as well.
>>>
>>> I do get the impression that we have got a problem in the mmc
>>> core/block layer for how erase/trim/discard timeouts are being
>>> handled.
>>>
>>> I don't think the mmc hw-controller should be waiting for the R1B
>>> response from the CMD38 as long as this "timeout" we are discussing
>>> here. According to the spec, at least how I interpret it, the card
>>> should respond rather quickly to CMD38, then it will assert the DAT0
>>> line to indicate busy.
>>>
>>
>> For IMX, CMD38 responds very quick since it does not wait for TC interrupt
>> when DAT0 de-assertion due to IP limitation.
>>
>>> The total time the card is allowed to stay busy, that is what the
>>> timeout specifies. We may either use a mmc hw-controller busy
>>> detection mechanism or send CMD13 to poll for status. The latter is
>>> somewhat already being handled in mmc_do_erase(), but we are using
>>> "MMC_CORE_TIMEOUT_MS" instead of the correct timeout.
>>>
>>
>> Maybe one better way may be using polling for status if erase timeout
>> is bigger than
>> host capability, else still prefer to use hw timeout mechanism instead
>> to save CPU.
>
> Nope, this wont work.
>
> Just because we get the R1B response within some chosen timeout that
> does not mean the card has completed it's operation.
>

I mean do not wait for R1B busy signal, IOW replace R1b with R1.
Then using polling way to check card's busy status.

Regards
Dong Aisheng

> We need to monitor if the card is signalling busy, after the R1B
> response has been received to know. Thus polling with CMD13 will be
> needed, no matter how.
>
> Kind regards
> Ulf Hansson
>
>> However, then we have two issues:
>> 1) not waiting for R1B seems a bit violation with standard spec.
>> Also it increase complexity on handling the R1B of the same command
>> for two different
>> cases: using hw timeout or polling status for CMD38.
>>
>> 2) In current implementation, the data size to erase will not exceed
>> the max_discard_bytes
>> which is calculated based on max_discard_to of host.
>> Then how do we specify max_discard_to if want to use polling? UNIT_MAX?
>> Will it be too long to affect other activities in the same system?
>
>
>
>>
>> Regards
>> Dong Aisheng
>>
>>> Kind regards
>>> Ulf Hansson
>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]         ` <52B0897B.5010700-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
@ 2013-12-18  3:32           ` Dong Aisheng
       [not found]             ` <CAA+hA=Rm1b_ah1uTyps71BLjturJXDHKyWTgiU+z2ELYZKSAWw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Dong Aisheng @ 2013-12-18  3:32 UTC (permalink / raw)
  To: Stephen Warren
  Cc: Chris Ball, linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, Dec 18, 2013 at 1:27 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
> On 12/17/2013 02:25 AM, Dong Aisheng wrote:
>> Hi Stephen,
>>
>> On Sat, Dec 14, 2013 at 6:43 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
>>> On one of my eMMC devices, I see the following results from calling
>>> mmc_do_calc_max_discard() with various parameters:
>>>
>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>
>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>> IOCTL extremely slow.
>>>
>>
>> IMX met the similar issue.
>> http://www.spinics.net/lists/linux-mmc/msg23375.html
>> It's caused by the max_discard_to supported by host is too small.
>>
>> I submitted the fix patches:
>> http://www.spinics.net/lists/arm-kernel/msg294924.html
>> Please see if it helps for you, especially patch #5.
>> It could increase the max_discard_to if Tegra has same problem.
>
> Thanks for the pointer!
>
> Yes, Tegra has SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK set, and has a max
> clock of 208MHz specified in HW, yet we only run the HW at 48MHz
> upstream, since we haven't actually implemented any of the
> advanced/faster transfer rates yet. Hence that patch does avoid/solve
> the issue on 2 of my boards.
>
> However, the patch doesn't solve the problem on 2 other boards, since
> the eMMC device on those boards specifies a much larger timeout, which
> still causes max_discard to be set to 0. It sounds like the real
> solution is what was discussed elsewhere in this thread; to use command
> polling for erases?
>
> Even on the boards where your patch solves the problem, isn't it just a
> temporary measure; as soon as we upstream the changes to enable the
> faster transfer modes, we'll have a faster SDCLK, and hence again be
> limited in the discard size, perhaps down to a single sector again.
>

Actually my patch is intend to fix 1) IMX incorrect max timeout issue
2) should not use max_clock
to calculate max_discard_to issue for using SDCLK as timeout clock.
The issue discussed here is a different issue that the card timeout
may still be bigger than
the host capability and how to use discard for such case.
So your problem may exist if you meet some more big timeout cards.
For IMX, when running at 50Mhz, the max timeout is more than 5s.
It looks bigger enough currently and i tested many eMMC cards(Samsung,
Toshiba, Sandisk, Hynix)
and all they worked well with discard after fix.
I don't know which eMMC cards you meet the issue and don't know what
is Tegra max timeout.
Just for SD3.0 cards working on 200Mhz, i observed one Toshiba
SDHC U1 card could not do discard, since its AU erase timeout is 2s+
which exceeds the host
capability 1335ms. Thus discard is automatically disabled.
But another Sandisk SDXC can still work well since it has small
ERASE_OFFSET as 1s.

Regards
Dong Aisheng

> (Incidentally, I think the code should be limiting to a single erase
> block, not a single sector. I'll send a separate patch to fix that, and
> Cc everyone here).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: max_discard anomaly on certain Sandisk eMMC
       [not found]             ` <CAA+hA=Rm1b_ah1uTyps71BLjturJXDHKyWTgiU+z2ELYZKSAWw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-12-18 18:42               ` Stephen Warren
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Warren @ 2013-12-18 18:42 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: Chris Ball, linux-mmc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 12/17/2013 08:32 PM, Dong Aisheng wrote:
> On Wed, Dec 18, 2013 at 1:27 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
>> On 12/17/2013 02:25 AM, Dong Aisheng wrote:
>>> Hi Stephen,
>>>
>>> On Sat, Dec 14, 2013 at 6:43 AM, Stephen Warren <swarren-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org> wrote:
>>>> On one of my eMMC devices, I see the following results from calling
>>>> mmc_do_calc_max_discard() with various parameters:
>>>>
>>>> [    3.057263] MMC_DISCARD_ARG max_discard 1
>>>> [    3.057266] MMC_ERASE_ARG   max_discard 4096
>>>> [    3.057267] MMC_TRIM_ARG    max_discard 1
>>>>
>>>> This causes mmc_calc_max_discard() to return 1, which makes the discard
>>>> IOCTL extremely slow.
>>>>
>>>
>>> IMX met the similar issue.
>>> http://www.spinics.net/lists/linux-mmc/msg23375.html
>>> It's caused by the max_discard_to supported by host is too small.
>>>
>>> I submitted the fix patches:
>>> http://www.spinics.net/lists/arm-kernel/msg294924.html
>>> Please see if it helps for you, especially patch #5.
>>> It could increase the max_discard_to if Tegra has same problem.
>>
...
>> Even on the boards where your patch solves the problem, isn't it just a
>> temporary measure; as soon as we upstream the changes to enable the
>> faster transfer modes, we'll have a faster SDCLK, and hence again be
>> limited in the discard size, perhaps down to a single sector again.
> 
> Actually my patch is intend to fix 1) IMX incorrect max timeout issue
> 2) should not use max_clock
> to calculate max_discard_to issue for using SDCLK as timeout clock.
> The issue discussed here is a different issue that the card timeout
> may still be bigger than
> the host capability and how to use discard for such case.
> So your problem may exist if you meet some more big timeout cards.
> For IMX, when running at 50Mhz, the max timeout is more than 5s.
> It looks bigger enough currently and i tested many eMMC cards(Samsung,
> Toshiba, Sandisk, Hynix)
> and all they worked well with discard after fix.
> I don't know which eMMC cards you meet the issue and don't know what
> is Tegra max timeout.
> Just for SD3.0 cards working on 200Mhz, i observed one Toshiba
> SDHC U1 card could not do discard, since its AU erase timeout is 2s+
> which exceeds the host
> capability 1335ms. Thus discard is automatically disabled.
> But another Sandisk SDXC can still work well since it has small
> ERASE_OFFSET as 1s.

On the more recent Tegra boards, the eMMC devices appear to have an
erase timeout of 4200ms for a single erase block! That's more than the
~2600ms max controller timeout at 48MHz on Tegra:-( (that is unless
Tegra also supports more than 27 bits of timeout register)

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-12-18 18:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-13 22:43 max_discard anomaly on certain Sandisk eMMC Stephen Warren
2013-12-16 23:18 ` Stephen Warren
2013-12-17  8:17   ` Adrian Hunter
     [not found]     ` <52B008AB.7060909-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-12-17  9:40       ` Dong Aisheng
2013-12-17  9:45         ` Vladimir Zapolskiy
2013-12-17 10:04       ` Ulf Hansson
2013-12-17 11:05         ` Dong Aisheng
2013-12-17 12:33           ` Ulf Hansson
     [not found]             ` <CAPDyKFpGpWJRz6AFNqb2AqQMoTuywwZz5Ekq5rc9kboYTGFg7A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-17 12:44               ` Ulf Hansson
2013-12-18  3:11               ` Dong Aisheng
     [not found]         ` <CAPDyKFooKY7nyOdLxQS-u9oC_pZL3V8pH5kixLgQpUkPG=kqKw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-17 11:20           ` Adrian Hunter
     [not found]             ` <52B03373.3000505-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-12-17 12:25               ` Ulf Hansson
2013-12-17 13:14           ` Vladimir Zapolskiy
     [not found] ` <52AB8DA2.9000001-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
2013-12-17  9:25   ` Dong Aisheng
     [not found]     ` <CAA+hA=R3wnbuJrJQhfG9PQEHrwE9nrwg_+xSpyXryOeM2Wtwcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-17 17:27       ` Stephen Warren
     [not found]         ` <52B0897B.5010700-3lzwWm7+Weoh9ZMKESR00Q@public.gmane.org>
2013-12-18  3:32           ` Dong Aisheng
     [not found]             ` <CAA+hA=Rm1b_ah1uTyps71BLjturJXDHKyWTgiU+z2ELYZKSAWw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-12-18 18:42               ` Stephen Warren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).