* [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
@ 2015-06-24 0:26 Jes.Sorensen
2015-06-24 0:26 ` [PATCH 1/1] raid0: Disable discard per default due to performance uncertainty Jes.Sorensen
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Jes.Sorensen @ 2015-06-24 0:26 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Neil,
I have been hitting issues with discard being ridiculously slow on
arrays with certain typs of SSDs that seem to serialize discard
processing.
This is particularly bad as I have seen systems where the IMSM BIOS
defaults to 4KB chunk size, combined with these badly performing
drives, it could bump the mkfs on an array from seconds to over 40
minutes. Most users will stick to the defaults and then hit the
problem during install without understanding why it goes wrong :(
The problem is that there is no way to benchmark our way to this or
somehow test if a drive performs discard at reasonable speed. I
suggest we take an approach similar to that of RAID456 and default to
disabling discard, except for the case where the user knows the drives
are safe.
Thoughts?
Cheers,
Jes
Jes Sorensen (1):
raid0: Disable discard per default due to performance uncertainty
drivers/md/raid0.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
--
2.4.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/1] raid0: Disable discard per default due to performance uncertainty
2015-06-24 0:26 [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Jes.Sorensen
@ 2015-06-24 0:26 ` Jes.Sorensen
2015-06-24 5:00 ` [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Roman Mamedov
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Jes.Sorensen @ 2015-06-24 0:26 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
From: Jes Sorensen <Jes.Sorensen@redhat.com>
Some SSDs handle discard requests badly (very slowly). As an example,
mkfs.xfs on a RAID0 array with 4KB chunk size, constructed from Intel
SSDSC2BF18 180GB SSDs, can jump from 1.2secs to 43mins.
There is no reliable way to easily determine whether a device handles
discard at decent speed, and we cannot just benchmark it since it will
destroy data on the drive.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
drivers/md/raid0.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 4f19837..cc8c3c3 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -25,6 +25,11 @@
#include "raid0.h"
#include "raid5.h"
+static bool devices_discard_performance = false;
+module_param(devices_discard_performance, bool, 0644);
+MODULE_PARM_DESC(devices_discard_performance,
+ "Set to Y if all devices in each array handles discard requests at proper speed");
+
static int raid0_congested(struct mddev *mddev, int bits)
{
struct r0conf *conf = mddev->private;
@@ -277,6 +282,21 @@ static int create_strip_zones(struct mddev *mddev, struct r0conf **private_conf)
blk_queue_io_opt(mddev->queue,
(mddev->chunk_sectors << 9) * mddev->raid_disks);
+ /* Unfortunately, some devices have awful discard performance,
+ * especially for small sized requests. This is particularly
+ * bad for RAID0 with a small chunk size resulting in a small
+ * DISCARD requests hitting the underlaying drives.
+ * Only allow DISCARD if the sysadmin confirms that all devices
+ * in use can handle small DISCARD requests at reasonable speed,
+ * by setting a module parameter.
+ */
+ if (!devices_discard_performance) {
+ if (discard_supported) {
+ pr_info("md/raid0: discard support disabled due to performance uncertainty.\n");
+ pr_info("Set raid0.devices_discard_performance=Y to override.\n");
+ }
+ discard_supported = false;
+ }
if (!discard_supported)
queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, mddev->queue);
else
--
2.4.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 0:26 [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Jes.Sorensen
2015-06-24 0:26 ` [PATCH 1/1] raid0: Disable discard per default due to performance uncertainty Jes.Sorensen
@ 2015-06-24 5:00 ` Roman Mamedov
2015-06-24 11:04 ` Jes Sorensen
2015-06-24 7:55 ` NeilBrown
2015-07-07 4:42 ` Mike Snitzer
3 siblings, 1 reply; 9+ messages in thread
From: Roman Mamedov @ 2015-06-24 5:00 UTC (permalink / raw)
To: Jes.Sorensen; +Cc: neilb, linux-raid
[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]
On Tue, 23 Jun 2015 20:26:12 -0400
Jes.Sorensen@redhat.com wrote:
> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>
> Neil,
>
> I have been hitting issues with discard being ridiculously slow on
> arrays with certain typs of SSDs that seem to serialize discard
> processing.
>
> This is particularly bad as I have seen systems where the IMSM BIOS
> defaults to 4KB chunk size, combined with these badly performing
> drives, it could bump the mkfs on an array from seconds to over 40
> minutes. Most users will stick to the defaults and then hit the
> problem during install without understanding why it goes wrong :(
>
> The problem is that there is no way to benchmark our way to this or
> somehow test if a drive performs discard at reasonable speed. I
> suggest we take an approach similar to that of RAID456 and default to
> disabling discard, except for the case where the user knows the drives
> are safe.
>
> Thoughts?
It's very unfortunate if you would cripple all the good SSD models because of
a few bad ones. No one will remember to explicitly put the override to enable
TRIM, or perhaps even know that it gets disabled in md in the first place. The
only thing they will later notice is lowered performance and lifespan of their
SSDs.
Also most importantly, shouldn't this be handled in the lower level (individual
block devices), and not in md? There's already a mechanism to blacklist TRIM
on some specific SSD models (see libata-core), maybe there should be a way to
disable it by default. Or if those SSDs you mentioned really make it unusable,
maybe they should be just blacklisted as well.
--
With respect,
Roman
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 0:26 [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Jes.Sorensen
2015-06-24 0:26 ` [PATCH 1/1] raid0: Disable discard per default due to performance uncertainty Jes.Sorensen
2015-06-24 5:00 ` [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Roman Mamedov
@ 2015-06-24 7:55 ` NeilBrown
2015-06-24 11:02 ` Jes Sorensen
2015-07-07 4:42 ` Mike Snitzer
3 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2015-06-24 7:55 UTC (permalink / raw)
To: Jes.Sorensen; +Cc: linux-raid
On Tue, 23 Jun 2015 20:26:12 -0400
Jes.Sorensen@redhat.com wrote:
> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>
> Neil,
>
> I have been hitting issues with discard being ridiculously slow on
> arrays with certain typs of SSDs that seem to serialize discard
> processing.
>
> This is particularly bad as I have seen systems where the IMSM BIOS
> defaults to 4KB chunk size, combined with these badly performing
> drives, it could bump the mkfs on an array from seconds to over 40
> minutes. Most users will stick to the defaults and then hit the
> problem during install without understanding why it goes wrong :(
>
> The problem is that there is no way to benchmark our way to this or
> somehow test if a drive performs discard at reasonable speed. I
> suggest we take an approach similar to that of RAID456 and default to
> disabling discard, except for the case where the user knows the drives
> are safe.
>
> Thoughts?
>
> Cheers,
> Jes
>
>
> Jes Sorensen (1):
> raid0: Disable discard per default due to performance uncertainty
>
> drivers/md/raid0.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
RAID1? RAID0?? I hate it when I do that!
Doesn't the scheduler merge adjacent discard requests?
Or is this some non-SATA/SCSI SSD that has a 'make_request_fn' driver?
I think I came across one of those before (NVMe).
In that case - the driver needs to be fixed.
NeilBrown
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 7:55 ` NeilBrown
@ 2015-06-24 11:02 ` Jes Sorensen
2015-06-25 1:05 ` Martin K. Petersen
0 siblings, 1 reply; 9+ messages in thread
From: Jes Sorensen @ 2015-06-24 11:02 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
NeilBrown <neilb@suse.com> writes:
> On Tue, 23 Jun 2015 20:26:12 -0400
> Jes.Sorensen@redhat.com wrote:
>
>> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>>
>> Neil,
>>
>> I have been hitting issues with discard being ridiculously slow on
>> arrays with certain typs of SSDs that seem to serialize discard
>> processing.
>>
>> This is particularly bad as I have seen systems where the IMSM BIOS
>> defaults to 4KB chunk size, combined with these badly performing
>> drives, it could bump the mkfs on an array from seconds to over 40
>> minutes. Most users will stick to the defaults and then hit the
>> problem during install without understanding why it goes wrong :(
>>
>> The problem is that there is no way to benchmark our way to this or
>> somehow test if a drive performs discard at reasonable speed. I
>> suggest we take an approach similar to that of RAID456 and default to
>> disabling discard, except for the case where the user knows the drives
>> are safe.
>>
>> Thoughts?
>>
>> Cheers,
>> Jes
>>
>> Jes Sorensen (1):
>> raid0: Disable discard per default due to performance uncertainty
>>
>> drivers/md/raid0.c | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>
> RAID1? RAID0?? I hate it when I do that!
>
> Doesn't the scheduler merge adjacent discard requests?
>
> Or is this some non-SATA/SCSI SSD that has a 'make_request_fn' driver?
> I think I came across one of those before (NVMe).
> In that case - the driver needs to be fixed.
Nope the block layer doesn't merge the requests at this point. It
actually triggered an OOM on my test system, which has only 4GB of RAM,
because discard commands are not handled as normal commands and have no
control on how many outstanding commands are in flight.
Jes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 5:00 ` [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Roman Mamedov
@ 2015-06-24 11:04 ` Jes Sorensen
2015-06-25 1:03 ` Martin K. Petersen
0 siblings, 1 reply; 9+ messages in thread
From: Jes Sorensen @ 2015-06-24 11:04 UTC (permalink / raw)
To: Roman Mamedov; +Cc: neilb, linux-raid
Roman Mamedov <rm@romanrm.net> writes:
> On Tue, 23 Jun 2015 20:26:12 -0400
> Jes.Sorensen@redhat.com wrote:
>
>> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>>
>> Neil,
>>
>> I have been hitting issues with discard being ridiculously slow on
>> arrays with certain typs of SSDs that seem to serialize discard
>> processing.
>>
>> This is particularly bad as I have seen systems where the IMSM BIOS
>> defaults to 4KB chunk size, combined with these badly performing
>> drives, it could bump the mkfs on an array from seconds to over 40
>> minutes. Most users will stick to the defaults and then hit the
>> problem during install without understanding why it goes wrong :(
>>
>> The problem is that there is no way to benchmark our way to this or
>> somehow test if a drive performs discard at reasonable speed. I
>> suggest we take an approach similar to that of RAID456 and default to
>> disabling discard, except for the case where the user knows the drives
>> are safe.
>>
>> Thoughts?
>
> It's very unfortunate if you would cripple all the good SSD models because of
> a few bad ones. No one will remember to explicitly put the override to enable
> TRIM, or perhaps even know that it gets disabled in md in the first place. The
> only thing they will later notice is lowered performance and lifespan of their
> SSDs.
We already disable discard per default on raid456 in a similar manner
because some of them unreliably reports discard_zeroes_data when they
in reality don't.
If there was a way to reliably detect these things it would be fine,
unfortunately there isn't.
Jes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 11:04 ` Jes Sorensen
@ 2015-06-25 1:03 ` Martin K. Petersen
0 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2015-06-25 1:03 UTC (permalink / raw)
To: Jes Sorensen; +Cc: Roman Mamedov, neilb, linux-raid
>>>>> "Jes" == Jes Sorensen <Jes.Sorensen@redhat.com> writes:
Jes> We already disable discard per default on raid456 in a similar
Jes> manner because some of them unreliably reports discard_zeroes_data
Jes> when they in reality don't.
Jes> If there was a way to reliably detect these things it would be
Jes> fine, unfortunately there isn't.
We whitelist drives that do the right thing now.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 11:02 ` Jes Sorensen
@ 2015-06-25 1:05 ` Martin K. Petersen
0 siblings, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2015-06-25 1:05 UTC (permalink / raw)
To: Jes Sorensen; +Cc: NeilBrown, linux-raid
>>>>> "Jes" == Jes Sorensen <Jes.Sorensen@redhat.com> writes:
>> Or is this some non-SATA/SCSI SSD that has a 'make_request_fn'
>> driver? I think I came across one of those before (NVMe). In that
>> case - the driver needs to be fixed.
Jes> Nope the block layer doesn't merge the requests at this point.
Then something broke it. Discard merging used to work.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1
2015-06-24 0:26 [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Jes.Sorensen
` (2 preceding siblings ...)
2015-06-24 7:55 ` NeilBrown
@ 2015-07-07 4:42 ` Mike Snitzer
3 siblings, 0 replies; 9+ messages in thread
From: Mike Snitzer @ 2015-07-07 4:42 UTC (permalink / raw)
To: Jes.Sorensen; +Cc: Neil Brown, linux-raid@vger.kernel.org, Mikulas Patocka
On Tue, Jun 23, 2015 at 8:26 PM, <Jes.Sorensen@redhat.com> wrote:
> From: Jes Sorensen <Jes.Sorensen@redhat.com>
>
> Neil,
>
> I have been hitting issues with discard being ridiculously slow on
> arrays with certain typs of SSDs that seem to serialize discard
> processing.
>
> This is particularly bad as I have seen systems where the IMSM BIOS
> defaults to 4KB chunk size, combined with these badly performing
> drives, it could bump the mkfs on an array from seconds to over 40
> minutes. Most users will stick to the defaults and then hit the
> problem during install without understanding why it goes wrong :(
>
> The problem is that there is no way to benchmark our way to this or
> somehow test if a drive performs discard at reasonable speed. I
> suggest we take an approach similar to that of RAID456 and default to
> disabling discard, except for the case where the user knows the drives
> are safe.
>
> Thoughts?
>
> Cheers,
> Jes
>
>
> Jes Sorensen (1):
> raid0: Disable discard per default due to performance uncertainty
>
> drivers/md/raid0.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
As we've discussed in private (but for the benefit of others):
MD raid0 should do what dm-stripe does. Which is calculate the full
extent to discard for each member in the raid0.
This avoids issuing lots of small discards and hoping the block layer
merges them back up.
See this DM commit for reference:
http://git.kernel.org/linus/7b76ec11fec40203836b488496d2df082d5b2022
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-07-07 4:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-24 0:26 [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Jes.Sorensen
2015-06-24 0:26 ` [PATCH 1/1] raid0: Disable discard per default due to performance uncertainty Jes.Sorensen
2015-06-24 5:00 ` [PATCH 0/1] RFC: Unreliable discard performance can cripple RAID1 Roman Mamedov
2015-06-24 11:04 ` Jes Sorensen
2015-06-25 1:03 ` Martin K. Petersen
2015-06-24 7:55 ` NeilBrown
2015-06-24 11:02 ` Jes Sorensen
2015-06-25 1:05 ` Martin K. Petersen
2015-07-07 4:42 ` Mike Snitzer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).